From brett at python.org Fri Nov 1 14:35:38 2013 From: brett at python.org (Brett Cannon) Date: Fri, 1 Nov 2013 09:35:38 -0400 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Thu, Oct 31, 2013 at 6:10 PM, Nick Coghlan wrote: > > On 1 Nov 2013 01:37, "PJ Eby" wrote: > . ;-) > > > > I also suspect, that if properly spelled out, those use cases are > > going to boil down to: > > > > 1. Throwing errors if you have an existing module object you can't > > load into, and > > 2. Passing in a previous spec object, if available > > > > In other words, loaders should not really have any responsibility for > > or concept of "reloading" -- they always load into a module object > > (that they may or may not have created), and they may get given a spec > > from a previous load. They should deal only in "module reuse" and > > "spec reuse". While a typical reload() might involve both reuses, > > there are cases where one sort of reuse could occur independently, and > > not all loaders care about both (or even either) condition. > > > > At any rate, it means a loader author doesn't have to figure out how > > to handle "reloading", all they have to figure out is whether they can > > load into a particular module object, and whether they can do > > something useful with a spec that was previously used to load a module > > with the same name -- a spec that may or may not refer to a similar > > previous loader. These are rather more well-defined endeavors than > > trying to determine in the abstract whether one "supports reload". > > ;-) > > It may not have been clear from the email exchange, but that's basically > where we ended up :) > > The change Eric is currently going to make to the PEP is to add an > optional "previous" parameter to the various find_spec APIs. > > At the time when find_spec runs, sys.modules hasn't been touched yet, so > the old module state is still available when reloading. Passing the spec in > lets the loader decide whether or not it can actually load that module > given the information about the previous load operation. > Do you mean loader or finder? You say "find_spec" which is for finders, but then you say "loader". > However, perhaps it makes more sense to pass the entire existing module, > rather than just the spec? I'd like to minimise the need for new-style > loader authors to retrieve things directly from the sys module, and you're > right that "can I reload into this particular module?" is a slightly > different question from "can I reload a module previously loaded using this > particular spec?". > > A spec-based API could still be used to answer the first question by > looking up sys.modules, but so could the name based API. Passing in the > module reference, on the other hand, should let loaders answer both > questions without needing to touch the sys module. > > I also now think this pass-the-module approach when finding the spec > approach could also clean up some of the messiness that is __main__ > initialisation, where we repeatedly load things into the same module object. > See you mention "finding" but before now everything has been "loader". > Let's say we be completely explicit and call the new parameter something > like "load_target". If we do that then I would make the *same* change to > runpy.run_path and runpy.run_module: let you pass in an existing module > object under that name to say "hey, run in this namespace, rather than a > fresh one". (Those APIs currently only support pre-populating a *fresh* > namespace, rather than allowing you to directly specify an existing one) > > Most loaders won't care, but those that do will have all the info they > need to throw an exception saying "I could provide a spec for this, but > it's not compatible with that load target". > Exactly what APIs -- new or changed -- are you proposing? Method signatures only please to avoid ambiguity. =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Nov 1 18:07:35 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 1 Nov 2013 18:07:35 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20131101170735.E65A956A3F@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-10-25 - 2013-11-01) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4217 (+18) closed 27029 (+64) total 31246 (+82) Open issues with patches: 1955 Issues opened (49) ================== #19394: distutils.core.Extension: empty strings in library_dirs and in http://bugs.python.org/issue19394 opened by robotron #19397: test_pydoc fails with -S http://bugs.python.org/issue19397 opened by pitrou #19398: test_trace fails with -S http://bugs.python.org/issue19398 opened by pitrou #19402: AbstractBasicAuthHandler http://bugs.python.org/issue19402 opened by Perry.Lorier #19403: Make contextlib.redirect_stdout reentrant http://bugs.python.org/issue19403 opened by ncoghlan #19404: Simplify per-instance control of help() output http://bugs.python.org/issue19404 opened by ncoghlan #19406: PEP 453: add the ensurepip module http://bugs.python.org/issue19406 opened by ncoghlan #19407: PEP 453: update the "Installing Python Modules" documentation http://bugs.python.org/issue19407 opened by ncoghlan #19411: binascii.hexlify docs say it returns a string (it returns byte http://bugs.python.org/issue19411 opened by Devin Jeanpierre #19412: Add docs for test.support.requires_docstrings decorator http://bugs.python.org/issue19412 opened by ncoghlan #19414: iter(ordered_dict) yields keys not in dict in some circumstanc http://bugs.python.org/issue19414 opened by Nikratio #19415: test_gdb fails when using --without-doc-strings on Fedora 19 http://bugs.python.org/issue19415 opened by ncoghlan #19417: bdb test coverage http://bugs.python.org/issue19417 opened by Colin.Williams #19419: Use abc.ABC in the collections ABC http://bugs.python.org/issue19419 opened by rhettinger #19420: Leak in _hashopenssl.c http://bugs.python.org/issue19420 opened by skrah #19422: Neither DTLS nor error for SSLSocket.sendto() of UDP socket http://bugs.python.org/issue19422 opened by christian.heimes #19428: marshal: error cases are not documented http://bugs.python.org/issue19428 opened by haypo #19429: OSError constructor does not handle errors correctly http://bugs.python.org/issue19429 opened by haypo #19431: Document PyFrame_FastToLocals() and PyFrame_FastToLocalsWithEr http://bugs.python.org/issue19431 opened by haypo #19433: Define PY_UINT64_T on Windows 32bit http://bugs.python.org/issue19433 opened by christian.heimes #19434: Wrong documentation of MIMENonMultipart class http://bugs.python.org/issue19434 opened by vajrasky #19437: More failures found by pyfailmalloc http://bugs.python.org/issue19437 opened by haypo #19438: Where is NoneType in Python 3? http://bugs.python.org/issue19438 opened by mpb #19439: Build _testembed on Windows http://bugs.python.org/issue19439 opened by zach.ware #19440: Clean up test_capi http://bugs.python.org/issue19440 opened by zach.ware #19441: itertools.tee improve documentation http://bugs.python.org/issue19441 opened by Alan.Cristhian #19444: mmap.mmap() allocates a file descriptor that isn't CLOEXEC http://bugs.python.org/issue19444 opened by rfm #19447: py_compile.compile raises if a file has bad encoding http://bugs.python.org/issue19447 opened by bkabrda #19448: SSL: add OID / NID lookup http://bugs.python.org/issue19448 opened by christian.heimes #19449: csv.DictWriter can't handle extra non-string fields http://bugs.python.org/issue19449 opened by tomas_grahn #19450: Bug in sqlite in Windows binaries http://bugs.python.org/issue19450 opened by schlamar #19451: urlparse accepts invalid hostnames http://bugs.python.org/issue19451 opened by daenney #19453: pydoc.py doesn't detect IronPython, help(foo) can hang http://bugs.python.org/issue19453 opened by owlmonkey #19454: devguide: Document what a "platform support" is http://bugs.python.org/issue19454 opened by techtonik #19456: ntpath doesn't join paths correctly when a drive is present http://bugs.python.org/issue19456 opened by gvanrossum #19459: Python does not support the GEORGIAN-PS charset http://bugs.python.org/issue19459 opened by Caol??n.McNamara #19460: Add test for MIMENonMultipart http://bugs.python.org/issue19460 opened by vajrasky #19461: RawConfigParser modifies empty strings unconditionally http://bugs.python.org/issue19461 opened by Nacsa.Krist??f #19462: Add remove_argument() method to argparse.ArgumentParser http://bugs.python.org/issue19462 opened by ustinov #19463: assertGdbRepr depends on hash randomization / endianess http://bugs.python.org/issue19463 opened by christian.heimes #19464: Remove warnings from Windows buildbot "clean" script http://bugs.python.org/issue19464 opened by zach.ware #19465: selectors: provide a helper to choose a selector using constra http://bugs.python.org/issue19465 opened by haypo #19466: Clear state of threads earlier in Python shutdown http://bugs.python.org/issue19466 opened by haypo #19467: asyncore documentation: redirect users to the new asyncio modu http://bugs.python.org/issue19467 opened by haypo #19468: Relax the type restriction on reloaded modules http://bugs.python.org/issue19468 opened by eric.snow #19469: Duplicate namespace package portions (but not on Windows) http://bugs.python.org/issue19469 opened by eric.snow #19470: email.header.Header - should not allow two newlines in a row http://bugs.python.org/issue19470 opened by hhm #19472: inspect.getsource() raises a wrong exception type http://bugs.python.org/issue19472 opened by marco.buttu #19474: Argument Clinic causing compiler warnings about uninitialized http://bugs.python.org/issue19474 opened by brett.cannon Most recent 15 issues with no replies (15) ========================================== #19474: Argument Clinic causing compiler warnings about uninitialized http://bugs.python.org/issue19474 #19472: inspect.getsource() raises a wrong exception type http://bugs.python.org/issue19472 #19469: Duplicate namespace package portions (but not on Windows) http://bugs.python.org/issue19469 #19468: Relax the type restriction on reloaded modules http://bugs.python.org/issue19468 #19467: asyncore documentation: redirect users to the new asyncio modu http://bugs.python.org/issue19467 #19466: Clear state of threads earlier in Python shutdown http://bugs.python.org/issue19466 #19463: assertGdbRepr depends on hash randomization / endianess http://bugs.python.org/issue19463 #19460: Add test for MIMENonMultipart http://bugs.python.org/issue19460 #19456: ntpath doesn't join paths correctly when a drive is present http://bugs.python.org/issue19456 #19454: devguide: Document what a "platform support" is http://bugs.python.org/issue19454 #19453: pydoc.py doesn't detect IronPython, help(foo) can hang http://bugs.python.org/issue19453 #19441: itertools.tee improve documentation http://bugs.python.org/issue19441 #19440: Clean up test_capi http://bugs.python.org/issue19440 #19434: Wrong documentation of MIMENonMultipart class http://bugs.python.org/issue19434 #19431: Document PyFrame_FastToLocals() and PyFrame_FastToLocalsWithEr http://bugs.python.org/issue19431 Most recent 15 issues waiting for review (15) ============================================= #19466: Clear state of threads earlier in Python shutdown http://bugs.python.org/issue19466 #19464: Remove warnings from Windows buildbot "clean" script http://bugs.python.org/issue19464 #19460: Add test for MIMENonMultipart http://bugs.python.org/issue19460 #19449: csv.DictWriter can't handle extra non-string fields http://bugs.python.org/issue19449 #19448: SSL: add OID / NID lookup http://bugs.python.org/issue19448 #19447: py_compile.compile raises if a file has bad encoding http://bugs.python.org/issue19447 #19440: Clean up test_capi http://bugs.python.org/issue19440 #19439: Build _testembed on Windows http://bugs.python.org/issue19439 #19434: Wrong documentation of MIMENonMultipart class http://bugs.python.org/issue19434 #19420: Leak in _hashopenssl.c http://bugs.python.org/issue19420 #19419: Use abc.ABC in the collections ABC http://bugs.python.org/issue19419 #19417: bdb test coverage http://bugs.python.org/issue19417 #19406: PEP 453: add the ensurepip module http://bugs.python.org/issue19406 #19403: Make contextlib.redirect_stdout reentrant http://bugs.python.org/issue19403 #19398: test_trace fails with -S http://bugs.python.org/issue19398 Top 10 most discussed issues (10) ================================= #19183: PEP 456 Secure and interchangeable hash algorithm http://bugs.python.org/issue19183 26 msgs #19414: iter(ordered_dict) yields keys not in dict in some circumstanc http://bugs.python.org/issue19414 19 msgs #19406: PEP 453: add the ensurepip module http://bugs.python.org/issue19406 12 msgs #19412: Add docs for test.support.requires_docstrings decorator http://bugs.python.org/issue19412 11 msgs #19227: test_multiprocessing_xxx hangs under Gentoo buildbots http://bugs.python.org/issue19227 10 msgs #19465: selectors: provide a helper to choose a selector using constra http://bugs.python.org/issue19465 10 msgs #4331: Add functools.partialmethod http://bugs.python.org/issue4331 9 msgs #19433: Define PY_UINT64_T on Windows 32bit http://bugs.python.org/issue19433 8 msgs #19063: Python 3.3.3 encodes emails containing non-ascii data as 7bit http://bugs.python.org/issue19063 7 msgs #19437: More failures found by pyfailmalloc http://bugs.python.org/issue19437 7 msgs Issues closed (64) ================== #11161: futures.ProcessPoolExecutor hangs http://bugs.python.org/issue11161 closed by bquinlan #13234: os.listdir breaks with literal paths http://bugs.python.org/issue13234 closed by tim.golden #13789: _tkinter does not build on Windows 7 http://bugs.python.org/issue13789 closed by terry.reedy #14255: tempfile.gettempdir() didn't return the path with correct case http://bugs.python.org/issue14255 closed by tim.golden #15634: Add serialized decorator to the threading module http://bugs.python.org/issue15634 closed by jjdominguezm #15792: Fix compiler options for x64 builds on Windows http://bugs.python.org/issue15792 closed by tim.golden #16427: Faster hash implementation http://bugs.python.org/issue16427 closed by serhiy.storchaka #16809: Tk 8.6.0 introduces TypeError. (Tk 8.5.13 works) http://bugs.python.org/issue16809 closed by serhiy.storchaka #16904: http.client.HTTPConnection.send double send data http://bugs.python.org/issue16904 closed by serhiy.storchaka #17933: format str bug in urllib request.py http://bugs.python.org/issue17933 closed by r.david.murray #17936: O(n**2) behaviour when adding/removing classes http://bugs.python.org/issue17936 closed by pitrou #18408: Fixes crashes found by pyfailmalloc http://bugs.python.org/issue18408 closed by haypo #18509: CJK decoders should return MBERR_EXCEPTION on PyUnicodeWriter http://bugs.python.org/issue18509 closed by haypo #18683: Core dumps on CentOS http://bugs.python.org/issue18683 closed by schlamar #18685: Restore re performance to pre-PEP393 level http://bugs.python.org/issue18685 closed by serhiy.storchaka #18810: Stop doing stat calls in importlib.machinery.FileFinder to see http://bugs.python.org/issue18810 closed by brett.cannon #19141: Windows Launcher fails to respect PATH http://bugs.python.org/issue19141 closed by vinay.sajip #19172: selectors: add keys() method http://bugs.python.org/issue19172 closed by neologix #19329: Faster compiling of charset regexpes http://bugs.python.org/issue19329 closed by serhiy.storchaka #19330: Use public classes for contextlib.suppress and redirect_stdout http://bugs.python.org/issue19330 closed by python-dev #19331: Revise PEP 8 recommendation for class names http://bugs.python.org/issue19331 closed by barry #19332: Guard against changing dict during iteration http://bugs.python.org/issue19332 closed by rhettinger #19339: telnetlib: time.monotonic() should be used instead of time.tim http://bugs.python.org/issue19339 closed by python-dev #19349: Not so correct exception message when running venv http://bugs.python.org/issue19349 closed by python-dev #19353: on RHEL6 simple build fails with test_linux_constants (test.te http://bugs.python.org/issue19353 closed by brett.cannon #19373: Tkinter apps including IDLE may not display properly on OS X 1 http://bugs.python.org/issue19373 closed by ned.deily #19375: Deprecate site-python in site.py http://bugs.python.org/issue19375 closed by pitrou #19387: Explain and test the sre overlap table http://bugs.python.org/issue19387 closed by pitrou #19389: find_file might return a different directory than the tested f http://bugs.python.org/issue19389 closed by ned.deily #19390: clinic.py: unhandled syntax error http://bugs.python.org/issue19390 closed by larry #19392: document importlib.reload() requires __loader__ to be defined http://bugs.python.org/issue19392 closed by brett.cannon #19393: symtable.symtable can return the wrong "top" SymbolTable http://bugs.python.org/issue19393 closed by python-dev #19395: unpickled LZMACompressor is crashy http://bugs.python.org/issue19395 closed by nadeem.vawda #19396: test_contextlib fails with -S http://bugs.python.org/issue19396 closed by ncoghlan #19399: sporadic test_subprocess failure http://bugs.python.org/issue19399 closed by tim.peters #19400: Extension module builds can fail with Xcode 5 on OS X 10.8+ http://bugs.python.org/issue19400 closed by ned.deily #19401: Segmentation fault with float arithmatics on OS X Mavericks http://bugs.python.org/issue19401 closed by ned.deily #19405: Fix outdated comments in the _sre module http://bugs.python.org/issue19405 closed by serhiy.storchaka #19408: Regex with set of characters and groups raises error http://bugs.python.org/issue19408 closed by serhiy.storchaka #19409: pkgutil isn't importable from a file or the REPL http://bugs.python.org/issue19409 closed by python-dev #19410: Restore empty string special casing in importlib.machinery.Fil http://bugs.python.org/issue19410 closed by brett.cannon #19413: Reload semantics changed unexpectedly in Python 3.3 http://bugs.python.org/issue19413 closed by eric.snow #19416: NNTP page has incorrect links http://bugs.python.org/issue19416 closed by python-dev #19418: audioop.c giving signed/unsigned warnings on Windows http://bugs.python.org/issue19418 closed by tim.golden #19421: FileIO destructor imports indirectly the io module at exit http://bugs.python.org/issue19421 closed by haypo #19423: permutations len issue http://bugs.python.org/issue19423 closed by amaury.forgeotdarc #19424: _warnings: patch to avoid conversions from/to UTF-8 http://bugs.python.org/issue19424 closed by haypo #19425: multiprocessing.Pool.map hangs if pickling argument raises an http://bugs.python.org/issue19425 closed by sbt #19426: Opening a file in IDLE causes a crash or hang http://bugs.python.org/issue19426 closed by serhiy.storchaka #19427: enhancement: dictionary maths http://bugs.python.org/issue19427 closed by r.david.murray #19430: argparse replaces \$ with $ (if in commandline) http://bugs.python.org/issue19430 closed by r.david.murray #19432: test_multiprocessing_fork failures http://bugs.python.org/issue19432 closed by skrah #19435: Directory traversal attack for CGIHTTPRequestHandler http://bugs.python.org/issue19435 closed by python-dev #19436: Python 2.7 on Linux Ubuntu 12.04 LTS crashes on the help('modu http://bugs.python.org/issue19436 closed by ned.deily #19442: Python crashes when a warning is emitted during shutdown http://bugs.python.org/issue19442 closed by python-dev #19443: add to dict fails after 1,000,000 items on py 2.7.5 http://bugs.python.org/issue19443 closed by benjamin.peterson #19445: heapq.heapify is n log(n), not linear http://bugs.python.org/issue19445 closed by rhettinger #19446: Integer division for negative numbers http://bugs.python.org/issue19446 closed by georg.brandl #19452: ElementTree iterparse documentation http://bugs.python.org/issue19452 closed by eli.bendersky #19455: LoggerAdapter class lacks documented "setLevel" method http://bugs.python.org/issue19455 closed by vinay.sajip #19457: test.test_codeccallbacks.CodecCallbackTest.test_xmlcharrefrepl http://bugs.python.org/issue19457 closed by serhiy.storchaka #19458: spam http://bugs.python.org/issue19458 closed by r.david.murray #19471: pandas DataFrame access results in segfault http://bugs.python.org/issue19471 closed by ezio.melotti #19473: Expose cache validation check in FileFinder http://bugs.python.org/issue19473 closed by brett.cannon From ericsnowcurrently at gmail.com Fri Nov 1 18:03:55 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 1 Nov 2013 11:03:55 -0600 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Thu, Oct 31, 2013 at 4:10 PM, Nick Coghlan wrote: > However, perhaps it makes more sense to pass the entire existing module, > rather than just the spec? Sounds good to me. :) -eric From allamsetty.anup at gmail.com Fri Nov 1 18:20:28 2013 From: allamsetty.anup at gmail.com (Anup Kumar) Date: Fri, 1 Nov 2013 22:50:28 +0530 Subject: [Python-Dev] Contributing to PSF Message-ID: Hi, My name is Anup. I am pursuing my Enginering degree at Amrita in India. I want to start contributing to PSF by fixing bugs and if possible by helping to develop the python software. So, I request some one to help me for that. Thanks in advance, Regards, A. Anup, 2nd Year CSE, Amrita University, http://anup07.wordpress.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Nov 1 18:52:47 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 1 Nov 2013 11:52:47 -0600 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Fri, Nov 1, 2013 at 7:35 AM, Brett Cannon wrote: > On Thu, Oct 31, 2013 at 6:10 PM, Nick Coghlan wrote: >> At the time when find_spec runs, sys.modules hasn't been touched yet, so >> the old module state is still available when reloading. Passing the spec in >> lets the loader decide whether or not it can actually load that module given >> the information about the previous load operation. > > Do you mean loader or finder? You say "find_spec" which is for finders, but > then you say "loader". > >> >> However, perhaps it makes more sense to pass the entire existing module, >> rather than just the spec? I'd like to minimise the need for new-style >> loader authors to retrieve things directly from the sys module, and you're >> right that "can I reload into this particular module?" is a slightly >> different question from "can I reload a module previously loaded using this >> particular spec?". >> >> A spec-based API could still be used to answer the first question by >> looking up sys.modules, but so could the name based API. Passing in the >> module reference, on the other hand, should let loaders answer both >> questions without needing to touch the sys module. >> >> I also now think this pass-the-module approach when finding the spec >> approach could also clean up some of the messiness that is __main__ >> initialisation, where we repeatedly load things into the same module object. > > > See you mention "finding" but before now everything has been "loader". > >> >> Let's say we be completely explicit and call the new parameter something >> like "load_target". If we do that then I would make the *same* change to >> runpy.run_path and runpy.run_module: let you pass in an existing module >> object under that name to say "hey, run in this namespace, rather than a >> fresh one". (Those APIs currently only support pre-populating a *fresh* >> namespace, rather than allowing you to directly specify an existing one) >> >> Most loaders won't care, but those that do will have all the info they >> need to throw an exception saying "I could provide a spec for this, but it's >> not compatible with that load target". > > Exactly what APIs -- new or changed -- are you proposing? Method signatures > only please to avoid ambiguity. =) I'm about to update the PEP but I'll clarify here. The signature of finder.find_spec() will grow an extra "existing" parameter (default to None) that is an existing module. During reload the module from sys.modules will be passed in. Having this be part of find_spec() is consistent with the job of the finder to identify a loader that should be used to load the module (reload or not). As to the inconsistency in what Nick said, it still fits in with the way I see it. I'm sure he'll correct me if he had something else in mind. :-) finder.find_spec() will be making a decision on what loader (if any) to return. As part of that, when it finds a loader it can further decide on whether or not the loader supports loading into an existing module (if one was provided). As part of that it may or may not consult with the loader. Either way it stands as proxy for the loader in the latter decision. In lieu of an explicit loader.supports_reload() method, a loader will rely on its associated finder to be a proxy for reload-related decisions, likely communicated via spec.loader_state. Perhaps Nick meant something else, but my understanding is that he was referring to this -eric From rdmurray at bitdance.com Fri Nov 1 18:53:40 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 01 Nov 2013 13:53:40 -0400 Subject: [Python-Dev] Contributing to PSF In-Reply-To: References: Message-ID: <20131101175341.19506250811@webabinitio.net> On Fri, 01 Nov 2013 22:50:28 +0530, Anup Kumar wrote: > My name is Anup. I am pursuing my Enginering degree at Amrita in India. I > want to start contributing to PSF by fixing bugs and if possible by helping > to develop the python software. So, I request some one to help me for that. Welcome, Anup. Good first steps are signing up for the core-mentorship mailing list, and reading the devguide (http://docs.python.org/devguide). --David From brett at python.org Fri Nov 1 18:59:08 2013 From: brett at python.org (Brett Cannon) Date: Fri, 1 Nov 2013 13:59:08 -0400 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Fri, Nov 1, 2013 at 1:52 PM, Eric Snow wrote: > On Fri, Nov 1, 2013 at 7:35 AM, Brett Cannon wrote: > > On Thu, Oct 31, 2013 at 6:10 PM, Nick Coghlan > wrote: > >> At the time when find_spec runs, sys.modules hasn't been touched yet, so > >> the old module state is still available when reloading. Passing the > spec in > >> lets the loader decide whether or not it can actually load that module > given > >> the information about the previous load operation. > > > > Do you mean loader or finder? You say "find_spec" which is for finders, > but > > then you say "loader". > > > >> > >> However, perhaps it makes more sense to pass the entire existing module, > >> rather than just the spec? I'd like to minimise the need for new-style > >> loader authors to retrieve things directly from the sys module, and > you're > >> right that "can I reload into this particular module?" is a slightly > >> different question from "can I reload a module previously loaded using > this > >> particular spec?". > >> > >> A spec-based API could still be used to answer the first question by > >> looking up sys.modules, but so could the name based API. Passing in the > >> module reference, on the other hand, should let loaders answer both > >> questions without needing to touch the sys module. > >> > >> I also now think this pass-the-module approach when finding the spec > >> approach could also clean up some of the messiness that is __main__ > >> initialisation, where we repeatedly load things into the same module > object. > > > > > > See you mention "finding" but before now everything has been "loader". > > > >> > >> Let's say we be completely explicit and call the new parameter something > >> like "load_target". If we do that then I would make the *same* change to > >> runpy.run_path and runpy.run_module: let you pass in an existing module > >> object under that name to say "hey, run in this namespace, rather than a > >> fresh one". (Those APIs currently only support pre-populating a *fresh* > >> namespace, rather than allowing you to directly specify an existing one) > >> > >> Most loaders won't care, but those that do will have all the info they > >> need to throw an exception saying "I could provide a spec for this, but > it's > >> not compatible with that load target". > > > > Exactly what APIs -- new or changed -- are you proposing? Method > signatures > > only please to avoid ambiguity. =) > > I'm about to update the PEP but I'll clarify here. The signature of > finder.find_spec() will grow an extra "existing" parameter (default to > None) that is an existing module. During reload the module from > sys.modules will be passed in. Having this be part of find_spec() is > consistent with the job of the finder to identify a loader that should > be used to load the module (reload or not). > > As to the inconsistency in what Nick said, it still fits in with the > way I see it. I'm sure he'll correct me if he had something else in > mind. :-) finder.find_spec() will be making a decision on what loader > (if any) to return. As part of that, when it finds a loader it can > further decide on whether or not the loader supports loading into an > existing module (if one was provided). As part of that it may or may > not consult with the loader. Either way it stands as proxy for the > loader in the latter decision. > > In lieu of an explicit loader.supports_reload() method, a loader will > rely on its associated finder to be a proxy for reload-related > decisions, likely communicated via spec.loader_state. Perhaps Nick > meant something else, but my understanding is that he was referring to > this > Thanks for the clarification. It all SGTM. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Nov 1 23:44:25 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 1 Nov 2013 16:44:25 -0600 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Fri, Nov 1, 2013 at 11:59 AM, Brett Cannon wrote: > Thanks for the clarification. It all SGTM. I've updated the PEP and the reference implementation. If I've missed anything please let me know. There is only one thing I thought of that I'd like to get an opinion on: In the Finders section, the PEP specifies returning None (or using a different loader) when the found loader does not support loading into an existing module (e.g during reload). An alternative to returning None would be for the finder to raise ImportError with a message like "the loader does not support reloading the module". This may actually be a better approach since "could not find a loader" and "the found loader won't work" are different situations that a single return value (None) may not sufficiently represent. Other than that, I'm not aware of any blockers for the PEP. Once I settle that last question, I'll ask for pronouncement. (Anyone may feel free to pronounce before then. ) -eric From ncoghlan at gmail.com Sat Nov 2 10:46:12 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 2 Nov 2013 19:46:12 +1000 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On 2 November 2013 08:44, Eric Snow wrote: > On Fri, Nov 1, 2013 at 11:59 AM, Brett Cannon wrote: >> Thanks for the clarification. It all SGTM. > > I've updated the PEP and the reference implementation. While I was saying loader when I should have been saying finder, Eric correctly divined my intent :) > If I've missed > anything please let me know. There is only one thing I thought of > that I'd like to get an opinion on: > > In the Finders section, the PEP specifies returning None (or using a > different loader) when the found loader does not support loading into > an existing module (e.g during reload). An alternative to returning > None would be for the finder to raise ImportError with a message like > "the loader does not support reloading the module". This may actually > be a better approach since "could not find a loader" and "the found > loader won't work" are different situations that a single return value > (None) may not sufficiently represent. Throwing ImportError for "can load, but cannot load into that target module" sounds good to me. We should also be explicit that throwing an exception from find_spec *stops* the search (unlike returning None). > Other than that, I'm not aware of any blockers for the PEP. The one slight quibble I have with the PEP update is the wording regarding the "existing" parameter. I think runpy should pass the new parameter as well when calling find_spec, but in that case it *won't* be an existing copy of the named module, it will be sys.modules["__main__"]. So we shouldn't set an expectation that the existing module passed in will necessarily be a previous copy of the module being loaded, it's just an indication to the finder that the target namespace for execution will be a pre-existing module rather than a new one. For the same reason, I also have a mild preference for "target" (or the more explicit "load_target") as the name, although I won't object if you and Brett prefer "existing". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Sat Nov 2 16:21:39 2013 From: brett at python.org (Brett Cannon) Date: Sat, 2 Nov 2013 11:21:39 -0400 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Sat, Nov 2, 2013 at 5:46 AM, Nick Coghlan wrote: > On 2 November 2013 08:44, Eric Snow wrote: > > On Fri, Nov 1, 2013 at 11:59 AM, Brett Cannon wrote: > >> Thanks for the clarification. It all SGTM. > > > > I've updated the PEP and the reference implementation. > > While I was saying loader when I should have been saying finder, Eric > correctly divined my intent :) > > > If I've missed > > anything please let me know. There is only one thing I thought of > > that I'd like to get an opinion on: > > > > In the Finders section, the PEP specifies returning None (or using a > > different loader) when the found loader does not support loading into > > an existing module (e.g during reload). An alternative to returning > > None would be for the finder to raise ImportError with a message like > > "the loader does not support reloading the module". This may actually > > be a better approach since "could not find a loader" and "the found > > loader won't work" are different situations that a single return value > > (None) may not sufficiently represent. > > Throwing ImportError for "can load, but cannot load into that target > module" sounds good to me. SGTM as well. > We should also be explicit that throwing an > exception from find_spec *stops* the search (unlike returning None). > I thought that was obvious but more docs doesn't hurt. =) > > > Other than that, I'm not aware of any blockers for the PEP. > > The one slight quibble I have with the PEP update is the wording > regarding the "existing" parameter. I think runpy should pass the new > parameter as well when calling find_spec, but in that case it *won't* > be an existing copy of the named module, it will be > sys.modules["__main__"]. So we shouldn't set an expectation that the > existing module passed in will necessarily be a previous copy of the > module being loaded, it's just an indication to the finder that the > target namespace for execution will be a pre-existing module rather > than a new one. > > For the same reason, I also have a mild preference for "target" (or > the more explicit "load_target") as the name, although I won't object > if you and Brett prefer "existing". I would go with "module" or "target" and that is where I shall end the bikeshedding. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Nov 2 16:38:48 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 3 Nov 2013 01:38:48 +1000 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On 3 November 2013 01:21, Brett Cannon wrote: > On Sat, Nov 2, 2013 at 5:46 AM, Nick Coghlan wrote: >> For the same reason, I also have a mild preference for "target" (or >> the more explicit "load_target") as the name, although I won't object >> if you and Brett prefer "existing". > > I would go with "module" or "target" and that is where I shall end the > bikeshedding. I had a quick reread of the runpy docs: http://docs.python.org/dev/library/runpy#runpy.run_module I can definitely make "target" work (that's why I suggested it originally, re-reading the docs just confirmed that), but I don't see any non-confusing way to use "module". Eric: since "target" was on both Brett's list and mine, it looks like s/existing/target/ and clarifying that "incompatible target module" should be reported by throwing ImportError from find_spec() will get you the pronouncement you're looking for :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From nad at acm.org Sat Nov 2 20:43:52 2013 From: nad at acm.org (Ned Deily) Date: Sat, 02 Nov 2013 12:43:52 -0700 Subject: [Python-Dev] cpython (3.2): Update NEWS for 265d369ad3b9. References: <3dBmq94f4bz7LkF@mail.python.org> Message-ID: In article <3dBmq94f4bz7LkF at mail.python.org>, jason.coombs wrote: > http://hg.python.org/cpython/rev/7d399099334d > changeset: 86861:7d399099334d > branch: 3.2 > user: Jason R. Coombs > date: Sat Nov 02 13:00:01 2013 -0400 > summary: > Update NEWS for 265d369ad3b9. > > files: > Misc/NEWS | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -10,6 +10,9 @@ > Library > ------- > > +- Issue #19286: Directories in ``package_data`` are no longer added to > + the filelist, preventing failure outlined in the ticket. > + > - Issue #19435: Fix directory traversal attack on CGIHttpRequestHandler. > > - Issue #14984: On POSIX systems, when netrc is called without a filename The 3.2 branch is closed to all but security fixes with release manager approval. You shouldn't be checking bug fixes in there. -- Ned Deily, nad at acm.org From ericsnowcurrently at gmail.com Sat Nov 2 22:14:58 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 2 Nov 2013 15:14:58 -0600 Subject: [Python-Dev] PEP 451 update In-Reply-To: References: Message-ID: On Sat, Nov 2, 2013 at 9:38 AM, Nick Coghlan wrote: > Eric: since "target" was on both Brett's list and mine, it looks like > s/existing/target/ and clarifying that "incompatible target module" > should be reported by throwing ImportError from find_spec() will get > you the pronouncement you're looking for :) Done and done! :) -eric From victor.stinner at gmail.com Sun Nov 3 14:34:48 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 3 Nov 2013 14:34:48 +0100 Subject: [Python-Dev] PEP 454: tracemalloc (n-th version) Message-ID: "n-th version": Sorry, I don't remember the version number of the PEP :-) http://www.python.org/dev/peps/pep-0454/ Lastest changes: * rename disable/enable/is_enabled() to stop/start/is_tracing() * Snapshot.apply_filters() now returns a new Snapshot instance * a traceback now always contains a least 1 frame, use (('', 0),) if the traceback cannot read or if the traceback limit is 0 * Rename Filter.traceback to Filter.all_frames * write more documentation on filter functions Charles-Fran?ois doesn't look convinced that disabling temporarily tracing is useful, me neither. Another concern is that it is not possible to disable completly tracing, only disable tracing new allocations, deallocations are still tracing. I chose a simpler API with limitations. The PEP is not *completly* different than my first proposition. Even if the latest PEP has fewer functions and classes, it's almost as powerful. I replaced complex classes and high-level API with short examples in the documentation: http://www.haypocalc.com/tmp/tracemalloc/library/tracemalloc.html#examples For the implementation of tracemalloc, see: http://bugs.python.org/issue18874 Rietveld link to the latest patch: http://bugs.python.org/review/18874/#ps9797 Victor From rdmurray at bitdance.com Sun Nov 3 17:59:20 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 03 Nov 2013 11:59:20 -0500 Subject: [Python-Dev] python2 and python3 and vim Message-ID: <20131103165921.49CE125007F@webabinitio.net> I came across this in the VIM documentation: Vim can be built in four ways (:version output): 1. No Python support (-python, -python3) 2. Python 2 support only (+python or +python/dyn, -python3) 3. Python 3 support only (-python, +python3 or +python3/dyn) 4. Python 2 and 3 support (+python/dyn, +python3/dyn) Some more details on the special case 4: When Python 2 and Python 3 are both supported they must be loaded dynamically. When doing this on Linux/Unix systems and importing global symbols, this leads to a crash when the second Python version is used. So either global symbols are loaded but only one Python version is activated, or no global symbols are loaded. The latter makes Python's "import" fail on libraries that expect the symbols to be provided by Vim. I've never played with embedding Python. Does this make sense to anyone who has? Is there some limitation in our embedding and/or import machinery here, or is this more likely to be a shortcoming on the VIM side? Mostly I'm asking out of curiosity in hopes of learning something; I doubt I'll have enough motivation to make time to work on solving this. --David From greg at krypto.org Sun Nov 3 19:41:31 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 3 Nov 2013 10:41:31 -0800 Subject: [Python-Dev] python2 and python3 and vim In-Reply-To: <20131103165921.49CE125007F@webabinitio.net> References: <20131103165921.49CE125007F@webabinitio.net> Message-ID: On Sun, Nov 3, 2013 at 8:59 AM, R. David Murray wrote: > I came across this in the VIM documentation: > > Vim can be built in four ways (:version output): > 1. No Python support (-python, -python3) > 2. Python 2 support only (+python or +python/dyn, -python3) > 3. Python 3 support only (-python, +python3 or +python3/dyn) > 4. Python 2 and 3 support (+python/dyn, +python3/dyn) > > Some more details on the special case 4: > > When Python 2 and Python 3 are both supported they must be loaded > dynamically. > > When doing this on Linux/Unix systems and importing global symbols, > this leads > to a crash when the second Python version is used. So either global > symbols > are loaded but only one Python version is activated, or no global > symbols are > loaded. The latter makes Python's "import" fail on libraries that > expect the > symbols to be provided by Vim. > > I've never played with embedding Python. Does this make sense to > anyone who has? Is there some limitation in our embedding and/or > import machinery here, or is this more likely to be a shortcoming on > the VIM side? > > Mostly I'm asking out of curiosity in hopes of learning something; > I doubt I'll have enough motivation to make time to work on solving this. > I'm not used to it being possible to have multiple different embedded Python interpreters in one process at all. The Python C API symbols exported by one python interpreter will conflict with another python interpreter. We don't provide isolation of interpreters within a process. IIRC, if you do have multiple even of the same version within a single process they are still all sharing the same python memory allocator pools and GIL and more. That someone even went through the hoops to attempt to get vim to allow having some semblance of both python 2 and python 3 embedded dynamically at runtime seems like overkill to me. Pick one and go with it. Vim should declare at some point in 2014 that the embedded Python in vim is now Python 3 only and move on. The rare users who actually use the embedded bazillion scripting languages within vim will need to update their code, ideally to be 2 and 3 compatible. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 3 23:04:27 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 4 Nov 2013 08:04:27 +1000 Subject: [Python-Dev] python2 and python3 and vim In-Reply-To: <20131103165921.49CE125007F@webabinitio.net> References: <20131103165921.49CE125007F@webabinitio.net> Message-ID: On 4 Nov 2013 03:00, "R. David Murray" wrote: > > I came across this in the VIM documentation: > > Vim can be built in four ways (:version output): > 1. No Python support (-python, -python3) > 2. Python 2 support only (+python or +python/dyn, -python3) > 3. Python 3 support only (-python, +python3 or +python3/dyn) > 4. Python 2 and 3 support (+python/dyn, +python3/dyn) > > Some more details on the special case 4: > > When Python 2 and Python 3 are both supported they must be loaded dynamically. > > When doing this on Linux/Unix systems and importing global symbols, this leads > to a crash when the second Python version is used. So either global symbols > are loaded but only one Python version is activated, or no global symbols are > loaded. The latter makes Python's "import" fail on libraries that expect the > symbols to be provided by Vim. > > I've never played with embedding Python. Does this make sense to > anyone who has? Is there some limitation in our embedding and/or > import machinery here, or is this more likely to be a shortcoming on > the VIM side? > > Mostly I'm asking out of curiosity in hopes of learning something; > I doubt I'll have enough motivation to make time to work on solving this. Essentially what Greg said - we export many of the same symbols from both shared libraries, so if you try to load both CPython 2 and 3 into one process, the dynamic linker isn't going to be happy about it at all. Since making it work would be a lot of work for minimal benefit, the most that could realistically be done is to make it fail a little more gracefully (I believe that would need to be done on the Vim side of things, though). Cheers, Nick. > > --David > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sun Nov 3 23:14:08 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 3 Nov 2013 17:14:08 -0500 Subject: [Python-Dev] python2 and python3 and vim In-Reply-To: References: <20131103165921.49CE125007F@webabinitio.net> Message-ID: <20131103171408.00599f0d@anarchist> On Nov 04, 2013, at 08:04 AM, Nick Coghlan wrote: >Since making it work would be a lot of work for minimal benefit, the most >that could realistically be done is to make it fail a little more >gracefully (I believe that would need to be done on the Vim side of things, >though). Right. I can't remember the details, but I've seen similar types of dual-plugin support in other applications. Generally it means you can *either* import Python 2 code or Python 3 code in a single process, but once you've fired up the first runtime, the application will prevent you from firing up the other. So yes, I think Vim would have to put that guard in, and I agree that there's not much benefit in trying to make this work in Python. -Barry From piet at vanoostrum.org Mon Nov 4 02:15:51 2013 From: piet at vanoostrum.org (Piet van Oostrum) Date: Sun, 3 Nov 2013 21:15:51 -0400 Subject: [Python-Dev] Problem installing matplotlib 1.3.1 with Python 2.7.6 and 3.3.3 (release candidate 1) Message-ID: <21110.62791.44734.656127@cochabamba.vanoostrum.org> Hello, I tried to install matplotlib 1.3.1 on the release candidates of Python 2.7.6 and 3.3.3. I am on Mac OS X 10.6.8. Although the installation gave no problems, there is a problem with Tcl/Tk. The new Pythons have their own embedded Tcl/Tk, but when installing matplotlib it links to the Frameworks version of Tcl and TK, not to the embedded version. This causes confusion when importing matplotlib.pyplot: objc[70648]: Class TKApplication is implemented in both /Library/Frameworks/Python.framework/Versions/2.7/lib/libtk8.5.dylib and /Library/Frameworks/Tk.framework/Versions/8.5/Tk. One of the two will be used. Which one is undefined. objc[70648]: Class TKMenu is implemented in both /Library/Frameworks/Python.framework/Versions/2.7/lib/libtk8.5.dylib and /Library/Frameworks/Tk.framework/Versions/8.5/Tk. One of the two will be used. Which one is undefined. objc[70648]: Class TKContentView is implemented in both /Library/Frameworks/Python.framework/Versions/2.7/lib/libtk8.5.dylib and /Library/Frameworks/Tk.framework/Versions/8.5/Tk. One of the two will be used. Which one is undefined. objc[70648]: Class TKWindow is implemented in both /Library/Frameworks/Python.framework/Versions/2.7/lib/libtk8.5.dylib and /Library/Frameworks/Tk.framework/Versions/8.5/Tk. One of the two will be used. Which one is undefined. And then later it gives a lot of error messages. So I think it should be linked to the embedded version. For this the matplotlib setupext.py should be adapted to find out if there is an embedded Tcl/Tk in the Python installation and set the link parameters accordingly. However, the installed Python versions (from the DMG's) do not contain the Tcl/Tk header files, only the shared library and the tcl files. So I thing the distributed Python should also include the Tcl/Tk header files. -- Piet van Oostrum WWW: http://pietvanoostrum.com/ PGP key: [8DAE142BE17999C4] From zachary.ware at gmail.com Mon Nov 4 07:09:37 2013 From: zachary.ware at gmail.com (Zachary Ware) Date: Mon, 4 Nov 2013 00:09:37 -0600 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: <5277287F.8060607@udel.edu> References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> Message-ID: On Sun, Nov 3, 2013 at 10:54 PM, Terry Reedy wrote: > On 11/3/2013 11:48 PM, terry.reedy wrote: >> >> http://hg.python.org/cpython/rev/cced7981ec4d >> changeset: 86908:cced7981ec4d >> branch: 2.7 >> user: Terry Jan Reedy >> date: Sun Nov 03 23:37:54 2013 -0500 >> summary: >> Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run >> under test.regrtest on 2.7. > > > This message is the one included with the patch by Ned Daily. Because a > message *was* included (not normal), hg import committed the patch > immediately, without giving me a chance to edit the patch or message. As far > as I know, there is no way I could have edited the message after the commit. > If there was, let me know. hg commit has an "--amend" option that allows you to change the last (un-pushed) commit. Even if there aren't actually any changes to be committed, it will pop up a text editor with the commit message of the last commit already filled in, allowing you to change the message. Any time you `hg commit --amend`, it saves the original commit locally as a bundle in the .hg dir, so you can restore it if need be. I haven't tested it, but from a quick look at TortoiseHg Workbench's Commit screen, it looks like --amend can be done from the down arrow on the commit button, "Amend current revision". -- Zach From brian at python.org Mon Nov 4 07:30:56 2013 From: brian at python.org (Brian Curtin) Date: Sun, 3 Nov 2013 22:30:56 -0800 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: <5277287F.8060607@udel.edu> References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> Message-ID: On Sun, Nov 3, 2013 at 8:54 PM, Terry Reedy wrote: > On 11/3/2013 11:48 PM, terry.reedy wrote: >> >> http://hg.python.org/cpython/rev/cced7981ec4d >> changeset: 86908:cced7981ec4d >> branch: 2.7 >> user: Terry Jan Reedy >> date: Sun Nov 03 23:37:54 2013 -0500 >> summary: >> Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run >> under test.regrtest on 2.7. > > > This message is the one included with the patch by Ned Daily. Because a > message *was* included (not normal), hg import committed the patch > immediately, without giving me a chance to edit the patch or message. As far > as I know, there is no way I could have edited the message after the commit. > If there was, let me know. Besides what Zach mentions, most of the time you probably want to "hg import --no-commit ", run it, test it, then commit it with whatever message you want. From tjreedy at udel.edu Mon Nov 4 09:56:25 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 04 Nov 2013 03:56:25 -0500 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> Message-ID: On 11/4/2013 1:09 AM, Zachary Ware wrote: > On Sun, Nov 3, 2013 at 10:54 PM, Terry Reedy wrote: >> On 11/3/2013 11:48 PM, terry.reedy wrote: >>> >>> http://hg.python.org/cpython/rev/cced7981ec4d >>> changeset: 86908:cced7981ec4d >>> branch: 2.7 >>> user: Terry Jan Reedy >>> date: Sun Nov 03 23:37:54 2013 -0500 >>> summary: >>> Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run >>> under test.regrtest on 2.7. >> >> >> This message is the one included with the patch by Ned Daily. Because a >> message *was* included (not normal), hg import committed the patch >> immediately, without giving me a chance to edit the patch or message. As far >> as I know, there is no way I could have edited the message after the commit. >> If there was, let me know. > > hg commit has an "--amend" option that allows you to change the last > (un-pushed) commit. Even if there aren't actually any changes to be > committed, it will pop up a text editor with the commit message of the > last commit already filled in, allowing you to change the message. > Any time you `hg commit --amend`, it saves the original commit locally > as a bundle in the .hg dir, so you can restore it if need be. > > I haven't tested it, but from a quick look at TortoiseHg Workbench's > Commit screen, it looks like --amend can be done from the down arrow > on the commit button, "Amend current revision". Thank you. I tried that in a local test repository, changed the message, even added an untracked file, hit Amend, and it worked. Editing a file, hitting refresh, and then Amend also worked. The one thing I tried but could not do was to directly change status 'A' back to '?'. -- Terry Jan Reedy From nad at acm.org Mon Nov 4 09:39:53 2013 From: nad at acm.org (Ned Deily) Date: Mon, 04 Nov 2013 01:39:53 -0700 Subject: [Python-Dev] Problem installing matplotlib 1.3.1 with Python 2.7.6 and 3.3.3 (release candidate 1) References: <21110.62791.44734.656127@cochabamba.vanoostrum.org> Message-ID: In article , Georg Brandl wrote: > Am 04.11.2013 01:59, schrieb Ned Deily: > > In article <21110.62791.44734.656127 at cochabamba.vanoostrum.org>, > > Piet van Oostrum wrote: > >> I tried to install matplotlib 1.3.1 on the release candidates of Python > >> 2.7.6 > >> and 3.3.3. > > > > [...] > > > > Please open an issue on the Python bug tracker for the Python component of > > this. > > > > http://bugs.python.org > And please mark as release blocker, I think this should go into 3.3.3rc2. http://bugs.python.org/issue19490 -- Ned Deily, nad at acm.org From phd at phdru.name Mon Nov 4 10:04:05 2013 From: phd at phdru.name (Oleg Broytman) Date: Mon, 4 Nov 2013 10:04:05 +0100 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> Message-ID: <20131104090405.GA10765@phdru.name> Hi! On Mon, Nov 04, 2013 at 03:56:25AM -0500, Terry Reedy wrote: > The one > thing I tried but could not do was to directly change status 'A' > back to '?'. hg forget file Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From victor.stinner at gmail.com Mon Nov 4 11:01:23 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 4 Nov 2013 11:01:23 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: <1381582294.2497.0.camel@fsol> References: <1381582294.2497.0.camel@fsol> Message-ID: Hi, bugs.python.org is still not responding on IPv6. Can someone please remove the AAAA record from python.org DNS, or fix the IPv6 configuration? It's an issue on my PC because my PC has IPv6 address and so it cannot reach bugs.python.org. wget is really slow because it tries first IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so doesn't have the issue. Victor 2013/10/12 Antoine Pitrou : > > Opened issue at http://psf.upfronthosting.co.za/roundup/meta/issue528 > > Regards > > Antoine. > > > > Le samedi 12 octobre 2013 ? 14:40 +0200, Victor Stinner a ?crit : >> Hi, >> >> The DNS server of python.org announce the IP address 2a01:4f8:131:2480::3: >> >> $ host -t AAAA bugs.python.org >> bugs.python.org has IPv6 address 2a01:4f8:131:2480::3 >> >> >> The problem is that I cannot connect to this IP address: >> >> $ ping6 -c 4 2a01:4f8:131:2480::3 >> PING 2a01:4f8:131:2480::3(2a01:4f8:131:2480::3) 56 data bytes >> >> --- 2a01:4f8:131:2480::3 ping statistics --- >> 4 packets transmitted, 0 received, 100% packet loss, time 2999ms >> >> >> Do you have a the same issue, or is it just me? >> >> Victor >> ________________________________________________ >> Infrastructure mailing list >> Infrastructure at python.org >> https://mail.python.org/mailman/listinfo/infrastructure >> Unsubscribe: https://mail.python.org/mailman/options/infrastructure/solipsis%40pitrou.net >> > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com From mcepl at redhat.com Mon Nov 4 15:11:12 2013 From: mcepl at redhat.com (=?UTF-8?B?TWF0xJtqIENlcGw=?=) Date: Mon, 04 Nov 2013 15:11:12 +0100 Subject: [Python-Dev] urllib2.HTTPBasicAuthHandler doesn't work with GitHub API Message-ID: <5277AB00.5000301@redhat.com> Hi, GitHub API v3 is intentionally broken (see http://developer.github.com/v3/auth/): > The main difference is that the RFC requires unauthenticated requests > to be answered with 401 Unauthorized responses. In many places, this > would disclose the existence of user data. Instead, the GitHub API > responds with 404 Not Found. This may cause problems for HTTP > libraries that assume a 401 Unauthorized response. The solution is to > manually craft the Authorization header. Unfortunately, urllib2.HTTPBasicAuthHandler relies on the standard-conformant behavior. So a naive programmer (like me) who wants to program against GitHub API using urllib2 (and foolishly ignores this comment about the API non-conformance, because he thinks GitHub wouldn't be that stupid and break all Python applications) writes something like the attached script, spends couple of hours hitting this issue, until he tries python-requests (which work) and his (mistaken) conclusion is that urllib2 is a piece of crap which should never be used again. I am not sure how widespread is this breaking of RFC, but it seems to me that quite a lot (e.g., http://stackoverflow.com/a/9698319/164233 which just en passant expects urllib2 authentication stuff to be useless), and the question is whether it shouldn't be documented somehow and/or urllib2.HTTPBasicAuthHandler shouldn't be modified to try add Authenticate header first. Any suggestions? Best, Mat?j -- http://www.ceplovi.cz/matej/, Jabber: mcepl at ceplovi.cz GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC For a successful technology, reality must take precedence over public relations, for nature cannot be fooled. -- R. P. Feynman's concluding sentence in his appendix to the Challenger Report -------------- next part -------------- A non-text attachment was scrubbed... Name: test3_create_1.py Type: text/x-python Size: 2453 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 255 bytes Desc: OpenPGP digital signature URL: From senthil at uthcode.com Mon Nov 4 15:51:27 2013 From: senthil at uthcode.com (Senthil Kumaran) Date: Mon, 4 Nov 2013 06:51:27 -0800 Subject: [Python-Dev] urllib2.HTTPBasicAuthHandler doesn't work with GitHub API In-Reply-To: <5277AB00.5000301@redhat.com> References: <5277AB00.5000301@redhat.com> Message-ID: On Mon, Nov 4, 2013 at 6:11 AM, Mat?j Cepl wrote: > I am not sure how widespread is this breaking of RFC, but it seems to me > that quite a lot (e.g., http://stackoverflow.com/a/9698319/164233 which > just en passant expects urllib2 authentication stuff to be useless), and > the question is whether it shouldn't be documented somehow and/or > urllib2.HTTPBasicAuthHandler shouldn't be modified to try add > Authenticate header first. > > Any suggestions? > Please file a bug report at bugs.python.org stating this. urllib2 tries to follow the RFC as much as possible, but ofcourse consider's de-facto standards and adopts to it. This can be documented for 2.x and can be 'fixed in 3.4. HTH, Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Mon Nov 4 16:09:06 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 04 Nov 2013 16:09:06 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: References: <1381582294.2497.0.camel@fsol> Message-ID: <5277B892.7070901@egenix.com> On 04.11.2013 11:01, Victor Stinner wrote: > Hi, > > bugs.python.org is still not responding on IPv6. Can someone please > remove the AAAA record from python.org DNS, or fix the IPv6 > configuration? > > It's an issue on my PC because my PC has IPv6 address and so it cannot > reach bugs.python.org. wget is really slow because it tries first > IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so > doesn't have the issue. The infrastructure team cannot do anything about this, since the tracker is hosted by Upfront Systems. They are hosting the instance at Hetzner in Germany, which does support IPv6. Do we have someone with login permission for the server running RoundUp ? If not, we should probably find someone at Upfront Systems to check the configuration. > Victor > > 2013/10/12 Antoine Pitrou : >> >> Opened issue at http://psf.upfronthosting.co.za/roundup/meta/issue528 >> >> Regards >> >> Antoine. >> >> >> >> Le samedi 12 octobre 2013 ? 14:40 +0200, Victor Stinner a ?crit : >>> Hi, >>> >>> The DNS server of python.org announce the IP address 2a01:4f8:131:2480::3: >>> >>> $ host -t AAAA bugs.python.org >>> bugs.python.org has IPv6 address 2a01:4f8:131:2480::3 >>> >>> >>> The problem is that I cannot connect to this IP address: >>> >>> $ ping6 -c 4 2a01:4f8:131:2480::3 >>> PING 2a01:4f8:131:2480::3(2a01:4f8:131:2480::3) 56 data bytes >>> >>> --- 2a01:4f8:131:2480::3 ping statistics --- >>> 4 packets transmitted, 0 received, 100% packet loss, time 2999ms >>> >>> >>> Do you have a the same issue, or is it just me? >>> >>> Victor >>> ________________________________________________ >>> Infrastructure mailing list >>> Infrastructure at python.org >>> https://mail.python.org/mailman/listinfo/infrastructure >>> Unsubscribe: https://mail.python.org/mailman/options/infrastructure/solipsis%40pitrou.net >>> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > ________________________________________________ > Infrastructure mailing list > Infrastructure at python.org > https://mail.python.org/mailman/listinfo/infrastructure > Unsubscribe: https://mail.python.org/mailman/options/infrastructure/mal%40egenix.com > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 04 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 15 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From benjamin at python.org Mon Nov 4 16:10:26 2013 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 4 Nov 2013 10:10:26 -0500 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: <5277B892.7070901@egenix.com> References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> Message-ID: 2013/11/4 M.-A. Lemburg : > On 04.11.2013 11:01, Victor Stinner wrote: >> Hi, >> >> bugs.python.org is still not responding on IPv6. Can someone please >> remove the AAAA record from python.org DNS, or fix the IPv6 >> configuration? >> >> It's an issue on my PC because my PC has IPv6 address and so it cannot >> reach bugs.python.org. wget is really slow because it tries first >> IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so >> doesn't have the issue. > > The infrastructure team cannot do anything about this, since > the tracker is hosted by Upfront Systems. But the PSF controls the DNS, no? ;; AUTHORITY SECTION: python.org. 2396 IN NS ns2.p11.dynect.net. python.org. 2396 IN NS ns3.p11.dynect.net. python.org. 2396 IN NS ns1.p11.dynect.net. python.org. 2396 IN NS ns4.p11.dynect.net. -- Regards, Benjamin From mal at egenix.com Mon Nov 4 16:11:56 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 04 Nov 2013 16:11:56 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> Message-ID: <5277B93C.1080307@egenix.com> On 04.11.2013 16:10, Benjamin Peterson wrote: > 2013/11/4 M.-A. Lemburg : >> On 04.11.2013 11:01, Victor Stinner wrote: >>> Hi, >>> >>> bugs.python.org is still not responding on IPv6. Can someone please >>> remove the AAAA record from python.org DNS, or fix the IPv6 >>> configuration? >>> >>> It's an issue on my PC because my PC has IPv6 address and so it cannot >>> reach bugs.python.org. wget is really slow because it tries first >>> IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so >>> doesn't have the issue. >> >> The infrastructure team cannot do anything about this, since >> the tracker is hosted by Upfront Systems. > > But the PSF controls the DNS, no? > > ;; AUTHORITY SECTION: > python.org. 2396 IN NS ns2.p11.dynect.net. > python.org. 2396 IN NS ns3.p11.dynect.net. > python.org. 2396 IN NS ns1.p11.dynect.net. > python.org. 2396 IN NS ns4.p11.dynect.net. Yes, but only Upfront Systems knows about the correct IPv6 address :-) Simply removing it would be short term work-around, though. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 04 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 15 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From antoine at python.org Mon Nov 4 16:34:46 2013 From: antoine at python.org (Antoine Pitrou) Date: Mon, 04 Nov 2013 16:34:46 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: <5277B892.7070901@egenix.com> References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> Message-ID: <1383579286.2883.1.camel@fsol> On lun., 2013-11-04 at 16:09 +0100, M.-A. Lemburg wrote: > On 04.11.2013 11:01, Victor Stinner wrote: > > Hi, > > > > bugs.python.org is still not responding on IPv6. Can someone please > > remove the AAAA record from python.org DNS, or fix the IPv6 > > configuration? > > > > It's an issue on my PC because my PC has IPv6 address and so it cannot > > reach bugs.python.org. wget is really slow because it tries first > > IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so > > doesn't have the issue. > > The infrastructure team cannot do anything about this, since > the tracker is hosted by Upfront Systems. > > They are hosting the instance at Hetzner in Germany, which does > support IPv6. > > Do we have someone with login permission for the server running > RoundUp ? I think Ezio and Martin have access to the machine, but not necessarily root access; and, furthermore, the problem might be somewhere else on the network. I think the best short-term measure is to disable the IPv6 resolution for bugs.python.org. Otherwise people will run in the same issue over and over again. Regards Antoine. > > If not, we should probably find someone at Upfront Systems to check > the configuration. > > > Victor > > > > 2013/10/12 Antoine Pitrou : > >> > >> Opened issue at http://psf.upfronthosting.co.za/roundup/meta/issue528 > >> > >> Regards > >> > >> Antoine. > >> > >> > >> > >> Le samedi 12 octobre 2013 ? 14:40 +0200, Victor Stinner a ?crit : > >>> Hi, > >>> > >>> The DNS server of python.org announce the IP address 2a01:4f8:131:2480::3: > >>> > >>> $ host -t AAAA bugs.python.org > >>> bugs.python.org has IPv6 address 2a01:4f8:131:2480::3 > >>> > >>> > >>> The problem is that I cannot connect to this IP address: > >>> > >>> $ ping6 -c 4 2a01:4f8:131:2480::3 > >>> PING 2a01:4f8:131:2480::3(2a01:4f8:131:2480::3) 56 data bytes > >>> > >>> --- 2a01:4f8:131:2480::3 ping statistics --- > >>> 4 packets transmitted, 0 received, 100% packet loss, time 2999ms > >>> > >>> > >>> Do you have a the same issue, or is it just me? > >>> > >>> Victor > >>> ________________________________________________ > >>> Infrastructure mailing list > >>> Infrastructure at python.org > >>> https://mail.python.org/mailman/listinfo/infrastructure > >>> Unsubscribe: https://mail.python.org/mailman/options/infrastructure/solipsis%40pitrou.net > >>> > >> > >> > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> https://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > > ________________________________________________ > > Infrastructure mailing list > > Infrastructure at python.org > > https://mail.python.org/mailman/listinfo/infrastructure > > Unsubscribe: https://mail.python.org/mailman/options/infrastructure/mal%40egenix.com > > > From support at github.com Mon Nov 4 16:44:03 2013 From: support at github.com (James Dennes) Date: Mon, 04 Nov 2013 07:44:03 -0800 Subject: [Python-Dev] urllib2.HTTPBasicAuthHandler doesn't work with GitHub API In-Reply-To: <5277AB00.5000301@redhat.com> References: <5277AB00.5000301@redhat.com> Message-ID: Hi there, Thanks for letting us know, however you'll need to report this bug at: http://bugs.python.org/ I'd recommend using one of the Python libraries listed here if that's possible in your case: http://developer.github.com/v3/libraries/#python I know that the following library is well maintained: https://github.com/sigmavirus24/github3.py Cheers, James From guido at python.org Mon Nov 4 16:52:00 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Nov 2013 07:52:00 -0800 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: <20131104090405.GA10765@phdru.name> References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: Two more approaches that can help when you haven't pushed yet: - hg rollback undoes the most recent local commit while leaving the local workspace unchanged so the files are now patched but not committed - hg strip deletes a revision and all its descendents (requires some extension to be configured); it also rolls back your workspace so you'd have to repatch using hg import --no-commit. (BTW everybody here knows you can give hg import a URL straight from Rietveld's "raw diff download" right?) On Mon, Nov 4, 2013 at 1:04 AM, Oleg Broytman wrote: > Hi! > > On Mon, Nov 04, 2013 at 03:56:25AM -0500, Terry Reedy > wrote: > > The one > > thing I tried but could not do was to directly change status 'A' > > back to '?'. > > hg forget file > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Mon Nov 4 17:05:25 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 04 Nov 2013 11:05:25 -0500 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: <1383579286.2883.1.camel@fsol> References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> <1383579286.2883.1.camel@fsol> Message-ID: <20131104160526.223D1250BFB@webabinitio.net> On Mon, 04 Nov 2013 16:34:46 +0100, Antoine Pitrou wrote: > On lun., 2013-11-04 at 16:09 +0100, M.-A. Lemburg wrote: > > On 04.11.2013 11:01, Victor Stinner wrote: > > > Hi, > > > > > > bugs.python.org is still not responding on IPv6. Can someone please > > > remove the AAAA record from python.org DNS, or fix the IPv6 > > > configuration? > > > > > > It's an issue on my PC because my PC has IPv6 address and so it cannot > > > reach bugs.python.org. wget is really slow because it tries first > > > IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so > > > doesn't have the issue. > > > > The infrastructure team cannot do anything about this, since > > the tracker is hosted by Upfront Systems. > > > > They are hosting the instance at Hetzner in Germany, which does > > support IPv6. > > > > Do we have someone with login permission for the server running > > RoundUp ? > > I think Ezio and Martin have access to the machine, but not necessarily > root access; and, furthermore, the problem might be somewhere else on > the network. > > I think the best short-term measure is to disable the IPv6 resolution > for bugs.python.org. Otherwise people will run in the same issue over > and over again. We have root, but I don't think that helps. I currently see this: eth0 Link encap:Ethernet HWaddr 6c:62:6d:85:bb:7b inet addr:46.4.197.70 Bcast:46.4.197.71 Mask:255.255.255.248 inet6 addr: 2a01:4f8:131:2480::3/64 Scope:Global so the problem appears to be a network issue (as far as I can see, there's no firewall running on the box itself). Disabling the AAA name record until we get a response from upfront sounds like a good idea. (I've added the upfront folks as nosy on the ticket Antoine opened on the meta-tracker.) --David From mal at egenix.com Mon Nov 4 17:18:53 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 04 Nov 2013 17:18:53 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: <20131104160526.223D1250BFB@webabinitio.net> References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> <1383579286.2883.1.camel@fsol> <20131104160526.223D1250BFB@webabinitio.net> Message-ID: <5277C8ED.1030206@egenix.com> On 04.11.2013 17:05, R. David Murray wrote: > On Mon, 04 Nov 2013 16:34:46 +0100, Antoine Pitrou wrote: >> On lun., 2013-11-04 at 16:09 +0100, M.-A. Lemburg wrote: >>> On 04.11.2013 11:01, Victor Stinner wrote: >>>> Hi, >>>> >>>> bugs.python.org is still not responding on IPv6. Can someone please >>>> remove the AAAA record from python.org DNS, or fix the IPv6 >>>> configuration? >>>> >>>> It's an issue on my PC because my PC has IPv6 address and so it cannot >>>> reach bugs.python.org. wget is really slow because it tries first >>>> IPv6, but then falls back to IPv4. Firefox tries first IPv4 and so >>>> doesn't have the issue. >>> >>> The infrastructure team cannot do anything about this, since >>> the tracker is hosted by Upfront Systems. >>> >>> They are hosting the instance at Hetzner in Germany, which does >>> support IPv6. >>> >>> Do we have someone with login permission for the server running >>> RoundUp ? >> >> I think Ezio and Martin have access to the machine, but not necessarily >> root access; and, furthermore, the problem might be somewhere else on >> the network. >> >> I think the best short-term measure is to disable the IPv6 resolution >> for bugs.python.org. Otherwise people will run in the same issue over >> and over again. > > We have root, but I don't think that helps. > > I currently see this: > > eth0 Link encap:Ethernet HWaddr 6c:62:6d:85:bb:7b > inet addr:46.4.197.70 Bcast:46.4.197.71 Mask:255.255.255.248 > inet6 addr: 2a01:4f8:131:2480::3/64 Scope:Global > > so the problem appears to be a network issue (as far as I can see, > there's no firewall running on the box itself). Disabling the AAA name > record until we get a response from upfront sounds like a good idea. > (I've added the upfront folks as nosy on the ticket Antoine opened > on the meta-tracker.) Some things to try on the box: * ping6 2001:888:2000:d::a2 (that's python.org) * check whether the web server is actually listening on the IPv6 address * check whether a IPv6 default route is set up If the first doesn't work, there's likely a router problem with the server, which only Upfront can resolve (via a support request to Hetzner or by checking their server IPv6 settings). Here's the configuration page for the IPv6 setup: http://wiki.hetzner.de/index.php/Netzkonfiguration_Debian/en -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 04 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 15 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From victor.stinner at gmail.com Mon Nov 4 20:54:40 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 4 Nov 2013 20:54:40 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: <5277C8ED.1030206@egenix.com> References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> <1383579286.2883.1.camel@fsol> <20131104160526.223D1250BFB@webabinitio.net> <5277C8ED.1030206@egenix.com> Message-ID: 2013/11/4 M.-A. Lemburg : > Some things to try on the box: > > * ping6 2001:888:2000:d::a2 (that's python.org) $ ping6 -c 4 2001:888:2000:d::a2 PING 2001:888:2000:d::a2(2001:888:2000:d::a2) 56 data bytes 64 bytes from 2001:888:2000:d::a2: icmp_seq=1 ttl=56 time=53.0 ms 64 bytes from 2001:888:2000:d::a2: icmp_seq=2 ttl=56 time=53.0 ms 64 bytes from 2001:888:2000:d::a2: icmp_seq=3 ttl=56 time=58.4 ms 64 bytes from 2001:888:2000:d::a2: icmp_seq=4 ttl=56 time=122 ms --- 2001:888:2000:d::a2 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3003ms rtt min/avg/max/mdev = 53.024/71.841/122.817/29.514 ms => OK > * check whether the web server is actually listening on the > IPv6 address smithers$ wget -O python.html 'http://[2001:888:2000:d::a2]/' --2013-11-04 20:53:27-- http://[2001:888:2000:d::a2]/ Connecting to [2001:888:2000:d::a2]:80... connect?. requ?te HTTP transmise, en attente de la r?ponse...302 Found Emplacement: http://www.python.org [suivant] --2013-11-04 20:53:27-- http://www.python.org/ R?solution de www.python.org (www.python.org)... 2001:888:2000:d::a2, 82.94.164.162 Reusing existing connection to [2001:888:2000:d::a2]:80. requ?te HTTP transmise, en attente de la r?ponse...200 OK Longueur: 20423 (20K) [text/html] Sauvegarde en : ?python.html? 100%[=====================================================================================================>] 20 423 113KB/s ds 0,2s 2013-11-04 20:53:28 (113 KB/s) - ?python.html? sauvegard? [20423/20423] => OK > * check whether a IPv6 default route is set up I don't know how to check that. Victor From christian at python.org Mon Nov 4 20:58:30 2013 From: christian at python.org (Christian Heimes) Date: Mon, 04 Nov 2013 20:58:30 +0100 Subject: [Python-Dev] The new SHA-3 is the old SHA-3 Message-ID: Good news everybody! A while ago John Kelsey has presented NIST's plans to change SHA-3 [1] in order to make it faster but also weaker. Last Friday he posted a mail on NIST's internal hash-forum mailing list. NIST is planing to drop these plans and to move forward with the SHA-3 draft in its original state. I can keep the sha3 module in Python 3.4 and don't have to remove it. Maybe I'm going to add the new SHAKE functions to hashlib, too. But first let me explain the background. NIST is going to standardize four replacements for SHA2: SHA3-224, SHA3-256, SHA3-384 and SHA3-512. (The number after the hyphen is the output size in bits.) These are going to be standardized variants of the Keccak-f[1600] sponge construction. [2] Similar to a real sponge Keccak has a capacity, can absorb data at some rate and can be squeezed to output data. The rate parameter affects performance of the Keccak construction (higher rate == faster). A higher capacity results in better collision resistance. Keccak-f[1600] rate + capacity = 1600 bits with capacity/2 collision and pre-image resistance. Contrary to a real sponge the squeeze function can return arbitrarily amounts of data. The old and new parameters for the SHA-3 are SHA3-224: capacity=448, output=224 SHA3-256: capacity=512, output=256 SHA3-384: capacity=768, output=384 SHA3-512: capacity=1024, output=512 NIST was about to standardize SHA3-224/SHA3-256 with capacity=256 and SHA3-384 /SHA3-512 with capacity=512. That's (most like) off the stove. However NIST is going to standardize two additional functions: SHAKE128 and SHAKE256. These functions support arbitrary length output with a collision resistance of 128/256 bits with a capacity of 256/512. The SHAKE functions can be easily implemented on top of the current code and PEP 247 API with mandatory length arguments for digst() and hexdigest(). Christian [1] http://bristolcrypto.blogspot.de/2013/08/ches-invited-talk-future-of-sha-3.html [2] http://keccak.noekeon.org/ From antoine at python.org Mon Nov 4 21:03:47 2013 From: antoine at python.org (Antoine Pitrou) Date: Mon, 04 Nov 2013 21:03:47 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> <1383579286.2883.1.camel@fsol> <20131104160526.223D1250BFB@webabinitio.net> <5277C8ED.1030206@egenix.com> Message-ID: <1383595427.2883.5.camel@fsol> On lun., 2013-11-04 at 20:54 +0100, Victor Stinner wrote: > 2013/11/4 M.-A. Lemburg : > > Some things to try on the box: > > > > * ping6 2001:888:2000:d::a2 (that's python.org) > > $ ping6 -c 4 2001:888:2000:d::a2 [...] On the box, not on your home machine! Regards Antoine. From mal at egenix.com Mon Nov 4 21:06:25 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 04 Nov 2013 21:06:25 +0100 Subject: [Python-Dev] [Infrastructure] bugs.python.org not reachable in IPv6? In-Reply-To: References: <1381582294.2497.0.camel@fsol> <5277B892.7070901@egenix.com> <1383579286.2883.1.camel@fsol> <20131104160526.223D1250BFB@webabinitio.net> <5277C8ED.1030206@egenix.com> Message-ID: <5277FE41.3050407@egenix.com> On 04.11.2013 20:54, Victor Stinner wrote: > 2013/11/4 M.-A. Lemburg : >> Some things to try on the box: >> >> * ping6 2001:888:2000:d::a2 (that's python.org) > > $ ping6 -c 4 2001:888:2000:d::a2 > PING 2001:888:2000:d::a2(2001:888:2000:d::a2) 56 data bytes > 64 bytes from 2001:888:2000:d::a2: icmp_seq=1 ttl=56 time=53.0 ms > 64 bytes from 2001:888:2000:d::a2: icmp_seq=2 ttl=56 time=53.0 ms > 64 bytes from 2001:888:2000:d::a2: icmp_seq=3 ttl=56 time=58.4 ms > 64 bytes from 2001:888:2000:d::a2: icmp_seq=4 ttl=56 time=122 ms > > --- 2001:888:2000:d::a2 ping statistics --- > 4 packets transmitted, 4 received, 0% packet loss, time 3003ms > rtt min/avg/max/mdev = 53.024/71.841/122.817/29.514 ms > > => OK > >> * check whether the web server is actually listening on the >> IPv6 address > > smithers$ wget -O python.html 'http://[2001:888:2000:d::a2]/' > --2013-11-04 20:53:27-- http://[2001:888:2000:d::a2]/ > Connecting to [2001:888:2000:d::a2]:80... connect?. > requ?te HTTP transmise, en attente de la r?ponse...302 Found > Emplacement: http://www.python.org [suivant] > --2013-11-04 20:53:27-- http://www.python.org/ > R?solution de www.python.org (www.python.org)... 2001:888:2000:d::a2, > 82.94.164.162 > Reusing existing connection to [2001:888:2000:d::a2]:80. > requ?te HTTP transmise, en attente de la r?ponse...200 OK > Longueur: 20423 (20K) [text/html] > Sauvegarde en : ?python.html? > > 100%[=====================================================================================================>] > 20 423 113KB/s ds 0,2s > > 2013-11-04 20:53:28 (113 KB/s) - ?python.html? sauvegard? [20423/20423] > > => OK > >> * check whether a IPv6 default route is set up > > I don't know how to check that. route -n -A inet6 should print the default route setup for IPv6. This has to list a working and ping6'able gateway. BTW: The above was meant to be run on the bugs.python.org box, not your box :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 04 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 15 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ethan at stoneleaf.us Mon Nov 4 22:17:44 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 04 Nov 2013 13:17:44 -0800 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: <52780EF8.2070309@stoneleaf.us> On 11/04/2013 07:52 AM, Guido van Rossum wrote: > Two more approaches that can help when you haven't pushed yet: > > - hg rollback undoes the most recent local commit while leaving the local workspace unchanged so the files are now > patched but not committed > > - hg strip deletes a revision and all its descendents (requires some extension to be configured); it also rolls > back your workspace so you'd have to repatch using hg import --no-commit. > > (BTW everybody here knows you can give hg import a URL straight from Rietveld's "raw diff download" right?) Oh, cool! Thanks for the hint! :) -- ~Ethan~ From john at ziaspace.com Mon Nov 4 21:47:53 2013 From: john at ziaspace.com (John Klos) Date: Mon, 4 Nov 2013 20:47:53 +0000 (UTC) Subject: [Python-Dev] VAX NaN evaluations Message-ID: Hi, The nice Python folks who were at SCALE in Los Angeles last year gave me a Python t-shirt for showing Python working on m68k and for suggesting that I'd get it working on VAX. With libffi support for VAX from Miod Vallat, this is now possible. However, when compiling Python, it seems that attempts to evaluate NaN are made in test_complex.py, test_complex_args.py, test_float.py and test_math.py (at least in 2.7.5 - I'm working on compiling 3.3.2 now). The short answer is to skip those tests on VAXen. The better answer is to patch any isnan functions to always return false on VAXen and patch any code which tries to parse, for instance, float("NaN") to use something uncommon, such as the largest representable number (const union __double_u __infinity = { { 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff } };) or something else equally rare. While code which depends on the ability to evaluate NaNs in some meaninful way will likely break on VAXen, I think this is better than raising an exception. Being completely new to Python, I'm not sure what's best to do here. What would be the best way to find code which handles evaluation of "NaN"? Does anyone have any recommendations? Thanks, John Klos From dreamingforward at gmail.com Mon Nov 4 23:02:45 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 4 Nov 2013 14:02:45 -0800 Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: > The nice Python folks who were at SCALE in Los Angeles last year gave me a > Python t-shirt for showing Python working on m68k and for suggesting that > I'd get it working on VAX. With libffi support for VAX from Miod Vallat, > this is now possible. > > However, when compiling Python, it seems that attempts to evaluate NaN are > made in test_complex.py, test_complex_args.py, test_float.py and > test_math.py (at least in 2.7.5 - I'm working on compiling 3.3.2 now). > > The short answer is to skip those tests on VAXen. The better answer is to > patch any isnan functions to always return false on VAXen and patch any code > which tries to parse, for instance, float("NaN") to use something uncommon, > such as the largest representable number (const union __double_u __infinity > = { { 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff } };) or something else > equally rare. I think you are asking for trouble here. As VAX floating point does not appear to have hardware support for NaN, attempts to "emulate" such could be potentially dangerous. If such software were running life-support systems, for example, your decision to emulate it would put you at fault for criminal negligence, whereas a failure simply due to lack of implementation would be more benign incompetence. Probably better for an exception to be thrown or other hard error. Mark From dreamingforward at gmail.com Mon Nov 4 23:14:36 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 4 Nov 2013 14:14:36 -0800 Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: I think I spoke to soon on my earlier reply. If you have control over the whole system, you could *set* policy on behalf of a whole platform (like VAX) so you can "safely" use an otherwise non-normal set of bits to designate divide by zero (a negative sign bit with the rest all zeros, for example) or other NaNs. But then you should be on IEEE floating point committee and make these *universal* for the particular platform in question. Despite the debate from another thread, there's probably no wisdom in using other magical bit patterns (xDEADBEEF). -- MarkJ Tacoma, Washington From ncoghlan at gmail.com Mon Nov 4 23:15:34 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Nov 2013 08:15:34 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: On 5 Nov 2013 02:03, "Guido van Rossum" wrote: > > Two more approaches that can help when you haven't pushed yet: > > - hg rollback undoes the most recent local commit while leaving the local workspace unchanged so the files are now patched but not committed > > - hg strip deletes a revision and all its descendents (requires some extension to be configured); it also rolls back your workspace so you'd have to repatch using hg import --no-commit. A couple more tips: - I actually have "--no-commit" configured as a standard option for hg import in my .hgrc file so I never forget - "hg pull --rebase" avoids having a merge in the history for push races that involve only default branch changes - hg histedit is a quite useful extension for adjusting commit history prior to publication. (most of the history editing mercurial features require enabling the appropriate extensions, but it's well worth taking the time to do so - git users in particular may find that appropriate use of extensions enables a more familiar local user experience) Cheers, Nick. > > (BTW everybody here knows you can give hg import a URL straight from Rietveld's "raw diff download" right?) > > > On Mon, Nov 4, 2013 at 1:04 AM, Oleg Broytman wrote: >> >> Hi! >> >> On Mon, Nov 04, 2013 at 03:56:25AM -0500, Terry Reedy wrote: >> > The one >> > thing I tried but could not do was to directly change status 'A' >> > back to '?'. >> >> hg forget file >> >> Oleg. >> -- >> Oleg Broytman http://phdru.name/ phd at phdru.name >> Programmers don't die, they just GOSUB without RETURN. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john at ziaspace.com Mon Nov 4 23:28:09 2013 From: john at ziaspace.com (John Klos) Date: Mon, 4 Nov 2013 22:28:09 +0000 (UTC) Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: >> The short answer is to skip those tests on VAXen. The better answer is to >> patch any isnan functions to always return false on VAXen and patch any code >> which tries to parse, for instance, float("NaN") to use something uncommon, >> such as the largest representable number (const union __double_u __infinity >> = { { 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff } };) or something else >> equally rare. > > I think you are asking for trouble here. As VAX floating point does > not appear to have hardware support for NaN, attempts to "emulate" > such could be potentially dangerous. If such software were running > life-support systems, for example, your decision to emulate it would > put you at fault for criminal negligence, whereas a failure simply due > to lack of implementation would be more benign incompetence. Probably > better for an exception to be thrown or other hard error. We'd have to have one uncommor and two extremely unlikely events all happen simultaneously for your example to be of concern: One, someone would have to care that NaN representations are done using an arbitrary value rather than an actual NaN representation in IEEE 754. I can imagine instances where this could cause an issue, but people who write proper math code would tend to write correct code anyway. Two, someone would have to decide to use Python with NaN testing / comparison code to run some sort of life-support system. I can't imagine anyone who isn't already horribly incompetent doing anything like this. Three, that someone would have to want to run that code and that life-support system on a VAX (or other system which doesn't handle NaNs. While you make a point worth making, nobody is ever going to be at fault for criminal negligence for having implementation side-cases. If that were even possible, open source software would be litigated out of existence and everyone would be suing Microsoft for their monumental failures. So you indirectly say that you think it'd be best to just skip the tests and leave it so tha any attempts to handle NaNs would dump core. Is that right? Thanks, John From alexander.belopolsky at gmail.com Mon Nov 4 23:37:59 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 4 Nov 2013 17:37:59 -0500 Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: On Mon, Nov 4, 2013 at 3:47 PM, John Klos wrote: > What would be the best way to find code which handles evaluation of "NaN"? I would be surprised to find NaN handling outside of math module, float object and their complex counterparts (cmath and complex object). Other areas that deal with NaNs such as decimal module should just call float methods and/or math functions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Mon Nov 4 23:45:03 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 4 Nov 2013 14:45:03 -0800 Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: > We'd have to have one uncommor and two extremely unlikely events all happen > simultaneously for your example to be of concern: Understood. But when things run millions of times a second, "extremely unlikely" things can happen more often that you wanted. > Two, someone would have to decide to use Python with NaN testing / > comparison code to run some sort of life-support system. I can't imagine > anyone who isn't already horribly incompetent doing anything like this. People incorporate third-party software, often unknowingly, trusting the process and "system" even though it is almost always *completely* informal and simply a community of good-hearted individuals. > Three, that someone would have to want to run that code and that > life-support system on a VAX (or other system which doesn't handle NaNs. > > While you make a point worth making, nobody is ever going to be at fault for > criminal negligence for having implementation side-cases. You *could* be right. I'm just saying what would be safe, legally and morally speaking. > If that were even > possible, open source software would be litigated out of existence and > everyone would be suing Microsoft for their monumental failures. End-to-end open source software doesn't (generally) make any guarantees ("without warrantees of any kind"), so there's little basis to sue. As for MSFT, people have been suing Microsoft. Apparently, they have good lawyers, and people are smart enough not to use their software too much in mission-critical applications (although apparently this has changed over the years). > So you indirectly say that you think it'd be best to just skip the tests and > leave it so tha any attempts to handle NaNs would dump core. Is that right? That keeps your hands most safe, but again, I suggest you research the IEEE standard to see if there's already a standard for VAX rather than spinning your own home-brewed version, or join the IEEE committee and make one. -- MarkJ Tacoma, Washington From larry at hastings.org Mon Nov 4 23:47:56 2013 From: larry at hastings.org (Larry Hastings) Date: Mon, 04 Nov 2013 14:47:56 -0800 Subject: [Python-Dev] "PyObject *module" for module-level functions? Message-ID: <5278241C.2050102@hastings.org> When Clinic generates a function, it knows what kind of callable it represents, and it names the first argument (the "PyObject *") accordingly: * module-level function ("self"), * method ("self"), * class method ("cls"), or * class static ("null"). I now boldly propose that for the first one, the module-level function, the PyObject * parameter should be named "module". The object passed in is the module object, it's not a "self" in any conventional sense of the word. This would enhance readability, as I assert the name "self" there is confusing. The argument is rarely used on module-level functions, and very little code is converted right now using Clinic anyway. I therefore assert this change would break very little code, and the code that did get broken by this change could be fixed as part of the process of converting it to Clinic. But now would be the time to make this change, before doing the big push to convert to Clinic. (A couple of weeks ago would have been even better...!) +1? -1? //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Nov 5 00:36:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Nov 2013 09:36:45 +1000 Subject: [Python-Dev] "PyObject *module" for module-level functions? In-Reply-To: <5278241C.2050102@hastings.org> References: <5278241C.2050102@hastings.org> Message-ID: On 5 Nov 2013 08:49, "Larry Hastings" wrote: > > > > When Clinic generates a function, it knows what kind of callable it represents, and it names the first argument (the "PyObject *") accordingly: > module-level function ("self"), > method ("self"), > class method ("cls"), or > class static ("null"). > I now boldly propose that for the first one, the module-level function, the PyObject * parameter should be named "module". The object passed in is the module object, it's not a "self" in any conventional sense of the word. > > This would enhance readability, as I assert the name "self" there is confusing. The argument is rarely used on module-level functions, and very little code is converted right now using Clinic anyway. I therefore assert this change would break very little code, and the code that did get broken by this change could be fixed as part of the process of converting it to Clinic. > > But now would be the time to make this change, before doing the big push to convert to Clinic. (A couple of weeks ago would have been even better...!) > > +1? -1? +1 from me, as they're not really methods of the module instance (i.e. dynamically bound when retrieved from the module), even though they behave a little like they are. Instead of relying on the descriptor protocol (which doesn't trigger because module level functions are stored in an instance namespace), we play games when creating the Python level callables during module creation in order to prebind the module object. Cheers, Nick. > > > /arry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Tue Nov 5 00:43:45 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 04 Nov 2013 18:43:45 -0500 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: On 11/4/2013 5:15 PM, Nick Coghlan wrote: > - I actually have "--no-commit" configured as a standard option for hg > import in my .hgrc file so I never forget On Windows, hg uses .ini files. Do you have any idea what the [section] option = value would look like? There is gui dialog for managing the .ini files, but it does not have a page for import options. > - "hg pull --rebase" avoids having a merge in the history for push races > that involve only default branch changes Can this be done routinely for all pulls? Does it hurt if here are working directory changes in 2 or 3 branches? -- Terry Jan Reedy From guido at python.org Tue Nov 5 00:45:03 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 4 Nov 2013 15:45:03 -0800 Subject: [Python-Dev] "PyObject *module" for module-level functions? In-Reply-To: References: <5278241C.2050102@hastings.org> Message-ID: On Mon, Nov 4, 2013 at 3:36 PM, Nick Coghlan wrote: > On 5 Nov 2013 08:49, "Larry Hastings" wrote: > > When Clinic generates a function, it knows what kind of callable it > represents, and it names the first argument (the "PyObject *") accordingly: > > module-level function ("self"), > > method ("self"), > > class method ("cls"), or > > class static ("null"). > > I now boldly propose that for the first one, the module-level function, > the PyObject * parameter should be named "module". The object passed in is > the module object, it's not a "self" in any conventional sense of the word. > > > > This would enhance readability, as I assert the name "self" there is > confusing. The argument is rarely used on module-level functions, and very > little code is converted right now using Clinic anyway. I therefore assert > this change would break very little code, and the code that did get broken > by this change could be fixed as part of the process of converting it to > Clinic. > > > > But now would be the time to make this change, before doing the big push > to convert to Clinic. (A couple of weeks ago would have been even > better...!) > > > > +1? -1? > > +1 from me, as they're not really methods of the module instance (i.e. > dynamically bound when retrieved from the module), even though they behave > a little like they are. > > Instead of relying on the descriptor protocol (which doesn't trigger > because module level functions are stored in an instance namespace), we > play games when creating the Python level callables during module creation > in order to prebind the module object > +1 from me too. This finally fixes a silly mistake I made well over two decades ago! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Tue Nov 5 00:45:26 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 05 Nov 2013 00:45:26 +0100 Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: <20131105004526.Horde.kYd4dsHoQU8OkyX353Rcgg1@webmail.df.eu> Quoting John Klos : > The nice Python folks who were at SCALE in Los Angeles last year > gave me a Python t-shirt for showing Python working on m68k and for > suggesting that I'd get it working on VAX. With libffi support for > VAX from Miod Vallat, this is now possible. I'm sorry to hear that - you might have been wasting your time (but then, perhaps not). We decided a while ago that the regular Python releases will not support VAX/VMS any longer. So any code you write has zero chance of being integrated into Python (the same holds for m68k code, BTW). That said, you are free to maintain your own branch of Python. In that branch, you can make whatever changes you consider necessary, but you will be on your own. Kind regards, Martin From ericsnowcurrently at gmail.com Tue Nov 5 00:53:26 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 4 Nov 2013 16:53:26 -0700 Subject: [Python-Dev] "PyObject *module" for module-level functions? In-Reply-To: <5278241C.2050102@hastings.org> References: <5278241C.2050102@hastings.org> Message-ID: On Mon, Nov 4, 2013 at 3:47 PM, Larry Hastings wrote: > When Clinic generates a function, it knows what kind of callable it > represents, and it names the first argument (the "PyObject *") accordingly: > > module-level function ("self"), > method ("self"), > class method ("cls"), or > class static ("null"). > > I now boldly propose that for the first one, the module-level function, the > PyObject * parameter should be named "module". The object passed in is the > module object, it's not a "self" in any conventional sense of the word. > > This would enhance readability, as I assert the name "self" there is > confusing. The argument is rarely used on module-level functions, and very > little code is converted right now using Clinic anyway. I therefore assert > this change would break very little code, and the code that did get broken > by this change could be fixed as part of the process of converting it to > Clinic. > > But now would be the time to make this change, before doing the big push to > convert to Clinic. (A couple of weeks ago would have been even better...!) > > +1? -1? +1 -eric From steve at pearwood.info Tue Nov 5 03:44:30 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 5 Nov 2013 13:44:30 +1100 Subject: [Python-Dev] VAX NaN evaluations In-Reply-To: References: Message-ID: <20131105024429.GI15615@ando> On Mon, Nov 04, 2013 at 08:47:53PM +0000, John Klos wrote: [...] > The short answer is to skip those tests on VAXen. The better answer is to > patch any isnan functions to always return false on VAXen and patch any > code which tries to parse, for instance, float("NaN") to use something > uncommon, such as the largest representable number (const union __double_u > __infinity = { { 0xff, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff } };) or > something else equally rare. While code which depends on the ability to > evaluate NaNs in some meaninful way will likely break on VAXen, I think > this is better than raising an exception. (I take it that emulating NANs at the object level isn't an option? That's what the Decimal class does.) I write code that uses NANs, and if running it on a VAX would break my code, I'd rather it broke it nice and cleanly with an exception rather than by just giving the wrong result. With an exception, I can see straight away that something is wrong, and decide the best way to deal with the lack of NANs for my application. If you helpfully "fix" it for me by returning a non-NAN number, or even an infinity, at best I'll get a failure somewhere else in the application, far from where the break actually occurred; at worst, I may never know that my application is actually generating garbage results. I'm reminded about a quote from Chris Smith: "I find it amusing when novice programmers believe their main job is preventing programs from crashing. ... More experienced programmers realize that correct code is great, code that crashes could use improvement, but incorrect code that doesn?t crash is a horrible nightmare." Not to imply that you're a novice programmer (sorry for the implication) but please don't do that to your users. The Zen of Python talks about this: py> import this The Zen of Python, by Tim Peters [...] Errors should never pass silently. Just to be clear, rather than dump core (which you suggested in a later email), if you cannot support NANs on VAXen, you should modify float to raise a ValueError exception. Pure Python code like float('nan') should never dump core, it should raise: ValueError: could not convert string to float: 'nan' -- Steven From g.brandl at gmx.net Tue Nov 5 06:50:51 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 05 Nov 2013 06:50:51 +0100 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: Am 05.11.2013 00:43, schrieb Terry Reedy: > On 11/4/2013 5:15 PM, Nick Coghlan wrote: > >> - I actually have "--no-commit" configured as a standard option for hg >> import in my .hgrc file so I never forget > > On Windows, hg uses .ini files. Do you have any idea what the > [section] > option = value > would look like? There is gui dialog for managing the .ini files, but it > does not have a page for import options. It'll look like this: [defaults] import = --no-commit >> - "hg pull --rebase" avoids having a merge in the history for push races >> that involve only default branch changes > > Can this be done routinely for all pulls? Does it hurt if here are > working directory changes in 2 or 3 branches? It won't work with merge commits, so if you locally commit to 3.3 and merge to default, pull --rebase will error out if there are remote changes. cheers, Georg From stefan_ml at behnel.de Tue Nov 5 08:18:09 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 05 Nov 2013 08:18:09 +0100 Subject: [Python-Dev] "PyObject *module" for module-level functions? In-Reply-To: <5278241C.2050102@hastings.org> References: <5278241C.2050102@hastings.org> Message-ID: Larry Hastings, 04.11.2013 23:47: > When Clinic generates a function, it knows what kind of callable it > represents, and it names the first argument (the "PyObject *") accordingly: > > * module-level function ("self"), > * method ("self"), > * class method ("cls"), or > * class static ("null"). > > I now boldly propose that for the first one, the module-level function, the > PyObject * parameter should be named "module". The object passed in is the > module object, it's not a "self" in any conventional sense of the word. > > This would enhance readability, as I assert the name "self" there is > confusing. The argument is rarely used on module-level functions Since this only relates to the argument clinic, I assume this change doesn't get in the way of making module level functions real methods of the module, does it? Stefan From tjreedy at udel.edu Tue Nov 5 08:31:11 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 05 Nov 2013 02:31:11 -0500 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: On 11/5/2013 12:50 AM, Georg Brandl wrote: > [defaults] > import = --no-commit Thank you. A message in a patch is now ignored rather than auto-committed. I still have to click away the edit box that would allow me to enter a message and override 'no-commit', but I have always managed to do that without accidentally hitting a key ;-). -- Terry Jan Reedy From g.brandl at gmx.net Tue Nov 5 08:55:01 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 05 Nov 2013 08:55:01 +0100 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Issue #XXXXX: Fix test_idle so that idlelib test cases are actually run In-Reply-To: References: <3dChMm1bw5z7LjR@mail.python.org> <5277287F.8060607@udel.edu> <20131104090405.GA10765@phdru.name> Message-ID: Am 05.11.2013 08:31, schrieb Terry Reedy: > On 11/5/2013 12:50 AM, Georg Brandl wrote: > >> [defaults] >> import = --no-commit > > Thank you. A message in a patch is now ignored rather than > auto-committed. Yes; that's why I'd prefer the commit-and-then-amend way. Georg From larry at hastings.org Tue Nov 5 08:59:53 2013 From: larry at hastings.org (Larry Hastings) Date: Mon, 04 Nov 2013 23:59:53 -0800 Subject: [Python-Dev] "PyObject *module" for module-level functions? In-Reply-To: References: <5278241C.2050102@hastings.org> Message-ID: <5278A579.4000608@hastings.org> On 11/04/2013 11:18 PM, Stefan Behnel wrote: > Since this only relates to the argument clinic, I assume this change > doesn't get in the way of making module level functions real methods of the > module, does it? I'm proposing renaming a parameter for Argument-Clinic-generated code. I can't see how this gets in the way of anything. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Nov 5 11:31:34 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Nov 2013 20:31:34 +1000 Subject: [Python-Dev] "PyObject *module" for module-level functions? In-Reply-To: References: <5278241C.2050102@hastings.org> Message-ID: On 5 Nov 2013 17:19, "Stefan Behnel" wrote: > > Larry Hastings, 04.11.2013 23:47: > > When Clinic generates a function, it knows what kind of callable it > > represents, and it names the first argument (the "PyObject *") accordingly: > > > > * module-level function ("self"), > > * method ("self"), > > * class method ("cls"), or > > * class static ("null"). > > > > I now boldly propose that for the first one, the module-level function, the > > PyObject * parameter should be named "module". The object passed in is the > > module object, it's not a "self" in any conventional sense of the word. > > > > This would enhance readability, as I assert the name "self" there is > > confusing. The argument is rarely used on module-level functions > > Since this only relates to the argument clinic, I assume this change > doesn't get in the way of making module level functions real methods of the > module, does it? Right, in that case (regardless of whether or not someone was using argument clinic to do it), the callables would be declared as actual methods of the custom module type rather than as module level functions. Cheers, Nick. > > Stefan > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Tue Nov 5 13:55:55 2013 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 05 Nov 2013 07:55:55 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #18345: Added cookbook example illustrating handler customisation. In-Reply-To: <3dDRKM2Zv2z7LqL@mail.python.org> References: <3dDRKM2Zv2z7LqL@mail.python.org> Message-ID: <5278EADB.7060908@trueblade.com> On 11/05/2013 05:03 AM, vinay.sajip wrote: > http://hg.python.org/cpython/rev/5636366db039 > changeset: 86936:5636366db039 > branch: 3.3 > parent: 86933:2c191b0b5e7a > user: Vinay Sajip > date: Tue Nov 05 10:02:21 2013 +0000 > summary: > Issue #18345: Added cookbook example illustrating handler customisation. > > files: > Doc/howto/logging-cookbook.rst | 135 +++++++++++++++++++++ > 1 files changed, 135 insertions(+), 0 deletions(-) > > > diff --git a/Doc/howto/logging-cookbook.rst b/Doc/howto/logging-cookbook.rst > --- a/Doc/howto/logging-cookbook.rst > +++ b/Doc/howto/logging-cookbook.rst > @@ -1694,3 +1694,138 @@ > Note that the order of items might be different according to the version of > Python used. > > +.. currentmodule:: logging.config > + > +Customising handlers with :func:`dictConfig` > +-------------------------------------------- I love all things British, but the python source code usually uses "customiz*" (341 instances) over "customis*" (19 instance, 8 of which are in logging). I realize "foolish consistency", and all that, but I think the documentation should all use the same conventions. I'd be happy to change the documentation. Eric. From eric at trueblade.com Tue Nov 5 14:47:05 2013 From: eric at trueblade.com (Eric V. Smith) Date: Tue, 05 Nov 2013 08:47:05 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #18345: Added cookbook example illustrating handler customisation. In-Reply-To: <5278EADB.7060908@trueblade.com> References: <3dDRKM2Zv2z7LqL@mail.python.org> <5278EADB.7060908@trueblade.com> Message-ID: <5278F6D9.60100@trueblade.com> On 11/05/2013 07:55 AM, Eric V. Smith wrote: > I love all things British, but the python source code usually uses > "customiz*" (341 instances) over "customis*" (19 instance, 8 of which > are in logging). > > I realize "foolish consistency", and all that, but I think the > documentation should all use the same conventions. I'd be happy to > change the documentation. http://bugs.python.org/issue19504 From john at ziaspace.com Tue Nov 5 10:20:07 2013 From: john at ziaspace.com (John Klos) Date: Tue, 5 Nov 2013 09:20:07 +0000 (UTC) Subject: [Python-Dev] VAX NaN evaluations Message-ID: > I'm sorry to hear that - you might have been wasting your time (but > then, perhaps not). > > We decided a while ago that the regular Python releases will not support > VAX/VMS any longer. So any code you write has zero chance of being > integrated into Python (the same holds for m68k code, BTW). > > That said, you are free to maintain your own branch of Python. In that > branch, you can make whatever changes you consider necessary, but you > will be on your own. Oh, it's definitely not wasting time. I seriously doubt I'd be asking Python people to maintain patches upstream, because I seriously doubt we'd have any to maintain - this is for NetBSD/VAX, not VMS. Running NetBSD on VAX points out many instances where programmers make assumptions, so everyone is helped when we address those assumptions. >From Steven D'Aprano: > Just to be clear, rather than dump core (which you suggested in a later > email), if you cannot support NANs on VAXen, you should modify float to > raise a ValueError exception. Pure Python code like float('nan') should > never dump core, it should raise: > > ValueError: could not convert string to float: 'nan' What Steven recommends makes a lot of sense, and if the Python folks don't think it's worth adding a few lines to make NaN assignments optionally raise an exception, we can keep the patches in NetBSD's pkgsrc. So the next time I have a little time, I'll look through the code and see if I can't find where that would be. And while I'm at it, I'll try to learn a little Python ;) Thanks, John From ethan at stoneleaf.us Tue Nov 5 20:49:27 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 05 Nov 2013 11:49:27 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #18345: Added cookbook example illustrating handler customisation. In-Reply-To: <5278EADB.7060908@trueblade.com> References: <3dDRKM2Zv2z7LqL@mail.python.org> <5278EADB.7060908@trueblade.com> Message-ID: <52794BC7.8050905@stoneleaf.us> On 11/05/2013 04:55 AM, Eric V. Smith wrote: > > I love all things British, but the python source code usually uses > "customiz*" (341 instances) over "customis*" (19 instance, 8 of which > are in logging). > > I realize "foolish consistency", and all that, but I think the > documentation should all use the same conventions. I'd be happy to > change the documentation. Not using the same spelling for the same word would be, I think, a foolish inconsistency. (Curiously enough, although I'm USA, I keep using UK spellings. But not consistently. :/ ) -- ~Ethan~ From victor.stinner at gmail.com Tue Nov 5 22:21:04 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 5 Nov 2013 22:21:04 +0100 Subject: [Python-Dev] PEP 454: tracemalloc (n-th version) In-Reply-To: References: Message-ID: 2013/11/3 Victor Stinner : > "n-th version": Sorry, I don't remember the version number of the PEP :-) > > http://www.python.org/dev/peps/pep-0454/ Charles-Fran?ois doesn't like the magic Statistic.key attribute which may be a string, a tuple of 2 strings, or a tuple of (filename: str, lineno: str) tuples depending on key_type parameter of Snapshot.statistics(). So I replaced Statistic.key with Statistic.traceback which is always a traceback (tuple of (filename: str, lineno: str) tuples). I also renamed the key_type parameter of Snapshot.statistics() to "group_by". Victor From ethan at stoneleaf.us Wed Nov 6 05:38:09 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 05 Nov 2013 20:38:09 -0800 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration Message-ID: <5279C7B1.30300@stoneleaf.us> http://bugs.python.org/issue19332 Summary: --> d = {1: 'one'} --> for k in d: ... d[k+1] = d[k] * k ... Traceback (most recent call last): File "", line 1, in RuntimeError: dictionary changed size during iteration --> for k in d: ... del d[k] ... Traceback (most recent call last): File "", line 1, in RuntimeError: dictionary changed size during iteration While the above two cases are covered, the following is not: --> d = dict.fromkeys('abcd') --> for i in d: ... print(i) ... d[i + 'x'] = None ... del d[i] ... d a dx dxx ax c b Which can yield pretty bizarre results and does not raise an exception. There is a patch to fix this. Raymond's first comment was: "The decision to not monitor adding or removing keys was intentional. It is just not worth the cost in either time or space." Considering that the first two scenarios do raise exceptions that statement does not seem correct, at least not with current code. Later, Raymend rejected the patch with the comment: "I disagree with adding such unimportant code to the critical path." I understand we need to keep the critical path as fast as possible, and that it is a balancing act between completely correct code and warnings in the docs. What I'm hoping for is either someone to shed some light on what is the "unimportant" part, or perhaps some performance measurements that would show there is no performance based reason to not have the patch added. (I can try to do the performance part myself if necessary, I'm just not sure what all the steps are yet.) -- ~Ethan~ From ncoghlan at gmail.com Wed Nov 6 06:41:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 6 Nov 2013 15:41:38 +1000 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: <5279C7B1.30300@stoneleaf.us> References: <5279C7B1.30300@stoneleaf.us> Message-ID: On 6 Nov 2013 15:02, "Ethan Furman" wrote: > > http://bugs.python.org/issue19332 > > Summary: > > --> d = {1: 'one'} > --> for k in d: > ... d[k+1] = d[k] * k > ... > Traceback (most recent call last): > File "", line 1, in > RuntimeError: dictionary changed size during iteration > > --> for k in d: > ... del d[k] > ... > Traceback (most recent call last): > File "", line 1, in > RuntimeError: dictionary changed size during iteration > > > While the above two cases are covered, the following is not: > > --> d = dict.fromkeys('abcd') > --> for i in d: > ... print(i) > ... d[i + 'x'] = None > ... del d[i] > ... > d > a > dx > dxx > ax > c > b > > Which can yield pretty bizarre results and does not raise an exception. > > There is a patch to fix this. > > Raymond's first comment was: > > "The decision to not monitor adding or removing keys was intentional. It is just not worth the cost in either time or space." > > Considering that the first two scenarios do raise exceptions that statement does not seem correct, at least not with current code. > > Later, Raymend rejected the patch with the comment: > > "I disagree with adding such unimportant code to the critical path." > > I understand we need to keep the critical path as fast as possible, and that it is a balancing act between completely correct code and warnings in the docs. > > What I'm hoping for is either someone to shed some light on what is the "unimportant" part, The behaviour of mutating builtin containers while iterating over them is formally undefined beyond "it won't segfault" (one of the few such undefined behaviours in Python). The associated exceptions are thus strictly "best effort given other constraints". Since most loops are unlikely to add and remove exactly balanced numbers of keys, and dict read and write performance impacts essentially all Python code except that which only operates on function locals, only checking for size changes between iteration steps is a nice way to get 99% of the benefits without paying any of the costs. >or perhaps some performance measurements that would show there is no performance based reason to not have the patch added. (I can try to do the performance part myself if necessary, I'm just not sure what all the steps are yet.) If the benchmark suite indicates there's no measurable speed penalty then such a patch may be worth reconsidering. I'd be astonished if that was actually the case, though - the lowest impact approach I can think of is to check for live iterators when setting a dict entry, and that still has non-trivial size and speed implications. Cheers, Nick. > > -- > ~Ethan~ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Nov 6 08:10:30 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 6 Nov 2013 00:10:30 -0700 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> Message-ID: On Nov 5, 2013 10:42 PM, "Nick Coghlan" wrote: > If the benchmark suite indicates there's no measurable speed penalty then such a patch may be worth reconsidering. I'd be astonished if that was actually the case, though - the lowest impact approach I can think of is to check for live iterators when setting a dict entry, and that still has non-trivial size and speed implications. If I remember right, I had to address something related in my C OrderedDict patch on issue #16991. I don't know if what I did there is applicable to dicts though. It's been a while since I worked on that patch. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Nov 6 13:34:22 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 6 Nov 2013 23:34:22 +1100 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: <5279C7B1.30300@stoneleaf.us> References: <5279C7B1.30300@stoneleaf.us> Message-ID: <20131106123422.GP15615@ando> On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote: > http://bugs.python.org/issue19332 Duplicate of this: http://bugs.python.org/issue6017 The conclusion on that also was that it is not worth guarding against such an unusual circumstance. -- Steven From rdmurray at bitdance.com Wed Nov 6 15:56:07 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 06 Nov 2013 09:56:07 -0500 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: <20131106123422.GP15615@ando> References: <5279C7B1.30300@stoneleaf.us> <20131106123422.GP15615@ando> Message-ID: <20131106145608.564A0250C02@webabinitio.net> On Wed, 06 Nov 2013 23:34:22 +1100, Steven D'Aprano wrote: > On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote: > > > http://bugs.python.org/issue19332 > > Duplicate of this: http://bugs.python.org/issue6017 > > The conclusion on that also was that it is not worth guarding against > such an unusual circumstance. If I remember correctly (and I may not) the size-change guards were added because without them there were certain cases that could either segfault or result in an infinite loop. --David From victor.stinner at gmail.com Wed Nov 6 16:10:16 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 6 Nov 2013 16:10:16 +0100 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: <20131106145608.564A0250C02@webabinitio.net> References: <5279C7B1.30300@stoneleaf.us> <20131106123422.GP15615@ando> <20131106145608.564A0250C02@webabinitio.net> Message-ID: 2013/11/6 R. David Murray : > On Wed, 06 Nov 2013 23:34:22 +1100, Steven D'Aprano wrote: >> On Tue, Nov 05, 2013 at 08:38:09PM -0800, Ethan Furman wrote: >> >> > http://bugs.python.org/issue19332 >> >> Duplicate of this: http://bugs.python.org/issue6017 >> >> The conclusion on that also was that it is not worth guarding against >> such an unusual circumstance. > > If I remember correctly (and I may not) the size-change guards were > added because without them there were certain cases that could > either segfault or result in an infinite loop. > > --David This exception is quite old: --- changeset: 17597:32e7d0898eab branch: legacy-trunk user: Guido van Rossum date: Fri Apr 20 19:13:02 2001 +0000 files: Include/Python.h Include/abstract.h Include/object.h Include/opcode.h Include/pyerrors.h Lib/dis.py Makefile.pre.in Objects/abstra description: Iterators phase 1. This comprises: new slot tp_iter in type object, plus new flag Py_TPFLAGS_HAVE_ITER new C API PyObject_GetIter(), calls tp_iter new builtin iter(), with two forms: iter(obj), and iter(function, sentinel) new internal object types iterobject and calliterobject new exception StopIteration new opcodes for "for" loops, GET_ITER and FOR_ITER (also supported by dis.py) new magic number for .pyc files new special method for instances: __iter__() returns an iterator iteration over dictionaries: "for x in dict" iterates over the keys iteration over files: "for x in file" iterates over lines TODO: documentation test suite decide whether to use a different way to spell iter(function, sentinal) decide whether "for key in dict" is a good idea use iterators in map/filter/reduce, min/max, and elsewhere (in/not in?) speed tuning (make next() a slot tp_next???) --- More recently, I added another exception if a dictionary is modified during a lookup. When I proposed a new frozendict type to secure my pysandbox project, Armin Rigo wrote that CPython segfaults must be fixed. So I fixed a corner case on dictionaries. http://bugs.python.org/issue14205 Thread in python-dev: https://mail.python.org/pipermail/python-dev/2012-March/117290.html The discussion in http://bugs.python.org/issue14205 is interesting. Victor From rdmurray at bitdance.com Wed Nov 6 16:17:59 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 06 Nov 2013 10:17:59 -0500 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> <20131106123422.GP15615@ando> <20131106145608.564A0250C02@webabinitio.net> Message-ID: <20131106151759.CAF29250C02@webabinitio.net> On Wed, 06 Nov 2013 16:10:16 +0100, Victor Stinner wrote: > More recently, I added another exception if a dictionary is modified > during a lookup. > > When I proposed a new frozendict type to secure my pysandbox project, > Armin Rigo wrote that CPython segfaults must be fixed. So I fixed a > corner case on dictionaries. > http://bugs.python.org/issue14205 Ah, that may be what I was (mis)remembering. --David From solipsis at pitrou.net Wed Nov 6 18:16:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 06 Nov 2013 18:16:59 +0100 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> Message-ID: Le 06/11/2013 06:41, Nick Coghlan a ?crit : > > The behaviour of mutating builtin containers while iterating over them > is formally undefined beyond "it won't segfault" (one of the few such > undefined behaviours in Python). The associated exceptions are thus > strictly "best effort given other constraints". Not sure what you mean with "formally undefined". For example, you can perfectly well change a list's contents while iterating over it, and I bet there's a lot of code that relies on that, as in:: for i, value in enumerate(mylist): if some_condition(value): mylist[i] = some_function(value) If you change "builtin containers" to "builtin unordered containers", then you probably are closer to the truth :) > >or perhaps some performance measurements that would show there is no > performance based reason to not have the patch added. (I can try to do > the performance part myself if necessary, I'm just not sure what all the > steps are yet.) > > If the benchmark suite indicates there's no measurable speed penalty > then such a patch may be worth reconsidering. I'd be astonished if that > was actually the case, though - the lowest impact approach I can think > of is to check for live iterators when setting a dict entry, and that > still has non-trivial size and speed implications. I think Serhiy's patch is much simpler than that (is it Serhiy's? I think so). Regards Antoine. From storchaka at gmail.com Wed Nov 6 19:02:26 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 06 Nov 2013 20:02:26 +0200 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> Message-ID: 06.11.13 07:41, Nick Coghlan ???????(??): > If the benchmark suite indicates there's no measurable speed penalty > then such a patch may be worth reconsidering. I don't see any significant speed difference even in artificial presumably worst case (a lot of items assignment in tight loop). If you have tests which demonstrate a difference, please show them. > I'd be astonished if that > was actually the case, though - the lowest impact approach I can think > of is to check for live iterators when setting a dict entry, and that > still has non-trivial size and speed implications. Actually we should guard not against changing dict during iteration, but against iterating modifying dict (and only such modifications which change dict's keys). For this we need only keys modification counter in a dict and it's copy in an iterator (this doesn't increase memory requirements however). I suppose Java use same technique in HashMap. From ericsnowcurrently at gmail.com Wed Nov 6 20:12:38 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 6 Nov 2013 12:12:38 -0700 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> Message-ID: On Wed, Nov 6, 2013 at 11:02 AM, Serhiy Storchaka wrote: > Actually we should guard not against changing dict during iteration, but > against iterating modifying dict (and only such modifications which change > dict's keys). s/iterating modifying/iteration modifying/ ? Just to clarify, do you mean we should only guard against modifications to the dict's keys during its iterator's __next__()? Changes to the values during __next__() would be okay, as would changes to the keys if they happen outside __next__()? Presumably the above restriction also applies to the iterators of the dict's views. OrderedDict would also need to be changed, meaning this counter would have to be accessible to and incrementable by subclasses. OrderedDict makes use of the MappingView classes in collections.abc, so those would also have be adjusted to check this counter. Would MutableMapping also need to be adjusted to accommodate the counter? Given that the MappingView classes would rely on the counter, I'd expect MutableMapping to need some method or variable that the views can rely on. How would we make sure custom methods, particularly __setitem__() and __delitem__(), increment the counter? > For this we need only keys modification counter in a dict A strictly monotonic counter, right? Every mutation method of dict would have to increment this counter. So would that put a limit (albeit a high one) on the number of mutations that can be made to a dict? Would there be conditions under which we'd reset the counter? If so, how would existing iterators cope? > and > it's copy in an iterator (this doesn't increase memory requirements > however). Because the iterators (and views) already have a pointer to the dict, right? -eric From thatiparthysreenivas at gmail.com Wed Nov 6 20:17:42 2013 From: thatiparthysreenivas at gmail.com (Sreenivas Reddy T) Date: Thu, 7 Nov 2013 00:47:42 +0530 Subject: [Python-Dev] A small patch. Message-ID: diff -r bec6df56c053 Lib/datetime.py --- a/Lib/datetime.py Tue Nov 05 22:01:46 2013 +0200 +++ b/Lib/datetime.py Thu Nov 07 00:46:34 2013 +0530 @@ -43,7 +43,7 @@ def _days_in_month(year, month): "year, month -> number of days in that month in that year." - assert 1 <= month <= 12, month + assert 1 <= month <= 12, 'month must be in 1..12' if month == 2 and _is_leap(year): return 29 return _DAYS_IN_MONTH[month] Best Regards, Srinivas Reddy Thatiparthy -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Nov 6 20:47:50 2013 From: brett at python.org (Brett Cannon) Date: Wed, 6 Nov 2013 14:47:50 -0500 Subject: [Python-Dev] A small patch. In-Reply-To: References: Message-ID: Please put all patches on bugs.python.org, else we will lose track of the change. On Wed, Nov 6, 2013 at 2:17 PM, Sreenivas Reddy T < thatiparthysreenivas at gmail.com> wrote: > diff -r bec6df56c053 Lib/datetime.py > --- a/Lib/datetime.py Tue Nov 05 22:01:46 2013 +0200 > +++ b/Lib/datetime.py Thu Nov 07 00:46:34 2013 +0530 > @@ -43,7 +43,7 @@ > > def _days_in_month(year, month): > "year, month -> number of days in that month in that year." > - assert 1 <= month <= 12, month > + assert 1 <= month <= 12, 'month must be in 1..12' > if month == 2 and _is_leap(year): > return 29 > return _DAYS_IN_MONTH[month] > > Best Regards, > Srinivas Reddy Thatiparthy > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Wed Nov 6 21:16:19 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 06 Nov 2013 22:16:19 +0200 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> Message-ID: 06.11.13 21:12, Eric Snow ???????(??): > Just to clarify, do you mean we should only guard against > modifications to the dict's keys during its iterator's __next__()? > Changes to the values during __next__() would be okay, as would > changes to the keys if they happen outside __next__()? Dict iteration order can be changed after adding or deleting a key so continue iterating after this is not well defined. __next__() is invalid if dict keys was modified after iterators creation. This is common assertion for hashtables in many programming languages. The raising an exception in such situation just helps a programmer to find his mistake. Java raises an exception, in C++ this is an undefined behavior. > Presumably the above restriction also applies to the iterators of the > dict's views. Yes. The proposed patch have tests for these cases. > OrderedDict would also need to be changed, meaning this counter would > have to be accessible to and incrementable by subclasses. OrderedDict > makes use of the MappingView classes in collections.abc, so those > would also have be adjusted to check this counter. Perhaps issue19414 is related. OrderedDict not needs such behavior because it's iteration order is well defined and we can implement another reasonable behavior (for example deleted keys are skipped and newly added keys are always iterated, because they added at the end of iteration order). > Would MutableMapping also need to be adjusted to accommodate the > counter? Given that the MappingView classes would rely on the > counter, I'd expect MutableMapping to need some method or variable > that the views can rely on. How would we make sure custom methods, > particularly __setitem__() and __delitem__(), increment the counter? MutableMapping doesn't implement neither __iter__() nor __setitem__(). Concrete implementation is responsible of this, either provide reliable __iter__() which have predictable behavior after __setitem__(), or detect such modification and raise an exception, or just ignore the problem. The MappingView classes would get this for free because their iterators iterate over mapping itself. > A strictly monotonic counter, right? Every mutation method of dict > would have to increment this counter. So would that put a limit > (albeit a high one) on the number of mutations that can be made to a > dict? Would there be conditions under which we'd reset the counter? > If so, how would existing iterators cope? It is very unlikely that unintentional causes exact 2**32 or 2**64 mutations between dict iterations. If this will happen the programmer should use other methods to find his mistakes. > Because the iterators (and views) already have a pointer to the dict, right? Currently the PyDictObject object contains 3 words and adding yet one word unlikely change actually allocated size. Dict iterators already contain a field for dict's size, it will be replaced by a field for dict's counter. From skip at pobox.com Wed Nov 6 21:39:17 2013 From: skip at pobox.com (Skip Montanaro) Date: Wed, 6 Nov 2013 14:39:17 -0600 Subject: [Python-Dev] A small patch. In-Reply-To: References: Message-ID: > - assert 1 <= month <= 12, month > + assert 1 <= month <= 12, 'month must be in 1..12' In addition to Brett's comment, you might as well include the offending value in your AssertionError message. For example, a value of 0 probably tells you something different about your underlying bug than a value of 2013. Just knowing it's out of range isn't really enough. Skip From solipsis at pitrou.net Wed Nov 6 21:43:36 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 06 Nov 2013 21:43:36 +0100 Subject: [Python-Dev] A small patch. In-Reply-To: References: Message-ID: Le 06/11/2013 21:39, Skip Montanaro a ?crit : >> - assert 1 <= month <= 12, month >> + assert 1 <= month <= 12, 'month must be in 1..12' > > In addition to Brett's comment, you might as well include the > offending value in your AssertionError message. For example, a value > of 0 probably tells you something different about your underlying bug > than a value of 2013. Just knowing it's out of range isn't really > enough. Besides, if it's an assertion it's only an internal helper to check implementation correctness. If it's an error that can be caused by erroneous user data, it should be replaced with the proper exception class (perhaps ValueError). Regards Antoine. From alexander.belopolsky at gmail.com Wed Nov 6 22:30:51 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 6 Nov 2013 16:30:51 -0500 Subject: [Python-Dev] A small patch. In-Reply-To: References: Message-ID: On Wed, Nov 6, 2013 at 3:43 PM, Antoine Pitrou wrote: > Besides, if it's an assertion it's only an internal helper to check > implementation correctness. If it's an error that can be caused by > erroneous user data, it should be replaced with the proper exception class > (perhaps ValueError). As far as I can tell, the only way to trigger this assertion is to call internal _days_in_month() function directly - something that users should not do. For a public function with similar functionality look at calendar.monthrange( ). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Nov 6 22:09:15 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 06 Nov 2013 13:09:15 -0800 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: <5279C7B1.30300@stoneleaf.us> References: <5279C7B1.30300@stoneleaf.us> Message-ID: <527AAFFB.1000103@stoneleaf.us> Thank you, Raymond, and everyone else who responded both here and on the tracker. -- ~Ethan~ From victor.stinner at gmail.com Wed Nov 6 23:32:33 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 6 Nov 2013 23:32:33 +0100 Subject: [Python-Dev] Avoid formatting an error message on attribute error Message-ID: Hi, I'm trying to avoid unnecessary temporary Unicode strings when possible (see issue #19512). While working on this, I saw that Python likes preparing an user friendly message to explain why getting an attribute failed. The problem is that in most cases, the caller doesn't care of the message: the exception is simply deleted. For example, hasattr() deletes immediatly the AttributeError. It would be nice to only format the message on demand. The AttributeError would keep a reference to the type. Keeping a strong reference to the type might change the behaviour of some applications in some corner cases. (Holding a reference to the object would be worse, and the type looks to be preferred over the type to format the error message.) Pseudo-code for modified AssertionError: class AttributeError(Exception): def __init__(self, type, attr): self.type = type self.attr = attr self._message = None def __str__(self): if self._message is None: self._message = ("type object %s has no attribute %r" % (self.type.__name__, self.attr)) return self._message AttributeError.args would be (type, attr) instead of (message,). ImportError was also modified to add a new "name "attribute". If AttributeError cannot be modified (because of backward compatibility), would it be possible to add a new exception inheriting from AttributeError? I have a similar project for OSError (generate the message on demand), but it's more tricky because os.strerror(errno) depends on the current locale... *** Example of C code raising an AttributeError: PyErr_Format(PyExc_AttributeError, "'%.50s' object has no attribute '%U'", tp->tp_name, name); Example of C code ignoring the AttributeError: value = _PyObject_GetAttrId(mod, &PyId___initializing__); if (value == NULL) PyErr_Clear(); else { ... } Example of Python code ignoring the AttributeError: try: get_subactions = action._get_subactions except AttributeError: pass else: ... Another example in Python: try: retattr = getattr(self.socket, attr) except AttributeError: # the original error is not used raise AttributeError("%s instance has no attribute '%s'" %(self.__class__.__name__, attr)) Victor From steve at pearwood.info Thu Nov 7 01:20:31 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 7 Nov 2013 11:20:31 +1100 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: Message-ID: <20131107002031.GQ15615@ando> On Wed, Nov 06, 2013 at 11:32:33PM +0100, Victor Stinner wrote: > Hi, > > I'm trying to avoid unnecessary temporary Unicode strings when > possible (see issue #19512). While working on this, I saw that Python > likes preparing an user friendly message to explain why getting an > attribute failed. The problem is that in most cases, the caller > doesn't care of the message: the exception is simply deleted. For > example, hasattr() deletes immediatly the AttributeError. My initial instinct here was to say that sounded like premature optimization, but to my surprise the overhead of generating the error message is actually significant -- at least from pure Python 3.3 code. That is, the time it takes to generate the message: "'%s' object has no attribute %r" % (obj.__class__.__name__, attrname) is about the same as the time to raise and catch an exception: try: raise AttributeError except AttributeError: pass (measured with timeit). Given that, I wonder whether it would be worth coming up with a more general solution to the question of lazily generating error messages rather than changing AttributeError specifically. Just thinking aloud -- perhaps this should go to python-ideas? -- maybe something like this: class _LazyMsg: def __init__(self, template, obj, attrname): self.template = template self.typename = type(obj).__name__ self.attrname = attrname def __str__(self): return self.template % (self.typename, self.attrname) Then: py> err = AttributeError( ... _LazyMsg("'%s' object has no attribute '%s'", None, "spam")) py> raise err Traceback (most recent call last): File "", line 1, in AttributeError: 'NoneType' object has no attribute 'spam' Giving the same effect from C code, I leave as an exercise for the reader :-) > It would be nice to only format the message on demand. The > AttributeError would keep a reference to the type. Since only the type name is used, why keep a reference to the type instead of just type.__name__? > AttributeError.args would be (type, attr) instead of (message,). > ImportError was also modified to add a new "name "attribute". I don't like changing the signature of AttributeError. I've got code that raises AttributeError explicitly. -- Steven From victor.stinner at gmail.com Thu Nov 7 12:32:50 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 7 Nov 2013 12:32:50 +0100 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: <20131107002031.GQ15615@ando> References: <20131107002031.GQ15615@ando> Message-ID: 2013/11/7 Steven D'Aprano : > My initial instinct here was to say that sounded like premature > optimization, but to my surprise the overhead of generating the error > message is actually significant -- at least from pure Python 3.3 code. I ran a quick and dirty benchmark by replacing the error message with None. Original: $ ./python -m timeit 'hasattr(1, "y")' 1000000 loops, best of 3: 0.354 usec per loop $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' 1000000 loops, best of 3: 0.471 usec per loop Patched: $ ./python -m timeit 'hasattr(1, "y")' 10000000 loops, best of 3: 0.106 usec per loop $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' 10000000 loops, best of 3: 0.191 usec per loop hasattr() is 3.3x faster and try/except is 2.4x faster on such micro benchmark. > Given that, I wonder whether it would be worth coming up with a more > general solution to the question of lazily generating error messages > rather than changing AttributeError specifically. My first question is about keeping strong references to objects (type object for AttributeError). Is it an issue? If it is an issue, it's maybe better to not modify the code :-) Yes, the lazy formatting idea can be applied to various other exceptions. For example, TypeError message is usually build using PyErr_Format() to mention the name of the invalid type. Example: PyErr_Format(PyExc_TypeError, "exec() arg 2 must be a dict, not %.100s", globals->ob_type->tp_name); But it's not easy to store arbitary C types for PyUnicode_FromFormat() parameters. Argument types can be char*, Py_ssize_t, PyObject*, int, etc. I proposed to modify (first/only) AttributeError, because it is more common to ignore the AttributeError than other errors like TypeError. (TypeError or UnicodeDecodeError are usually unexpected.) >> It would be nice to only format the message on demand. The >> AttributeError would keep a reference to the type. > > Since only the type name is used, why keep a reference to the type > instead of just type.__name__? In the C language, type.__name__ does not exist, it's a char* object. If the type object is destroyed, type->tp_name becomes an invalid pointer. So AttributeError should keep a reference on the type object. >> AttributeError.args would be (type, attr) instead of (message,). >> ImportError was also modified to add a new "name "attribute". > > I don't like changing the signature of AttributeError. I've got code > that raises AttributeError explicitly. The constructor may support different signature for backward compatibility: AttributeError(message: str) and AttributeError(type: type, attr: str). I'm asking if anyone relies on AttributeError.args attribute. Victor From storchaka at gmail.com Thu Nov 7 13:19:18 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 07 Nov 2013 14:19:18 +0200 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: Message-ID: 07.11.13 00:32, Victor Stinner ???????(??): > I'm trying to avoid unnecessary temporary Unicode strings when > possible (see issue #19512). While working on this, I saw that Python > likes preparing an user friendly message to explain why getting an > attribute failed. The problem is that in most cases, the caller > doesn't care of the message: the exception is simply deleted. For > example, hasattr() deletes immediatly the AttributeError. > > It would be nice to only format the message on demand. The > AttributeError would keep a reference to the type. Keeping a strong > reference to the type might change the behaviour of some applications > in some corner cases. (Holding a reference to the object would be > worse, and the type looks to be preferred over the type to format the > error message.) See also: http://bugs.python.org/issue18156 : Add an 'attr' attribute to AttributeError http://bugs.python.org/issue18162 : Add index attribute to IndexError http://bugs.python.org/issue18163 : Add a 'key' attribute to KeyError From ncoghlan at gmail.com Thu Nov 7 13:29:36 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Nov 2013 22:29:36 +1000 Subject: [Python-Dev] Issue 19332: Guard against changing dict during iteration In-Reply-To: References: <5279C7B1.30300@stoneleaf.us> Message-ID: On 7 Nov 2013 03:18, "Antoine Pitrou" wrote: > > Le 06/11/2013 06:41, Nick Coghlan a ?crit : > >> >> The behaviour of mutating builtin containers while iterating over them >> is formally undefined beyond "it won't segfault" (one of the few such >> undefined behaviours in Python). The associated exceptions are thus >> strictly "best effort given other constraints". > > > Not sure what you mean with "formally undefined". For example, you can perfectly well change a list's contents while iterating over it, and I bet there's a lot of code that relies on that, as in:: > > for i, value in enumerate(mylist): > if some_condition(value): > mylist[i] = some_function(value) > > If you change "builtin containers" to "builtin unordered containers", then you probably are closer to the truth :) I meant changing the length rather than just the contents of an existing entry. Even lists get a little odd if you add and remove entries while iterating (although I agree the sequence case is much better defined than the unordered container case). Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 7 13:41:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Nov 2013 22:41:09 +1000 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: <20131107002031.GQ15615@ando> Message-ID: On 7 Nov 2013 21:34, "Victor Stinner" wrote: > > 2013/11/7 Steven D'Aprano : > > My initial instinct here was to say that sounded like premature > > optimization, but to my surprise the overhead of generating the error > > message is actually significant -- at least from pure Python 3.3 code. > > I ran a quick and dirty benchmark by replacing the error message with None. > > Original: > > $ ./python -m timeit 'hasattr(1, "y")' > 1000000 loops, best of 3: 0.354 usec per loop > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' > 1000000 loops, best of 3: 0.471 usec per loop > > Patched: > > $ ./python -m timeit 'hasattr(1, "y")' > 10000000 loops, best of 3: 0.106 usec per loop > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' > 10000000 loops, best of 3: 0.191 usec per loop > > hasattr() is 3.3x faster and try/except is 2.4x faster on such micro benchmark. > > > Given that, I wonder whether it would be worth coming up with a more > > general solution to the question of lazily generating error messages > > rather than changing AttributeError specifically. > > My first question is about keeping strong references to objects (type > object for AttributeError). Is it an issue? If it is an issue, it's > maybe better to not modify the code :-) > > > Yes, the lazy formatting idea can be applied to various other > exceptions. For example, TypeError message is usually build using > PyErr_Format() to mention the name of the invalid type. Example: > > PyErr_Format(PyExc_TypeError, "exec() arg 2 must be a dict, not %.100s", > globals->ob_type->tp_name); > > But it's not easy to store arbitary C types for PyUnicode_FromFormat() > parameters. Argument types can be char*, Py_ssize_t, PyObject*, int, > etc. > > I proposed to modify (first/only) AttributeError, because it is more > common to ignore the AttributeError than other errors like TypeError. > (TypeError or UnicodeDecodeError are usually unexpected.) > > >> It would be nice to only format the message on demand. The > >> AttributeError would keep a reference to the type. > > > > Since only the type name is used, why keep a reference to the type > > instead of just type.__name__? > > In the C language, type.__name__ does not exist, it's a char* object. > If the type object is destroyed, type->tp_name becomes an invalid > pointer. So AttributeError should keep a reference on the type object. > > >> AttributeError.args would be (type, attr) instead of (message,). > >> ImportError was also modified to add a new "name "attribute". The existing signature continued to be supported, though. > > > > I don't like changing the signature of AttributeError. I've got code > > that raises AttributeError explicitly. > > The constructor may support different signature for backward > compatibility: AttributeError(message: str) and AttributeError(type: > type, attr: str). > > I'm asking if anyone relies on AttributeError.args attribute. The bigger problem is you can't change the constructor signature in a backwards incompatible way. You would need a new class method as an alternative constructor instead, or else use optional parameters. Cheers, Nick. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From theller at ctypes.org Thu Nov 7 13:44:49 2013 From: theller at ctypes.org (Thomas Heller) Date: Thu, 07 Nov 2013 13:44:49 +0100 Subject: [Python-Dev] PEP 384 (stable api) question Message-ID: PEP 384 describes the stable Python api, available when Py_LIMITED_API is defined. However, there are some (small) changes in the function prototypes available, one example is (in Python 3.3): PyObject* PyObject_CallFunction(PyObject *callable, char *format, ...) which changed in Python 3.4 to 'const char *format' for the third argument. I know that this is only a subtle difference, but in my case it gives compiler warnings when I compile my stuff (my stuff is a little bit special, I have to admit, but anyway). I thought that the stable API would keep exactly the same across releases - is this expectation wrong or is this a bug? Thanks, Thomas From oscar.j.benjamin at gmail.com Thu Nov 7 14:53:57 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Thu, 7 Nov 2013 13:53:57 +0000 Subject: [Python-Dev] PEP 384 (stable api) question In-Reply-To: References: Message-ID: On 7 November 2013 12:44, Thomas Heller wrote: > PEP 384 describes the stable Python api, available when > Py_LIMITED_API is defined. > > However, there are some (small) changes in the function prototypes > available, one example is (in Python 3.3): > PyObject* PyObject_CallFunction(PyObject *callable, char *format, ...) > > which changed in Python 3.4 to 'const char *format' for the > third argument. > > I know that this is only a subtle difference, but in my case it gives > compiler warnings when I compile my stuff (my stuff is a little bit > special, I have to admit, but anyway). > > I thought that the stable API would keep exactly the same across > releases - is this expectation wrong or is this a bug? I think you're confusing API with ABI. PEP 384 describes a stable ABI. This means that an extension module can be *binary* compatible with several different Python versions without needing to be recompiled for each particular CPython release. I don't think the change you refer to breaks binary compatibility. Oscar From christian at python.org Thu Nov 7 19:13:38 2013 From: christian at python.org (Christian Heimes) Date: Thu, 07 Nov 2013 19:13:38 +0100 Subject: [Python-Dev] Simplify and unify SSL verification Message-ID: Hi, I'd like to simplify, unify and tighten SSL handling and verification code in Python 3.5. Therefore I propose to deprecate some features for Python 3.4. SSLContext objects are going to be the central instance for configuration. In order to archive the goal I propose to - deprecate the key_file, cert_file and check_hostname arguments in various in favor of context. Python 3.5 will no longer have these arguments in http.client, smtplib etc. - deprecate implicit verify_mode=CERT_NONE. Python 3.5 will default to CERT_REQUIRED. - enforce hostname matching in CERT_OPTIONAL or CERT_REQUIRED mode by default in Python 3.5. - add ssl.get_default_context() to acquire a default SSLContext object if the context argument is None. (Python 3.4) - add check_cert option to SSLContext and a check_cert() function to SSLSocket in order to unify certificate validation and hostname matching. (Python 3.4) The SSLContext's check_cert option will support four values: None (default) use match_hostname() if verify_mode is CERT_OPTIONAL or CERT_REQUIRED True always use match_hostname() False never use match_hostname() callable function: call func(sslsock, hostname, initiator, **kwargs) The initiator argument is the object that has initiated the SSL connection, e.g. http.client.HTTPSConnection object. sslsock.context points to its SSLContext object A connect method may look like this: def connect(self): hostname = self.hostname s = socket.create_connection((hostname, self.port)) sock = self.context.wrap_socket(sock, server_hostname=hostname) sock.check_cert(hostname, initiator=self) return sock The check_cert option for SSLContext makes it easy to customize cert handling. Application developers just need to configure a SSLContext object in order to customize SSL cert verification and matching. In Python 3.4 it will be possible to get the full peer chain, cert information in CERT_NONE verification mode and SPKI data for SSL pinning. A custom check_cert callback can be used to validate self-signed certs or test for pinned certificates. The proposal does NOT address the CA certificates difficulty. It's a different story that needs to be addressed for all platforms separately. Most Linux distributions, (Net|Free|Open) BSD and Mac Ports come with SSL certs anyway, although some do not correctly handle the certs' purposes. I have working code for Windows' cert system store that will land in 3.4. Native Mac OS X hasn't been addressed yet. AIX, HP-UX, Solaris etc. don't come with CA certs. Christian From martin at v.loewis.de Thu Nov 7 19:35:08 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 07 Nov 2013 19:35:08 +0100 Subject: [Python-Dev] PEP 384 (stable api) question In-Reply-To: References: Message-ID: <527BDD5C.1090801@v.loewis.de> Am 07.11.13 13:44, schrieb Thomas Heller: > I thought that the stable API would keep exactly the same across > releases - is this expectation wrong or is this a bug? Oscar is right - this change doesn't affect the ABI, just the API. That said, please file an issue reporting what change you see in your compiler warnings. 3.4 is not released yet, perhaps there is something that can be done. Regards, Martin From brett at python.org Thu Nov 7 19:44:39 2013 From: brett at python.org (Brett Cannon) Date: Thu, 7 Nov 2013 13:44:39 -0500 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: <20131107002031.GQ15615@ando> Message-ID: On Thu, Nov 7, 2013 at 7:41 AM, Nick Coghlan wrote: > > On 7 Nov 2013 21:34, "Victor Stinner" wrote: > > > > 2013/11/7 Steven D'Aprano : > > > My initial instinct here was to say that sounded like premature > > > optimization, but to my surprise the overhead of generating the error > > > message is actually significant -- at least from pure Python 3.3 code. > > > > I ran a quick and dirty benchmark by replacing the error message with > None. > > > > Original: > > > > $ ./python -m timeit 'hasattr(1, "y")' > > 1000000 loops, best of 3: 0.354 usec per loop > > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' > > 1000000 loops, best of 3: 0.471 usec per loop > > > > Patched: > > > > $ ./python -m timeit 'hasattr(1, "y")' > > 10000000 loops, best of 3: 0.106 usec per loop > > $ ./python -m timeit -s 's=1' 'try: s.y' 'except AttributeError: pass' > > 10000000 loops, best of 3: 0.191 usec per loop > > > > hasattr() is 3.3x faster and try/except is 2.4x faster on such micro > benchmark. > > > > > Given that, I wonder whether it would be worth coming up with a more > > > general solution to the question of lazily generating error messages > > > rather than changing AttributeError specifically. > > > > My first question is about keeping strong references to objects (type > > object for AttributeError). Is it an issue? If it is an issue, it's > > maybe better to not modify the code :-) > > > > > > Yes, the lazy formatting idea can be applied to various other > > exceptions. For example, TypeError message is usually build using > > PyErr_Format() to mention the name of the invalid type. Example: > > > > PyErr_Format(PyExc_TypeError, "exec() arg 2 must be a dict, not > %.100s", > > globals->ob_type->tp_name); > > > > But it's not easy to store arbitary C types for PyUnicode_FromFormat() > > parameters. Argument types can be char*, Py_ssize_t, PyObject*, int, > > etc. > > > > I proposed to modify (first/only) AttributeError, because it is more > > common to ignore the AttributeError than other errors like TypeError. > > (TypeError or UnicodeDecodeError are usually unexpected.) > > > > >> It would be nice to only format the message on demand. The > > >> AttributeError would keep a reference to the type. > > > > > > Since only the type name is used, why keep a reference to the type > > > instead of just type.__name__? > > > > In the C language, type.__name__ does not exist, it's a char* object. > > If the type object is destroyed, type->tp_name becomes an invalid > > pointer. So AttributeError should keep a reference on the type object. > > > > >> AttributeError.args would be (type, attr) instead of (message,). > > >> ImportError was also modified to add a new "name "attribute". > > The existing signature continued to be supported, though. > > > > > > > I don't like changing the signature of AttributeError. I've got code > > > that raises AttributeError explicitly. > > > > The constructor may support different signature for backward > > compatibility: AttributeError(message: str) and AttributeError(type: > > type, attr: str). > > > > I'm asking if anyone relies on AttributeError.args attribute. > > The bigger problem is you can't change the constructor signature in a > backwards incompatible way. You would need a new class method as an > alternative constructor instead, or else use optional parameters. > The optional parameter approach is the one ImportError took for introducing its `name` and `path` attributes, so there is precedent. Currently it doesn't use a default exception message for backwards-compatibility, but adding one wouldn't be difficult technically. In the case of AttributeError, what you could do is follow the suggestion I made in http://bugs.python.org/issue18156 and add an `attr` keyword-only argument and correpsonding attribute to AttributeError. If you then add whatever other keyword arguments are needed to generate a good error message (`object` so you have the target, or at least get the string name to prevent gc issues for large objects?) you could construct the message lazily in the __str__ method very easily while also doing away with inconsistencies in the message which should make doctest users happy. Lazy message creation through __str__ does leave the message out of `args`, though. In a perfect world (Python 4 maybe?) BaseException would take a single argument which would be an optional message, `args` wouldn't exist, and people called `str(exc)` to get the message for the exception. That would allow subclasses to expand the API with keyword-only arguments to carry extra info and have reasonable default messages that were built on-demand when __str__ was called. It would also keep `args` from just being a dumping ground of stuff that has no structure except by calling convention (which is not how to do an API; explicit > implicit and all). IOW the original dream of PEP 352 ( http://python.org/dev/peps/pep-0352/#retracted-ideas). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Thu Nov 7 20:20:36 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Nov 2013 12:20:36 -0700 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: <20131107002031.GQ15615@ando> Message-ID: On Thu, Nov 7, 2013 at 11:44 AM, Brett Cannon wrote: > Lazy message creation through > __str__ does leave the message out of `args`, though. If that's an issue you could make args a (settable) property that dynmically returns str(self) if appropriate: @property def args(self): actual = super().args if actual or self.name is None: return actual return (str(self),) > > In a perfect world (Python 4 maybe?) BaseException would take a single > argument which would be an optional message, `args` wouldn't exist, and > people called `str(exc)` to get the message for the exception. That would > allow subclasses to expand the API with keyword-only arguments to carry > extra info and have reasonable default messages that were built on-demand > when __str__ was called. It would also keep `args` from just being a dumping > ground of stuff that has no structure except by calling convention (which is > not how to do an API; explicit > implicit and all). IOW the original dream > of PEP 352 (http://python.org/dev/peps/pep-0352/#retracted-ideas). This reminds me that I need to revisit that idea of reimplementing all the builtin exceptions in pure Python. :) -eric From brett at python.org Thu Nov 7 20:33:31 2013 From: brett at python.org (Brett Cannon) Date: Thu, 7 Nov 2013 14:33:31 -0500 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: <20131107002031.GQ15615@ando> Message-ID: On Thu, Nov 7, 2013 at 2:20 PM, Eric Snow wrote: > On Thu, Nov 7, 2013 at 11:44 AM, Brett Cannon wrote: > > Lazy message creation through > > __str__ does leave the message out of `args`, though. > > If that's an issue you could make args a (settable) property that > dynmically returns str(self) if appropriate: > > @property > def args(self): > actual = super().args > if actual or self.name is None: > return actual > return (str(self),) > Good point. That would solve that backwards-compatibility issue and allow the raising of DeprecationWarning. > > > > > In a perfect world (Python 4 maybe?) BaseException would take a single > > argument which would be an optional message, `args` wouldn't exist, and > > people called `str(exc)` to get the message for the exception. That would > > allow subclasses to expand the API with keyword-only arguments to carry > > extra info and have reasonable default messages that were built on-demand > > when __str__ was called. It would also keep `args` from just being a > dumping > > ground of stuff that has no structure except by calling convention > (which is > > not how to do an API; explicit > implicit and all). IOW the original > dream > > of PEP 352 (http://python.org/dev/peps/pep-0352/#retracted-ideas). > > This reminds me that I need to revisit that idea of reimplementing all > the builtin exceptions in pure Python. :) > Ah, that idea. =) Definitely a question of performance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Nov 7 21:45:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 07 Nov 2013 21:45:39 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: Le 07/11/2013 19:13, Christian Heimes a ?crit : > Hi, > > I'd like to simplify, unify and tighten SSL handling and verification > code in Python 3.5. > Therefore I propose to deprecate some features for > Python 3.4. SSLContext objects are going to be the central instance for > configuration. > > In order to archive the goal I propose to > > - deprecate the key_file, cert_file and check_hostname arguments in > various in favor of context. Python 3.5 will no longer have these > arguments in http.client, smtplib etc. I'm in favour but I think 3.5 is too early. Keep in mind SSLContext is quite young. > - deprecate implicit verify_mode=CERT_NONE. Python 3.5 will default > to CERT_REQUIRED. -0.9. This breaks compatibility and doesn't achieve anything, since there's no reliable story for CA certs. > - enforce hostname matching in CERT_OPTIONAL or CERT_REQUIRED mode > by default in Python 3.5. Service matching is an application protocol level thing, it isn't strictly part of SSL. I'd feel warmer if you removed the "by default" part. > - add ssl.get_default_context() to acquire a default SSLContext object > if the context argument is None. (Python 3.4) I think I'm against it, for two reasons: 1. It will in the long term create compatibility and/or security issues (we'll have to keep fragile legacy settings if we want to avoid breaking existing uses of the default context) 2. SSLContexts are mutable, which means some code can affect other code's behaviour without even knowing. It's a recipe for subtle bugs and security issues. Applications and/or higher-level libraries should be smart enough to choose their own security preferences, and it's the better place to do so anyway. > - add check_cert option to SSLContext and a check_cert() function to > SSLSocket in order to unify certificate validation and hostname > matching. (Python 3.4) The check_cert() function wouldn't achieve anything besides calling match_hostname(), right? I'm against calling it check_cert(), it's too generic and only induces confusion. > The SSLContext's check_cert option will support four values: > > None (default) > use match_hostname() if verify_mode is CERT_OPTIONAL or > CERT_REQUIRED > True > always use match_hostname() > False > never use match_hostname() What is the hostname that the cert is matched against? > callable function: > call func(sslsock, hostname, initiator, **kwargs) > The initiator argument is the object that has initiated the > SSL connection, e.g. http.client.HTTPSConnection object. > sslsock.context points to its SSLContext object Where does the "hostname" come from? And why is there an "initiator" object? Personally I prefer to avoid passing opaque user data, since the caller can use a closure anyway. And what are the **kwargs? > A connect method may look like this: > > def connect(self): > hostname = self.hostname > s = socket.create_connection((hostname, self.port)) > sock = self.context.wrap_socket(sock, > server_hostname=hostname) > sock.check_cert(hostname, initiator=self) So what does this bring compared to the statu quo? I thought at least sock.check_cert() would be called automatically, but if you have to call it yourself it starts looking pointless, IMO. Regards Antoine. From benjamin at python.org Thu Nov 7 22:09:03 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 7 Nov 2013 16:09:03 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" In-Reply-To: <3dFxbk1dYKz7Lmg@mail.python.org> References: <3dFxbk1dYKz7Lmg@mail.python.org> Message-ID: 2013/11/7 victor.stinner : > http://hg.python.org/cpython/rev/99afa4c74436 > changeset: 86995:99afa4c74436 > user: Victor Stinner > date: Thu Nov 07 13:33:36 2013 +0100 > summary: > Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" > if the input string is NULL > > files: > Objects/unicodeobject.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > > diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c > --- a/Objects/unicodeobject.c > +++ b/Objects/unicodeobject.c > @@ -2983,6 +2983,8 @@ > char *l_end; > > if (encoding == NULL) { > + if (lower_len < 6) How about doing something like strlen("utf-8") rather than hardcoding that? -- Regards, Benjamin From ncoghlan at gmail.com Thu Nov 7 22:35:10 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 07:35:10 +1000 Subject: [Python-Dev] PEP 384 (stable api) question In-Reply-To: <527BDD5C.1090801@v.loewis.de> References: <527BDD5C.1090801@v.loewis.de> Message-ID: On 8 Nov 2013 04:42, Martin v. L?wis wrote: > > Am 07.11.13 13:44, schrieb Thomas Heller: > > > I thought that the stable API would keep exactly the same across > > releases - is this expectation wrong or is this a bug? > > Oscar is right - this change doesn't affect the ABI, just the API. > > That said, please file an issue reporting what change you see in > your compiler warnings. 3.4 is not released yet, perhaps there is > something that can be done. Thanks for bringing this up, Thomas. I'd spotted warnings from that change in CPython's own build process on Fedora, but not followed it up yet. At a very quick glance, the Fedora issue appeared to be a platform header that publishes a parameter as "char *", but CPython is now calling it with a "const char *" pointer. (I agree with the change to the API signatures in principle, though, since it endeavours to accurately indicate which string inputs CPython may modify and which we guarantee to leave alone. There may just be some specific cases where we can't actually make that guarantee because the underlying platform APIs aren't consistent in their declarations) Cheers, Nick. > > Regards, > Martin > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Nov 7 22:38:32 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 7 Nov 2013 22:38:32 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" In-Reply-To: References: <3dFxbk1dYKz7Lmg@mail.python.org> Message-ID: 2013/11/7 Benjamin Peterson : > 2013/11/7 victor.stinner : >> http://hg.python.org/cpython/rev/99afa4c74436 >> changeset: 86995:99afa4c74436 >> user: Victor Stinner >> date: Thu Nov 07 13:33:36 2013 +0100 >> summary: >> Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" >> if the input string is NULL >> >> files: >> Objects/unicodeobject.c | 2 ++ >> 1 files changed, 2 insertions(+), 0 deletions(-) >> >> >> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c >> --- a/Objects/unicodeobject.c >> +++ b/Objects/unicodeobject.c >> @@ -2983,6 +2983,8 @@ >> char *l_end; >> >> if (encoding == NULL) { >> + if (lower_len < 6) > > How about doing something like strlen("utf-8") rather than hardcoding that? Full code: if (encoding == NULL) { if (lower_len < 6) return 0; strcpy(lower, "utf-8"); return 1; } On my opinion, it is easy to guess that 6 is len("utf-8") + 1 byte for NUL. Calling strlen() at runtime may slow-down a function in the fast-path of PyUnicode_Decode() and PyUnicode_AsEncodedString() which are important functions. I know that some developers can execute strlen() during compilation, but I don't see the need of replacing 6 with strlen("utf-8")+1. Victor From christian at python.org Thu Nov 7 22:42:30 2013 From: christian at python.org (Christian Heimes) Date: Thu, 07 Nov 2013 22:42:30 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: Am 07.11.2013 21:45, schrieb Antoine Pitrou: > I'm in favour but I think 3.5 is too early. Keep in mind SSLContext is > quite young. It's available since 3.3 >> - deprecate implicit verify_mode=CERT_NONE. Python 3.5 will default >> to CERT_REQUIRED. > > -0.9. This breaks compatibility and doesn't achieve anything, since > there's no reliable story for CA certs. I'd like to move to "secure by default". The CA cert situation is solved on most platforms. > Service matching is an application protocol level thing, it isn't > strictly part of SSL. I'd feel warmer if you removed the "by default" part. Hostname matching is done through an explicit call to SSLSocket.check_cert(). All Python stdlib libraries like smtplib, ftplib etc. are going to verify the hostname by default. 3rd party libraries have to call check_cert() explicitly. >> - add ssl.get_default_context() to acquire a default SSLContext object >> if the context argument is None. (Python 3.4) > > I think I'm against it, for two reasons: > > 1. It will in the long term create compatibility and/or security issues > (we'll have to keep fragile legacy settings if we want to avoid breaking > existing uses of the default context) > > 2. SSLContexts are mutable, which means some code can affect other > code's behaviour without even knowing. It's a recipe for subtle bugs and > security issues. > > Applications and/or higher-level libraries should be smart enough to > choose their own security preferences, and it's the better place to do > so anyway. You misunderstood me. I'm not proposing a global SSLContext object but a factory function that creates a context for Python stdlib modules. Right now every urllib, http.client, nntplib, asyncio, ftplib, poplib and imaplib have duplicated code. I'd like to have ONE function that creates and configures a SSLContext object with sensible default values for Python stdlib. 3rd party apps are free to create their own SSLContext object. > The check_cert() function wouldn't achieve anything besides calling > match_hostname(), right? It does more: http://pastebin.com/gKi7N2WM > I'm against calling it check_cert(), it's too generic and only induces > confusion. Do you have an idea for a better name? > >> The SSLContext's check_cert option will support four values: >> >> None (default) >> use match_hostname() if verify_mode is CERT_OPTIONAL or >> CERT_REQUIRED >> True >> always use match_hostname() >> False >> never use match_hostname() > > What is the hostname that the cert is matched against? The library code must provide as argument to check_cert or it's taken from server_hostname if no hostname is provided. > And why is there an "initiator" object? Personally I prefer to avoid > passing opaque user data, since the caller can use a closure anyway. > And what are the **kwargs? No, they can't use a closure. The callback function is part of the SSLContext object. The context object can be used by multiple SSLSocket objects. The **kwargs make it possible to pass data from the caller of check_cert() to the callback function of the SSLContext instance. > So what does this bring compared to the statu quo? I thought at least > sock.check_cert() would be called automatically, but if you have to call > it yourself it starts looking pointless, IMO. As you have said earlier, cert matching is not strictly a SSL connection thing. With the check_cert feature you can do stuff like validation of self-signed certs with known hash sums: def check_cert_cb(sslsock, hostname, initiator, **kwargs): # real code would use SPKI certdigest = sha1(sslsock.getpeercert(True)).digest() if hostname == "my.host.name" and certdigest == b"abcdef...": return True do_other_check(sslsock, hostname) ctx = SSLContext(PROTOCOL_TLSv1, check_cert_cb) ctx.verify_mode = CERT_NONE httpscon = http.client.HTTPSConnection("my.host.name", 443, context=ctx) httpscon = http.client.HTTPSConnection("my.host.name", 443, context=ctx) httpscon = http.client.HTTPSConnection("my.host.name", 443, context=ctx) Christian From donald at stufft.io Thu Nov 7 22:45:37 2013 From: donald at stufft.io (Donald Stufft) Date: Thu, 7 Nov 2013 16:45:37 -0500 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: <03FDD10B-C630-4977-8659-A42BBED9104B@stufft.io> On Nov 7, 2013, at 4:42 PM, Christian Heimes wrote: >>> >>> - deprecate implicit verify_mode=CERT_NONE. Python 3.5 will default >>> to CERT_REQUIRED. >> >> -0.9. This breaks compatibility and doesn't achieve anything, since >> there's no reliable story for CA certs. > > I'd like to move to "secure by default". The CA cert situation is solved > on most platforms. Please Yes, secure by default +1000 ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From eric at trueblade.com Thu Nov 7 23:06:12 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 07 Nov 2013 17:06:12 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" In-Reply-To: References: <3dFxbk1dYKz7Lmg@mail.python.org> Message-ID: <527C0ED4.50702@trueblade.com> On 11/7/2013 4:38 PM, Victor Stinner wrote: > 2013/11/7 Benjamin Peterson : >> 2013/11/7 victor.stinner : >>> http://hg.python.org/cpython/rev/99afa4c74436 >>> changeset: 86995:99afa4c74436 >>> user: Victor Stinner >>> date: Thu Nov 07 13:33:36 2013 +0100 >>> summary: >>> Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" >>> if the input string is NULL >>> >>> files: >>> Objects/unicodeobject.c | 2 ++ >>> 1 files changed, 2 insertions(+), 0 deletions(-) >>> >>> >>> diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c >>> --- a/Objects/unicodeobject.c >>> +++ b/Objects/unicodeobject.c >>> @@ -2983,6 +2983,8 @@ >>> char *l_end; >>> >>> if (encoding == NULL) { >>> + if (lower_len < 6) >> >> How about doing something like strlen("utf-8") rather than hardcoding that? > > Full code: > > if (encoding == NULL) { > if (lower_len < 6) > return 0; > strcpy(lower, "utf-8"); > return 1; > } > > On my opinion, it is easy to guess that 6 is len("utf-8") + 1 byte for NUL. > > Calling strlen() at runtime may slow-down a function in the fast-path > of PyUnicode_Decode() and PyUnicode_AsEncodedString() which are > important functions. I know that some developers can execute strlen() > during compilation, but I don't see the need of replacing 6 with > strlen("utf-8")+1. Then how about at least a comment about how 6 is derived? if (lower_len < 6) /* 6 == strlen("utf-8") + 1 */ return 0; Eric. From ncoghlan at gmail.com Thu Nov 7 23:01:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 08:01:29 +1000 Subject: [Python-Dev] Avoid formatting an error message on attribute error In-Reply-To: References: <20131107002031.GQ15615@ando> Message-ID: On 8 Nov 2013 04:44, "Brett Cannon" wrote: > On Thu, Nov 7, 2013 at 7:41 AM, Nick Coghlan wrote: >> >> The bigger problem is you can't change the constructor signature in a backwards incompatible way. You would need a new class method as an alternative constructor instead, or else use optional parameters. > > > The optional parameter approach is the one ImportError took for introducing its `name` and `path` attributes, so there is precedent. Currently it doesn't use a default exception message for backwards-compatibility, but adding one wouldn't be difficult technically. > > In the case of AttributeError, what you could do is follow the suggestion I made in http://bugs.python.org/issue18156 and add an `attr` keyword-only argument and correpsonding attribute to AttributeError. If you then add whatever other keyword arguments are needed to generate a good error message (`object` so you have the target, or at least get the string name to prevent gc issues for large objects?) you could construct the message lazily in the __str__ method very easily while also doing away with inconsistencies in the message which should make doctest users happy. Lazy message creation through __str__ does leave the message out of `args`, though. > > In a perfect world (Python 4 maybe?) BaseException would take a single argument which would be an optional message, `args` wouldn't exist, and people called `str(exc)` to get the message for the exception. That would allow subclasses to expand the API with keyword-only arguments to carry extra info and have reasonable default messages that were built on-demand when __str__ was called. It would also keep `args` from just being a dumping ground of stuff that has no structure except by calling convention (which is not how to do an API; explicit > implicit and all). IOW the original dream of PEP 352 ( http://python.org/dev/peps/pep-0352/#retracted-ideas). A related problem (that rich exceptions cause, rather than solve) is one I encountered recently in trying to improve the error messages thrown by the codec machinery to ensure they mention the specific codec operation that failed: you can't easily create a copy of a rich exception with all the same attributes, but a different error message (at least through the C API - copy.copy may work at the Python level, but I don't want to make the codec infrastructure depend on the copy module). Rich exception types that require more than one parameter are also incompatible with PyErr_Format in general. For the current patch [1], I've "solved" the problem by whitelisting a common set of types that the chaining code knows how to handle, but it would be good to have some kind of BaseException.replace() API that did the heavy lifting and returned None rather than throwing an exception if replacement wasn't supported (since blindly bitwise copying instance objects isn't safe for subclasses that add extra C level pointers and throwing an exception would be really annoying for exception chaining code that wants to reraise the original exception if replacement isn't possible). Cheers, Nick. [1] http://bugs.python.org/issue17828 -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Nov 7 23:13:11 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 7 Nov 2013 23:13:11 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" In-Reply-To: <527C0ED4.50702@trueblade.com> References: <3dFxbk1dYKz7Lmg@mail.python.org> <527C0ED4.50702@trueblade.com> Message-ID: 2013/11/7 Eric V. Smith : > Then how about at least a comment about how 6 is derived? > > if (lower_len < 6) /* 6 == strlen("utf-8") + 1 */ > return 0; Ok, I added the comment in changeset 9c929b9e0a2a. Victor From eric at trueblade.com Thu Nov 7 23:14:18 2013 From: eric at trueblade.com (Eric V. Smith) Date: Thu, 07 Nov 2013 17:14:18 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" In-Reply-To: References: <3dFxbk1dYKz7Lmg@mail.python.org> <527C0ED4.50702@trueblade.com> Message-ID: <527C10BA.7000804@trueblade.com> On 11/7/2013 5:13 PM, Victor Stinner wrote: > 2013/11/7 Eric V. Smith : >> Then how about at least a comment about how 6 is derived? >> >> if (lower_len < 6) /* 6 == strlen("utf-8") + 1 */ >> return 0; > > Ok, I added the comment in changeset 9c929b9e0a2a. Thanks! Eric. From ncoghlan at gmail.com Thu Nov 7 23:25:25 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 08:25:25 +1000 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: On 8 Nov 2013 04:18, "Christian Heimes" wrote: > > Hi, > > I'd like to simplify, unify and tighten SSL handling and verification > code in Python 3.5. Therefore I propose to deprecate some features for > Python 3.4. SSLContext objects are going to be the central instance for > configuration. > > In order to archive the goal I propose to > > - deprecate the key_file, cert_file and check_hostname arguments in > various in favor of context. Python 3.5 will no longer have these > arguments in http.client, smtplib etc. > > - deprecate implicit verify_mode=CERT_NONE. Python 3.5 will default > to CERT_REQUIRED. > > - enforce hostname matching in CERT_OPTIONAL or CERT_REQUIRED mode > by default in Python 3.5. > > - add ssl.get_default_context() to acquire a default SSLContext object > if the context argument is None. (Python 3.4) Change this to create_default_context() (to make it clearer there is no global context object) and I'm generally +1. > - add check_cert option to SSLContext and a check_cert() function to > SSLSocket in order to unify certificate validation and hostname > matching. (Python 3.4) > > > The SSLContext's check_cert option will support four values: I suggest just making this a "verifier" argument that's always a callable. > None (default) > use match_hostname() if verify_mode is CERT_OPTIONAL or > CERT_REQUIRED > True > always use match_hostname() > False > never use match_hostname() > callable function: > call func(sslsock, hostname, initiator, **kwargs) This overlaps confusingly with "verify_mode". Perhaps just offer a module level "verify_hostname" API that adapts between the verifier signature and match_hostname, say that is the default verifier, and that the verifier is always called. Also, how does the verifier know the verify_mode? Even if that info is set on the wrapped socket, it may be better to make it an optional argument to verifiers as well, to make it clearer that a verify mode of CERT_NONE may mean verification doesn't actually check anything. For the new method API (name suggestion: verify_cert), I believe that should default to CERT_REQUIRED, regardless of the mode configured on the wrapped socket. > The initiator argument is the object that has initiated the > SSL connection, e.g. http.client.HTTPSConnection object. > sslsock.context points to its SSLContext object > > > A connect method may look like this: > > def connect(self): > hostname = self.hostname > s = socket.create_connection((hostname, self.port)) > sock = self.context.wrap_socket(sock, > server_hostname=hostname) > sock.check_cert(hostname, initiator=self) > return sock > > > The check_cert option for SSLContext makes it easy to customize cert > handling. Application developers just need to configure a SSLContext > object in order to customize SSL cert verification and matching. In > Python 3.4 it will be possible to get the full peer chain, cert > information in CERT_NONE verification mode and SPKI data for SSL > pinning. A custom check_cert callback can be used to validate > self-signed certs or test for pinned certificates. > > > The proposal does NOT address the CA certificates difficulty. It's a > different story that needs to be addressed for all platforms separately. > Most Linux distributions, (Net|Free|Open) BSD and Mac Ports come with > SSL certs anyway, although some do not correctly handle the certs' > purposes. I have working code for Windows' cert system store that will > land in 3.4. Native Mac OS X hasn't been addressed yet. AIX, HP-UX, > Solaris etc. don't come with CA certs. > > Christian > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Nov 7 23:29:37 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 7 Nov 2013 17:29:37 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Fix _Py_normalize_encoding(): ensure that buffer is big enough to store "utf-8" In-Reply-To: References: <3dFxbk1dYKz7Lmg@mail.python.org> Message-ID: 2013/11/7 Victor Stinner : > Calling strlen() at runtime may slow-down a function in the fast-path > of PyUnicode_Decode() and PyUnicode_AsEncodedString() which are > important functions. I know that some developers can execute strlen() > during compilation, but I don't see the need of replacing 6 with > strlen("utf-8")+1. GCC will fold constant strlen calls. Another option would be to use sizeof(). -- Regards, Benjamin From barry at python.org Fri Nov 8 00:09:40 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 7 Nov 2013 18:09:40 -0500 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: <20131107180940.650749dd@anarchist> On Nov 07, 2013, at 10:42 PM, Christian Heimes wrote: >You misunderstood me. I'm not proposing a global SSLContext object but a >factory function that creates a context for Python stdlib modules. Right >now every urllib, http.client, nntplib, asyncio, ftplib, poplib and >imaplib have duplicated code. I'd like to have ONE function that creates >and configures a SSLContext object with sensible default values for >Python stdlib. I'm sure you're considering this, but I want to explicitly preserve the ability to register self-signed certificates. It's often necessary in practice, but very useful for testing purposes. ssl.SSLContext.load_cert_chain() is the way to do this, but will this be exposed in your proposed factory function? If not, then I think it's critically important that whatever API is exposed in the client code not hide the SSLContext object, such that clients of the client code can load up those self-signed certificates after the context has been created. -Barry From ncoghlan at gmail.com Fri Nov 8 00:16:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 09:16:13 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: <3dFFXL6Cxgz7LsB@mail.python.org> References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: New C APIs should either be documented or have an underscore prefix. Also, if they're part of the stable ABI, they need a version guard. Wishlist item: an automated ABI checker that can diff the exported symbols against a reference list (Could ctypes or cffi be used for that?) As long as this kind of thing involves manual review, we're going to keep making similar mistakes :P That said, a potentially simpler first step would be a list of common mistakes to check for/questions to ask during reviews as part of the devguide. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 8 00:18:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 09:18:45 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: <3dFG004BDdz7Lpy@mail.python.org> References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: On 7 Nov 2013 04:06, "victor.stinner" wrote: > > http://hg.python.org/cpython/rev/69071054b42f > changeset: 86968:69071054b42f > user: Victor Stinner > date: Wed Nov 06 18:58:22 2013 +0100 > summary: > Issue #19512: Add a new _PyDict_DelItemId() function, similar to > PyDict_DelItemString() but using an identifier for the key > > files: > Include/dictobject.h | 1 + > Objects/dictobject.c | 9 +++++++++ > 2 files changed, 10 insertions(+), 0 deletions(-) > > > diff --git a/Include/dictobject.h b/Include/dictobject.h > --- a/Include/dictobject.h > +++ b/Include/dictobject.h > @@ -109,6 +109,7 @@ > PyAPI_FUNC(int) PyDict_SetItemString(PyObject *dp, const char *key, PyObject *item); > PyAPI_FUNC(int) _PyDict_SetItemId(PyObject *dp, struct _Py_Identifier *key, PyObject *item); > PyAPI_FUNC(int) PyDict_DelItemString(PyObject *dp, const char *key); > +PyAPI_FUNC(int) _PyDict_DelItemId(PyObject *mp, struct _Py_Identifier *key); As a private API, this shouldn't be part of the stable ABI. Cheers, Nick. > > #ifndef Py_LIMITED_API > int _PyObjectDict_SetItem(PyTypeObject *tp, PyObject **dictptr, PyObject *name, PyObject *value); > diff --git a/Objects/dictobject.c b/Objects/dictobject.c > --- a/Objects/dictobject.c > +++ b/Objects/dictobject.c > @@ -2736,6 +2736,15 @@ > } > > int > +_PyDict_DelItemId(PyObject *v, _Py_Identifier *key) > +{ > + PyObject *kv = _PyUnicode_FromId(key); /* borrowed */ > + if (kv == NULL) > + return -1; > + return PyDict_DelItem(v, kv); > +} > + > +int > PyDict_DelItemString(PyObject *v, const char *key) > { > PyObject *kv; > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Nov 8 00:27:16 2013 From: christian at python.org (Christian Heimes) Date: Fri, 08 Nov 2013 00:27:16 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: <527C21D4.9020601@python.org> Am 07.11.2013 23:25, schrieb Nick Coghlan: > Change this to create_default_context() (to make it clearer there is no > global context object) and I'm generally +1. Good idea! > This overlaps confusingly with "verify_mode". Perhaps just offer a > module level "verify_hostname" API that adapts between the verifier > signature and match_hostname, say that is the default verifier, and that > the verifier is always called. verify_mode = CERT_REQUIRED checks a lot of stuff like certificate chain, expiration date, cert purpose and all the other crypto + X.509 magic. It doesn't check that the cert data matches the host name, though. My proposal tries to make the SSLContext object *the* central configuration point for cert validation and give the user an API for a callback that e.g. can check a a self signed certificate or do cert pinning. It already holds all options (protocol, verify mode, CA certs, cert, key, cert chain, NPN protocol, ciphers ...). The context also holds information for SSL session resumption to speed up future connection. It's just for a way to configure hostname matching and post connection checks. > Also, how does the verifier know the verify_mode? Even if that info is > set on the wrapped socket, it may be better to make it an optional > argument to verifiers as well, to make it clearer that a verify mode of > CERT_NONE may mean verification doesn't actually check anything. It involves some layers of indirection. Since Python 3.2 all SSL sockets are created from a SSLContext object. Even SSLSocket.__init__() and ssl.wrap_socket() create a SSLContext object first. The SSLSocket instance has a reference to its SSLContext in SSLSocket.context. Internally SSLSocket.check_cert(hostname, **kwargs) calls self.context.check_cert(self, hostname, **kwargs) of its SSLContext. SSLSocket.check_cert() has also an option to shut down the SSL connection gracefully when an exception occurs. > For the new method API (name suggestion: verify_cert), I believe that > should default to CERT_REQUIRED, regardless of the mode configured on > the wrapped socket. I don't want to create more confusion between verify_mode and the new feature, so I didn't use the term "verify" in the method name. Do you have a good idea for a better name that does not contain verify? Christian From victor.stinner at gmail.com Fri Nov 8 00:43:37 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Nov 2013 00:43:37 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: Hi, [PyRun_InteractiveOneObject()] 2013/11/8 Nick Coghlan : > New C APIs should either be documented or have an underscore prefix. I created the issue #19518 to add documentation (but also to add even more functions :-)). > Also, if they're part of the stable ABI, they need a version guard. As most PyRun_xxx() functions, the function is declared in a "#ifndef Py_LIMITED_API" block. It means that it is not part of the stable ABI, is that correct? > Wishlist item: an automated ABI checker that can diff the exported symbols > against a reference list (Could ctypes or cffi be used for that?) Yeah, it would be nice to have a tool to list changes on the stable ABI :-) * * * For a previous issue, I also added various other functions like PyParser_ASTFromStringObject() or PyParser_ASTFromFileObject(). As PyRun_InteractiveOneObject(), they are declared in a "#ifndef Py_LIMITED_API" block in pythonrun.h. These new functions are not documented, because PyParser_AST*() functions are not documented. http://bugs.python.org/issue11619 http://hg.python.org/cpython/rev/df2fdd42b375 The patch parser_unicode.patch was attached to the issue during 2 years. First I didn't want to commit it becaues it added too many new functions. But I pushed by change because some Windows users like using the full Unicode range in the name of their scripts and modules! If it's not too late, I would appreciate a review of the stable ABI of this changeset. Victor From victor.stinner at gmail.com Fri Nov 8 00:46:40 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Nov 2013 00:46:40 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: 2013/11/8 Nick Coghlan : >> http://hg.python.org/cpython/rev/69071054b42f >> changeset: 86968:69071054b42f >> user: Victor Stinner >> date: Wed Nov 06 18:58:22 2013 +0100 >> summary: >> Issue #19512: Add a new _PyDict_DelItemId() function, similar to >> PyDict_DelItemString() but using an identifier for the key >> ... > > As a private API, this shouldn't be part of the stable ABI. When I don't know if a function should be made public or not, I compare with other functions. In Python 3.3, _PyDict_GetItemIdWithError(), _PyDict_GetItemId() and _PyDict_SetItemId() are part of the stable ABI if I read correctly dictobject.h. _PyObject_GetAttrId() is also part of the stable ABI. Was it a mistake, or did I misunderstand how stable functions are declared? Victor From ncoghlan at gmail.com Fri Nov 8 00:50:35 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 09:50:35 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: On 8 Nov 2013 09:45, "Victor Stinner" wrote: > > Hi, > > [PyRun_InteractiveOneObject()] > > 2013/11/8 Nick Coghlan : > > New C APIs should either be documented or have an underscore prefix. > > I created the issue #19518 to add documentation (but also to add even > more functions :-)). > > > Also, if they're part of the stable ABI, they need a version guard. > > As most PyRun_xxx() functions, the function is declared in a "#ifndef > Py_LIMITED_API" block. It means that it is not part of the stable ABI, > is that correct? Yeah, that's fine - I'm just replying on my phone at the moment, so checking the broader file context is a pain :) Cheers, Nick. > > > Wishlist item: an automated ABI checker that can diff the exported symbols > > against a reference list (Could ctypes or cffi be used for that?) > > Yeah, it would be nice to have a tool to list changes on the stable ABI :-) > > * * * > > For a previous issue, I also added various other functions like > PyParser_ASTFromStringObject() or PyParser_ASTFromFileObject(). As > PyRun_InteractiveOneObject(), they are declared in a "#ifndef > Py_LIMITED_API" block in pythonrun.h. These new functions are not > documented, because PyParser_AST*() functions are not documented. > > http://bugs.python.org/issue11619 > http://hg.python.org/cpython/rev/df2fdd42b375 > > The patch parser_unicode.patch was attached to the issue during 2 > years. First I didn't want to commit it becaues it added too many new > functions. But I pushed by change because some Windows users like > using the full Unicode range in the name of their scripts and modules! > > If it's not too late, I would appreciate a review of the stable ABI of > this changeset. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Nov 8 00:50:57 2013 From: christian at python.org (Christian Heimes) Date: Fri, 08 Nov 2013 00:50:57 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: <20131107180940.650749dd@anarchist> References: <20131107180940.650749dd@anarchist> Message-ID: Am 08.11.2013 00:09, schrieb Barry Warsaw: > I'm sure you're considering this, but I want to explicitly preserve the > ability to register self-signed certificates. It's often necessary in > practice, but very useful for testing purposes. > > ssl.SSLContext.load_cert_chain() is the way to do this, but will this be > exposed in your proposed factory function? If not, then I think it's > critically important that whatever API is exposed in the client code not hide > the SSLContext object, such that clients of the client code can load up those > self-signed certificates after the context has been created. If you want full control over the context then you can still create your own context object. Nobody is going to stop you from that. The factory function removes code duplication. Right now 6 modules have the same code for PROTOCOL_SSLv23 with OP_NO_SSLv2. Old code -------- class HTTPSConnection: def __init__(self, hostname, port, key_file=None, cert_file=None, context=None): if context is None: # Some reasonable defaults context = ssl.SSLContext(ssl.PROTOCOL_SSLv23) context.options |= ssl.OP_NO_SSLv2 if key_file or cert_file: context.load_cert_chain(cert_file, key_file) New code -------- def create_default_context(protocol=None): if protocol is None: context = SSLContext(PROTOCOL_SSLv23) context.options |= OP_NO_SSLv2 else: context = SSLContext(protocol) return context class HTTPSConnection: def __init__(self, hostname, port, context=None): if context is None: context = ssl.create_default_context() self.context = context If you want full control ------------------------ barrys_special_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) barrys_special_context.load_cert_chain(cert_file, key_file) con = HTTPSConnection(host, port, barrys_special_context) With my proposed new option for SSLContext() you also gain full control over hostname matching and extra cert checks. Super Barry power! :) Christian From victor.stinner at gmail.com Fri Nov 8 00:55:54 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Nov 2013 00:55:54 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: 2013/11/8 Nick Coghlan : >> [PyRun_InteractiveOneObject()] >> (...) >> > Also, if they're part of the stable ABI, they need a version guard. >> >> As most PyRun_xxx() functions, the function is declared in a "#ifndef >> Py_LIMITED_API" block. It means that it is not part of the stable ABI, >> is that correct? > > Yeah, that's fine - I'm just replying on my phone at the moment, so checking > the broader file context is a pain :) About the 72523 functions PyRun_xxx(), I don't understand something. The PyRun_FileEx() is documented in the Python "very high" C API to use Python. But this function is not part of the stable ABI. So there is no "very high" function in the stable ABI to use Python? Another question: it's not documented if a function is part or not part of the stable ABI. So as an user of the API, it is hard to check if a function is part of the stable ABI or not. Victor From victor.stinner at gmail.com Fri Nov 8 00:58:43 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Nov 2013 00:58:43 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: 2013/11/8 Victor Stinner : > Another question: it's not documented if a function is part or not > part of the stable ABI. So as an user of the API, it is hard to check > if a function is part of the stable ABI or not. A solution for that it maybe to split the documentation of the C API in two parts: * Stable C ABI * Private C ABI If it's stupid, at least functions of the stable ABI should be marked in the documentation. Victor From christian at python.org Fri Nov 8 01:02:33 2013 From: christian at python.org (Christian Heimes) Date: Fri, 08 Nov 2013 01:02:33 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: <527C21D4.9020601@python.org> Message-ID: <527C2A19.6010709@python.org> Somehow your mail didn't end up on Python-dev Am 08.11.2013 00:38, schrieb Nick Coghlan: > In that case, it sounds like you need *two* new options rather than > one. "verify_hostname", with the None/True/False behaviour and a > separate postverify hook. Mmmh, yes, you are making an intriguing point. Two different options are easier to understand and more powerful. > It contains the word verify, but if I'm correct in thinking you > intend for the new callback to be invoked only if the checks > specified by verify_mode pass, then I would suggest "postverify", > and skip adding the separate method. The tests specified by verify_mode are done by OpenSSL during the protocol handshake. The SSLSocket object has no peer, peer cert and transport information before the hand shake is done. So yes, these checks are always done before Python can match the hostname of the peer's cert and before the postverify hook can run. OpenSSL has a verify callback hook that is called for each certificate in the trust chain starting with the peer cert up to a root cert. This callback is too low level and too complex to be useful for the majority of users. Python would also have to gain wrappers for X509_STORE and X509_STORE_CTX objects... You don't want to know the difference :) From ericsnowcurrently at gmail.com Fri Nov 8 01:15:35 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 7 Nov 2013 17:15:35 -0700 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: On Thu, Nov 7, 2013 at 4:55 PM, Victor Stinner wrote: > About the 72523 functions PyRun_xxx(), I don't understand something. > The PyRun_FileEx() is documented in the Python "very high" C API to > use Python. But this function is not part of the stable ABI. So there > is no "very high" function in the stable ABI to use Python? My understanding is that if you are using the PyRun_* API (embedding?), you aren't concerned with maintaining binary compatibility across Python versions. If you are writing an extension module, you probably aren't using PyRun_*. > > Another question: it's not documented if a function is part or not > part of the stable ABI. So as an user of the API, it is hard to check > if a function is part of the stable ABI or not. The best we have is probably PEP 384. I've been meaning to work on the C-API docs for a while and add a concise reference page that would include a summary of the stable API. Alas, other things have taken precedence. -eric From solipsis at pitrou.net Fri Nov 8 08:42:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 08 Nov 2013 08:42:53 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: Le 07/11/2013 22:42, Christian Heimes a ?crit : > Am 07.11.2013 21:45, schrieb Antoine Pitrou: >> I'm in favour but I think 3.5 is too early. Keep in mind SSLContext is >> quite young. > > It's available since 3.3 3.2 actually, while many code bases are still 2.x-compatible. There's no need to make migration more annoying right now. >>> - deprecate implicit verify_mode=CERT_NONE. Python 3.5 will default >>> to CERT_REQUIRED. >> >> -0.9. This breaks compatibility and doesn't achieve anything, since >> there's no reliable story for CA certs. > > I'd like to move to "secure by default". The CA cert situation is solved > on most platforms. If it's not solved on *all* platforms then you're introducing a nasty discrepancy between platforms. That said, "secure by default" could be a property of the "default context factory" (see below). >>> - add ssl.get_default_context() to acquire a default SSLContext object >>> if the context argument is None. (Python 3.4) >> >> I think I'm against it, for two reasons: >> >> 1. It will in the long term create compatibility and/or security issues >> (we'll have to keep fragile legacy settings if we want to avoid breaking >> existing uses of the default context) >> >> 2. SSLContexts are mutable, which means some code can affect other >> code's behaviour without even knowing. It's a recipe for subtle bugs and >> security issues. >> >> Applications and/or higher-level libraries should be smart enough to >> choose their own security preferences, and it's the better place to do >> so anyway. > > You misunderstood me. I'm not proposing a global SSLContext object but a > factory function that creates a context for Python stdlib modules. Right > now every urllib, http.client, nntplib, asyncio, ftplib, poplib and > imaplib have duplicated code. I'd like to have ONE function that creates > and configures a SSLContext object with sensible default values for > Python stdlib. Ok, so what about the long term compatibility issues? I'd be ok if you changed the name as suggested by Nick *and* if it was explicit that the security settings for this context can change between releases, therefore working code can break due to upgraded security standards. >> I'm against calling it check_cert(), it's too generic and only induces >> confusion. > > Do you have an idea for a better name? match_service()? match_cert()? But, really, it strikes me that adding this method to SSLSocket may make things more confusing. The docs will have to be clear that certificate *validation* and certificate *matching* are two different things (that is, you can't pass CERT_NONE and then expect calling match_cert() is sufficient). >> And why is there an "initiator" object? Personally I prefer to avoid >> passing opaque user data, since the caller can use a closure anyway. >> And what are the **kwargs? > > No, they can't use a closure. The callback function is part of the > SSLContext object. > The context object can be used by multiple SSLSocket > objects. So what? If you want to use multiple callbacks or closures, you can create multiple context objects. > The **kwargs make it possible to pass data from the caller of > check_cert() to the callback function of the SSLContext instance. Well, I think such explicit "user data" needn't exist in Python. >> So what does this bring compared to the statu quo? I thought at least >> sock.check_cert() would be called automatically, but if you have to call >> it yourself it starts looking pointless, IMO. > > As you have said earlier, cert matching is not strictly a SSL connection > thing. With the check_cert feature you can do stuff like validation of > self-signed certs with known hash sums: Ok, so you can pass that context to e.g. httplib, right? I'm convinced, except on the "user data" part :) Regards Antoine. From ncoghlan at gmail.com Fri Nov 8 10:25:35 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 19:25:35 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: On 8 Nov 2013 09:48, "Victor Stinner" wrote: > > 2013/11/8 Nick Coghlan : > >> http://hg.python.org/cpython/rev/69071054b42f > >> changeset: 86968:69071054b42f > >> user: Victor Stinner > >> date: Wed Nov 06 18:58:22 2013 +0100 > >> summary: > >> Issue #19512: Add a new _PyDict_DelItemId() function, similar to > >> PyDict_DelItemString() but using an identifier for the key > >> ... > > > > As a private API, this shouldn't be part of the stable ABI. > > When I don't know if a function should be made public or not, I > compare with other functions. > > In Python 3.3, _PyDict_GetItemIdWithError(), _PyDict_GetItemId() and > _PyDict_SetItemId() are part of the stable ABI if I read correctly > dictobject.h. _PyObject_GetAttrId() is also part of the stable ABI. > Was it a mistake, or did I misunderstand how stable functions are > declared? Likely a mistake - the stable ABI is hard to review properly (since it can depend on non local preprocessor checks, so a mistake may not be obvious in a diff), we don't currently have a systematic approach to handling changes and there's no automated test to catch inadvertent additions or (worse) removals :( This may be a good thing for us to look at more generally when things settle down a bit after the beta 1 feature freeze. Cheers, Nick. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 8 10:35:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 19:35:05 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add PyRun_InteractiveOneObject() function In-Reply-To: References: <3dFFXL6Cxgz7LsB@mail.python.org> Message-ID: On 8 Nov 2013 10:15, "Eric Snow" wrote: > > On Thu, Nov 7, 2013 at 4:55 PM, Victor Stinner wrote: > > About the 72523 functions PyRun_xxx(), I don't understand something. > > The PyRun_FileEx() is documented in the Python "very high" C API to > > use Python. But this function is not part of the stable ABI. So there > > is no "very high" function in the stable ABI to use Python? > > My understanding is that if you are using the PyRun_* API > (embedding?), you aren't concerned with maintaining binary > compatibility across Python versions. If you are writing an extension > module, you probably aren't using PyRun_*. > > > > > Another question: it's not documented if a function is part or not > > part of the stable ABI. So as an user of the API, it is hard to check > > if a function is part of the stable ABI or not. > > The best we have is probably PEP 384. I've been meaning to work on > the C-API docs for a while and add a concise reference page that would > include a summary of the stable API. Alas, other things have taken > precedence. Give it six months or so and I'll likely once again be looking for people to help out with the "make embedding CPython less painful and arcane" project that is PEP 432. There's a *lot* we can do in that space, along with the PEP 451 based extension loading enhancements. Just throwin' it out there ;) Cheers, Nick. > > -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Nov 8 12:19:19 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Nov 2013 12:19:19 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: 2013/11/8 Nick Coghlan : >> In Python 3.3, _PyDict_GetItemIdWithError(), _PyDict_GetItemId() and >> _PyDict_SetItemId() are part of the stable ABI if I read correctly >> dictobject.h. _PyObject_GetAttrId() is also part of the stable ABI. >> Was it a mistake, or did I misunderstand how stable functions are >> declared? > > Likely a mistake - the stable ABI is hard to review properly (since it can > depend on non local preprocessor checks, so a mistake may not be obvious in > a diff), we don't currently have a systematic approach to handling changes > and there's no automated test to catch inadvertent additions or (worse) > removals :( Would it be possible to remove them from the stable ABI in Python 3.4? They are marked as private using the "_Py" prefix... > This may be a good thing for us to look at more generally when things settle > down a bit after the beta 1 feature freeze. I created the following issue to not forget it: http://bugs.python.org/issue19526 Victor From theller at ctypes.org Fri Nov 8 13:03:55 2013 From: theller at ctypes.org (Thomas Heller) Date: Fri, 08 Nov 2013 13:03:55 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: Am 08.11.2013 12:19, schrieb Victor Stinner: > 2013/11/8 Nick Coghlan : >>> In Python 3.3, _PyDict_GetItemIdWithError(), _PyDict_GetItemId() and >>> _PyDict_SetItemId() are part of the stable ABI if I read correctly >>> dictobject.h. _PyObject_GetAttrId() is also part of the stable ABI. >>> Was it a mistake, or did I misunderstand how stable functions are >>> declared? >> >> Likely a mistake - the stable ABI is hard to review properly (since it can >> depend on non local preprocessor checks, so a mistake may not be obvious in >> a diff), we don't currently have a systematic approach to handling changes >> and there's no automated test to catch inadvertent additions or (worse) >> removals :( > > Would it be possible to remove them from the stable ABI in Python 3.4? > They are marked as private using the "_Py" prefix... I may be confusing API and ABI (see my other message), but adding to or removing functions from the stable ABI seems to be a very serious mistake, IMO - private or not. Unless my understanding of the word 'stable' is wrong... >> This may be a good thing for us to look at more generally when things settle >> down a bit after the beta 1 feature freeze. > > I created the following issue to not forget it: > http://bugs.python.org/issue19526 Thomas From theller at ctypes.org Fri Nov 8 13:13:39 2013 From: theller at ctypes.org (Thomas Heller) Date: Fri, 08 Nov 2013 13:13:39 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: Am 08.11.2013 13:03, schrieb Thomas Heller: > I may be confusing API and ABI (see my other message), but adding to > or removing functions from the stable ABI seems to be a very serious > mistake, IMO - private or not. Unless my understanding of the word > 'stable' is wrong... Ok - my mistake. The PEP allows functions to be added (but not to be removed). From PEP 384: """ During evolution of Python, new ABI functions will be added. Applications using them will then have a requirement on a minimum version of Python; this PEP provides no mechanism for such applications to fall back when the Python library is too old. """ From ncoghlan at gmail.com Fri Nov 8 13:55:17 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 8 Nov 2013 22:55:17 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: On 8 Nov 2013 22:03, "Thomas Heller" wrote: > > Am 08.11.2013 12:19, schrieb Victor Stinner: > >> 2013/11/8 Nick Coghlan : >>>> >>>> In Python 3.3, _PyDict_GetItemIdWithError(), _PyDict_GetItemId() and >>>> _PyDict_SetItemId() are part of the stable ABI if I read correctly >>>> dictobject.h. _PyObject_GetAttrId() is also part of the stable ABI. >>>> Was it a mistake, or did I misunderstand how stable functions are >>>> declared? >>> >>> >>> Likely a mistake - the stable ABI is hard to review properly (since it can >>> depend on non local preprocessor checks, so a mistake may not be obvious in >>> a diff), we don't currently have a systematic approach to handling changes >>> and there's no automated test to catch inadvertent additions or (worse) >>> removals :( >> >> >> Would it be possible to remove them from the stable ABI in Python 3.4? >> They are marked as private using the "_Py" prefix... > > > I may be confusing API and ABI (see my other message), but adding to > or removing functions from the stable ABI seems to be a very serious > mistake, IMO - private or not. Unless my understanding of the word > 'stable' is wrong... Yeah, we can add things to the stable ABI with an appropriate version guard, but we won't remove even private APIs if they were previously published in 3.3. The main thing I get out of this is that we need to figure out a way to test it automatically - the nature of the problem means that code review is an inherently unreliable mechanism for spotting mistakes. Cheers, Nick. > > >>> This may be a good thing for us to look at more generally when things settle >>> down a bit after the beta 1 feature freeze. >> >> >> I created the following issue to not forget it: >> http://bugs.python.org/issue19526 > > > Thomas > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Fri Nov 8 17:39:04 2013 From: christian at python.org (Christian Heimes) Date: Fri, 08 Nov 2013 17:39:04 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: Am 08.11.2013 08:42, schrieb Antoine Pitrou: > 3.2 actually, while many code bases are still 2.x-compatible. > There's no need to make migration more annoying right now. There is also no need to hinder and delay future improvements for the sake of 2.x compatibility. Python 2.7.0 has been released in July 2010. Python 3.5 will be released late 2015. 3rd parties can backport Python 3.x ssl module if they need the new feature in 2.x, too. Seriously, it's time to set Python 3 free. Python 2.7 shall no longer be a millstone around Python 3's neck. > If it's not solved on *all* platforms then you're introducing a nasty > discrepancy between platforms. > > That said, "secure by default" could be a property of the "default > context factory" (see below). I think you forget that incompatible changes won't happen until Python 3.5. Developer and users can still have the insecure, non verified mode but they have to configure it explictly. > Ok, so what about the long term compatibility issues? > > I'd be ok if you changed the name as suggested by Nick *and* if it was > explicit that the security settings for this context can change between > releases, therefore working code can break due to upgraded security > standards. Absolutely! Applications shall create their own context if they require full control. create_default_context() return a sensible context object that is subject to changes between releases. > But, really, it strikes me that adding this method to SSLSocket may make > things more confusing. The docs will have to be clear that certificate > *validation* and certificate *matching* are two different things (that > is, you can't pass CERT_NONE and then expect calling match_cert() is > sufficient). I like Nick's idea with verify_hostname flag and a postverify callback. At first the callback is invoked. Then the hostname is verified if either verify_hostname is true or None with CERT_OPTIONAL / CERT_REQUIRED. Similar to __exit__() the callback can return a true value in order to skip hostname matching. >> The **kwargs make it possible to pass data from the caller of >> check_cert() to the callback function of the SSLContext instance. > > Well, I think such explicit "user data" needn't exist in Python. Sure, people can just use thread local storage or walk up the stack with sys._getframe(). That's going to work well for asyncio! :p Christian From status at bugs.python.org Fri Nov 8 18:07:42 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 8 Nov 2013 18:07:42 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20131108170742.E67F856A27@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-11-01 - 2013-11-08) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4227 (+10) closed 27074 (+45) total 31301 (+55) Open issues with patches: 1960 Issues opened (43) ================== #10197: subprocess.getoutput fails on win32 http://bugs.python.org/issue10197 reopened by gregory.p.smith #19475: Add microsecond flag to datetime isoformat() http://bugs.python.org/issue19475 opened by skip.montanaro #19476: Add a dedicated specification for module "reloading" to the la http://bugs.python.org/issue19476 opened by eric.snow #19477: document tp_print() as being dead in Py3 http://bugs.python.org/issue19477 opened by scoder #19478: Add ability to prefix posix semaphore names created by multipr http://bugs.python.org/issue19478 opened by Marc.Liyanage #19479: textwrap.dedent doesn't work properly with strings containing http://bugs.python.org/issue19479 opened by alexis.d #19481: IDLE hangs while printing instance of Unicode subclass http://bugs.python.org/issue19481 opened by tim.peters #19482: _pickle build warnings on Fedora 19 http://bugs.python.org/issue19482 opened by ncoghlan #19483: Pure-Python ElementTree classes no longer available since 3.3 http://bugs.python.org/issue19483 opened by brechtm #19486: Bump up version of Tcl/Tk in building Python in Windows platfo http://bugs.python.org/issue19486 opened by vajrasky #19489: move quick search box above TOC http://bugs.python.org/issue19489 opened by georg.brandl #19490: Problem installing matplotlib 1.3.1 with Python 2.7.6rc1 and 3 http://bugs.python.org/issue19490 opened by ned.deily #19491: Python Crashing When Saving Documents http://bugs.python.org/issue19491 opened by carolyn.reed at talktalk.net #19492: Report skipped distutils tests as skipped http://bugs.python.org/issue19492 opened by serhiy.storchaka #19493: Report skipped ctypes tests as skipped http://bugs.python.org/issue19493 opened by serhiy.storchaka #19494: urllib2.HTTPBasicAuthHandler (or urllib.request.HTTPBasicAuthH http://bugs.python.org/issue19494 opened by mcepl #19495: Enhancement for timeit: measure time to run blocks of code usi http://bugs.python.org/issue19495 opened by dmoore #19499: "import this" is cached in sys.modules http://bugs.python.org/issue19499 opened by rhettinger #19500: Error when connecting to FTPS servers not supporting SSL sessi http://bugs.python.org/issue19500 opened by Ye.Wang #19501: Buildbot testing of 3.2 broken http://bugs.python.org/issue19501 opened by zach.ware #19502: Wrong time zone offset, when using time.strftime() with a give http://bugs.python.org/issue19502 opened by pwronisz #19504: Change "customise" to "customize". http://bugs.python.org/issue19504 opened by eric.smith #19505: OrderedDict views don't implement __reversed__ http://bugs.python.org/issue19505 opened by ThiefMaster #19506: subprocess.communicate() should use a memoryview http://bugs.python.org/issue19506 opened by haypo #19507: ssl.wrap_socket() with server_hostname should imply match_host http://bugs.python.org/issue19507 opened by christian.heimes #19508: Add warning that Python doesn't verify SSL certs by default http://bugs.python.org/issue19508 opened by christian.heimes #19509: No SSL match_hostname() in ftp, imap, nntp, pop, smtp modules http://bugs.python.org/issue19509 opened by christian.heimes #19510: lib2to3.fixes.fix_import gets confused if implicit relative im http://bugs.python.org/issue19510 opened by durin42 #19511: lib2to3 Grammar file is no longer a Python 3 superset http://bugs.python.org/issue19511 opened by ncoghlan #19512: Avoid temporary Unicode strings, use identifiers to only creat http://bugs.python.org/issue19512 opened by haypo #19513: Use PyUnicodeWriter instead of PyAccu in repr(tuple) and repr( http://bugs.python.org/issue19513 opened by haypo #19515: Share duplicated _Py_IDENTIFIER identifiers in C code http://bugs.python.org/issue19515 opened by haypo #19517: sysconfig variables introduced by PEP-3149 are currently undoc http://bugs.python.org/issue19517 opened by zaytsev #19518: Add new PyRun_xxx() functions to not encode the filename http://bugs.python.org/issue19518 opened by haypo #19519: Parser: don't transcode input string to UTF-8 if it is already http://bugs.python.org/issue19519 opened by haypo #19520: Win32 compiler warning in _sha3 http://bugs.python.org/issue19520 opened by zach.ware #19521: parallel build race condition on AIX since python-3.2 http://bugs.python.org/issue19521 opened by haubi #19523: logging.FileHandler - using of delay argument case handle leak http://bugs.python.org/issue19523 opened by DDGG #19524: ResourceWarning when urlopen() forgets the HTTPConnection obje http://bugs.python.org/issue19524 opened by vadmium #19526: Review additions to the stable ABI of Python 3.4 http://bugs.python.org/issue19526 opened by haypo #19527: Test failures with COUNT_ALLOCS http://bugs.python.org/issue19527 opened by bkabrda #19528: logger.config.fileConfig cant cope with files starting with an http://bugs.python.org/issue19528 opened by 51m0n #19529: Fix unicode_aswidechar() with 4byte unicode and 2byte wchar_t, http://bugs.python.org/issue19529 opened by haubi Most recent 15 issues with no replies (15) ========================================== #19524: ResourceWarning when urlopen() forgets the HTTPConnection obje http://bugs.python.org/issue19524 #19517: sysconfig variables introduced by PEP-3149 are currently undoc http://bugs.python.org/issue19517 #19511: lib2to3 Grammar file is no longer a Python 3 superset http://bugs.python.org/issue19511 #19510: lib2to3.fixes.fix_import gets confused if implicit relative im http://bugs.python.org/issue19510 #19509: No SSL match_hostname() in ftp, imap, nntp, pop, smtp modules http://bugs.python.org/issue19509 #19506: subprocess.communicate() should use a memoryview http://bugs.python.org/issue19506 #19500: Error when connecting to FTPS servers not supporting SSL sessi http://bugs.python.org/issue19500 #19493: Report skipped ctypes tests as skipped http://bugs.python.org/issue19493 #19489: move quick search box above TOC http://bugs.python.org/issue19489 #19482: _pickle build warnings on Fedora 19 http://bugs.python.org/issue19482 #19477: document tp_print() as being dead in Py3 http://bugs.python.org/issue19477 #19476: Add a dedicated specification for module "reloading" to the la http://bugs.python.org/issue19476 #19474: Argument Clinic causing compiler warnings about uninitialized http://bugs.python.org/issue19474 #19472: inspect.getsource() raises a wrong exception type http://bugs.python.org/issue19472 #19469: Duplicate namespace package portions (but not on Windows) http://bugs.python.org/issue19469 Most recent 15 issues waiting for review (15) ============================================= #19529: Fix unicode_aswidechar() with 4byte unicode and 2byte wchar_t, http://bugs.python.org/issue19529 #19527: Test failures with COUNT_ALLOCS http://bugs.python.org/issue19527 #19521: parallel build race condition on AIX since python-3.2 http://bugs.python.org/issue19521 #19520: Win32 compiler warning in _sha3 http://bugs.python.org/issue19520 #19519: Parser: don't transcode input string to UTF-8 if it is already http://bugs.python.org/issue19519 #19518: Add new PyRun_xxx() functions to not encode the filename http://bugs.python.org/issue19518 #19513: Use PyUnicodeWriter instead of PyAccu in repr(tuple) and repr( http://bugs.python.org/issue19513 #19512: Avoid temporary Unicode strings, use identifiers to only creat http://bugs.python.org/issue19512 #19505: OrderedDict views don't implement __reversed__ http://bugs.python.org/issue19505 #19504: Change "customise" to "customize". http://bugs.python.org/issue19504 #19501: Buildbot testing of 3.2 broken http://bugs.python.org/issue19501 #19493: Report skipped ctypes tests as skipped http://bugs.python.org/issue19493 #19492: Report skipped distutils tests as skipped http://bugs.python.org/issue19492 #19486: Bump up version of Tcl/Tk in building Python in Windows platfo http://bugs.python.org/issue19486 #19481: IDLE hangs while printing instance of Unicode subclass http://bugs.python.org/issue19481 Top 10 most discussed issues (10) ================================= #19475: Add microsecond flag to datetime isoformat() http://bugs.python.org/issue19475 20 msgs #19512: Avoid temporary Unicode strings, use identifiers to only creat http://bugs.python.org/issue19512 20 msgs #19085: Add tkinter basic options tests http://bugs.python.org/issue19085 19 msgs #19515: Share duplicated _Py_IDENTIFIER identifiers in C code http://bugs.python.org/issue19515 16 msgs #19406: PEP 453: add the ensurepip module http://bugs.python.org/issue19406 12 msgs #17828: More informative error handling when encoding and decoding http://bugs.python.org/issue17828 9 msgs #19481: IDLE hangs while printing instance of Unicode subclass http://bugs.python.org/issue19481 9 msgs #19499: "import this" is cached in sys.modules http://bugs.python.org/issue19499 9 msgs #19513: Use PyUnicodeWriter instead of PyAccu in repr(tuple) and repr( http://bugs.python.org/issue19513 9 msgs #19518: Add new PyRun_xxx() functions to not encode the filename http://bugs.python.org/issue19518 9 msgs Issues closed (39) ================== #4331: Add functools.partialmethod http://bugs.python.org/issue4331 closed by ncoghlan #4965: Can doc index of html version be separately scrollable? http://bugs.python.org/issue4965 closed by ezio.melotti #6157: Tkinter.Text: changes for bbox, debug, and edit methods. http://bugs.python.org/issue6157 closed by serhiy.storchaka #6160: Tkinter.Spinbox: fix bbox method http://bugs.python.org/issue6160 closed by serhiy.storchaka #6868: Check errno of epoll_ctrl http://bugs.python.org/issue6868 closed by akuchling #7593: Computed-goto patch for RE engine http://bugs.python.org/issue7593 closed by akuchling #8003: Fragile and unexpected error-handling in asyncore http://bugs.python.org/issue8003 closed by akuchling #10375: 2to3 print(single argument) http://bugs.python.org/issue10375 closed by terry.reedy #11096: Multiple turtle tracers http://bugs.python.org/issue11096 closed by terry.reedy #11965: Simplify context manager in os.popen http://bugs.python.org/issue11965 closed by akuchling #17080: A better error message for float() http://bugs.python.org/issue17080 closed by ezio.melotti #17827: Document codecs.encode and codecs.decode http://bugs.python.org/issue17827 closed by python-dev #18345: logging: file creation options with FileHandler and friends http://bugs.python.org/issue18345 closed by python-dev #18678: Wrong struct members name for spwd module http://bugs.python.org/issue18678 closed by r.david.murray #18702: Report skipped tests as skipped http://bugs.python.org/issue18702 closed by serhiy.storchaka #18809: Expose mtime check in importlib.machinery.FileFinder http://bugs.python.org/issue18809 closed by brett.cannon #18985: Improve the documentation in fcntl module http://bugs.python.org/issue18985 closed by r.david.murray #19286: error: can't copy '': doesn't exist or not a regular http://bugs.python.org/issue19286 closed by jason.coombs #19378: Clean up Python 3.4 API additions in the dis module http://bugs.python.org/issue19378 closed by python-dev #19391: Fix PCbuild/readme.txt in 2.7 and 3.3 http://bugs.python.org/issue19391 closed by zach.ware #19397: test_pydoc fails with -S http://bugs.python.org/issue19397 closed by terry.reedy #19403: Make contextlib.redirect_stdout reentrant http://bugs.python.org/issue19403 closed by python-dev #19411: binascii.hexlify docs say it returns a string (it returns byte http://bugs.python.org/issue19411 closed by r.david.murray #19439: Build _testembed on Windows http://bugs.python.org/issue19439 closed by python-dev #19464: Remove warnings from Windows buildbot "clean" script http://bugs.python.org/issue19464 closed by tim.golden #19480: HTMLParser fails to handle some characters in the starttag http://bugs.python.org/issue19480 closed by ezio.melotti #19484: Idle crashes when opening file http://bugs.python.org/issue19484 closed by ned.deily #19485: Slightly incorrect doc for get_param method in Lib/email/messa http://bugs.python.org/issue19485 closed by r.david.murray #19487: Correct the error of the example given in the doc of partialme http://bugs.python.org/issue19487 closed by ncoghlan #19488: idlelib tests are silently skipped by test.regrtest on 2.7 http://bugs.python.org/issue19488 closed by terry.reedy #19496: Website link http://bugs.python.org/issue19496 closed by ned.deily #19497: selectors and modify() http://bugs.python.org/issue19497 closed by gvanrossum #19498: IDLE is behaving badly in Python 2.7.6rc1 http://bugs.python.org/issue19498 closed by ned.deily #19503: does gc_list_merge merge properly? http://bugs.python.org/issue19503 closed by mark.dickinson #19514: Merge duplicated _Py_IDENTIFIER identifiers in C code http://bugs.python.org/issue19514 closed by loewis #19516: segmentation fault using a dict as a key http://bugs.python.org/issue19516 closed by ned.deily #19522: A suggestion: python 3.* is not as convenient as python 2.* http://bugs.python.org/issue19522 closed by eric.snow #19525: Strict indentation in Python3 http://bugs.python.org/issue19525 closed by Arfrever #834840: Unhelpful error message from cgi module http://bugs.python.org/issue834840 closed by akuchling From solipsis at pitrou.net Fri Nov 8 18:39:35 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 08 Nov 2013 18:39:35 +0100 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: Le 08/11/2013 17:39, Christian Heimes a ?crit : > Am 08.11.2013 08:42, schrieb Antoine Pitrou: >> 3.2 actually, while many code bases are still 2.x-compatible. >> There's no need to make migration more annoying right now. > > There is also no need to hinder and delay future improvements for the > sake of 2.x compatibility. It's not an improvement, you are proposing to remove existing arguments to stdlib functions/constructors. I agree it would be cleaner without them, but there's no rush really. >> If it's not solved on *all* platforms then you're introducing a nasty >> discrepancy between platforms. >> >> That said, "secure by default" could be a property of the "default >> context factory" (see below). > > I think you forget that incompatible changes won't happen until Python > 3.5. Developer and users can still have the insecure, non verified mode > but they have to configure it explictly. Whether "secure by default" happens in 3.4 or 3.5 doesn't change my point. >> I'd be ok if you changed the name as suggested by Nick *and* if it was >> explicit that the security settings for this context can change between >> releases, therefore working code can break due to upgraded security >> standards. > > Absolutely! Applications shall create their own context if they require > full control. create_default_context() return a sensible context object > that is subject to changes between releases. Ok, fine with me, then. >> But, really, it strikes me that adding this method to SSLSocket may make >> things more confusing. The docs will have to be clear that certificate >> *validation* and certificate *matching* are two different things (that >> is, you can't pass CERT_NONE and then expect calling match_cert() is >> sufficient). > > I like Nick's idea with verify_hostname flag and a postverify callback. I don't really like it. There are already two steps: cert validation and cert matching, this idea is adding a third step and therefore making things extra confusing. >>> The **kwargs make it possible to pass data from the caller of >>> check_cert() to the callback function of the SSLContext instance. >> >> Well, I think such explicit "user data" needn't exist in Python. > > Sure, people can just use thread local storage or walk up the stack with > sys._getframe(). That's going to work well for asyncio! Actually, sensible people will use a closure, but you can prefer thread-local storage or sys._getframe() if that's your taste in programming. Regards Antoine. From ericsnowcurrently at gmail.com Fri Nov 8 20:49:54 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 8 Nov 2013 12:49:54 -0700 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted Message-ID: I'm pleased to announce that Brett Cannon and Nick Coghlan (the co-BDFL-delegates) have accepted PEP 451 for inclusion in Python 3.4. Both of them have contributed substantially to the discussions of the proposal and have helped make it solid. I'm grateful for their diligence, attentiveness, and expertise. I'd also like to thank PJ Eby, who provided important insight on several occasions and helped out with some sticky backward-compatibility situations. http://www.python.org/dev/peps/pep-0451/ -eric From victor.stinner at gmail.com Fri Nov 8 21:02:52 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 8 Nov 2013 21:02:52 +0100 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted In-Reply-To: References: Message-ID: Oh congrats Eric for your PEP. I like it. Victor Le vendredi 8 novembre 2013, Eric Snow a ?crit : > I'm pleased to announce that Brett Cannon and Nick Coghlan (the > co-BDFL-delegates) have accepted PEP 451 for inclusion in Python 3.4. > Both of them have contributed substantially to the discussions of the > proposal and have helped make it solid. I'm grateful for their > diligence, attentiveness, and expertise. I'd also like to thank PJ > Eby, who provided important insight on several occasions and helped > out with some sticky backward-compatibility situations. > > http://www.python.org/dev/peps/pep-0451/ > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Fri Nov 8 21:36:24 2013 From: brett at python.org (Brett Cannon) Date: Fri, 8 Nov 2013 15:36:24 -0500 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted In-Reply-To: References: Message-ID: I just want to publicly thank Eric for his hard work on this PEP. I think Nick, Eric, and myself all came to the same conclusion independently about needing to update the APIs used by import to make them more flexible and able to grow in the future, but Eric is the one that took the time to champion the changes, write the PEP, and handle the huge amount of feedback both on the import SIG and here along with his current patch. I know I'm quite pleased with the way this PEP turned out and am already looking forward to using the new APIs hopefully in Python 3.5 for introducing a lazy loader mechanism to the stdlib. On Fri, Nov 8, 2013 at 2:49 PM, Eric Snow wrote: > I'm pleased to announce that Brett Cannon and Nick Coghlan (the > co-BDFL-delegates) have accepted PEP 451 for inclusion in Python 3.4. > Both of them have contributed substantially to the discussions of the > proposal and have helped make it solid. I'm grateful for their > diligence, attentiveness, and expertise. I'd also like to thank PJ > Eby, who provided important insight on several occasions and helped > out with some sticky backward-compatibility situations. > > http://www.python.org/dev/peps/pep-0451/ > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Nov 8 21:42:09 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 8 Nov 2013 12:42:09 -0800 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted In-Reply-To: References: Message-ID: Congrats are deserved all around. This is a solid PEP. I'm glad you all took such great care with the design and the review. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 8 22:15:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Nov 2013 07:15:05 +1000 Subject: [Python-Dev] Simplify and unify SSL verification In-Reply-To: References: Message-ID: On 9 Nov 2013 02:41, "Christian Heimes" wrote: > > Am 08.11.2013 08:42, schrieb Antoine Pitrou: > > 3.2 actually, while many code bases are still 2.x-compatible. > > There's no need to make migration more annoying right now. > > There is also no need to hinder and delay future improvements for the > sake of 2.x compatibility. Python 2.7.0 has been released in July 2010. > Python 3.5 will be released late 2015. 3rd parties can backport Python > 3.x ssl module if they need the new feature in 2.x, too. > > Seriously, it's time to set Python 3 free. Python 2.7 shall no longer be > a millstone around Python 3's neck. As long as there is a backports.ssl module, support in tulip and in requests, it's likely fine in terms of supporting the common 2/3 subset. However, since this change mostly *can* be published via PyPI, I also don't think it makes sense to rush the design of this in the next two weeks. - I'm already trying to help get the binary codec changes, ensurepip and ModuleSpec landed - the hash update PEP still needs to be finalised - there's a couple of other context management features assigned to me to check before the deadline Heck, testing on PyPI first would even allow for fixing the misnomer - currently we're importing the ssl module to access Python's TLS support, which can't be helping the widespread misunderstanding of the fact they're not quite the same thing (even though TLS superseded SSL). The "create_default_context" idea and ensuring affected APIs accept a context object doesn't need to wait, nor does deprecating the implied CERT_NONE verification mode. It's just the parameter deprecations and the matching customisation API I think should be delayed, since those have broader implications that I don't believe we can give adequate consideration in two weeks when there's already plenty of other work we're still trying to finish up. Cheers, Nick. > > > If it's not solved on *all* platforms then you're introducing a nasty > > discrepancy between platforms. > > > > That said, "secure by default" could be a property of the "default > > context factory" (see below). > > I think you forget that incompatible changes won't happen until Python > 3.5. Developer and users can still have the insecure, non verified mode > but they have to configure it explictly. > > > > Ok, so what about the long term compatibility issues? > > > > I'd be ok if you changed the name as suggested by Nick *and* if it was > > explicit that the security settings for this context can change between > > releases, therefore working code can break due to upgraded security > > standards. > > Absolutely! Applications shall create their own context if they require > full control. create_default_context() return a sensible context object > that is subject to changes between releases. > > > > But, really, it strikes me that adding this method to SSLSocket may make > > things more confusing. The docs will have to be clear that certificate > > *validation* and certificate *matching* are two different things (that > > is, you can't pass CERT_NONE and then expect calling match_cert() is > > sufficient). > > I like Nick's idea with verify_hostname flag and a postverify callback. > > At first the callback is invoked. Then the hostname is verified if > either verify_hostname is true or None with CERT_OPTIONAL / > CERT_REQUIRED. Similar to __exit__() the callback can return a true > value in order to skip hostname matching. > > > >> The **kwargs make it possible to pass data from the caller of > >> check_cert() to the callback function of the SSLContext instance. > > > > Well, I think such explicit "user data" needn't exist in Python. > > Sure, people can just use thread local storage or walk up the stack with > sys._getframe(). That's going to work well for asyncio! > > :p > > Christian > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Nov 8 21:57:19 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 08 Nov 2013 12:57:19 -0800 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted In-Reply-To: References: Message-ID: <527D502F.1020501@stoneleaf.us> On 11/08/2013 11:49 AM, Eric Snow wrote: > I'm pleased to announce that Brett Cannon and Nick Coghlan (the > co-BDFL-delegates) have accepted PEP 451 for inclusion in Python 3.4. Excellent! Congrats, and thanks to everyone who worked on it. -- ~Ethan~ From barry at python.org Fri Nov 8 22:28:49 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 8 Nov 2013 16:28:49 -0500 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted In-Reply-To: References: Message-ID: <20131108162849.367bd2d8@anarchist> On Nov 08, 2013, at 12:49 PM, Eric Snow wrote: >I'm pleased to announce that Brett Cannon and Nick Coghlan (the >co-BDFL-delegates) have accepted PEP 451 for inclusion in Python 3.4. >Both of them have contributed substantially to the discussions of the >proposal and have helped make it solid. I'm grateful for their >diligence, attentiveness, and expertise. I'd also like to thank PJ >Eby, who provided important insight on several occasions and helped >out with some sticky backward-compatibility situations. > >http://www.python.org/dev/peps/pep-0451/ Congratulations for all the hard work on a great PEP. -Barry From ncoghlan at gmail.com Fri Nov 8 22:57:59 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 9 Nov 2013 07:57:59 +1000 Subject: [Python-Dev] PEP 451 (ModuleSpec) is accepted In-Reply-To: References: Message-ID: On 9 Nov 2013 06:39, "Brett Cannon" wrote: > > I just want to publicly thank Eric for his hard work on this PEP. I think Nick, Eric, and myself all came to the same conclusion independently about needing to update the APIs used by import to make them more flexible and able to grow in the future, but Eric is the one that took the time to champion the changes, write the PEP, and handle the huge amount of feedback both on the import SIG and here along with his current patch. > > I know I'm quite pleased with the way this PEP turned out and am already looking forward to using the new APIs hopefully in Python 3.5 for introducing a lazy loader mechanism to the stdlib. We should even be able to finally make circular imports work more consistently, better handle partial failures of packages that implicitly import submodules and address some of the import traps described in PEP 395. This change also enables better integration of runpy with the normal import system, the opportunity for much cleaner reload support (it's currently rife with opportunities for implicit, hard to debug, failures) and the possibility of bringing C extension loading up to a similar level of capability to Python module execution (e.g. allowing extension modules to support execution via runpy) Many of those related changes won't happen until Python 3.5, but PEP 451 is an important step towards making them practical. Awesome work, Eric - this is a far more significant upgrade to the import system than my indirect import idea that originally spawned the discussion on import-sig :) Cheers, Nick. > > > On Fri, Nov 8, 2013 at 2:49 PM, Eric Snow wrote: >> >> I'm pleased to announce that Brett Cannon and Nick Coghlan (the >> co-BDFL-delegates) have accepted PEP 451 for inclusion in Python 3.4. >> Both of them have contributed substantially to the discussions of the >> proposal and have helped make it solid. I'm grateful for their >> diligence, attentiveness, and expertise. I'd also like to thank PJ >> Eby, who provided important insight on several occasions and helped >> out with some sticky backward-compatibility situations. >> >> http://www.python.org/dev/peps/pep-0451/ >> >> -eric >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/brett%40python.org > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Sat Nov 9 00:14:52 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 09 Nov 2013 00:14:52 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Issue #19512: Add a new _PyDict_DelItemId() function, similar to In-Reply-To: References: <3dFG004BDdz7Lpy@mail.python.org> Message-ID: <527D706C.6020905@v.loewis.de> Am 08.11.13 10:25, schrieb Nick Coghlan: > Likely a mistake - the stable ABI is hard to review properly (since it > can depend on non local preprocessor checks, so a mistake may not be > obvious in a diff), we don't currently have a systematic approach to > handling changes and there's no automated test to catch inadvertent > additions or (worse) removals :( Actually, there is, for Windows. The .def file is an explicit list of symbols; additions can easily be reviewed. Everything that's not added there is not in the stable ABI (whether or not it is included in the #ifdef). Removals cause linker failures. It would be possible to automatically find out whether any functions declared under the stable API are actually exported also, by parsing the preprocessor output for Python.h Regards, Martin From theller at ctypes.org Sat Nov 9 21:31:18 2013 From: theller at ctypes.org (Thomas Heller) Date: Sat, 09 Nov 2013 21:31:18 +0100 Subject: [Python-Dev] PEP 384 (stable api) question In-Reply-To: <527BDD5C.1090801@v.loewis.de> References: <527BDD5C.1090801@v.loewis.de> Message-ID: Am 07.11.2013 19:35, schrieb "Martin v. L?wis": > Am 07.11.13 13:44, schrieb Thomas Heller: > >> I thought that the stable API would keep exactly the same across >> releases - is this expectation wrong or is this a bug? > > Oscar is right - this change doesn't affect the ABI, just the API. > > That said, please file an issue reporting what change you see in > your compiler warnings. 3.4 is not released yet, perhaps there is > something that can be done. http://bugs.python.org/issue19538 Thanks, Thomas From techtonik at gmail.com Sun Nov 10 06:29:38 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 10 Nov 2013 08:29:38 +0300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? Message-ID: http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl In Assign(expr* targets, expr value), why the first argument is a list? -- anatoly t. From benjamin at python.org Sun Nov 10 06:34:29 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 10 Nov 2013 00:34:29 -0500 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: 2013/11/10 anatoly techtonik : > http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl > > In Assign(expr* targets, expr value), why the first argument is a list? x = y = 42 -- Regards, Benjamin From benjamin at python.org Sun Nov 10 20:29:37 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 10 Nov 2013 14:29:37 -0500 Subject: [Python-Dev] [RELEASE] Python 2.7.6 Message-ID: Python 2.7.6 is now available. This release resolves crashes of the interactive interpreter on OS X 10.9. The final release also fixes several issues identified in the release candidate. Importantly, a security bug in CGIHTTPServer was fixed [1]. Thank you to those who tested the 2.7.6 release candidate and reported these bugs! All the changes in Python 2.7.6 are described in full detail in the Misc/NEWS file of the source tarball. You can also view online at http://hg.python.org/cpython/raw-file/99d03261c1ba/Misc/NEWS Downloads are at http://python.org/download/releases/2.7.6/ As always, report bugs to http://bugs.python.org/ Have a nice November, Benjamin Peterson 2.7 Release Manager [1] http://bugs.python.org/issue19435 From tjreedy at udel.edu Mon Nov 11 01:08:17 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 10 Nov 2013 19:08:17 -0500 Subject: [Python-Dev] [RELEASE] Python 2.7.6 In-Reply-To: References: Message-ID: On 11/10/2013 2:29 PM, Benjamin Peterson wrote: > Python 2.7.6 is now available. > > This release resolves crashes of the interactive interpreter on OS X 10.9. The > final release also fixes several issues identified in the release > candidate. Importantly, a security bug in CGIHTTPServer was fixed [1]. Thank you > to those who tested the 2.7.6 release candidate and reported these bugs! > > All the changes in Python 2.7.6 are described in full detail in the Misc/NEWS > file of the source tarball. You can also view online at > > http://hg.python.org/cpython/raw-file/99d03261c1ba/Misc/NEWS > > Downloads are at > > http://python.org/download/releases/2.7.6/ ''' Windows x86 MSI Installer (2.7.6) (sig) Windows x86 MSI program database (2.7.6) (sig) Windows X86-64 MSI Installer (2.7.6) [1] (sig) Windows X86-64 program database (2.7.6) [1] (sig) Windows help file (sig) ''' I presume line 4 should be 'MSI program database'. (3.x pages do not have this.) Line 5 could add "(included in MSI installers)" for those who do not know. Some programs, like Open/Libre Office, require a separate download and install for Windows help. (This comment applies to 3.x also.) -- Terry Jan Reedy From victor.stinner at gmail.com Mon Nov 11 02:44:35 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Nov 2013 02:44:35 +0100 Subject: [Python-Dev] [RELEASE] Python 2.7.6 In-Reply-To: References: Message-ID: 2013/11/10 Benjamin Peterson : > All the changes in Python 2.7.6 are described in full detail in the Misc/NEWS > file of the source tarball. You can also view online at > > http://hg.python.org/cpython/raw-file/99d03261c1ba/Misc/NEWS - Issue #18747: Re-seed OpenSSL's pseudo-random number generator after fork. A pthread_atfork() parent handler is used to seed the PRNG with pid, time and some stack data. Is this changeset in Python 2.7.6 or not? Victor From benjamin at python.org Mon Nov 11 02:48:47 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 10 Nov 2013 20:48:47 -0500 Subject: [Python-Dev] [RELEASE] Python 2.7.6 In-Reply-To: References: Message-ID: 2013/11/10 Victor Stinner : > 2013/11/10 Benjamin Peterson : >> All the changes in Python 2.7.6 are described in full detail in the Misc/NEWS >> file of the source tarball. You can also view online at >> >> http://hg.python.org/cpython/raw-file/99d03261c1ba/Misc/NEWS > > - Issue #18747: Re-seed OpenSSL's pseudo-random number generator after fork. > A pthread_atfork() parent handler is used to seed the PRNG with pid, time > and some stack data. > > Is this changeset in Python 2.7.6 or not? It was backed out, but evidently the news entry wasn't removed. -- Regards, Benjamin From amk at amk.ca Mon Nov 11 03:30:02 2013 From: amk at amk.ca (A.M. Kuchling) Date: Sun, 10 Nov 2013 21:30:02 -0500 Subject: [Python-Dev] [Python-checkins] cpython: #1097797: Add CP273 codec, and exercise it in the test suite In-Reply-To: <52801B4E.1020204@udel.edu> References: <3dHkgx6mzFz7Ltb@mail.python.org> <52801B4E.1020204@udel.edu> Message-ID: <20131111023002.GA62967@datlandrewk.home> On Sun, Nov 10, 2013 at 06:48:30PM -0500, Terry Reedy wrote: > >+ def encode(self,input,errors='strict'): > >+ return codecs.charmap_encode(input,errors,encoding_table) > > You left out the spaces after the call commas (PEP8) Thanks for noting this. This file is programmatically generated by the scripts in Tools/unicode/ . Many of the files in this directory also have similar PEP8 violations: grep -r 'def encode' Lib/encodings/ |grep ',inpu' |less ... Lib/encodings//koi8_r.py: def encode(self,input,errors='strict'): Lib/encodings//koi8_u.py: def encode(self,input,errors='strict'): Lib/encodings//mac_arabic.py: def encode(self,input,errors='strict'): Lib/encodings//mac_centeuro.py: def encode(self,input,errors='strict'): So I'll take a look at fixing the code generator to generate properly formatted code, but it probably isn't worth modifying all of the existing files in Lib/encodings. --amk From cf.natali at gmail.com Mon Nov 11 08:57:01 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Mon, 11 Nov 2013 08:57:01 +0100 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement Message-ID: Hi, After several exchanges with Victor, PEP 454 has reached a status which I consider ready for pronuncement [1]: so if you have any last minute comment, now is the time! Cheers, cf [1] http://www.python.org/dev/peps/pep-0454/ From victor.stinner at gmail.com Mon Nov 11 12:58:43 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 11 Nov 2013 12:58:43 +0100 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement In-Reply-To: References: Message-ID: I'm happy with the latest PEP because it is minimal but core features are still present: - get the traceback where an object was allocated, - get traces of memory blocks, - compute statistics on memory usage per line, file or traceback, - compute differences between two snapshots. We now have an object API to access low-level traces without killing performances. Temporary objects (read only views) are generated on demand. Generating concrete objects for each trace is really too slow because there are as many traces as memory blocks (500.000 memory blocks at the end of Python test suite for example). Victor Le lundi 11 novembre 2013, Charles-Fran?ois Natali a ?crit : > Hi, > > After several exchanges with Victor, PEP 454 has reached a status > which I consider ready for pronuncement [1]: so if you have any last > minute comment, now is the time! > > Cheers, > > cf > > > [1] http://www.python.org/dev/peps/pep-0454/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amk at amk.ca Mon Nov 11 22:30:38 2013 From: amk at amk.ca (A.M. Kuchling) Date: Mon, 11 Nov 2013 16:30:38 -0500 Subject: [Python-Dev] Is pygettext.py deprecated? Message-ID: <20131111213038.GA74478@DATLANDREWK.local> I was looking at http://bugs.python.org/issue1098749, which adds extraction of multiple-line strings to pygettext.py, but on the ticket Mark Lawrence suggested that "pygettext should be deprecated". Is it deprecated? (It's not listed in PEP 4, but it isn't a module.) Debian wrote a man page for pygettext that starts with: pygettext is deprecated. The current version of xgettext supports many languages, including Python. GNU xgettext didn't understand any languages other than C at one time, so pygettext.py was written to handle Python code but not C code. But now xgettext will parse Python code. I assume xgettext's parsing is just using regexes, not an actual parse tree, so it may work equally well for Python 2.x and 3.x. Do we want to deprecate pygettext.py? If so, we should rewrite http://docs.python.org/3/library/gettext.html, which encourages people to use pygettext, and add it to PEP4. --amk From victor.stinner at gmail.com Tue Nov 12 00:25:39 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Nov 2013 00:25:39 +0100 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement In-Reply-To: References: Message-ID: 2013/11/11 Charles-Fran?ois Natali : > After several exchanges with Victor, PEP 454 has reached a status > which I consider ready for pronuncement [1]: so if you have any last > minute comment, now is the time! Because the PEP has a long history, 49 mercurial revisions between september and november 2013, I tried to summarize its history. Most important changes of the PEP 454 between initial versions and the current (final?) version: - tracemalloc can store a whole traceback instead of just the filename and line number of the most recent frame - tracemalloc is no more a high-level tool, but a core module exposing only one thing, traces on memory blocks, with a light Snapshot class to compute statistics. Tasks, DisplayTop class, command line interface and metrics have been removed. - many functions and features with no real use cases were removed. For example, get_trace(address) was taking a raw address, whereas such address is not directly accessible in Python. It was replaced with get_object_traceback(obj) which has a better API. - better API providing access to all data from traces to statistics. Raw traces are accessible via Snapshot.traces which generates temporary read-only view to get an object API. - minimalist API, ex: no more Snapshot.timestamp attribute - no more "premature optimizations". For example, statistics are no more computed during capture in C, but computed on a snapshot in Python. Charles-Fran?ois did a great job to convert a high-level and specialized tool to a reusage and generic module. Thanks for all your advices! Without all these changes, it would be harder to extend tracemalloc later and to reuse tracemalloc for different use cases. Victor From barry at python.org Tue Nov 12 00:56:21 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 11 Nov 2013 18:56:21 -0500 Subject: [Python-Dev] Is pygettext.py deprecated? In-Reply-To: <20131111213038.GA74478@DATLANDREWK.local> References: <20131111213038.GA74478@DATLANDREWK.local> Message-ID: <20131111185621.52a73e05@anarchist> On Nov 11, 2013, at 04:30 PM, A.M. Kuchling wrote: >Do we want to deprecate pygettext.py? If so, we should rewrite >http://docs.python.org/3/library/gettext.html, which encourages people >to use pygettext, and add it to PEP4. I think we should deprecate it. If we find things that xgettext can't handle that pygettext can, we should help to fix the former. -Barry From pjenvey at underboss.org Tue Nov 12 02:51:02 2013 From: pjenvey at underboss.org (Philip Jenvey) Date: Mon, 11 Nov 2013 17:51:02 -0800 Subject: [Python-Dev] Is pygettext.py deprecated? In-Reply-To: <20131111213038.GA74478@DATLANDREWK.local> References: <20131111213038.GA74478@DATLANDREWK.local> Message-ID: <02ADFF8B-9192-493E-88F5-FBD693462896@underboss.org> On Nov 11, 2013, at 1:30 PM, A.M. Kuchling wrote: > I was looking at http://bugs.python.org/issue1098749, which adds > extraction of multiple-line strings to pygettext.py, but on the ticket > Mark Lawrence suggested that "pygettext should be deprecated". Is it > deprecated? (It's not listed in PEP 4, but it isn't a module.) > > Debian wrote a man page for pygettext that starts with: > > pygettext is deprecated. The current version of xgettext > supports many languages, including Python. > > GNU xgettext didn't understand any languages other than C at one time, > so pygettext.py was written to handle Python code but not C code. But > now xgettext will parse Python code. I assume xgettext's > parsing is just using regexes, not an actual parse tree, so it may > work equally well for Python 2.x and 3.x. > > Do we want to deprecate pygettext.py? If so, we should rewrite > http://docs.python.org/3/library/gettext.html, which encourages people > to use pygettext, and add it to PEP4. +1 on deprecating it, though I'd prefer to recommend the pure python Babel instead of xgettext. Babel can extract gettext messages from Python code and also has has plugins for extracting from various popular Python templating languages. It's also easier to setup against Python projects using distutils/setuptools. -- Philip Jenvey From martin at v.loewis.de Tue Nov 12 05:05:48 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 12 Nov 2013 05:05:48 +0100 Subject: [Python-Dev] Is pygettext.py deprecated? In-Reply-To: <20131111213038.GA74478@DATLANDREWK.local> References: <20131111213038.GA74478@DATLANDREWK.local> Message-ID: <20131112050548.Horde.IskbBGORnOy7_a7xGICkJA7@webmail.df.eu> Quoting "A.M. Kuchling" : > GNU xgettext didn't understand any languages other than C at one time, > so pygettext.py was written to handle Python code but not C code. That's not the only reason. Another two reasons are that a) it works on Windows (for xgettext, you'll have to install Cygwin, which some consider a bit heavy - if all you need is xgettext) b) it comes with Python (interesting on Unix systems that don't come with a pre-built xgettext; less relevant now as most Unix systems are Linux these days) I see no real harm done by keeping (and possibly fixing) pygettext. I also see little harm in removing it, although I guess that some people might still rely on it. Regards, Martin From techtonik at gmail.com Tue Nov 12 07:43:59 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 12 Nov 2013 09:43:59 +0300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: > 2013/11/10 anatoly techtonik : >> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >> >> In Assign(expr* targets, expr value), why the first argument is a list? > > x = y = 42 Thanks. Speaking of this ASDL. `expr* targets` means that multiple entities of `expr` under the name 'targets' can be passed to Assign statement. Assign uses them as left value. But `expr` definition contains things that can not be used as left side assignment targets: expr = BoolOp(boolop op, expr* values) | BinOp(expr left, operator op, expr right) ... | Str(string s) -- need to specify raw, unicode, etc? | Bytes(bytes s) | NameConstant(singleton value) | Ellipsis -- the following expression can appear in assignment context | Attribute(expr value, identifier attr, expr_context ctx) | Subscript(expr value, slice slice, expr_context ctx) | Starred(expr value, expr_context ctx) | Name(identifier id, expr_context ctx) | List(expr* elts, expr_context ctx) | Tuple(expr* elts, expr_context ctx) If I understand correctly, this is compiled into C struct definitions (Python-ast.c), and there is a code to traverse the structure, but where is code that validates that the structure is correct? Is it done on the first level - text file parsing, before ASDL is built? If so, then what is the role of this ADSL exactly that the first step is unable to solve? Is it possible to fix ADSL to move `expr` that are allowed in Assign into `expr` subset? What effect will it achieve? I mean - will ADSL compiler complain about wrong stuff on the left side, or it will still be a role of some other component. Which one? From georg at python.org Tue Nov 12 08:10:38 2013 From: georg at python.org (Georg Brandl) Date: Tue, 12 Nov 2013 08:10:38 +0100 Subject: [Python-Dev] [RELEASED] Python 3.3.3 release candidate 2 Message-ID: <5281D46E.5010404@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team, I'm quite happy to announce the Python 3.3.3 release candidate 2. Python 3.3.3 includes several security fixes and over 150 bug fixes compared to the Python 3.3.2 release. This release fully supports OS X 10.9 Mavericks. In particular, this release fixes an issue that could cause previous versions of Python to crash when typing in interactive mode on OS X 10.9. Python 3.3 includes a range of improvements of the 3.x series, as well as easier porting between 2.x and 3.x. In total, almost 500 API items are new or improved in Python 3.3. For a more extensive list of changes in the 3.3 series, see http://docs.python.org/3.3/whatsnew/3.3.html To download Python 3.3.3 rc2 visit: http://www.python.org/download/releases/3.3.3/ This is a preview release, please report any bugs to http://bugs.python.org/ Enjoy! - -- Georg Brandl, Release Manager georg at python.org (on behalf of the entire python-dev team and 3.3's contributors) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlKB1G4ACgkQN9GcIYhpnLAu5gCfRkfpnEs+rmtZ9iTjaaZcHDx3 sNYAn180Q4cFZmKtwJdaG+g/3jHAVd97 =n/Tt -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Tue Nov 12 08:14:30 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Nov 2013 20:14:30 +1300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: <5281D556.8070406@canterbury.ac.nz> anatoly techtonik wrote: > where is code that validates that the structure is correct? Is it done > on the first level - text file parsing, before ASDL is built? If so, > then what is the role of this ADSL exactly that the first step is > unable to solve? I think it's done by the code that traverses the AST and generates bytecode. It complains if it comes across something that it doesn't know how to generate assignment code for. The reason it's done this way is that Python's LL parser can't see far enough ahead to know that it's processing an assignment until it has parsed the whole LHS and gets to the '='. Up to that point, it could well be just something to be evaluated, so it has to accept a general expression. I suppose a validation step could be performed before putting it into the AST, but there's no point, because the problem will become obvious during code generation anyway. -- Greg From benjamin at python.org Tue Nov 12 15:08:34 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 12 Nov 2013 09:08:34 -0500 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: 2013/11/12 anatoly techtonik : > On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >> 2013/11/10 anatoly techtonik : >>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>> >>> In Assign(expr* targets, expr value), why the first argument is a list? >> >> x = y = 42 > > Thanks. > > Speaking of this ASDL. `expr* targets` means that multiple entities of > `expr` under the name 'targets' can be passed to Assign statement. > Assign uses them as left value. But `expr` definition contains things > that can not be used as left side assignment targets: > > expr = BoolOp(boolop op, expr* values) > | BinOp(expr left, operator op, expr right) > ... > | Str(string s) -- need to specify raw, unicode, etc? > | Bytes(bytes s) > | NameConstant(singleton value) > | Ellipsis > > -- the following expression can appear in assignment context > | Attribute(expr value, identifier attr, expr_context ctx) > | Subscript(expr value, slice slice, expr_context ctx) > | Starred(expr value, expr_context ctx) > | Name(identifier id, expr_context ctx) > | List(expr* elts, expr_context ctx) > | Tuple(expr* elts, expr_context ctx) > > If I understand correctly, this is compiled into C struct definitions > (Python-ast.c), and there is a code to traverse the structure, but > where is code that validates that the structure is correct? Is it done > on the first level - text file parsing, before ASDL is built? If so, > then what is the role of this ADSL exactly that the first step is > unable to solve? Only valid expression targets are allowed during AST construction. See set_expr_context in ast.c. > > Is it possible to fix ADSL to move `expr` that are allowed in Assign > into `expr` subset? What effect will it achieve? I mean - will ADSL > compiler complain about wrong stuff on the left side, or it will still > be a role of some other component. Which one? I'm not sure what you mean by an `expr` subset. -- Regards, Benjamin From victor.stinner at gmail.com Tue Nov 12 22:16:55 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Nov 2013 22:16:55 +0100 Subject: [Python-Dev] The pysandbox project is broken Message-ID: Hi, After having work during 3 years on a pysandbox project to sandbox untrusted code, I now reached a point where I am convinced that pysandbox is broken by design. Different developers tried to convinced me before that pysandbox design is unsafe, but I had to experience it myself to be convineced. It would also be nice to help developers looking for a sandbox for their application. Please tell me if you know sandbox projects for Python so I can redirect users of pysandbox to a safer solution. I already know PyPy sandbox. I would like to share my experience because I know that other developers are using sandboxes in production and that there is a real need for sandboxing. Origin of pysandbox =================== In 2010, a developper called Tav wrote a sandbox called "safelite.py": the sandbox hides sensitive attributes to separate a trusted namespace and an untrusted namespace. Tav challenged Python core developers to break his sandbox and... the sandbox was quickly broken. Even if it was quickly broken, I was conviced that Tav found something interesting and that there is a real need for sandboxing Python. I continued his work by putting more protections on the untrusted namespace. I published pysandbox 1.0 in june 2010. History of pysandbox ==================== pysandbox was used to build an IRC bot on a french Python channel. The bot executed Python code in the sandbox. The bot was mainly used by hackers to test the sandbox to try to find a vulnerability. It was nice to have such IRC bot on an Python help channel. Three month later after the release of pysandbox 1.0, the first vulnerability was found: it was possible to modify the __builtins__ dictionary to hack the sandbox functions and so escape from the sandbox. I had to blacklist common instructions like "dict.pop()" or "del dict[key]" to protect the __builtins__ dictionary. I had prefer to use a custom type for __builtins__ but CPython requires a real dictionary: Python/ceval.c has inlined version of PyDict_GetItem. For your information, I modified CPython 3.3 to accept arbitrary mapping types for __builtins__. Just after this fix, another vulnerability was found: it was still possible to modify __builtins__ using dict.__init__() method. The access to this method was also blocked. Seven months later, new vulnerabilities. The "timeout" protection was removed because it is not effective on CPU intensive functions implemented in C. And to workaround a known bug in CPython crashing the interpreter, the access to the type.__bases__ attribute was also blocked. But this protection has to be disabled on CPython 2.5 because of another CPython bug... The access to func_defaults/__defaults__ attributes of a function was also blocked to protect the sandbox, even if it was not exploitable to escape from the sandbox. Recent events ============== A few weeks ago, a security challenge targeted pysandbox. In less then one day, two vulnerabilities were found. First, the compile() builtin function was used to read line by line of an arbitrary file on the disk using a syntax error: the line is displayed in the traceback. Second, a context manager was used to retrieve a traceback object: from traceback.tb_frame, it was possible to navigate in the frames (using frame.f_back) to retrieve a frame of the trusted namespace, and then use f_globals attribute of the frame to retrieve a global name. Game over. I fixed these two vulnerabilities in pysandbox 1.5.1: compile() is now blocked by default, and the access to traceback.tb_frame, frame.f_back and frame.f_globals has been blocked. I also started to work on a new design of pysandbox (version currently called "pysandbox 1.6", might become pysandbox 2.0 later): run untrusted code in a subprocess to have a safer design. Using a subprocess, it becomes easier to limit the memory usage, setup a real timeout, limit bytes written to stdout, limit the size of data send to and received from the child process, etc. But my main motivation was to not crash the whole application if the untrusted code exploits a know Python bug to crash the process. They are (too) many ways to crash Python using common types and functions... The problem is that after each release it becomes harder to write Python code in the sandbox. For example it becomes very hard to give access to objects from the trusted namespace to the untrusted namespace, because the whole object must be serialized to be passed to the child process. It becomes also harder to debug bugs in the sandboxeded code because the traceback feature doesn't work well in the sandbox. Pysandbox is broken =================== In my opinion, the compile() vulnerabilty is the proof that it is not possible to put a sandbox in CPython. Blocking access to the open() builtin function and the file type constructor are not enough if unrelated functions can give access indirectly to the file system. Having read access on the file system is a critical vulnerability in pysandbox and modifying CPython to not print the source code line in a traceback is also not acceptable. I now agree that putting a sandbox in CPython is the wrong design. There are too many ways to escape the untrusted namespace using the various introspection features of the Python language. To guarantee the safetely of a security product, the code should be carefuly audited and the code to review must be as small as possible. Using pysandbox, the "code" is the whole Python core which is a really huge code base. For example, the Python and Objects directories of Python 3.4 contain more than 126,000 lines of C code. The security of pysandbox is the security of its weakest part. A single bug is enough to escape the whole sandbox. Attackers had original and different ideas like hacking __builtins__, using warnings, context manager, syntax errors, arbitrary bytecode, etc. It is hard to protect the untrusted namespace for all these different Python features. It might be possible to invest a lot of time to put enough protections to protect the untrusted namespace, but it leads to my second point: pysandbox cannot be used in practice. pysandbox cannot be used in practice ==================================== To protect the untrusted namespace, pysandbox installs a lot of different protections. Because of all these protections, it becomes hard to write Python code. Basic features like "del dict[key]" are denied. Passing an object to a sandbox is not possible to sandbox, pysandbox is unable to proxify arbitary objects. For something more complex than evaluating "1+(2*3)", pysandbox cannot be used in practice, because of all these protections. Individual protections cannot be disabled, all protections are required to get a secure sandbox. So what should be used to sandbox Python? ========================================= I developed pysandbox for fun in my free time. But I was contacted by different companies interested to use pysandbox in production on their web application. So I think that there is a real need to execute arbitrary untrusted code. I now think that putting a sandbox directly in Python cannot be secure. To build a secure sandbox, the whole Python process must be put in an external sandbox. There are for example projects using Linux SECCOMP security feature to isolate the Python process. PyPy has a similar design, it implemented something similar to SECCOMP but in a portable way. Please tell me if you know sandbox projects for Python so I can redirect users of pysandbox to a safer solution. I already know PyPy sandbox. Victor From victor.stinner at gmail.com Tue Nov 12 23:52:56 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 12 Nov 2013 23:52:56 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Provide a more readable representation of socket on repr(). In-Reply-To: <3dK35t3mfqz7LkL@mail.python.org> References: <3dK35t3mfqz7LkL@mail.python.org> Message-ID: Hi Giampaolo, You forgot to update tests after your change in repr(socket). Tests are failing on buildbots, just one example: ====================================================================== FAIL: test_repr (test.test_socket.GeneralModuleTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/lib/buildslave/3.x.murray-gentoo/build/Lib/test/test_socket.py", line 653, in test_repr self.assertIn('family=%i' % socket.AF_INET, repr(s)) AssertionError: 'family=2' not found in "" ---------------------------------------------------------------------- Victor 2013/11/12 giampaolo.rodola : > http://hg.python.org/cpython/rev/c5751f01b09b > changeset: 87074:c5751f01b09b > parent: 85942:0d079c66dc23 > user: Giampaolo Rodola' > date: Thu Oct 03 21:01:43 2013 +0200 > summary: > Provide a more readable representation of socket on repr(). > > Before: > > > Now: > > > files: > Lib/socket.py | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > > diff --git a/Lib/socket.py b/Lib/socket.py > --- a/Lib/socket.py > +++ b/Lib/socket.py > @@ -136,7 +136,7 @@ > address(es). > """ > closed = getattr(self, '_closed', False) > - s = "<%s.%s%s fd=%i, family=%i, type=%i, proto=%i" \ > + s = "<%s.%s%s fd=%i, family=%s, type=%s, proto=%i" \ > % (self.__class__.__module__, > self.__class__.__name__, > " [closed]" if closed else "", > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > From ncoghlan at gmail.com Wed Nov 13 00:40:48 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Nov 2013 09:40:48 +1000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: On 13 Nov 2013 07:18, "Victor Stinner" wrote: > > Please tell me if you know sandbox projects for Python so I can > redirect users of pysandbox to a safer solution. I already know PyPy > sandbox. Sandboxing is hard enough (see also the many JVM vulnerabilities) that the only ones I even remotely trust are the platform level mechanisms that form the foundation of the various PaaS services, including SELinux and Linux containers. Cross platform? In process? Even Lua is hard to secure in that situation, and it has a much smaller attack surface than CPython or Java. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Nov 13 00:48:55 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 12 Nov 2013 18:48:55 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: On 11/12/2013 4:16 PM, Victor Stinner wrote: > It would also be nice to help developers looking for a sandbox for > their application. Please tell me if you know sandbox projects for > Python so I can redirect users of pysandbox to a safer solution. I > already know PyPy sandbox. There are several websites running submitted Python code (and in some cases, many other languages). ProjectEuler CodeAcademy (I think they use someone else's code box) CheckIO.org - python only other coding challenge sites I suspect they use sandboxed processes but have not seen anyone talk about what they are doing. -- Terry Jan Reedy From josiah.carlson at gmail.com Wed Nov 13 00:49:28 2013 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Tue, 12 Nov 2013 15:49:28 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: Python-dev is for the development of the Python core language, the CPython runtime, and libraries. Your sandbox, despite using and requiring deep knowledge of the runtime, is not developing those things. If you had a series of requests for the language or runtime that would make your job easier, then your thread would be on-topic. I replied off-list because I didn't want to contribute to the off-topic posting, but if posting on-list is required for you to pay attention, so be it. - Josiah On Nov 12, 2013 2:51 PM, "Victor Stinner" wrote: > 2013/11/12 Josiah Carlson : > > I'm replying off-list because I didn't want to bother the other folks in > > python-dev (also, your post might have been better on python-list, but I > > digress). > > I don't understand why you are writing to me directly. I won't reply > if you don't write publicly on python-dev. > > Summary of my email: it's not possible to write a sandbox in CPython. > So it's very specific to CPython internals. I'm not subscribed to > python-list. > > Victor > > > > > Long story short, I think that you are right, and I think that you are > > wrong. > > > > I think that you are right that your current pysandbox implementation is > > likely broken by design. You are starting from a completely working > Python > > runtime, then eliminating/hiding/blocking certain features. This makes > it a > > game of whack-a-mole, for every vulnerability you fix, a new one comes up > > later. The only way to fix this problem is to change your design. > > > > If you wanted to do it right, instead of removing things that are > > vulnerable, start by defining what is safe, and expose only those safe > > things. As an example, you did the right thing by splitting your main and > > subprocess into two pieces. But you don't need to serialize your objects > > from the trusted namespace to give access to the sandbox (that exposes > your > > "trusted" objects to the sandbox in a raw manner, in obvious preparation > for > > exploitation). Instead you would just expose a proxy object whose method > > calls/attribute references are made across your pipe (or socket, or > > whatever) to the trusted controlling process. Is it slower? Yes. Does it > > matter? Not if it keeps the sandbox secure. > > > > Now if you start by saying, "what is allowed?", the most obvious > destination > > is that you will more or less end up writing your own Python runtime. > That's > > not necessarily a bad thing, as if you know that a new runtime is your > > destination, you can look for a viable alternate-language runtime to > begin > > with to short-circuit your work. The best option that I can come up with > at > > this point is Javascript as a destination language, as there are several > > Python to Javascript compilers out there, Javascript is sandboxed by > design, > > and you can arbitrarily eliminate portions of the py->js compilation > > opportunities to eliminate attack vectors (specifically keeping only > those > > that you know won't lead to an attack). > > > > Another option is Lua, though I don't really know of any viable Python to > > Lua transpilers out there. > > > > Good luck with whatever you decide to do. > > > > Regards, > > - Josiah > > > > > > > > On Tue, Nov 12, 2013 at 1:16 PM, Victor Stinner < > victor.stinner at gmail.com> > > wrote: > >> > >> Hi, > >> > >> After having work during 3 years on a pysandbox project to sandbox > >> untrusted code, I now reached a point where I am convinced that > >> pysandbox is broken by design. Different developers tried to convinced > >> me before that pysandbox design is unsafe, but I had to experience it > >> myself to be convineced. > >> > >> It would also be nice to help developers looking for a sandbox for > >> their application. Please tell me if you know sandbox projects for > >> Python so I can redirect users of pysandbox to a safer solution. I > >> already know PyPy sandbox. > >> > >> I would like to share my experience because I know that other > >> developers are using sandboxes in production and that there is a real > >> need for sandboxing. > >> > >> > >> Origin of pysandbox > >> =================== > >> > >> In 2010, a developper called Tav wrote a sandbox called "safelite.py": > >> the sandbox hides sensitive attributes to separate a trusted namespace > >> and an untrusted namespace. Tav challenged Python core developers to > >> break his sandbox and... the sandbox was quickly broken. Even if it > >> was quickly broken, I was conviced that Tav found something > >> interesting and that there is a real need for sandboxing Python. I > >> continued his work by putting more protections on the untrusted > >> namespace. I published pysandbox 1.0 in june 2010. > >> > >> > >> History of pysandbox > >> ==================== > >> > >> pysandbox was used to build an IRC bot on a french Python channel. The > >> bot executed Python code in the sandbox. The bot was mainly used by > >> hackers to test the sandbox to try to find a vulnerability. It was > >> nice to have such IRC bot on an Python help channel. > >> > >> Three month later after the release of pysandbox 1.0, the first > >> vulnerability was found: it was possible to modify the __builtins__ > >> dictionary to hack the sandbox functions and so escape from the > >> sandbox. I had to blacklist common instructions like "dict.pop()" or > >> "del dict[key]" to protect the __builtins__ dictionary. I had prefer > >> to use a custom type for __builtins__ but CPython requires a real > >> dictionary: Python/ceval.c has inlined version of PyDict_GetItem. For > >> your information, I modified CPython 3.3 to accept arbitrary mapping > >> types for __builtins__. > >> > >> Just after this fix, another vulnerability was found: it was still > >> possible to modify __builtins__ using dict.__init__() method. The > >> access to this method was also blocked. > >> > >> Seven months later, new vulnerabilities. The "timeout" protection was > >> removed because it is not effective on CPU intensive functions > >> implemented in C. And to workaround a known bug in CPython crashing > >> the interpreter, the access to the type.__bases__ attribute was also > >> blocked. But this protection has to be disabled on CPython 2.5 because > >> of another CPython bug... The access to func_defaults/__defaults__ > >> attributes of a function was also blocked to protect the sandbox, even > >> if it was not exploitable to escape from the sandbox. > >> > >> > >> Recent events > >> ============== > >> > >> A few weeks ago, a security challenge targeted pysandbox. In less then > >> one day, two vulnerabilities were found. First, the compile() builtin > >> function was used to read line by line of an arbitrary file on the > >> disk using a syntax error: the line is displayed in the traceback. > >> Second, a context manager was used to retrieve a traceback object: > >> from traceback.tb_frame, it was possible to navigate in the frames > >> (using frame.f_back) to retrieve a frame of the trusted namespace, and > >> then use f_globals attribute of the frame to retrieve a global name. > >> Game over. > >> > >> I fixed these two vulnerabilities in pysandbox 1.5.1: compile() is now > >> blocked by default, and the access to traceback.tb_frame, frame.f_back > >> and frame.f_globals has been blocked. > >> > >> I also started to work on a new design of pysandbox (version currently > >> called "pysandbox 1.6", might become pysandbox 2.0 later): run > >> untrusted code in a subprocess to have a safer design. Using a > >> subprocess, it becomes easier to limit the memory usage, setup a real > >> timeout, limit bytes written to stdout, limit the size of data send to > >> and received from the child process, etc. But my main motivation was > >> to not crash the whole application if the untrusted code exploits a > >> know Python bug to crash the process. They are (too) many ways to > >> crash Python using common types and functions... > >> > >> The problem is that after each release it becomes harder to write > >> Python code in the sandbox. For example it becomes very hard to give > >> access to objects from the trusted namespace to the untrusted > >> namespace, because the whole object must be serialized to be passed to > >> the child process. It becomes also harder to debug bugs in the > >> sandboxeded code because the traceback feature doesn't work well in > >> the sandbox. > >> > >> > >> Pysandbox is broken > >> =================== > >> > >> In my opinion, the compile() vulnerabilty is the proof that it is not > >> possible to put a sandbox in CPython. Blocking access to the open() > >> builtin function and the file type constructor are not enough if > >> unrelated functions can give access indirectly to the file system. > >> Having read access on the file system is a critical vulnerability in > >> pysandbox and modifying CPython to not print the source code line in a > >> traceback is also not acceptable. > >> > >> I now agree that putting a sandbox in CPython is the wrong design. > >> There are too many ways to escape the untrusted namespace using the > >> various introspection features of the Python language. To guarantee > >> the safetely of a security product, the code should be carefuly > >> audited and the code to review must be as small as possible. Using > >> pysandbox, the "code" is the whole Python core which is a really huge > >> code base. For example, the Python and Objects directories of Python > >> 3.4 contain more than 126,000 lines of C code. > >> > >> The security of pysandbox is the security of its weakest part. A > >> single bug is enough to escape the whole sandbox. > >> > >> Attackers had original and different ideas like hacking __builtins__, > >> using warnings, context manager, syntax errors, arbitrary bytecode, > >> etc. It is hard to protect the untrusted namespace for all these > >> different Python features. > >> > >> It might be possible to invest a lot of time to put enough protections > >> to protect the untrusted namespace, but it leads to my second point: > >> pysandbox cannot be used in practice. > >> > >> > >> pysandbox cannot be used in practice > >> ==================================== > >> > >> To protect the untrusted namespace, pysandbox installs a lot of > >> different protections. Because of all these protections, it becomes > >> hard to write Python code. Basic features like "del dict[key]" are > >> denied. Passing an object to a sandbox is not possible to sandbox, > >> pysandbox is unable to proxify arbitary objects. > >> > >> For something more complex than evaluating "1+(2*3)", pysandbox cannot > >> be used in practice, because of all these protections. Individual > >> protections cannot be disabled, all protections are required to get a > >> secure sandbox. > >> > >> > >> So what should be used to sandbox Python? > >> ========================================= > >> > >> I developed pysandbox for fun in my free time. But I was contacted by > >> different companies interested to use pysandbox in production on their > >> web application. So I think that there is a real need to execute > >> arbitrary untrusted code. > >> > >> I now think that putting a sandbox directly in Python cannot be > >> secure. To build a secure sandbox, the whole Python process must be > >> put in an external sandbox. There are for example projects using Linux > >> SECCOMP security feature to isolate the Python process. > >> > >> PyPy has a similar design, it implemented something similar to SECCOMP > >> but in a portable way. > >> > >> Please tell me if you know sandbox projects for Python so I can > >> redirect users of pysandbox to a safer solution. I already know PyPy > >> sandbox. > >> > >> > >> Victor > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> https://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> > https://mail.python.org/mailman/options/python-dev/josiah.carlson%40gmail.com > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Nov 13 00:58:42 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 13 Nov 2013 00:58:42 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: 2013/11/13 Josiah Carlson : > Python-dev is for the development of the Python core language, the CPython > runtime, and libraries. Your sandbox, despite using and requiring deep > knowledge of the runtime, is not developing those things. If you had a > series of requests for the language or runtime that would make your job > easier, then your thread would be on-topic. My initial goal was to put pysandbox directly into CPython when it would be considered safe and stable. The PEP 416 (frozendict) was a first step in this direction, but the PEP was rejected. I now gave up on sandboxing Python. I just would like to warn other core developers that trying to put a sandbox in Python is not a good idea :-) Victor From victor.stinner at gmail.com Wed Nov 13 00:53:53 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 13 Nov 2013 00:53:53 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: 2013/11/13 Terry Reedy : > There are several websites running submitted Python code (and in some cases, > many other languages). > ProjectEuler > CodeAcademy (I think they use someone else's code box) > CheckIO.org - python only > other coding challenge sites > I suspect they use sandboxed processes but have not seen anyone talk about > what they are doing. It's probably a sandbox around the Python process, not inside the process. There is also http://shell.appspot.com/ which uses Google AppEngine. In my opinion, Google AppEngine doesn't use a sandbox in Python, but outside Python. Victor From guido at python.org Wed Nov 13 01:04:49 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 12 Nov 2013 16:04:49 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: On Tue, Nov 12, 2013 at 3:53 PM, Victor Stinner wrote: > 2013/11/13 Terry Reedy : > > There are several websites running submitted Python code (and in some > cases, > > many other languages). > > ProjectEuler > > CodeAcademy (I think they use someone else's code box) > > CheckIO.org - python only > > other coding challenge sites > > I suspect they use sandboxed processes but have not seen anyone talk > about > > what they are doing. > I sure hope so. > > It's probably a sandbox around the Python process, not inside the process. > > There is also http://shell.appspot.com/ which uses Google AppEngine. > In my opinion, Google AppEngine doesn't use a sandbox in Python, but > outside Python. That's not just your opinion, it's a fact. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Nov 13 01:11:22 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 13 Nov 2013 11:11:22 +1100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: <20131113001121.GG2085@ando> On Wed, Nov 13, 2013 at 12:58:42AM +0100, Victor Stinner wrote: > I now gave up on sandboxing Python. I just would like to warn other > core developers that trying to put a sandbox in Python is not a good > idea :-) Do you mean CPython? Do you think it would be productive to create an independent Python compiler, designed with sandboxing in mind from the beginning? -- Steven From v+python at g.nevcal.com Wed Nov 13 01:47:47 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 12 Nov 2013 16:47:47 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <20131113001121.GG2085@ando> References: <20131113001121.GG2085@ando> Message-ID: <5282CC33.7010806@g.nevcal.com> On 11/12/2013 4:11 PM, Steven D'Aprano wrote: > On Wed, Nov 13, 2013 at 12:58:42AM +0100, Victor Stinner wrote: > >> >I now gave up on sandboxing Python. I just would like to warn other >> >core developers that trying to put a sandbox in Python is not a good >> >idea:-) > Do you mean CPython? > > Do you think it would be productive to create an independent Python > compiler, designed with sandboxing in mind from the beginning? In reading this thread, which I took as an on-topic dismissal of an integrated CPython sandbox, I also wondered if it was a CPython implementation issue, or a language design issue. If it is an implementation issue, then perhaps a different implementation would help. Or perhaps a "safe compiler". If it is a language design issue, then a different implementation wouldn't help, it would require a new language, or a restricted subset. I'm not sure whether some of the onerous sounding restrictions result from language or implementation issues; some of them certainly sounded like implementation issues. A restricted subset, compiled by a validating compiler, might still be a useful language, even if the execution speed has to be reduced by a validating runtime. Perhaps exception handling for exceptions hit inside a sandbox need to stop at the sandbox boundary. That is, exceptions within the sandbox stay within the sandbox, and exceptions generated due to sandbox calls to the implementation need to stay outside the sandbox, and then sanitized and limited information passed back in to the sandbox. Perhaps a different/restricted set of builtins must be provided within the sandbox. These ideas may perhaps still allow a CPython sandbox to be written, or may only help a new implementation. Is there technology in the smartphone OSes that could be applied? iOS seems to not even provide a file system to its apps, and there is limited sharing of data from one app to the next. Android provides an explicit subset of system services to its apps. Thanks, Victor, for the update on your sandbox efforts. I was hoping you would be successful, and then I was wondering if you had abandoned the effort, and now I know what the current status is. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Wed Nov 13 03:09:25 2013 From: christian at python.org (Christian Heimes) Date: Wed, 13 Nov 2013 03:09:25 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <5282CC33.7010806@g.nevcal.com> References: <20131113001121.GG2085@ando> <5282CC33.7010806@g.nevcal.com> Message-ID: <5282DF55.5050701@python.org> Am 13.11.2013 01:47, schrieb Glenn Linderman: > If it is an implementation issue, then perhaps a different > implementation would help. Or perhaps a "safe compiler". > > If it is a language design issue, then a different implementation > wouldn't help, it would require a new language, or a restricted subset. > I'm not sure whether some of the onerous sounding restrictions result > from language or implementation issues; some of them certainly sounded > like implementation issues. > > A restricted subset, compiled by a validating compiler, might still be a > useful language, even if the execution speed has to be reduced by a > validating runtime. A limited and well-defined subset of Python may do the trick, perhaps a project based on RPython. Zope has a long history of restricted Python code with safe-guards and security proxies. Any project must start with a proper threat model and goals. Does sandboxed code need to access frame objects and use compile()? Could we perhaps use a limited subinterpreters with reduced / modified builtins to archive isolation? CPython still has a couple of crashers, too. These must be resolved. You don't want sandboxed code to generate a segfault, do you? > Is there technology in the smartphone OSes that could be applied? iOS > seems to not even provide a file system to its apps, and there is > limited sharing of data from one app to the next. Android provides an > explicit subset of system services to its apps. On Linux seccomp may be a feasible way to prevent syscalls. Seccomp basically can limit the capability of a thread so it can no longer do certain syscalls. Chrome uses it for sandboxing. Christian From ned at nedbatchelder.com Wed Nov 13 04:37:11 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Tue, 12 Nov 2013 22:37:11 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: <5282F3E7.9060104@nedbatchelder.com> On 11/12/13 6:48 PM, Terry Reedy wrote: > On 11/12/2013 4:16 PM, Victor Stinner wrote: > >> It would also be nice to help developers looking for a sandbox for >> their application. Please tell me if you know sandbox projects for >> Python so I can redirect users of pysandbox to a safer solution. I >> already know PyPy sandbox. > > There are several websites running submitted Python code (and in some > cases, many other languages). > ProjectEuler > CodeAcademy (I think they use someone else's code box) > CheckIO.org - python only > other coding challenge sites > I suspect they use sandboxed processes but have not seen anyone talk > about what they are doing. > At edX, we use CodeJail to apply OS-level sandboxing to untrusted Python code: https://github.com/edx/codejail --Ned. From ncoghlan at gmail.com Wed Nov 13 07:40:56 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Nov 2013 16:40:56 +1000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <5282F3E7.9060104@nedbatchelder.com> References: <5282F3E7.9060104@nedbatchelder.com> Message-ID: On 13 Nov 2013 13:44, "Ned Batchelder" wrote: > > On 11/12/13 6:48 PM, Terry Reedy wrote: >> >> On 11/12/2013 4:16 PM, Victor Stinner wrote: >> >>> It would also be nice to help developers looking for a sandbox for >>> their application. Please tell me if you know sandbox projects for >>> Python so I can redirect users of pysandbox to a safer solution. I >>> already know PyPy sandbox. >> >> >> There are several websites running submitted Python code (and in some cases, many other languages). >> ProjectEuler >> CodeAcademy (I think they use someone else's code box) >> CheckIO.org - python only >> other coding challenge sites >> I suspect they use sandboxed processes but have not seen anyone talk about what they are doing. >> > > At edX, we use CodeJail to apply OS-level sandboxing to untrusted Python code: https://github.com/edx/codejail A couple of years ago at PyCon AU, Tim Dawborn went over the sandboxing approach used for the National Computer Science School infrastructure: http://m.youtube.com/watch?v=y-WPPdhTKBU&feature=plpp&p=PLpKCScKXUAmerE_uUsImVlPsmhLaYQuQy Cheers, Nick. > > --Ned. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Nov 13 07:48:03 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Nov 2013 16:48:03 +1000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <5282DF55.5050701@python.org> References: <20131113001121.GG2085@ando> <5282CC33.7010806@g.nevcal.com> <5282DF55.5050701@python.org> Message-ID: On 13 Nov 2013 12:11, "Christian Heimes" wrote: > > Am 13.11.2013 01:47, schrieb Glenn Linderman: > > If it is an implementation issue, then perhaps a different > > implementation would help. Or perhaps a "safe compiler". > > > > If it is a language design issue, then a different implementation > > wouldn't help, it would require a new language, or a restricted subset. > > I'm not sure whether some of the onerous sounding restrictions result > > from language or implementation issues; some of them certainly sounded > > like implementation issues. > > > > A restricted subset, compiled by a validating compiler, might still be a > > useful language, even if the execution speed has to be reduced by a > > validating runtime. > > A limited and well-defined subset of Python may do the trick, perhaps a > project based on RPython. Zope has a long history of restricted Python > code with safe-guards and security proxies. Any project must start with > a proper threat model and goals. Does sandboxed code need to access > frame objects and use compile()? Could we perhaps use a limited > subinterpreters with reduced / modified builtins to archive isolation? Brett Cannon also spent some time exploring exploring the idea of a security capability based model for a Python implementation. > CPython still has a couple of crashers, too. These must be resolved. You > don't want sandboxed code to generate a segfault, do you? Indeed - it would be interesting to see if any of those have been resolved by the various edge case fixes in recent months. > > Is there technology in the smartphone OSes that could be applied? iOS > > seems to not even provide a file system to its apps, and there is > > limited sharing of data from one app to the next. Android provides an > > explicit subset of system services to its apps. > > On Linux seccomp may be a feasible way to prevent syscalls. Seccomp > basically can limit the capability of a thread so it can no longer do > certain syscalls. Chrome uses it for sandboxing. Yeah, there's a reason our standard answer to "How do I sandbox Python code?" has been "Use a subprocess and the OS provided process sandboxing facilities" for quite some time. Sandboxing software *at all* is difficult, doing it cross-platform is even harder. Cheers, Nick. > > Christian > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Nov 13 07:54:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Nov 2013 16:54:30 +1000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: On 13 Nov 2013 09:56, "Josiah Carlson" wrote: > > Python-dev is for the development of the Python core language, the CPython runtime, and libraries. Your sandbox, despite using and requiring deep knowledge of the runtime, is not developing those things. If you had a series of requests for the language or runtime that would make your job easier, then your thread would be on-topic. While it may seem off-topic at first glance, pysandbox started out as Victor's attempt to prove those of us that were saying this wouldn't work wrong when he proposed replacing the long dead rexec and Bastion with something more robust. I actually applaud his decision to post his final conclusion to the list, even though it wasn't the outcome he was hoping for. Negative data is still data :) Cheers, Nick. > > I replied off-list because I didn't want to contribute to the off-topic posting, but if posting on-list is required for you to pay attention, so be it. > > - Josiah > > On Nov 12, 2013 2:51 PM, "Victor Stinner" wrote: >> >> 2013/11/12 Josiah Carlson : >> > I'm replying off-list because I didn't want to bother the other folks in >> > python-dev (also, your post might have been better on python-list, but I >> > digress). >> >> I don't understand why you are writing to me directly. I won't reply >> if you don't write publicly on python-dev. >> >> Summary of my email: it's not possible to write a sandbox in CPython. >> So it's very specific to CPython internals. I'm not subscribed to >> python-list. >> >> Victor >> >> > >> > Long story short, I think that you are right, and I think that you are >> > wrong. >> > >> > I think that you are right that your current pysandbox implementation is >> > likely broken by design. You are starting from a completely working Python >> > runtime, then eliminating/hiding/blocking certain features. This makes it a >> > game of whack-a-mole, for every vulnerability you fix, a new one comes up >> > later. The only way to fix this problem is to change your design. >> > >> > If you wanted to do it right, instead of removing things that are >> > vulnerable, start by defining what is safe, and expose only those safe >> > things. As an example, you did the right thing by splitting your main and >> > subprocess into two pieces. But you don't need to serialize your objects >> > from the trusted namespace to give access to the sandbox (that exposes your >> > "trusted" objects to the sandbox in a raw manner, in obvious preparation for >> > exploitation). Instead you would just expose a proxy object whose method >> > calls/attribute references are made across your pipe (or socket, or >> > whatever) to the trusted controlling process. Is it slower? Yes. Does it >> > matter? Not if it keeps the sandbox secure. >> > >> > Now if you start by saying, "what is allowed?", the most obvious destination >> > is that you will more or less end up writing your own Python runtime. That's >> > not necessarily a bad thing, as if you know that a new runtime is your >> > destination, you can look for a viable alternate-language runtime to begin >> > with to short-circuit your work. The best option that I can come up with at >> > this point is Javascript as a destination language, as there are several >> > Python to Javascript compilers out there, Javascript is sandboxed by design, >> > and you can arbitrarily eliminate portions of the py->js compilation >> > opportunities to eliminate attack vectors (specifically keeping only those >> > that you know won't lead to an attack). >> > >> > Another option is Lua, though I don't really know of any viable Python to >> > Lua transpilers out there. >> > >> > Good luck with whatever you decide to do. >> > >> > Regards, >> > - Josiah >> >> > >> > >> > >> > On Tue, Nov 12, 2013 at 1:16 PM, Victor Stinner < victor.stinner at gmail.com> >> > wrote: >> >> >> >> Hi, >> >> >> >> After having work during 3 years on a pysandbox project to sandbox >> >> untrusted code, I now reached a point where I am convinced that >> >> pysandbox is broken by design. Different developers tried to convinced >> >> me before that pysandbox design is unsafe, but I had to experience it >> >> myself to be convineced. >> >> >> >> It would also be nice to help developers looking for a sandbox for >> >> their application. Please tell me if you know sandbox projects for >> >> Python so I can redirect users of pysandbox to a safer solution. I >> >> already know PyPy sandbox. >> >> >> >> I would like to share my experience because I know that other >> >> developers are using sandboxes in production and that there is a real >> >> need for sandboxing. >> >> >> >> >> >> Origin of pysandbox >> >> =================== >> >> >> >> In 2010, a developper called Tav wrote a sandbox called "safelite.py": >> >> the sandbox hides sensitive attributes to separate a trusted namespace >> >> and an untrusted namespace. Tav challenged Python core developers to >> >> break his sandbox and... the sandbox was quickly broken. Even if it >> >> was quickly broken, I was conviced that Tav found something >> >> interesting and that there is a real need for sandboxing Python. I >> >> continued his work by putting more protections on the untrusted >> >> namespace. I published pysandbox 1.0 in june 2010. >> >> >> >> >> >> History of pysandbox >> >> ==================== >> >> >> >> pysandbox was used to build an IRC bot on a french Python channel. The >> >> bot executed Python code in the sandbox. The bot was mainly used by >> >> hackers to test the sandbox to try to find a vulnerability. It was >> >> nice to have such IRC bot on an Python help channel. >> >> >> >> Three month later after the release of pysandbox 1.0, the first >> >> vulnerability was found: it was possible to modify the __builtins__ >> >> dictionary to hack the sandbox functions and so escape from the >> >> sandbox. I had to blacklist common instructions like "dict.pop()" or >> >> "del dict[key]" to protect the __builtins__ dictionary. I had prefer >> >> to use a custom type for __builtins__ but CPython requires a real >> >> dictionary: Python/ceval.c has inlined version of PyDict_GetItem. For >> >> your information, I modified CPython 3.3 to accept arbitrary mapping >> >> types for __builtins__. >> >> >> >> Just after this fix, another vulnerability was found: it was still >> >> possible to modify __builtins__ using dict.__init__() method. The >> >> access to this method was also blocked. >> >> >> >> Seven months later, new vulnerabilities. The "timeout" protection was >> >> removed because it is not effective on CPU intensive functions >> >> implemented in C. And to workaround a known bug in CPython crashing >> >> the interpreter, the access to the type.__bases__ attribute was also >> >> blocked. But this protection has to be disabled on CPython 2.5 because >> >> of another CPython bug... The access to func_defaults/__defaults__ >> >> attributes of a function was also blocked to protect the sandbox, even >> >> if it was not exploitable to escape from the sandbox. >> >> >> >> >> >> Recent events >> >> ============== >> >> >> >> A few weeks ago, a security challenge targeted pysandbox. In less then >> >> one day, two vulnerabilities were found. First, the compile() builtin >> >> function was used to read line by line of an arbitrary file on the >> >> disk using a syntax error: the line is displayed in the traceback. >> >> Second, a context manager was used to retrieve a traceback object: >> >> from traceback.tb_frame, it was possible to navigate in the frames >> >> (using frame.f_back) to retrieve a frame of the trusted namespace, and >> >> then use f_globals attribute of the frame to retrieve a global name. >> >> Game over. >> >> >> >> I fixed these two vulnerabilities in pysandbox 1.5.1: compile() is now >> >> blocked by default, and the access to traceback.tb_frame, frame.f_back >> >> and frame.f_globals has been blocked. >> >> >> >> I also started to work on a new design of pysandbox (version currently >> >> called "pysandbox 1.6", might become pysandbox 2.0 later): run >> >> untrusted code in a subprocess to have a safer design. Using a >> >> subprocess, it becomes easier to limit the memory usage, setup a real >> >> timeout, limit bytes written to stdout, limit the size of data send to >> >> and received from the child process, etc. But my main motivation was >> >> to not crash the whole application if the untrusted code exploits a >> >> know Python bug to crash the process. They are (too) many ways to >> >> crash Python using common types and functions... >> >> >> >> The problem is that after each release it becomes harder to write >> >> Python code in the sandbox. For example it becomes very hard to give >> >> access to objects from the trusted namespace to the untrusted >> >> namespace, because the whole object must be serialized to be passed to >> >> the child process. It becomes also harder to debug bugs in the >> >> sandboxeded code because the traceback feature doesn't work well in >> >> the sandbox. >> >> >> >> >> >> Pysandbox is broken >> >> =================== >> >> >> >> In my opinion, the compile() vulnerabilty is the proof that it is not >> >> possible to put a sandbox in CPython. Blocking access to the open() >> >> builtin function and the file type constructor are not enough if >> >> unrelated functions can give access indirectly to the file system. >> >> Having read access on the file system is a critical vulnerability in >> >> pysandbox and modifying CPython to not print the source code line in a >> >> traceback is also not acceptable. >> >> >> >> I now agree that putting a sandbox in CPython is the wrong design. >> >> There are too many ways to escape the untrusted namespace using the >> >> various introspection features of the Python language. To guarantee >> >> the safetely of a security product, the code should be carefuly >> >> audited and the code to review must be as small as possible. Using >> >> pysandbox, the "code" is the whole Python core which is a really huge >> >> code base. For example, the Python and Objects directories of Python >> >> 3.4 contain more than 126,000 lines of C code. >> >> >> >> The security of pysandbox is the security of its weakest part. A >> >> single bug is enough to escape the whole sandbox. >> >> >> >> Attackers had original and different ideas like hacking __builtins__, >> >> using warnings, context manager, syntax errors, arbitrary bytecode, >> >> etc. It is hard to protect the untrusted namespace for all these >> >> different Python features. >> >> >> >> It might be possible to invest a lot of time to put enough protections >> >> to protect the untrusted namespace, but it leads to my second point: >> >> pysandbox cannot be used in practice. >> >> >> >> >> >> pysandbox cannot be used in practice >> >> ==================================== >> >> >> >> To protect the untrusted namespace, pysandbox installs a lot of >> >> different protections. Because of all these protections, it becomes >> >> hard to write Python code. Basic features like "del dict[key]" are >> >> denied. Passing an object to a sandbox is not possible to sandbox, >> >> pysandbox is unable to proxify arbitary objects. >> >> >> >> For something more complex than evaluating "1+(2*3)", pysandbox cannot >> >> be used in practice, because of all these protections. Individual >> >> protections cannot be disabled, all protections are required to get a >> >> secure sandbox. >> >> >> >> >> >> So what should be used to sandbox Python? >> >> ========================================= >> >> >> >> I developed pysandbox for fun in my free time. But I was contacted by >> >> different companies interested to use pysandbox in production on their >> >> web application. So I think that there is a real need to execute >> >> arbitrary untrusted code. >> >> >> >> I now think that putting a sandbox directly in Python cannot be >> >> secure. To build a secure sandbox, the whole Python process must be >> >> put in an external sandbox. There are for example projects using Linux >> >> SECCOMP security feature to isolate the Python process. >> >> >> >> PyPy has a similar design, it implemented something similar to SECCOMP >> >> but in a portable way. >> >> >> >> Please tell me if you know sandbox projects for Python so I can >> >> redirect users of pysandbox to a safer solution. I already know PyPy >> >> sandbox. >> >> >> >> >> >> Victor >> >> _______________________________________________ >> >> Python-Dev mailing list >> >> Python-Dev at python.org >> >> https://mail.python.org/mailman/listinfo/python-dev >> >> Unsubscribe: >> >> https://mail.python.org/mailman/options/python-dev/josiah.carlson%40gmail.com >> > >> > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed Nov 13 08:04:04 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 13 Nov 2013 08:04:04 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: Am 13.11.2013 00:49, schrieb Josiah Carlson: > Python-dev is for the development of the Python core language, the CPython > runtime, and libraries. Your sandbox, despite using and requiring deep knowledge > of the runtime, is not developing those things. If you had a series of requests > for the language or runtime that would make your job easier, then your thread > would be on-topic. Can we please exempt core committers from these misdemeanor notices? Thanks, Georg From fijall at gmail.com Wed Nov 13 08:37:37 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 13 Nov 2013 09:37:37 +0200 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <20131113001121.GG2085@ando> References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 2:11 AM, Steven D'Aprano wrote: > On Wed, Nov 13, 2013 at 12:58:42AM +0100, Victor Stinner wrote: > >> I now gave up on sandboxing Python. I just would like to warn other >> core developers that trying to put a sandbox in Python is not a good >> idea :-) > > Do you mean CPython? > > Do you think it would be productive to create an independent Python > compiler, designed with sandboxing in mind from the beginning? PyPy sandbox does work FYI It might not do exactly what you want, but it both provides a full python and security. Cheers, fijal From victor.stinner at gmail.com Wed Nov 13 09:28:41 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 13 Nov 2013 09:28:41 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <5282CC33.7010806@g.nevcal.com> References: <20131113001121.GG2085@ando> <5282CC33.7010806@g.nevcal.com> Message-ID: 2013/11/13 Glenn Linderman : > If it is an implementation issue, then perhaps a different implementation > would help. Or perhaps a "safe compiler". There is PyPy with its sandbox. > If it is a language design issue, then a different implementation wouldn't > help, it would require a new language, or a restricted subset. I'm not sure > whether some of the onerous sounding restrictions result from language or > implementation issues; some of them certainly sounded like implementation > issues. > > A restricted subset, compiled by a validating compiler, might still be a > useful language, even if the execution speed has to be reduced by a > validating runtime. > > Perhaps exception handling for exceptions hit inside a sandbox need to stop > at the sandbox boundary. That is, exceptions within the sandbox stay within > the sandbox, and exceptions generated due to sandbox calls to the > implementation need to stay outside the sandbox, and then sanitized and > limited information passed back in to the sandbox. > > Perhaps a different/restricted set of builtins must be provided within the > sandbox. The problem is to draw a line between the trusted namespace and the untrusted namespace. Tracebacks are just one example, there are too many other examples. Just another example: from types.__bases__, you may reach all available types in Python, even "sensitive" types. If you cannot draw a line because it is too complex, it probably means that it's simpler to consider that the whole Python process is untrusted. In this case, you have to put the sandbox outside Python, not inside. The second problem is that if you modify the Python language and write a limited implementation of Python, it is no more the Python language. What is the purpose of your sandbox if you cannot use the full Python language and the stdlib? It also depends on you use the sandbox. If it's just to evaluate basic mathematic expressions, it's easier to use Python with an external sandbox. If you want to plug the sandbox "in your application", it's more complex because you have to give access to your sensitive data through a proxy, so the proxy must be carefully written. Victor From hodgestar+pythondev at gmail.com Wed Nov 13 10:33:24 2013 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Wed, 13 Nov 2013 11:33:24 +0200 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: Thanks for writing this email. It's well written and it takes a lot of character to stand up and say you went down the wrong road. While I'm here - thanks also for all your work on core Python. As a Python user I really appreciate it. Schiavo Simon From facundobatista at gmail.com Wed Nov 13 12:30:05 2013 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 13 Nov 2013 09:30:05 -0200 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 4:37 AM, Maciej Fijalkowski wrote: >> Do you think it would be productive to create an independent Python >> compiler, designed with sandboxing in mind from the beginning? > > PyPy sandbox does work FYI > > It might not do exactly what you want, but it both provides a full > python and security. If we have sandboxing using PyPy... what also we need to put Python running in the browser? (like javascript, you know) Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista From ncoghlan at gmail.com Wed Nov 13 13:26:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 13 Nov 2013 22:26:07 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Provide a more readable representation of socket on repr(). In-Reply-To: References: <3dK35t3mfqz7LkL@mail.python.org> Message-ID: On 13 November 2013 08:52, Victor Stinner wrote: > Hi Giampaolo, > > You forgot to update tests after your change in repr(socket). Tests > are failing on buildbots, just one example: > > ====================================================================== > FAIL: test_repr (test.test_socket.GeneralModuleTests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/var/lib/buildslave/3.x.murray-gentoo/build/Lib/test/test_socket.py", > line 653, in test_repr > self.assertIn('family=%i' % socket.AF_INET, repr(s)) > AssertionError: 'family=2' not found in " family=AddressFamily.AF_INET, type=SocketType.SOCK_STREAM, proto=0, > laddr=('0.0.0.0', 0)>" I fixed this to turn the buildbots green again before committing the codec error handling changes. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Nov 13 15:29:51 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 00:29:51 +1000 Subject: [Python-Dev] Restoring the aliases for the non-Unicode codecs Message-ID: Back in Python 3.2, the non-Unicode codecs were restored to the standard library, but without the associated aliases (mostly due to some thoroughly confusing error messages when they were mistakenly used with the Unicode encoding convenience methods). The long gory history in http://bugs.python.org/issue7475 took a different turn earlier this year when I noticed (http://bugs.python.org/issue7475#msg187698) that the codecs module already *had* type neutral helper functions in the form of codecs.encode and codecs.decode, and has had them since Python 2.4. These were covered in the test suite, but not in the documentation. That realisation substantially changed my perspective on the issue, since it was no longer a matter of adding a new API, but of better documenting and facilitating the use of one we already had (and was supported all the way back to Python 2.4, make it easy to use in single-source Python 2/3 projects as well). Since then, three key supporting issues have been addressed for Python 3.4: * codecs.encode() and codecs.decode() have been documented in 2.7, 3.3 and default (http://bugs.python.org/issue17827) * codec errors have been updated to incorporate the name of the codec whenever feasible, and output type errors in Unicode encoding convenience methods refer uses to the type neutral object<->object convenience functions (http://bugs.python.org/issue17839) * the especially confusing errors from the base64 codecs have been eliminated by updating the base64 module to use memoryview to process binary input rather than explicit typechecks on builtin types (http://bugs.python.org/issue17839) So, with those underlying issues resolved, I would now like to restore the aliases for the non-Unicode codecs that were removed in http://bugs.python.org/issue10807 (aliases) and http://bugs.python.org/issue17841 (docs). I also looked into the possibility of providing appropriate 2to3 fixers, but the data driven nature of the problem makes that impractical. However, the presence of codecs.decode and codecs.encode in Python 2.4+ makes a new Py3k warning in 2.7.7 a viable option: http://bugs.python.org/issue19543 Regards, Nick. P.S. Until the next docs rebuild, you can see a summary of the non-Unicode codec handling improvements in the What's New diff here: http://hg.python.org/cpython/rev/854a2cea31b9 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mal at egenix.com Wed Nov 13 15:36:01 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 13 Nov 2013 15:36:01 +0100 Subject: [Python-Dev] Restoring the aliases for the non-Unicode codecs In-Reply-To: References: Message-ID: <52838E51.5070902@egenix.com> On 13.11.2013 15:29, Nick Coghlan wrote: > Back in Python 3.2, the non-Unicode codecs were restored to the > standard library, but without the associated aliases (mostly due to > some thoroughly confusing error messages when they were mistakenly > used with the Unicode encoding convenience methods). > > The long gory history in http://bugs.python.org/issue7475 took a > different turn earlier this year when I noticed > (http://bugs.python.org/issue7475#msg187698) that the codecs module > already *had* type neutral helper functions in the form of > codecs.encode and codecs.decode, and has had them since Python 2.4. > These were covered in the test suite, but not in the documentation. > > That realisation substantially changed my perspective on the issue, > since it was no longer a matter of adding a new API, but of better > documenting and facilitating the use of one we already had (and was > supported all the way back to Python 2.4, make it easy to use in > single-source Python 2/3 projects as well). Since then, three key > supporting issues have been addressed for Python 3.4: > > * codecs.encode() and codecs.decode() have been documented in 2.7, 3.3 > and default (http://bugs.python.org/issue17827) > * codec errors have been updated to incorporate the name of the codec > whenever feasible, and output type errors in Unicode encoding > convenience methods refer uses to the type neutral object<->object > convenience functions (http://bugs.python.org/issue17839) > * the especially confusing errors from the base64 codecs have been > eliminated by updating the base64 module to use memoryview to process > binary input rather than explicit typechecks on builtin types > (http://bugs.python.org/issue17839) > > So, with those underlying issues resolved, I would now like to restore > the aliases for the non-Unicode codecs that were removed in > http://bugs.python.org/issue10807 (aliases) and > http://bugs.python.org/issue17841 (docs). +1 and thanks for your work on this. > I also looked into the possibility of providing appropriate 2to3 > fixers, but the data driven nature of the problem makes that > impractical. However, the presence of codecs.decode and codecs.encode > in Python 2.4+ makes a new Py3k warning in 2.7.7 a viable option: > http://bugs.python.org/issue19543 > > Regards, > Nick. > > P.S. Until the next docs rebuild, you can see a summary of the > non-Unicode codec handling improvements in the What's New diff here: > http://hg.python.org/cpython/rev/854a2cea31b9 > > -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 13 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 6 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From walter at livinglogic.de Wed Nov 13 15:30:38 2013 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Wed, 13 Nov 2013 15:30:38 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: <3dKS143jbKz7Lk8@mail.python.org> References: <3dKS143jbKz7Lk8@mail.python.org> Message-ID: <52838D0E.5040606@livinglogic.de> On 13.11.13 14:51, nick.coghlan wrote: > http://hg.python.org/cpython/rev/854a2cea31b9 > changeset: 87084:854a2cea31b9 > user: Nick Coghlan > date: Wed Nov 13 23:49:21 2013 +1000 > summary: > Close #17828: better handling of codec errors > > - output type errors now redirect users to the type-neutral > convenience functions in the codecs module > - stateless errors that occur during encoding and decoding > will now be automatically wrapped in exceptions that give > the name of the codec involved Wouldn't it be better to add an annotation API to the exceptions classes? This would allow to annotate all exceptions without having to replace the exception object. I.e. BaseException would have an additional method annotate(): try: dostuff(param) except Exception as exc: exc.annotate("while doing stuff with {}".format(param)) annotate() would simply append the message to an internal list attribute. BaseException.__str__() could then use this to format an appropriate exception message. Servus, Walter From brett at python.org Wed Nov 13 15:58:55 2013 From: brett at python.org (Brett Cannon) Date: Wed, 13 Nov 2013 09:58:55 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 6:30 AM, Facundo Batista wrote: > On Wed, Nov 13, 2013 at 4:37 AM, Maciej Fijalkowski > wrote: > > >> Do you think it would be productive to create an independent Python > >> compiler, designed with sandboxing in mind from the beginning? > > > > PyPy sandbox does work FYI > > > > It might not do exactly what you want, but it both provides a full > > python and security. > > If we have sandboxing using PyPy... what also we need to put Python > running in the browser? (like javascript, you know) > > Thanks! > You can try to get PNaCl to work with Python to get a Python executable that at least Chrome can run. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Wed Nov 13 16:21:49 2013 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 13 Nov 2013 10:21:49 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/13/2013 01:54 AM, Nick Coghlan wrote: > I actually applaud his decision to post his final conclusion to the > list, even though it wasn't the outcome he was hoping for. Negative > data is still data :) Amen! I also applaud the work he put into the idea. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKDmQ0ACgkQ+gerLs4ltQ5XkACgn8riIUhp624gmVxSDqp6KNK+ A74An0Z3BKtc8CU0fSTsr4U76MTMw3Hi =YtJz -----END PGP SIGNATURE----- From breamoreboy at yahoo.co.uk Wed Nov 13 16:29:47 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 13 Nov 2013 15:29:47 +0000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: On 13/11/2013 09:33, Simon Cross wrote: > Thanks for writing this email. It's well written and it takes a lot of > character to stand up and say you went down the wrong road. While I'm > here - thanks also for all your work on core Python. As a Python user > I really appreciate it. > > Schiavo > Simon > Big +1s from me to Victor Stinner for his original email and to Simon Cross for this response. -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From stephen at xemacs.org Wed Nov 13 16:43:01 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 14 Nov 2013 00:43:01 +0900 Subject: [Python-Dev] Restoring the aliases for the non-Unicode codecs In-Reply-To: References: Message-ID: <87ppq4gvei.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > The long gory history in http://bugs.python.org/issue7475 took a > different turn earlier this year when I noticed > (http://bugs.python.org/issue7475#msg187698) that the codecs module > already *had* type neutral helper functions in the form of > codecs.encode and codecs.decode, and has had them since Python 2.4. > These were covered in the test suite, but not in the documentation. As long as the agreement on _methods_ (documented in http://bugs.python.org/issue7475#msg96240) isn't changed, I'm +1 on this. From ncoghlan at gmail.com Wed Nov 13 17:12:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 02:12:30 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: <52838D0E.5040606@livinglogic.de> References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> Message-ID: On 14 November 2013 00:30, Walter D?rwald wrote: > On 13.11.13 14:51, nick.coghlan wrote: > >> http://hg.python.org/cpython/rev/854a2cea31b9 >> changeset: 87084:854a2cea31b9 >> user: Nick Coghlan >> date: Wed Nov 13 23:49:21 2013 +1000 >> summary: >> Close #17828: better handling of codec errors >> >> - output type errors now redirect users to the type-neutral >> convenience functions in the codecs module >> - stateless errors that occur during encoding and decoding >> will now be automatically wrapped in exceptions that give >> the name of the codec involved > > > Wouldn't it be better to add an annotation API to the exceptions classes? > This would allow to annotate all exceptions without having to replace the > exception object. There's a reason the C API for this is private - it's a band aid fix, because solving it properly is hard :) http://bugs.python.org/issue18861 covers the fact this is just one instance of a broader category of related problems related to providing good exception context information (and the issue started with a different one). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Nov 13 17:25:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 02:25:40 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> Message-ID: On 14 November 2013 02:12, Nick Coghlan wrote: > On 14 November 2013 00:30, Walter D?rwald wrote: >> On 13.11.13 14:51, nick.coghlan wrote: >> >>> http://hg.python.org/cpython/rev/854a2cea31b9 >>> changeset: 87084:854a2cea31b9 >>> user: Nick Coghlan >>> date: Wed Nov 13 23:49:21 2013 +1000 >>> summary: >>> Close #17828: better handling of codec errors >>> >>> - output type errors now redirect users to the type-neutral >>> convenience functions in the codecs module >>> - stateless errors that occur during encoding and decoding >>> will now be automatically wrapped in exceptions that give >>> the name of the codec involved >> >> >> Wouldn't it be better to add an annotation API to the exceptions classes? >> This would allow to annotate all exceptions without having to replace the >> exception object. > > There's a reason the C API for this is private - it's a band aid fix, > because solving it properly is hard :) Note that the specific problem with just annotating the exception rather than a specific frame is that you lose the stack context for where the annotation occurred. The current chaining workaround doesn't just change the exception message, it also breaks the stack into two pieces (inside and outside the codec) that get displayed separately. Mostly though, it boils down to the fact that I'm far more comfortable changing codec exception stack trace details in some cases than I am proposing a new API for all exceptions this close to the Python 3.4 feature freeze. A more elegant (and comprehensive) solution as a PEP for 3.5 would certainly be a nice thing to have, but I think this is still much better than the 3.3 status quo. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed Nov 13 17:30:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 02:30:38 +1000 Subject: [Python-Dev] Restoring the aliases for the non-Unicode codecs In-Reply-To: <87ppq4gvei.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87ppq4gvei.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 14 November 2013 01:43, Stephen J. Turnbull wrote: > Nick Coghlan writes: > > > The long gory history in http://bugs.python.org/issue7475 took a > > different turn earlier this year when I noticed > > (http://bugs.python.org/issue7475#msg187698) that the codecs module > > already *had* type neutral helper functions in the form of > > codecs.encode and codecs.decode, and has had them since Python 2.4. > > These were covered in the test suite, but not in the documentation. > > As long as the agreement on _methods_ (documented in > http://bugs.python.org/issue7475#msg96240) isn't changed, I'm +1 on > this. Yeah, those have actually been updated to point to the type neutral functions: >>> "bad output".encode("rot_13") Traceback (most recent call last): File "", line 1, in TypeError: 'rot_13' encoder returned 'str' instead of 'bytes'; use codecs.encode() to encode to arbitrary types >>> b"bad output".decode("quopri_codec") Traceback (most recent call last): File "", line 1, in TypeError: 'quopri_codec' decoder returned 'bytes' instead of 'str'; use codecs.decode() to decode to arbitrary types Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eliben at gmail.com Wed Nov 13 19:05:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 13 Nov 2013 10:05:56 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 6:58 AM, Brett Cannon wrote: > > > > On Wed, Nov 13, 2013 at 6:30 AM, Facundo Batista > wrote: > >> On Wed, Nov 13, 2013 at 4:37 AM, Maciej Fijalkowski >> wrote: >> >> >> Do you think it would be productive to create an independent Python >> >> compiler, designed with sandboxing in mind from the beginning? >> > >> > PyPy sandbox does work FYI >> > >> > It might not do exactly what you want, but it both provides a full >> > python and security. >> >> If we have sandboxing using PyPy... what also we need to put Python >> running in the browser? (like javascript, you know) >> >> Thanks! >> > > You can try to get PNaCl to work with Python to get a Python executable > that at least Chrome can run. > Two corrections: 1. CPython already works with NaCl and PNaCl (there are working patches in naclports to build it) 2. It can be used outside Chrome as well, using the standalone "sel_ldr" tool that will then allow to run a sandboxed CPython .nexe from the command line Note that this is a fundamentally different sandboxing model (the whole interpreter is run in a sandbox), but it's also more secure. PNaCl has shipped publicly yesterday, so Chrome runs native code *from the web* on your machine - a lot of security research and work went into making this possible. As for performance, the sandboxing overhead of NaCl is very low (< 10% in most cases). Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Nov 13 19:27:06 2013 From: brett at python.org (Brett Cannon) Date: Wed, 13 Nov 2013 13:27:06 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 1:05 PM, Eli Bendersky wrote: > > > > On Wed, Nov 13, 2013 at 6:58 AM, Brett Cannon wrote: > >> >> >> >> On Wed, Nov 13, 2013 at 6:30 AM, Facundo Batista < >> facundobatista at gmail.com> wrote: >> >>> On Wed, Nov 13, 2013 at 4:37 AM, Maciej Fijalkowski >>> wrote: >>> >>> >> Do you think it would be productive to create an independent Python >>> >> compiler, designed with sandboxing in mind from the beginning? >>> > >>> > PyPy sandbox does work FYI >>> > >>> > It might not do exactly what you want, but it both provides a full >>> > python and security. >>> >>> If we have sandboxing using PyPy... what also we need to put Python >>> running in the browser? (like javascript, you know) >>> >>> Thanks! >>> >> >> You can try to get PNaCl to work with Python to get a Python executable >> that at least Chrome can run. >> > > Two corrections: > > 1. CPython already works with NaCl and PNaCl (there are working patches in > naclports to build it) > Anything that should be upstreamed? > 2. It can be used outside Chrome as well, using the standalone "sel_ldr" > tool that will then allow to run a sandboxed CPython .nexe from the command > line > Sure, but I was just thinking about the "in browser" question Facundo asked about. > > Note that this is a fundamentally different sandboxing model (the whole > interpreter is run in a sandbox), but it's also more secure. PNaCl has > shipped publicly yesterday, so Chrome runs native code *from the web* on > your machine - a lot of security research and work went into making this > possible. > > As for performance, the sandboxing overhead of NaCl is very low (< 10% in > most cases). > I feel like we need to have a page at python.org (or somewhere) that provides every which way to run Python from the browser for people to try the interpreter out as easily as possible. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Wed Nov 13 23:37:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 13 Nov 2013 14:37:46 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 10:27 AM, Brett Cannon wrote: > > > > On Wed, Nov 13, 2013 at 1:05 PM, Eli Bendersky wrote: > >> >> >> >> On Wed, Nov 13, 2013 at 6:58 AM, Brett Cannon wrote: >> >>> >>> >>> >>> On Wed, Nov 13, 2013 at 6:30 AM, Facundo Batista < >>> facundobatista at gmail.com> wrote: >>> >>>> On Wed, Nov 13, 2013 at 4:37 AM, Maciej Fijalkowski >>>> wrote: >>>> >>>> >> Do you think it would be productive to create an independent Python >>>> >> compiler, designed with sandboxing in mind from the beginning? >>>> > >>>> > PyPy sandbox does work FYI >>>> > >>>> > It might not do exactly what you want, but it both provides a full >>>> > python and security. >>>> >>>> If we have sandboxing using PyPy... what also we need to put Python >>>> running in the browser? (like javascript, you know) >>>> >>>> Thanks! >>>> >>> >>> You can try to get PNaCl to work with Python to get a Python executable >>> that at least Chrome can run. >>> >> >> Two corrections: >> >> 1. CPython already works with NaCl and PNaCl (there are working patches >> in naclports to build it) >> > > Anything that should be upstreamed? > Yeah, it definitely could. There are two problems currently: 1) the patches are for 2.7.x and 2) they have some ugly hacks in them. But I will talk to the guy who worked on that and hopefully we'll be able to have something cleaned up for upstreaming into default/3.x Anyhow, the webstore app is: https://chrome.google.com/webstore/detail/python/nodpmmidbgeganfponihbgmfcoiibffi And the code is in: https://code.google.com/p/naclports/wiki/PortList > > >> 2. It can be used outside Chrome as well, using the standalone "sel_ldr" >> tool that will then allow to run a sandboxed CPython .nexe from the command >> line >> > > Sure, but I was just thinking about the "in browser" question Facundo > asked about. > Yep, see link above for in-the-browser Python. Same can be done with PNaCl and not require the web store (this can actually be built by anyone from the NaCl SDK today). Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Wed Nov 13 23:48:14 2013 From: christian at python.org (Christian Heimes) Date: Wed, 13 Nov 2013 23:48:14 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: Am 13.11.2013 23:37, schrieb Eli Bendersky: > Yeah, it definitely could. There are two problems currently: 1) the > patches are for 2.7.x and 2) they have some ugly hacks in them. But I > will talk to the guy who worked on that and hopefully we'll be able to > have something cleaned up for upstreaming into default/3.x > > Anyhow, the webstore app is: > https://chrome.google.com/webstore/detail/python/nodpmmidbgeganfponihbgmfcoiibffi > > And the code is in: https://code.google.com/p/naclports/wiki/PortList The patch https://code.google.com/p/naclports/source/browse/trunk/src/libraries/python/nacl.patch looks rather small and simple. Some of the hacks may not be required in Python 3.4, too. I'd love to have PNaCl support in Python 3.4! Christian From eliben at gmail.com Thu Nov 14 00:06:48 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 13 Nov 2013 15:06:48 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 2:48 PM, Christian Heimes wrote: > Am 13.11.2013 23:37, schrieb Eli Bendersky: > > Yeah, it definitely could. There are two problems currently: 1) the > > patches are for 2.7.x and 2) they have some ugly hacks in them. But I > > will talk to the guy who worked on that and hopefully we'll be able to > > have something cleaned up for upstreaming into default/3.x > > > > Anyhow, the webstore app is: > > > https://chrome.google.com/webstore/detail/python/nodpmmidbgeganfponihbgmfcoiibffi > > > > And the code is in: https://code.google.com/p/naclports/wiki/PortList > > The patch > > https://code.google.com/p/naclports/source/browse/trunk/src/libraries/python/nacl.patch > looks rather small and simple. Some of the hacks may not be required in > Python 3.4, too. I'd love to have PNaCl support in Python 3.4! > Feel free to chime in, if you want. You can download the PNaCl toolchain from the latest Native Client SDK and hack on this easily. I'll be happy to look at any patches you have for this. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 14 01:50:18 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 10:50:18 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #17828: va_start() must be accompanied by va_end() In-Reply-To: <3dKkNm0mwGz7LlT@mail.python.org> References: <3dKkNm0mwGz7LlT@mail.python.org> Message-ID: On 14 Nov 2013 10:40, "christian.heimes" wrote: > > http://hg.python.org/cpython/rev/99ba1772c469 > changeset: 87089:99ba1772c469 > user: Christian Heimes > date: Thu Nov 14 01:39:35 2013 +0100 > summary: > Issue #17828: va_start() must be accompanied by va_end() > CID 1128793: Missing varargs init or cleanup (VARARGS) Today I learned... :) Thanks! Cheers, Nick. > > files: > Objects/exceptions.c | 13 +++++++------ > 1 files changed, 7 insertions(+), 6 deletions(-) > > > diff --git a/Objects/exceptions.c b/Objects/exceptions.c > --- a/Objects/exceptions.c > +++ b/Objects/exceptions.c > @@ -2632,12 +2632,6 @@ > PyObject *new_exc, *new_val, *new_tb; > va_list vargs; > > -#ifdef HAVE_STDARG_PROTOTYPES > - va_start(vargs, format); > -#else > - va_start(vargs); > -#endif > - > PyErr_Fetch(&exc, &val, &tb); > caught_type = (PyTypeObject *) exc; > /* Ensure type info indicates no extra state is stored at the C level */ > @@ -2690,7 +2684,14 @@ > * types as well, but that's quite a bit trickier due to the extra > * state potentially stored on OSError instances. > */ > + > +#ifdef HAVE_STDARG_PROTOTYPES > + va_start(vargs, format); > +#else > + va_start(vargs); > +#endif > msg_prefix = PyUnicode_FromFormatV(format, vargs); > + va_end(vargs); > if (msg_prefix == NULL) > return NULL; > > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Nov 14 10:00:38 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 14 Nov 2013 10:00:38 +0100 Subject: [Python-Dev] peps: PEP 456: add some of the new implementation details to the PEP's text References: <3dKgZQ31fjz7Lqj@mail.python.org> Message-ID: <20131114100038.7797e663@fsol> On Wed, 13 Nov 2013 23:33:02 +0100 (CET) christian.heimes wrote: > > > +Small string optimization > +========================= > + > +Hash functions like SipHash24 have a costly initialization and finalization > +code that can dominate speed of the algorithm for very short strings. On the > +other hand Python calculates the hash value of short strings quite often. A > +simple and fast function for especially for hashing of small strings can make > +a measurably impact on performance. For example these measurements were taken > +during a run of Python's regression tests. Additional measurements of other > +code have shown a similar distribution. Well, the text above talks about a "measurably (typo?) impact on performance", but you aren't giving any performance numbers, which doesn't help the reader of those lines. Regards Antoine. From ncoghlan at gmail.com Thu Nov 14 12:58:16 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 21:58:16 +1000 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (784a02ec2a26): sum=522 In-Reply-To: References: Message-ID: On 14 Nov 2013 13:52, wrote: > > results for 784a02ec2a26 on branch "default" > -------------------------------------------- > > test_codeccallbacks leaked [40, 40, 40] references, sum=120 > test_codeccallbacks leaked [40, 40, 40] memory blocks, sum=120 > test_codecs leaked [38, 38, 38] references, sum=114 > test_codecs leaked [24, 24, 24] memory blocks, sum=72 > test_email leaked [16, 16, 16] references, sum=48 > test_email leaked [16, 16, 16] memory blocks, sum=48 Hmm, it appears I have a reference leak somewhere. Cheers, Nick. > > > Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogx2QIb_', '-x'] > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 14 13:01:32 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Nov 2013 22:01:32 +1000 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (784a02ec2a26): sum=522 In-Reply-To: References: Message-ID: On 14 Nov 2013 21:58, "Nick Coghlan" wrote: > > > On 14 Nov 2013 13:52, wrote: > > > > results for 784a02ec2a26 on branch "default" > > -------------------------------------------- > > > > test_codeccallbacks leaked [40, 40, 40] references, sum=120 > > test_codeccallbacks leaked [40, 40, 40] memory blocks, sum=120 > > test_codecs leaked [38, 38, 38] references, sum=114 > > test_codecs leaked [24, 24, 24] memory blocks, sum=72 > > test_email leaked [16, 16, 16] references, sum=48 > > test_email leaked [16, 16, 16] memory blocks, sum=48 > > Hmm, it appears I have a reference leak somewhere. Ah, Benjamin fixed it already. Thanks! :) Cheers, Nick. > > Cheers, > Nick. > > > > > > > Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', '3:3:/home/antoine/cpython/refleaks/reflogx2QIb_', '-x'] > > _______________________________________________ > > Python-checkins mailing list > > Python-checkins at python.org > > https://mail.python.org/mailman/listinfo/python-checkins -------------- next part -------------- An HTML attachment was scrubbed... URL: From walter at livinglogic.de Thu Nov 14 14:22:16 2013 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 14 Nov 2013 14:22:16 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> Message-ID: <5284CE88.7080101@livinglogic.de> On 13.11.13 17:25, Nick Coghlan wrote: > On 14 November 2013 02:12, Nick Coghlan wrote: >> On 14 November 2013 00:30, Walter D?rwald wrote: >>> On 13.11.13 14:51, nick.coghlan wrote: >>> >>>> http://hg.python.org/cpython/rev/854a2cea31b9 >>>> changeset: 87084:854a2cea31b9 >>>> user: Nick Coghlan >>>> date: Wed Nov 13 23:49:21 2013 +1000 >>>> summary: >>>> Close #17828: better handling of codec errors >>>> >>>> - output type errors now redirect users to the type-neutral >>>> convenience functions in the codecs module >>>> - stateless errors that occur during encoding and decoding >>>> will now be automatically wrapped in exceptions that give >>>> the name of the codec involved >>> >>> >>> Wouldn't it be better to add an annotation API to the exceptions classes? >>> This would allow to annotate all exceptions without having to replace the >>> exception object. Hmm, it might be better to have the traceback machinery print the annotation information instead of BaseException.__str__, so we don't get any compatibility issues with custom __str__ implementations. >> There's a reason the C API for this is private - it's a band aid fix, >> because solving it properly is hard :) > > Note that the specific problem with just annotating the exception > rather than a specific frame is that you lose the stack context for > where the annotation occurred. The current chaining workaround doesn't > just change the exception message, it also breaks the stack into two > pieces (inside and outside the codec) that get displayed separately. > > Mostly though, it boils down to the fact that I'm far more comfortable > changing codec exception stack trace details in some cases than I am > proposing a new API for all exceptions this close to the Python 3.4 > feature freeze. Sure, this is something that might go into 3.5, but not 3.4. > A more elegant (and comprehensive) solution as a PEP for 3.5 would > certainly be a nice thing to have, but I think this is still much > better than the 3.3 status quo. Thinking further about this, I like your "frame annotation" suggestion Tracebacks could then look like this: >>> b"hello".decode("uu_codec") Traceback (most recent call last): File "", line 1, in : decoding with 'uu_codec' codec failed ValueError: Missing "begin" line in input data In fact the traceback already lays out the chain of events. What is missing is simply a little additional information. Could frame annotation be added via decorators, i.e. something like this: @annotate("while doing something with {param}") def func(param): do something annotate() would catch the exception, call .format() on the annotation string with the local variables of the frame as keyword arguments, attach the result to a special attribute of the frame and reraise the exception. The traceback machinery would simply have to print this additional attribute. Servus, Walter Servus, Walter From benjamin at python.org Thu Nov 14 15:14:46 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 14 Nov 2013 09:14:46 -0500 Subject: [Python-Dev] [Python-checkins] Daily reference leaks (784a02ec2a26): sum=522 In-Reply-To: <20131114130611.263ee550@fsol> References: <20131114130611.263ee550@fsol> Message-ID: 2013/11/14 Antoine Pitrou : > On Thu, 14 Nov 2013 22:01:32 +1000 > Nick Coghlan wrote: >> On 14 Nov 2013 21:58, "Nick Coghlan" wrote: >> > >> > >> > On 14 Nov 2013 13:52, wrote: >> > > >> > > results for 784a02ec2a26 on branch "default" >> > > -------------------------------------------- >> > > >> > > test_codeccallbacks leaked [40, 40, 40] references, sum=120 >> > > test_codeccallbacks leaked [40, 40, 40] memory blocks, sum=120 >> > > test_codecs leaked [38, 38, 38] references, sum=114 >> > > test_codecs leaked [24, 24, 24] memory blocks, sum=72 >> > > test_email leaked [16, 16, 16] references, sum=48 >> > > test_email leaked [16, 16, 16] memory blocks, sum=48 >> > >> > Hmm, it appears I have a reference leak somewhere. >> >> Ah, Benjamin fixed it already. Thanks! :) > > The reference leak task has been running for quite some time on my > personal machine and I believe it has proven useful. I have no problem > continuing running it on the same machine (which is mostly sitting idle > anyway), but maybe it should rather be hosted on our CI infrastructure? > Any suggestions? Thank you very much for running that, btw. I'm sure we would have released a lot of horribly leaking stuff without it. > > (the script is quite rough with hardcoded stuff, but beating it into > better shape could be a nice target for first-time contributors) Perhaps someone can figure out how to run it on one of the the buildbots? -- Regards, Benjamin From walter at livinglogic.de Thu Nov 14 18:27:37 2013 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 14 Nov 2013 18:27:37 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: <5284CE88.7080101@livinglogic.de> References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> <5284CE88.7080101@livinglogic.de> Message-ID: <52850809.3080908@livinglogic.de> On 14.11.13 14:22, Walter D?rwald wrote: > On 13.11.13 17:25, Nick Coghlan wrote: > >> [...] >> A more elegant (and comprehensive) solution as a PEP for 3.5 would >> certainly be a nice thing to have, but I think this is still much >> better than the 3.3 status quo. > > Thinking further about this, I like your "frame annotation" suggestion > > Tracebacks could then look like this: > > >>> b"hello".decode("uu_codec") > Traceback (most recent call last): > File "", line 1, in : decoding with 'uu_codec' codec > failed > ValueError: Missing "begin" line in input data > > In fact the traceback already lays out the chain of events. What is > missing is simply a little additional information. > > Could frame annotation be added via decorators, i.e. something like this: > > @annotate("while doing something with {param}") > def func(param): > do something > > annotate() would catch the exception, call .format() on the annotation > string with the local variables of the frame as keyword arguments, > attach the result to a special attribute of the frame and reraise the > exception. > > The traceback machinery would simply have to print this additional > attribute. http://bugs.python.org/19585 is a patch that implements that. With the patch the following code: import traceback @traceback.annotate("while handling x={x!r}") def handle(x): raise ValueError(42) handle("spam") will give the traceback: Traceback (most recent call last): File "spam.py", line 8, in handle("spam") File "frame-annotation/Lib/traceback.py", line 322, in wrapped f(*args, **kwargs) File "spam.py", line 5, in handle: while handling x='spam' raise ValueError(42) ValueError: 42 Unfortunaty the frame from the decorator shows up in the traceback. Servus, Walter From chris.barker at noaa.gov Thu Nov 14 18:32:10 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 14 Nov 2013 09:32:10 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 Message-ID: Folks, (note this is about 2.7 -- sorry, but a lot of us still use that! I can only assume that in 3.* this is a non-issue) I just discovered an issue that's been around a long time: If you create an Exception with a unicode object for the message, the message can be silently ignored if it can not be encoded to ASCII (or, more properly, the default encoding). In my use-case, I was parsing a text file (utf-8), and wanted a bit of that text to be part of the Exception message (an error reading the file, I wanted the user to know what the text was surrounding the ill-formated part of the text file). What I got was a blank message, and it took a lot of poking at it to figure out why. My solution was: msg = u"Problem with line %i: %s This is not a valid time slot"%(linenum, line) raise ValueError(msg.encode('ascii', 'ignore')) which is really pretty painfully clunky. This is an issue brought up in various tutorial and blog posts, and all the solutions I've seen involve some similar clunkiness. I also found this issue in the issue tracker: http://bugs.python.org/issue2517 Which was resolved years ago, but as far as I can tell, only solved the problem of being able to do: unicode(an_exception) and get the proper unicode message object. But we still can't raise the darn thing and expect the user to see the message. Why is this the case? I can print a unicode object to the terminal, why can't raising an Exception print a unicode object? I can imagine for backward compatibility, or maybe for non-unicode terminals, or ??? Exceptions do need to print as ascii. However, having a message simply get swallowed up and disappear seems like the wrong solution. - auto-conversion to a default encoding is fraught with problems all over the board -- I know that. I also know that too much code would break too often if we didn't have auto-conversion. - for the most part, the auto-conversion uses 'strict' mode -- I generally dislike this, as it means code crashes when odd stuff gets introduced after testing, but I can see why it is done. - However, I can see why for raising Exceptions, the decision was made to swallow that error, so that the actual Exception intended is raised, rather than a new UnicodeEncodeError. - But combining 'strict' with ignoring the encoding exception seems like the worst of both worlds. So a proposal: Use 'replace" mode for the encoding to the default, and at least the user would see SOMETHING of the message. In a common case, it would be a lot of ascii, and in the worse case it would be a lot of question marks -- still better than a totally blank message. Another option would be to use the str(repr(the_message)) so the user would get the escaped version. Though I think that would be more ugly. What am I missing? This seems so obvious, and easy to do (though maybe it's buried in the C implementation of Exceptions) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From eliben at gmail.com Thu Nov 14 19:28:54 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 14 Nov 2013 10:28:54 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131113001121.GG2085@ando> Message-ID: On Wed, Nov 13, 2013 at 10:27 AM, Brett Cannon wrote: > > > > On Wed, Nov 13, 2013 at 1:05 PM, Eli Bendersky wrote: > >> >> >> >> On Wed, Nov 13, 2013 at 6:58 AM, Brett Cannon wrote: >> >>> >>> >>> >>> On Wed, Nov 13, 2013 at 6:30 AM, Facundo Batista < >>> facundobatista at gmail.com> wrote: >>> >>>> On Wed, Nov 13, 2013 at 4:37 AM, Maciej Fijalkowski >>>> wrote: >>>> >>>> >> Do you think it would be productive to create an independent Python >>>> >> compiler, designed with sandboxing in mind from the beginning? >>>> > >>>> > PyPy sandbox does work FYI >>>> > >>>> > It might not do exactly what you want, but it both provides a full >>>> > python and security. >>>> >>>> If we have sandboxing using PyPy... what also we need to put Python >>>> running in the browser? (like javascript, you know) >>>> >>>> Thanks! >>>> >>> >>> You can try to get PNaCl to work with Python to get a Python executable >>> that at least Chrome can run. >>> >> >> Two corrections: >> >> 1. CPython already works with NaCl and PNaCl (there are working patches >> in naclports to build it) >> > > Anything that should be upstreamed? > > >> 2. It can be used outside Chrome as well, using the standalone "sel_ldr" >> tool that will then allow to run a sandboxed CPython .nexe from the command >> line >> > > Sure, but I was just thinking about the "in browser" question Facundo > asked about. > FWIW, if you already have Chrome 31, go to: http://commondatastorage.googleapis.com/nativeclient-mirror/naclports/pepper_33/988/publish/python/pnacl/index.html This is CPython running on top of PNaCl, at near-native speed. With C extensions. With threads. It's 2.7.5 but we'll put up 3.4 too soon (anyone can do it though - based on naclports). The first load takes a bit of time, afterwards it's cached and instantaneous. Now all that's left is for someone to come up with a friendly API to wrap around the Pepper interface to conveniently access DOM :-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Nov 14 22:02:40 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 14 Nov 2013 16:02:40 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: 2013/11/14 Chris Barker : > So a proposal: > > Use 'replace" mode for the encoding to the default, and at least the > user would see SOMETHING of the message. In a common case, it would be > a lot of ascii, and in the worse case it would be a lot of question > marks -- still better than a totally blank message. > > Another option would be to use the str(repr(the_message)) so the user > would get the escaped version. Though I think that would be more ugly. Unfortunately both of these things change behavior so cannot be changed in Python 2.7. > > What am I missing? This seems so obvious, and easy to do (though maybe > it's buried in the C implementation of Exceptions) -- Regards, Benjamin From victor.stinner at gmail.com Thu Nov 14 22:20:52 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 14 Nov 2013 22:20:52 +0100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: 2013/11/14 Chris Barker : > (note this is about 2.7 -- sorry, but a lot of us still use that! I > can only assume that in 3.* this is a non-issue) > > I just discovered an issue that's been around a long time: > > If you create an Exception with a unicode object for the message, (...) In Python 2, there are too many similar corner cases. It is impossible to fix these bugs without taking the risk of introducing a regression. Seriously, *all* these tricky bugs are fixed in Python 3. So don't loose time on trying to workaround them, but invest in the future: upgrade to Python 3! Victor From arigo at tunes.org Thu Nov 14 22:43:55 2013 From: arigo at tunes.org (Armin Rigo) Date: Thu, 14 Nov 2013 22:43:55 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: Hi Victor, On Wed, Nov 13, 2013 at 12:58 AM, Victor Stinner wrote: > I now gave up on sandboxing Python. I just would like to warn other > core developers that trying to put a sandbox in Python is not a good > idea :-) I cannot thank you enough for writing this mail :-) It is a great place to point people to when they come along with some superficial idea about sandboxing Python. A bient?t, Armin. From tseaver at palladion.com Thu Nov 14 22:55:19 2013 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 14 Nov 2013 16:55:19 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/14/2013 04:02 PM, Benjamin Peterson wrote: > 2013/11/14 Chris Barker : >> So a proposal: >> >> Use 'replace" mode for the encoding to the default, and at least >> the user would see SOMETHING of the message. In a common case, it >> would be a lot of ascii, and in the worse case it would be a lot of >> question marks -- still better than a totally blank message. >> >> Another option would be to use the str(repr(the_message)) so the >> user would get the escaped version. Though I think that would be >> more ugly. > > Unfortunately both of these things change behavior so cannot be > changed in Python 2.7. Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. The real question is whether third-party code will break when the now-empty error messages appear with '?' littered through them? About the only things I can think of which might break would be doctests, but people *expect* those to break across third-dot releases of Python (one reason why I hate them). Exception repr is explicitly *not* part of any backward-compatibility guarantees in Python. Or code which explicitly works around the breakage could fail (urlparse changes between 2.7.3 and 2.7.4, anyone?d( Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iEYEARECAAYFAlKFRscACgkQ+gerLs4ltQ6JIgCgvNxHugjjbR3L1crSDK0QJiLb LSYAn2cJnZ8almcfCmWHKhOnCP69bpB3 =MIFq -----END PGP SIGNATURE----- From victor.stinner at gmail.com Thu Nov 14 23:32:32 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 14 Nov 2013 23:32:32 +0100 Subject: [Python-Dev] Add transform() and untranform() methods Message-ID: Hi, I saw that Nick Coghlan documented codecs.encode() and codecs.decode(), and changed the exception raised when codecs like rot_13 are used on bytes.decode() and str.encode(). I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (second parameter). We try to avoid this in Python. I would prefer to split the registry of codecs to have 3 registries: - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str - bytes: encode bytes=>bytes, decode bytes=>bytes - str: encode str=>str, decode str=>str And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new attribute. Examples: - utf8: encoding - zlib: bytes - rot13: str The result type of bytes.transform/untransform would be bytes, and the result type of str.transform/untransform would be str. I don't know which exception should be raised when a codec is used in the wrong method. LookupError? TypeError "codec xxx cannot be used with method xxx.xx"? Something else? codecs.encode/decode() documentation should be removed. The functions should be kept, just in case if someone uses them. Victor From victor.stinner at gmail.com Thu Nov 14 23:40:41 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 14 Nov 2013 23:40:41 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: Oh, I forgot to mention that I sent this email in reaction to this issue: http://bugs.python.org/issue19585 Modifying the critical PyFrameObject because the codecs API raises surprising errors doesn't sound correct. I prefer to fix how codecs are used, than modifying the PyFrameObject. For more information, see the issue #7475 which a long history (4 years) and many messages. Martin von Loewis wrote "I would still be opposed to such a change, and I think it needs a PEP." and I still agree with him on this point. Because they are different opinions and no consensus, a PEP is required to explain why we took this decision and list rejected alternatives. http://bugs.python.org/issue7475 Victor 2013/11/14 Victor Stinner : > Hi, > > I saw that Nick Coghlan documented codecs.encode() and > codecs.decode(), and changed the exception raised when codecs like > rot_13 are used on bytes.decode() and str.encode(). > > I don't like the functions codecs.encode() and codecs.decode() because > the type of the result depends on the encoding (second parameter). We > try to avoid this in Python. > > I would prefer to split the registry of codecs to have 3 registries: > > - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str > - bytes: encode bytes=>bytes, decode bytes=>bytes > - str: encode str=>str, decode str=>str > > And add transform() and untransform() methods to bytes and str types. > In practice, it might be same codecs registry for all codecs just with > a new attribute. > > Examples: > > - utf8: encoding > - zlib: bytes > - rot13: str > > The result type of bytes.transform/untransform would be bytes, and the > result type of str.transform/untransform would be str. > > I don't know which exception should be raised when a codec is used in > the wrong method. LookupError? TypeError "codec xxx cannot be used > with method xxx.xx"? Something else? > > codecs.encode/decode() documentation should be removed. The functions > should be kept, just in case if someone uses them. > > Victor From tjreedy at udel.edu Thu Nov 14 23:46:05 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Nov 2013 17:46:05 -0500 Subject: [Python-Dev] peps: PEP 456: add some of the new implementation details to the PEP's text In-Reply-To: <20131114100038.7797e663@fsol> References: <3dKgZQ31fjz7Lqj@mail.python.org> <20131114100038.7797e663@fsol> Message-ID: On 11/14/2013 4:00 AM, Antoine Pitrou wrote: > On Wed, 13 Nov 2013 23:33:02 +0100 (CET) > christian.heimes wrote: >> >> >> +Small string optimization >> +========================= >> + >> +Hash functions like SipHash24 have a costly initialization and finalization >> +code that can dominate speed of the algorithm for very short strings. On the >> +other hand Python calculates the hash value of short strings quite often. A >> +simple and fast function for especially for hashing of small strings can make 'for especially for hashing' is garbled. Delete first 'for'. >> +a measurably impact on performance. For example these measurements were taken 'measurable' >> +during a run of Python's regression tests. Additional measurements of other >> +code have shown a similar distribution. -- Terry Jan Reedy From tjreedy at udel.edu Thu Nov 14 23:59:08 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Nov 2013 17:59:08 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: On 11/14/2013 4:55 PM, Tres Seaver wrote: > About the only things I can think of which might break would be doctests, > but people *expect* those to break across third-dot releases of Python > (one reason why I hate them). My impression is that we avoid enhancing correct exception messages in bugfix (third-dot) releases because of both doctests and other in-code examination of messages. > Exception repr is explicitly *not* part of > any backward-compatibility guarantees in Python. So we more freely change exception messages in version (second-dot) releases, without deprecation notices or waiting periods. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Fri Nov 15 00:02:07 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 15 Nov 2013 12:02:07 +1300 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: <52850809.3080908@livinglogic.de> References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> <5284CE88.7080101@livinglogic.de> <52850809.3080908@livinglogic.de> Message-ID: <5285566F.3060300@canterbury.ac.nz> Walter D?rwald wrote: > Unfortunaty the frame from the decorator shows up in the traceback. Maybe the decorator could remove its own frame from the traceback? -- Greg From ncoghlan at gmail.com Fri Nov 15 00:03:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 09:03:37 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: On 15 Nov 2013 08:34, "Victor Stinner" wrote: > > Hi, > > I saw that Nick Coghlan documented codecs.encode() and > codecs.decode(), and changed the exception raised when codecs like > rot_13 are used on bytes.decode() and str.encode(). > > I don't like the functions codecs.encode() and codecs.decode() because > the type of the result depends on the encoding (second parameter). We > try to avoid this in Python. The type signature of those functions is just object -> object (Similar to the way the 2.x convenience methods were actually basestring -> basestring). > I would prefer to split the registry of codecs to have 3 registries: > > - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str > - bytes: encode bytes=>bytes, decode bytes=>bytes > - str: encode str=>str, decode str=>str > You have to get it out of your head that codecs are just about text and and binary data. They're not: they're arbitrary type transforms, and MAL deliberately wrote the module that way. > And add transform() and untransform() methods to bytes and str types. > In practice, it might be same codecs registry for all codecs just with > a new attribute. This is completely the wrong approach. There's zero justification for adding new builtin methods for this use case - encoding and decoding are generic operations, they should use functions not methods. What could be useful is allowing CodecInfo objects to supply an "expected input type" and an "expected output type" (ABCs and instance check overrides make that quite flexible). > > Examples: > > - utf8: encoding > - zlib: bytes > - rot13: str > > The result type of bytes.transform/untransform would be bytes, and the > result type of str.transform/untransform would be str. > > I don't know which exception should be raised when a codec is used in > the wrong method. LookupError? TypeError "codec xxx cannot be used > with method xxx.xx"? Something else? We already do this check in the existing convenience methods - it raises TypeError. > > codecs.encode/decode() documentation should be removed. The functions > should be kept, just in case if someone uses them. No. They're part of the regression test suite, and have been since Python 2.4. They embody MAL's intended "arbitrary type transform library" approach. They provide a source compatible mechanism for using binary codecs in single code base Python 2/3 projects. At this point, the only person that can get me to revert this clarification of MAL's original vision for the codecs module is Guido, since anything else completely fails to address the Python 3 adoption barrier posed by the current state of Python 3's binary codec support. Note that the only behavioural changes in the commits so far were to exception handling - everything else was just docs. The next planned commit (to restore the binary codec aliases) *is* a behavioural change - that's why I posted to the list about it (it received only two responses, both +1) Cheers, Nick. > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 15 00:11:02 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 09:11:02 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: On 15 Nov 2013 08:42, "Victor Stinner" wrote: > > Oh, I forgot to mention that I sent this email in reaction to this issue: > > http://bugs.python.org/issue19585 > > Modifying the critical PyFrameObject because the codecs API raises > surprising errors doesn't sound correct. I prefer to fix how codecs > are used, than modifying the PyFrameObject. > > For more information, see the issue #7475 which a long history (4 > years) and many messages. Martin von Loewis wrote "I would still be > opposed to such a change, and I think it needs a PEP." and I still > agree with him on this point. Because they are different opinions and > no consensus, a PEP is required to explain why we took this decision > and list rejected alternatives. > > http://bugs.python.org/issue7475 Martin wrote that before it was pointed out there were existing functions to handle the problem (I was asking for a PEP back then, too). I posted my plan for dealing with this months ago without receiving any complaints, and I'm annoyed you waited until I had actually followed through and implemented it to complain about it and ask for Python 3's binary codec support to stay broken instead :P (Starting a new thread instead of replying to the one where I specifically asked about taking the next step does nothing to improve my mood) Regards, Nick. > > Victor > > 2013/11/14 Victor Stinner : > > Hi, > > > > I saw that Nick Coghlan documented codecs.encode() and > > codecs.decode(), and changed the exception raised when codecs like > > rot_13 are used on bytes.decode() and str.encode(). > > > > I don't like the functions codecs.encode() and codecs.decode() because > > the type of the result depends on the encoding (second parameter). We > > try to avoid this in Python. > > > > I would prefer to split the registry of codecs to have 3 registries: > > > > - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str > > - bytes: encode bytes=>bytes, decode bytes=>bytes > > - str: encode str=>str, decode str=>str > > > > And add transform() and untransform() methods to bytes and str types. > > In practice, it might be same codecs registry for all codecs just with > > a new attribute. > > > > Examples: > > > > - utf8: encoding > > - zlib: bytes > > - rot13: str > > > > The result type of bytes.transform/untransform would be bytes, and the > > result type of str.transform/untransform would be str. > > > > I don't know which exception should be raised when a codec is used in > > the wrong method. LookupError? TypeError "codec xxx cannot be used > > with method xxx.xx"? Something else? > > > > codecs.encode/decode() documentation should be removed. The functions > > should be kept, just in case if someone uses them. > > > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri Nov 15 00:30:22 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 15 Nov 2013 01:30:22 +0200 Subject: [Python-Dev] "*zip-bomb" via codecs Message-ID: It is possible make a DDoS using the fact that codecs registry provides access to gzip and bzip2 decompressor. Someone can send HTTP request or email message with specified "gzip_codec" or "bzip2_codec" as content encoding and great well compressed gzip- or bzip2-file as a content. Naive server will use the bytes.decode() method to decompress a content. It is possible to create small compressed files which require very much time and memory to decompress. Of course bytes.decode() will fail becouse decoder returns bytes instead string, but time and memory are already wasted. I have no working example but I'm sure it will be easy to create it. I suspect many services will be vulnerable for this attack. Simple solution for this problem is check any foreign encoding that it is conteined in a special set of safe encodings. But every program should check it explicitly. For more general solution bytes.decode() should reject encoding *before* starting of decoding. I.e. either all bytes->str decoders should be registered in separated registry, or all codecs should have additional attributes which determines input and output type. From ethan at stoneleaf.us Fri Nov 15 00:02:30 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 14 Nov 2013 15:02:30 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: <52855686.2050603@stoneleaf.us> On 11/14/2013 02:59 PM, Terry Reedy wrote: > On 11/14/2013 4:55 PM, Tres Seaver wrote: > >> About the only things I can think of which might break would be doctests, >> but people *expect* those to break across third-dot releases of Python >> (one reason why I hate them). > > My impression is that we avoid enhancing correct exception messages in bugfix (third-dot) releases because of both > doctests and other in-code examination of messages. But these exception messages are incorrect, and so we are okay to fix them, yes? -- ~Ethan~ From storchaka at gmail.com Fri Nov 15 00:37:19 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 15 Nov 2013 01:37:19 +0200 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: 15.11.13 01:03, Nick Coghlan ???????(??): > We already do this check in the existing convenience methods - it raises > TypeError. The problem with this check is that it happens *after* encoding/decoding. This opens door for DoS (see my last message). From ncoghlan at gmail.com Fri Nov 15 00:44:58 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 09:44:58 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: On 15 Nov 2013 09:11, "Nick Coghlan" wrote: > > > On 15 Nov 2013 08:42, "Victor Stinner" wrote: > > > > Oh, I forgot to mention that I sent this email in reaction to this issue: > > > > http://bugs.python.org/issue19585 > > > > Modifying the critical PyFrameObject because the codecs API raises > > surprising errors doesn't sound correct. I prefer to fix how codecs > > are used, than modifying the PyFrameObject. > > > > For more information, see the issue #7475 which a long history (4 > > years) and many messages. Martin von Loewis wrote "I would still be > > opposed to such a change, and I think it needs a PEP." and I still > > agree with him on this point. Because they are different opinions and > > no consensus, a PEP is required to explain why we took this decision > > and list rejected alternatives. > > > > http://bugs.python.org/issue7475 > > Martin wrote that before it was pointed out there were existing functions to handle the problem (I was asking for a PEP back then, too). > > I posted my plan for dealing with this months ago without receiving any complaints, and I'm annoyed you waited until I had actually followed through and implemented it to complain about it and ask for Python 3's binary codec support to stay broken instead :P Something I *would* be entirely happy to do is write a retroactive PEP after beta 1 is out the door, explaining the history of this issue in a more coherent form than the comment history on issue 7475 and the many child issues it spawned. This would also provide a better launching point for other enhancements in Python 3.5 (frame annotations to remove the need for the exception chaining hack and better input validation mechanisms for codecs that allow the convenience methods to check that case explicitly rather than relying on the exception chaining). Cheers, Nick. > > (Starting a new thread instead of replying to the one where I specifically asked about taking the next step does nothing to improve my mood) > > Regards, > Nick. > > > > > Victor > > > > 2013/11/14 Victor Stinner : > > > Hi, > > > > > > I saw that Nick Coghlan documented codecs.encode() and > > > codecs.decode(), and changed the exception raised when codecs like > > > rot_13 are used on bytes.decode() and str.encode(). > > > > > > I don't like the functions codecs.encode() and codecs.decode() because > > > the type of the result depends on the encoding (second parameter). We > > > try to avoid this in Python. > > > > > > I would prefer to split the registry of codecs to have 3 registries: > > > > > > - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str > > > - bytes: encode bytes=>bytes, decode bytes=>bytes > > > - str: encode str=>str, decode str=>str > > > > > > And add transform() and untransform() methods to bytes and str types. > > > In practice, it might be same codecs registry for all codecs just with > > > a new attribute. > > > > > > Examples: > > > > > > - utf8: encoding > > > - zlib: bytes > > > - rot13: str > > > > > > The result type of bytes.transform/untransform would be bytes, and the > > > result type of str.transform/untransform would be str. > > > > > > I don't know which exception should be raised when a codec is used in > > > the wrong method. LookupError? TypeError "codec xxx cannot be used > > > with method xxx.xx"? Something else? > > > > > > codecs.encode/decode() documentation should be removed. The functions > > > should be kept, just in case if someone uses them. > > > > > > Victor > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri Nov 15 00:42:21 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 15 Nov 2013 01:42:21 +0200 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: 15.11.13 00:32, Victor Stinner ???????(??): > And add transform() and untransform() methods to bytes and str types. > In practice, it might be same codecs registry for all codecs just with > a new attribute. If the transform() method will be added, I prefer to have only one transformation method and specify a direction by the transformation name ("bzip2"/"unbzip2"). From steve at pearwood.info Fri Nov 15 00:58:42 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 10:58:42 +1100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: <20131114235841.GL2085@ando> On Thu, Nov 14, 2013 at 04:55:19PM -0500, Tres Seaver wrote: > Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. It's not a given that the current behaviour *is* a bug. Exception messages in 2 are byte-strings, not Unicode. Trying to use Unicode instead is not, as far as I can tell, supported behaviour. If the exception message cannot be converted to a byte-string, suppressing the display of the message seems like perfectly reasonable behaviour to me: py> class NoString: ... def __str__(self): ... raise ValueError ... py> msg = NoString py> msg = NoString() py> print msg Traceback (most recent call last): File "", line 1, in ? File "", line 3, in __str__ ValueError py> raise TypeError(msg) Traceback (most recent call last): File "", line 1, in ? TypeErrorpy> although it would be nice if a newline was used so the prompt was bumped to the next line. The point is, I'm not convinced that this is a bug at all. > The real question is whether third-party code will break when the > now-empty error messages appear with '?' littered through them? This behaviour goes back to at least Python 2.4, the oldest version I have easy access to at the moment that includes Unicode. Given that this alleged bug has been around for so long, I don't think that it effects terribly many people. That implies that fixing it won't benefit many people either. > About the only things I can think of which might break would be doctests, > but people *expect* those to break across third-dot releases of Python Which people? I certainly don't expect doctests to break unless I've done something silly. > (one reason why I hate them). Exception repr is explicitly *not* part of > any backward-compatibility guarantees in Python. Do you have a link for that explicit non-guarantee from the docs please? -- Steven From chris.barker at noaa.gov Fri Nov 15 01:02:17 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 14 Nov 2013 16:02:17 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: On Thu, Nov 14, 2013 at 1:55 PM, Tres Seaver wrote: > Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. Thank you. > The real question is whether third-party code will break when the > now-empty error messages appear with '?' littered through them? right -- any bugfix changes behaviour, and any that can break any test or code that is expecting (or working around) that behavior. So the key question here is are there many (any?) tests or function code out there that are counting on an empty message if and only if there happens to be a non-ascii charactor in an assigned message. It's hard for me to imagine that that's a common thing to test for, but then I'm been known to lack imagination ;-) > About the only things I can think of which might break would be doctests, > but people *expect* those to break across third-dot releases of Python > (one reason why I hate them). Exception repr is explicitly *not* part of > any backward-compatibility guarantees in Python. Or code which > explicitly works around the breakage could fail (urlparse changes between > 2.7.3 and 2.7.4, anyone?d( Sounds do-able to me, then... -Thanks, -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Fri Nov 15 00:57:05 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 14 Nov 2013 15:57:05 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: On Thu, Nov 14, 2013 at 1:20 PM, Victor Stinner >> If you create an Exception with a unicode object for the message, (...) > > In Python 2, there are too many similar corner cases. It is impossible > to fix these bugs without taking the risk of introducing a regression. Yes, there are -- the auto-encoding is a serious pain everywhere. However, this is a case where the resulting Exception is silenced -- it's the only one I know of, and there can't be many like that. > Seriously, *all* these tricky bugs are fixed in Python 3. So don't > loose time on trying to workaround them, but invest in the future: > upgrade to Python 3! Maybe so -- but we are either maintaining 2.7 or not -- it WIL be around for along time yet... (amazing to me how many people are still using <=2.7, actually, even for new projects .. thank you Red Hat "Enterprise" Linux ;-) ) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From chris.barker at noaa.gov Fri Nov 15 01:41:35 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 14 Nov 2013 16:41:35 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: <20131114235841.GL2085@ando> References: <20131114235841.GL2085@ando> Message-ID: On Thu, Nov 14, 2013 at 3:58 PM, Steven D'Aprano wrote: > It's not a given that the current behaviour *is* a bug. I'll concede that it's not a bug unless someone said somewhere that unicode messages should work .. but that's kind of a semantic argument. I have to say it's a very odd choice to me that it suppresses the message, rather than raising an encoding error, like what happens everywhere else the default encoding is used. In fact, I noticed that the message can be anything that can be stringified, which makes it particularly wacky that you can't use a unicode object. > Exception > messages in 2 are byte-strings, not Unicode. well, they are anything that you can call str() on anyway... > Trying to use Unicode > instead is not, as far as I can tell, supported behaviour. clearly not > If the exception message cannot be converted to a byte-string, > suppressing the display of the message seems like perfectly reasonable > behaviour to me: well, yes and no -- the fact is that unicode objects ARE special -- and it wouldn't hurt to treat them that way. And I'm not sure that suppressing the message when you've passed in a weird object that raises an exception when you try to convert it to a string makes sense either -- suppressing an exception is really not a good idea in general -- you really should have a good reason for it. I'm guessing that this was put in to save a lot of crashing from unicode objects, but what do I know? Actually, when I think about it, Exceptions being raised when you call str(0 on something are probably pretty rare -- if you define a class with no __str__ method, you get a default string version -- there can't be many use-cases where you want to make sure no one tries to make a string out of your object... > although it would be nice if a newline was used so the prompt was bumped > to the next line. yup -- that would be good. > The point is, I'm not convinced that this is a bug at all. OK -- to clarify the discussion a bit: I think we all agree that this is not a fatal bug that MUST be fixed. Is this something that could be improved or is the current behavior the best we could have, given the limitations of strings an unicode in py2 anyway? If it's not a desirable change, then we're done -- sorry for the noise. If it is a desirable change, then is the benefit worth the possible breakage of code. Do assess that, you need to trade off the size of the benefit with the amount of breakage. I think it would be a pretty nice benefit I can't see that it would cause a lot of breakage. Any idea how we could assess how much code or tests are out there in the would that this would affect? I contend that it wouldn't be much because: If I had thought to write a test for this, I would have thought to fix my code so that it would either never use a unicode object for a message, or, like I have done in my code, encode it when passing it in to the Exception. There is certainly a chance that some doctests would break, if people had not looked carefully at them -- i.e. that wanted to test that the exception was raised, but did not notice that the message didn't get through. How many are there? who knows? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From tjreedy at udel.edu Fri Nov 15 02:10:32 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Nov 2013 20:10:32 -0500 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: On 11/14/2013 5:32 PM, Victor Stinner wrote: > I don't like the functions codecs.encode() and codecs.decode() because > the type of the result depends on the encoding (second parameter). We > try to avoid this in Python. Such dependence is common with arithmetic. >>> 1 + 2 3 >>> 1 + 2.0 3.0 >>> 1 + 2+0j (3+0j) >>> sum((1,2,3), 0) 6 >>> sum((1,2,3), 0.0) 6.0 >>> sum((1,2,3), 0.0+0j) (6+0j) for f in (compile, eval, getattr, iter, max, min, next, open, pow, round, type, vars): type(f(*args)) # depends on the inputs That is a large fraction of the non-class builtin functions. -- Terry Jan Reedy From tjreedy at udel.edu Fri Nov 15 02:35:29 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Nov 2013 20:35:29 -0500 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: On 11/14/2013 6:03 PM, Nick Coghlan wrote: > You have to get it out of your head that codecs are just about text and > and binary data. 99+% of the current codec module doc leads one to that impression. The fact that codecs are expected to have a file reader and writer and that the default 'strict' error handler is specified in 2 out of the 3 mostly redundant lists as raising a UnicodeError reinforces the impression. > They're not: they're arbitrary type transforms, and MAL > deliberately wrote the module that way. Generic functions are quite pythonic. However, I am not sure how much benefit there is to registering an arbitrary pair of bijective functions > This is completely the wrong approach. There's zero justification for > adding new builtin methods for this use case - encoding and decoding are > generic operations, they should use functions not methods. Making 2&3 code easier is certainly a good reason for the codecs approach. > The next planned commit (to restore the binary codec aliases) *is* a > behavioural change - that's why I posted to the list about it (it > received only two responses, both +1) If I understand correctly, I am mildly +1, but did not respond, thinking that 2 to 0 was sufficient response for you to continue ;-). -- Terry Jan Reedy From tjreedy at udel.edu Fri Nov 15 03:09:06 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Nov 2013 21:09:06 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: On 11/14/2013 6:57 PM, Chris Barker wrote: > On Thu, Nov 14, 2013 at 1:20 PM, Victor Stinner >> Seriously, *all* these tricky bugs are fixed in Python 3. So don't >> loose time on trying to workaround them, but invest in the future: >> upgrade to Python 3! > > Maybe so -- but we are either maintaining 2.7 or not That statement is too 'binary'. We normally fix general bugs* for two years and security bugs for 3 more years. That is already 'trinary'. For 2.7, we have already done 3 1/2 years of general bug fixing. I expect that that will taper off for the next 1 1/2 years. * We sometimes do not back port a bug fix that theorectically could be backported because we think it would be too disruptive (because people depend on the bug). When we fix a bug with a feature change that cannot be backported, we do not usually create a separate backport patch unless the bug is severe. In either case, people who want the fix must upgrade. Many unicode bugs in 2.x were fixed in 3.0 by making unicode the text type. For some but not all unicode issues, separate patches have been made for 2.7. People who want the general fix must upgrade. (The unicode future import gives some of the benefits, but maybe not all.) A few more unicode bugs were fixed in 3.3 with the flexible string representation. People who want the 3.3 fix must upgrade, even from 3.2. > -- it WIL be around for along time yet... 1.5 was around for a long time; not sure if it is completely gone yet. -- Terry Jan Reedy From steve at pearwood.info Fri Nov 15 04:05:29 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 14:05:29 +1100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: <20131115030529.GM2085@ando> On Thu, Nov 14, 2013 at 09:09:06PM -0500, Terry Reedy wrote: > 1.5 was around for a long time; not sure if it is completely gone yet. It's not. I forget the details, but after the last American PyCon, somebody posted a message about a fellow they met who was still using 1.5 in production. -- Steven From steve at pearwood.info Fri Nov 15 04:08:55 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 14:08:55 +1100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: <20131115030855.GN2085@ando> On Thu, Nov 14, 2013 at 04:02:17PM -0800, Chris Barker wrote: > On Thu, Nov 14, 2013 at 1:55 PM, Tres Seaver wrote: > > > Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. > > Thank you. > > > The real question is whether third-party code will break when the > > now-empty error messages appear with '?' littered through them? > > right -- any bugfix changes behaviour It isn't clear that this is a bug at all. Non-ascii Unicode strings are just a special case of the more general problem of what to do if printing the exception raises. If str(exception.message) raises, suppressing the message seems like a perfectly reasonable approach to me. -- Steven From tjreedy at udel.edu Fri Nov 15 04:10:58 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 14 Nov 2013 22:10:58 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: <20131114235841.GL2085@ando> Message-ID: On 11/14/2013 7:41 PM, Chris Barker wrote: > On Thu, Nov 14, 2013 at 3:58 PM, Steven D'Aprano wrote: > >> It's not a given that the current behaviour *is* a bug. > > I'll concede that it's not a bug unless someone said somewhere that > unicode messages should work In particular, what does the reference manual say. > .. but that's kind of a semantic argument. Given that committing a patch to an existing version is a binary action -- done or not, we have to have a binary semantic decision, 'bug' or not, even when the best answer is 'sort of'. We cannot 'sort of' apply a patch ;-). > I have to say it's a very odd choice to me that it suppresses the > message, rather than raising an encoding error, like what happens > everywhere else the default encoding is used. An encoding exception is raised but ignored. Exception handling has changed in some details in 3.x. Sometimes two sensible actions interact in certain contexts to produce an odd result. > In fact, I noticed that the message can be anything that can be > stringified, which makes it particularly wacky that you can't use a > unicode object. You can, as long as it can be stringified with the default args. If it cannot be, then convert it yourself, with the alternative you choose (raise or substitute). > Is this something that could be improved or is the current behavior > the best we could have, given the limitations of strings an unicode in > py2 anyway? From our (core developer viewpoint) that is the wrong question. 2.7 does not get enhancements. The situation would be different if there were going to be a 2.8. -- Terry Jan Reedy From cs at zip.com.au Fri Nov 15 04:21:24 2013 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 15 Nov 2013 14:21:24 +1100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: <20131115032124.GA35038@cskk.homeip.net> On 14Nov2013 15:57, Chris Barker - NOAA Federal wrote: > (amazing to me how many people are still using <=2.7, actually, even > for new projects .. thank you Red Hat "Enterprise" Linux ;-) ) Well, one of the things RHEL gets you is platform stability (they backport fixes; primarily security in the older RHEL streams). So of course the Python dates to the time of the release. I install a current Python 2.7 into /usr/local on many RHEL boxes and target that for custom code. -- Cameron Simpson There is this special biologist word we use for 'stable'. It is 'dead'. - Jack Cohen From cs at zip.com.au Fri Nov 15 04:28:48 2013 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 15 Nov 2013 14:28:48 +1100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: <20131115030855.GN2085@ando> References: <20131115030855.GN2085@ando> Message-ID: <20131115032848.GA39820@cskk.homeip.net> On 15Nov2013 14:08, Steven D'Aprano wrote: > On Thu, Nov 14, 2013 at 04:02:17PM -0800, Chris Barker wrote: > > right -- any bugfix changes behaviour > > It isn't clear that this is a bug at all. > > Non-ascii Unicode strings are just a special case of the more general > problem of what to do if printing the exception raises. If > str(exception.message) raises, suppressing the message seems like a > perfectly reasonable approach to me. Not to me. Silent failure is really nasty. In fact, doesn't the Zen speak explicitly against it? I'm debugging a program right now with silent failures; my own code, with functions submitted to a queue for asynchronous execution, and the queue preserves the function result (or exception) for collection later; if that collection doesn't happen you get... silent failure! I think that if an exception escapes to the outside for reporting, if the reporting raises an exception (especially an "expectable" one like unicode coding/decoding errors), the reporting should have at least a layer of "ouch, report failed, try something uglier but more conservative". At least you'd know there had been a failure. Cheers, -- Cameron Simpson Windows is really user friendly - it doesn't crash on its own, it first opens a dialog box, saying it will crash and you have to click OK :-) - Zoltan Kocsi From steve at pearwood.info Fri Nov 15 06:42:07 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 16:42:07 +1100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: <20131115032848.GA39820@cskk.homeip.net> References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> Message-ID: <20131115054207.GO2085@ando> On Fri, Nov 15, 2013 at 02:28:48PM +1100, Cameron Simpson wrote: > > Non-ascii Unicode strings are just a special case of the more general > > problem of what to do if printing the exception raises. If > > str(exception.message) raises, suppressing the message seems like a > > perfectly reasonable approach to me. > > Not to me. Silent failure is really nasty. In fact, doesn't the Zen > speak explicitly against it? But its not really a silent failure, since you're already dealing with an exception, and that's the important one. The original exception is not suppressed, just the error message. If the original exception was replaced with a different exception: # this doesn't actually happen py> raise ValueError(u"?what?") Traceback (most recent call last): File "", line 1, in ? TypeError: error displaying exception message py> or lost altogether: # neither does this py> raise ValueError(u"?what?") py> then I would consider that a bug. Ideally, we should get a chained exception so you can see both the original and subsequent exceptions: Traceback (most recent call last): File "", line 2, in ValueError: During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 4, in UnicodeEncodeError: 'ascii' codec can't encode character u'\xbf' in position 0: ordinal not in range(128) but Python 2 doesn't have chained exceptions so that's not an option. As for the Zen, the nice thing about that is that it can argue both sides of most questions :-) The Zen has something else to say about this: Special cases aren't special enough to break the rules. Except as the next line in the Zen suggests, sometimes they are :-) UnicodeEncoding errors are just a special case of arbitrary objects that can't be converted to byte strings. If the exception message can't be stringified, in general there's really nothing you can do about it. I suppose one might argue for inserting a meta-error message: ValueError: ***the error message could not be displayed*** but that strikes me as too subtle, potentially confusing, and generally problematic. Ultimately, in the absense of chained exceptions I don't think there's any good solution to the general problem, and I'm not convinced that treating Unicode strings as a special case is justified. It's been at least four, and possibly six (back to 2.2) point releases with this behaviour, and until now apparently nobody has noticed. -- Steven From walter at livinglogic.de Fri Nov 15 08:09:47 2013 From: walter at livinglogic.de (=?utf-8?Q?Walter_D=C3=B6rwald?=) Date: Fri, 15 Nov 2013 08:09:47 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: <051807F7-D6A0-4BC5-BE4C-7EA014B42F75@livinglogic.de> Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : > > 15.11.13 00:32, Victor Stinner ???????(??): >> And add transform() and untransform() methods to bytes and str types. >> In practice, it might be same codecs registry for all codecs just with >> a new attribute. > > If the transform() method will be added, I prefer to have only one transformation method and specify a direction by the transformation name ("bzip2"/"unbzip2"). +1 Some of the transformations might not be revertible (s.transform("lower")? ;)) And the transform function probably doesn't need any error handling machinery. What about the stream/iterator/incremental parts of the codec API? Servus, Walter From ncoghlan at gmail.com Fri Nov 15 08:13:34 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 17:13:34 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: On 15 November 2013 11:10, Terry Reedy wrote: > On 11/14/2013 5:32 PM, Victor Stinner wrote: > >> I don't like the functions codecs.encode() and codecs.decode() because >> the type of the result depends on the encoding (second parameter). We >> try to avoid this in Python. > > > Such dependence is common with arithmetic. > >>>> 1 + 2 > 3 >>>> 1 + 2.0 > 3.0 >>>> 1 + 2+0j > (3+0j) > >>>> sum((1,2,3), 0) > 6 >>>> sum((1,2,3), 0.0) > 6.0 >>>> sum((1,2,3), 0.0+0j) > (6+0j) > > for f in (compile, eval, getattr, iter, max, min, next, open, pow, round, > type, vars): > type(f(*args)) # depends on the inputs > That is a large fraction of the non-class builtin functions. *Type* dependence between inputs and outputs is common (and completely non-controversial). The codecs system is different, since the supported input and output types are *value* dependent, driven by the name of the codec. That's the part which makes the codec machinery interesting in general, since it combines a value driven lazy loading mechanism (based on the codec name) with the subsequent invocation of that mechanism: the default codec search algorithm goes hunting in the "encodings" package (or the alias dictionary), but you can register custom search algorithms and provide encodings any way you want. It does mean, however, that the most you can claim for the type signature of codecs.encode and codecs.decode is that they accept an object and return an object. Beyond that, it's completely driven by the value of the codec. In Python 2.x, the type constraints imposed by the str and unicode convenience methods is "basestring in, basestring out". As it happens, all of the standard library codecs abide by that restriction , so it was easy to interpret the codecs module itself as having the same "basestring in, basestring out" limitation, especially given the heavy focus on text encodings in the way it was documented. In practice, the codecs weren't that open ended - some of them only accepted 8 bit strings, some only accepted unicode, some accepted both (perhaps relying on implicit decoding to unicode), The migration to Python 3 made the contrast between the two far more stark however, hence the long and involved discussion on issue 7475, and the fact that the non-Unicode codecs are currently still missing their shorthand aliases. The proposal I posted to issue 7475 back in April (and, in the absence of any objections to the proposal, finally implemented over the past few weeks) was to take advantage of the fact that the codecs.encode and codecs.decode convenience functions exist (and have been covered by the regression test suite) as far back as Python 2.4. I did this merely by documenting the existing of the functions for Python 2.7, 3.3 and 3.4, changing the exception messages thrown for codec output type errors on the convenience methods to reference them, and by updating the Python 3.4 What's New document to explain the changes. This approach provides a Python 2/3 compatible solution for usage of non-Unicode encodings: users simply need to call the existing module level functions in the codecs module, rather than using the methods on specific builtin types. This approach also means that the binary codecs can be used with any bytes-like object (including memoryview and array.array), rather than being limited to types that implement a new method (like "transform"), and can also be used in Python 2/3 source compatible APIs (since the data driven nature of the problem makes 2to3 unusable as a solution, and that doesn't help single code base projects anyway). >From my point of view, this is now just a matter of better documenting the status quo, and nudging people in the right direction when it comes to using the appropriate API for non-Unicode codecs. Since we now realise these functions have existed since Python 2.4, it doesn't make sense to try to fundamentally change direction, but instead to work on making it better. A few things I noticed while implementing the recent updates: - as you noted in your other email, while MAL is on record as saying the codecs module is intended for arbitrary codecs, not just Unicode encodings, readers of the current docs can definitely be forgiven for not realising that. We really need to better separate the codecs module docs from the text model docs (two new sections in the language reference, one for the codecs machinery and one for the text model would likely be appropriate. The io module docs and those for the builtin open function may also be affected) - a mechanism for annotating frames would help avoid the need for nasty hacks like the exception wrapping that aims to make codec failures easier to debug - if codecs exposed a way to separate the input type check from the invocation of the codec, we could redirect users to the module API for bad input types as well (e.g. calling "input str".encode("bz2") - if we want something that doesn't need to be imported, then encode() and decode() builtins make more sense than new methods on str, bytes and bytearray objects (since builtins would support memoryview and array.array as well, and it avoids ambiguity regarding the direction of the operation) - an ABC for "ByteSequence" might be a good idea (ducktyped to check if memoryview can retrieve a 1-D byte array) - an ABC for "String" might be a good idea (but opens a big can of worms) - the codecs module should offer a way to register a new alias for an existing codec - the codecs module should offer a way to map a name to a CodecInfo object without registering a new search function - encodings should be converted to a namespace package Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefan_ml at behnel.de Fri Nov 15 08:22:59 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 15 Nov 2013 08:22:59 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> Message-ID: Nick Coghlan, 13.11.2013 17:25: > Note that the specific problem with just annotating the exception > rather than a specific frame is that you lose the stack context for > where the annotation occurred. The current chaining workaround doesn't > just change the exception message, it also breaks the stack into two > pieces (inside and outside the codec) that get displayed separately. I find this specific chain of exceptions a bit excessive, though: """ Failed example: str(result) Expected: Traceback (most recent call last): ... LookupError: unknown encoding: UCS4 Got: LookupError: unknown encoding: UCS4 The above exception was the direct cause of the following exception: Traceback (most recent call last): File ".../py3km/python/lib/python3.4/doctest.py", line 1291, in __run compileflags, 1), test.globs) File "", line 1, in str(result) File "xslt.pxi", line 727, in lxml.etree._XSLTResultTree.__str__ (src/lxml/lxml.etree.c:143584) File "xslt.pxi", line 750, in lxml.etree._XSLTResultTree.__unicode__ (src/lxml/lxml.etree.c:143853) LookupError: decoding with 'UCS4' codec failed (LookupError: unknown encoding: UCS4) """ I can't see any bit of information being added by chaining the exceptions in this specific case. Remember that each change to exception messages and/or exception chaining will break someone's doctests somewhere, and it's really ugly to work around chained exceptions in (cross-Py-version) doctests. I understand that this is helpful *in general*, though, i.e. for other kinds of exceptions in codecs, so maybe changing the exception handling in the doctest module could be a work-around for this kind of change? Stefan From mal at egenix.com Fri Nov 15 09:37:09 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 15 Nov 2013 09:37:09 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: <5285DD35.5050401@egenix.com> On 15.11.2013 08:13, Nick Coghlan wrote: > On 15 November 2013 11:10, Terry Reedy wrote: >> On 11/14/2013 5:32 PM, Victor Stinner wrote: >> >>> I don't like the functions codecs.encode() and codecs.decode() because >>> the type of the result depends on the encoding (second parameter). We >>> try to avoid this in Python. >> >> >> Such dependence is common with arithmetic. >> >>>>> 1 + 2 >> 3 >>>>> 1 + 2.0 >> 3.0 >>>>> 1 + 2+0j >> (3+0j) >> >>>>> sum((1,2,3), 0) >> 6 >>>>> sum((1,2,3), 0.0) >> 6.0 >>>>> sum((1,2,3), 0.0+0j) >> (6+0j) >> >> for f in (compile, eval, getattr, iter, max, min, next, open, pow, round, >> type, vars): >> type(f(*args)) # depends on the inputs >> That is a large fraction of the non-class builtin functions. > > *Type* dependence between inputs and outputs is common (and completely > non-controversial). The codecs system is different, since the > supported input and output types are *value* dependent, driven by the > name of the codec. > > That's the part which makes the codec machinery interesting in > general, since it combines a value driven lazy loading mechanism > (based on the codec name) with the subsequent invocation of that > mechanism: the default codec search algorithm goes hunting in the > "encodings" package (or the alias dictionary), but you can register > custom search algorithms and provide encodings any way you want. It > does mean, however, that the most you can claim for the type signature > of codecs.encode and codecs.decode is that they accept an object and > return an object. Beyond that, it's completely driven by the value of > the codec. Indeed. You have to think of the codec registry as a mere lookup mechanism - very much like an import. The implementation of the imported module defines which types are supported and how the encode/decode steps work. > In Python 2.x, the type constraints imposed by the str and unicode > convenience methods is "basestring in, basestring out". As it happens, > all of the standard library codecs abide by that restriction , so it > was easy to interpret the codecs module itself as having the same > "basestring in, basestring out" limitation, especially given the heavy > focus on text encodings in the way it was documented. In practice, the > codecs weren't that open ended - some of them only accepted 8 bit > strings, some only accepted unicode, some accepted both (perhaps > relying on implicit decoding to unicode), > > The migration to Python 3 made the contrast between the two far more > stark however, hence the long and involved discussion on issue 7475, > and the fact that the non-Unicode codecs are currently still missing > their shorthand aliases. > > The proposal I posted to issue 7475 back in April (and, in the absence > of any objections to the proposal, finally implemented over the past > few weeks) was to take advantage of the fact that the codecs.encode > and codecs.decode convenience functions exist (and have been covered > by the regression test suite) as far back as Python 2.4. I did this > merely by documenting the existing of the functions for Python 2.7, > 3.3 and 3.4, changing the exception messages thrown for codec output > type errors on the convenience methods to reference them, and by > updating the Python 3.4 What's New document to explain the changes. > > This approach provides a Python 2/3 compatible solution for usage of > non-Unicode encodings: users simply need to call the existing module > level functions in the codecs module, rather than using the methods on > specific builtin types. This approach also means that the binary > codecs can be used with any bytes-like object (including memoryview > and array.array), rather than being limited to types that implement a > new method (like "transform"), and can also be used in Python 2/3 > source compatible APIs (since the data driven nature of the problem > makes 2to3 unusable as a solution, and that doesn't help single code > base projects anyway). Right, and that was the main point in making codecs flexible in this respect. There are many other types which can serve as input and output - in the stdlib and interpreter as well as in extension modules that implement their own types. >>From my point of view, this is now just a matter of better documenting > the status quo, and nudging people in the right direction when it > comes to using the appropriate API for non-Unicode codecs. Since we > now realise these functions have existed since Python 2.4, it doesn't > make sense to try to fundamentally change direction, but instead to > work on making it better. > > A few things I noticed while implementing the recent updates: > > - as you noted in your other email, while MAL is on record as saying > the codecs module is intended for arbitrary codecs, not just Unicode > encodings, readers of the current docs can definitely be forgiven for > not realising that. We really need to better separate the codecs > module docs from the text model docs (two new sections in the language > reference, one for the codecs machinery and one for the text model > would likely be appropriate. The io module docs and those for the > builtin open function may also be affected) Agreed. > - a mechanism for annotating frames would help avoid the need for > nasty hacks like the exception wrapping that aims to make codec > failures easier to debug > - if codecs exposed a way to separate the input type check from the > invocation of the codec, we could redirect users to the module API for > bad input types as well (e.g. calling "input str".encode("bz2") This is one feature that's still missing from the codec design: there's currently no way to do input/output type introspection. It would be great to have codecs expose mapping of input to output types in their CodecInfo structure. > - if we want something that doesn't need to be imported, then encode() > and decode() builtins make more sense than new methods on str, bytes > and bytearray objects (since builtins would support memoryview and > array.array as well, and it avoids ambiguity regarding the direction > of the operation) I'm not sure I understand this part. > - an ABC for "ByteSequence" might be a good idea (ducktyped to check > if memoryview can retrieve a 1-D byte array) > - an ABC for "String" might be a good idea (but opens a big can of worms) > - the codecs module should offer a way to register a new alias for an > existing codec For the builtin codecs, this is already possible via the encodings.aliases module, but I agree: making the registry more flexible in this respect would probably be better. I must admit that the codec module is somewhat over-engineered with respect to the search function idea. The motivation was to abstract the lookup mechanism, in order to make it possible to e.g. dynamically load codecs in other ways, or to have codecs which implement a whole set of encodings with a single implementation, or to implement new ways of aliasing encoding names, etc. At the time I designed this, it wasn't clear which approach would be used. Today, I guess most people simply rely on the search function that is implemented in the encodings package. It may be a good idea to make the encodings package search function the preferred search function for codecs in Python and then slowly deprecate the search function registry and replace it with a more direct codec registry built into the encodings package. > - the codecs module should offer a way to map a name to a CodecInfo > object without registering a new search function Yep. See above. > - encodings should be converted to a namespace package Agreed. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 15 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 4 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Fri Nov 15 10:22:28 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Nov 2013 10:22:28 +0100 Subject: [Python-Dev] Add transform() and untranform() methods References: Message-ID: <20131115102228.4d828516@fsol> On Fri, 15 Nov 2013 09:03:37 +1000 Nick Coghlan wrote: > > > And add transform() and untransform() methods to bytes and str types. > > In practice, it might be same codecs registry for all codecs just with > > a new attribute. > > This is completely the wrong approach. There's zero justification for > adding new builtin methods for this use case - encoding and decoding are > generic operations, they should use functions not methods. I'm sorry, I disagree. The question is what use case it is solving, and there's zero benefit in writing codecs.encode("zlib") compared to e.g. zlib.compress(). A transform() or untransform() method, however, allows for a much more convenient spelling, with easy cascading, e.g.: b.transform("zlib").transform("base64") In other words, there's zero justification for codecs.encode() and codecs.decode(). The fact that the codecs machinery works on arbitrary object transformation is a pointless genericity, if it doesn't bring any additional convenience compared to the canonical functions in their respective modules. > At this point, the only person that can get me to revert this clarification > of MAL's original vision for the codecs module is Guido, since anything > else completely fails to address the Python 3 adoption barrier posed by the > current state of Python 3's binary codec support. I'd like to challenge your assertion that your change addresses anything. It's not easier to change b.encode("zlib") into codecs.encode("zlib", b), than it is to change it into zlib.compress(b). Regards, Antoine. From tim.peters at gmail.com Fri Nov 15 07:48:33 2013 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 15 Nov 2013 00:48:33 -0600 Subject: [Python-Dev] Finding overlapping matches with re assertions: bug or feature? Message-ID: I was surprised to find that "this works": if you want to find all _overlapping_ matches for a regexp R, wrap it in (?=(R)) and feed it to (say) finditer. Here's a very simple example, finding all overlapping occurrences of "xx": pat = re.compile("(?=(xx))") for it in pat.finditer("xxxx"): print(it.span(1)) That displays: (0, 2) (1, 3) (2, 4) Is that a feature? Or an accident? It's very surprising to find a non-empty match inside an empty match (the outermost lookahead assertion). If it's intended behavior, it's just in time for the holiday season; e.g., to generate ASCII art for half an upside-down Christmas tree: pat = re.compile("(?=(x+))") for it in pat.finditer("xxxxxxxxxx"): print(it.group(1)) From p.f.moore at gmail.com Fri Nov 15 10:47:56 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 15 Nov 2013 09:47:56 +0000 Subject: [Python-Dev] Finding overlapping matches with re assertions: bug or feature? In-Reply-To: References: Message-ID: On 15 November 2013 06:48, Tim Peters wrote: > Is that a feature? Or an accident? It's very surprising to find a > non-empty match inside an empty match (the outermost lookahead > assertion). Personally, I would read (?=(R))" as finding an empty match at a point where R starts. There's no implication that R is in any sense "inside" the match. (?=(\<\w\w\w\w\w\w)\w\w\w) finds the first 3 characters of words that are 6 or more characters long. Once again, the lookahead extends beyond the extent of the main match. It's obscure and a little bizarre, but I'd say its intended and a logical consequence of the definitions. Paul From techtonik at gmail.com Fri Nov 15 10:54:25 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 15 Nov 2013 12:54:25 +0300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson wrote: > 2013/11/12 anatoly techtonik : >> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >>> 2013/11/10 anatoly techtonik : >>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>>> >>>> In Assign(expr* targets, expr value), why the first argument is a list? >>> >>> x = y = 42 >> >> Thanks. >> >> Speaking of this ASDL. `expr* targets` means that multiple entities of >> `expr` under the name 'targets' can be passed to Assign statement. >> Assign uses them as left value. But `expr` definition contains things >> that can not be used as left side assignment targets: >> >> expr = BoolOp(boolop op, expr* values) >> | BinOp(expr left, operator op, expr right) >> ... >> | Str(string s) -- need to specify raw, unicode, etc? >> | Bytes(bytes s) >> | NameConstant(singleton value) >> | Ellipsis >> >> -- the following expression can appear in assignment context >> | Attribute(expr value, identifier attr, expr_context ctx) >> | Subscript(expr value, slice slice, expr_context ctx) >> | Starred(expr value, expr_context ctx) >> | Name(identifier id, expr_context ctx) >> | List(expr* elts, expr_context ctx) >> | Tuple(expr* elts, expr_context ctx) >> >> If I understand correctly, this is compiled into C struct definitions >> (Python-ast.c), and there is a code to traverse the structure, but >> where is code that validates that the structure is correct? Is it done >> on the first level - text file parsing, before ASDL is built? If so, >> then what is the role of this ADSL exactly that the first step is >> unable to solve? > > Only valid expression targets are allowed during AST construction. See > set_expr_context in ast.c. Oh my. Now there is also CST in addition to AST. This stuff - http://docs.python.org/devguide/ - badly needs diagrams about data transformation toolchain from Python source code to machine execution instructions. I'd like some pretty stuff, but raw blogdiag hack will do the job http://blockdiag.com/en/blockdiag/index.html There is no set_expr_context in my copy of CPython code, which seems to be some alpha of Python 3.4 >> Is it possible to fix ADSL to move `expr` that are allowed in Assign >> into `expr` subset? What effect will it achieve? I mean - will ADSL >> compiler complain about wrong stuff on the left side, or it will still >> be a role of some other component. Which one? > > I'm not sure what you mean by an `expr` subset. Transform this: expr = BoolOp(boolop op, expr* values) | BinOp(expr left, operator op, expr right) ... | Str(string s) -- need to specify raw, unicode, etc? | Bytes(bytes s) | NameConstant(singleton value) | Ellipsis -- the following expression can appear in assignment context | Attribute(expr value, identifier attr, expr_context ctx) | Subscript(expr value, slice slice, expr_context ctx) | Starred(expr value, expr_context ctx) | Name(identifier id, expr_context ctx) | List(expr* elts, expr_context ctx) | Tuple(expr* elts, expr_context ctx) to this: expr = BoolOp(boolop op, expr* values) | BinOp(expr left, operator op, expr right) ... | Str(string s) -- need to specify raw, unicode, etc? | Bytes(bytes s) | NameConstant(singleton value) | Ellipsis -- the following expression can appear in assignment context | expr_asgn expr_asgn = Attribute(expr value, identifier attr, expr_context ctx) | Subscript(expr value, slice slice, expr_context ctx) | Starred(expr value, expr_context ctx) | Name(identifier id, expr_context ctx) | List(expr* elts, expr_context ctx) | Tuple(expr* elts, expr_context ctx) From techtonik at gmail.com Fri Nov 15 10:56:39 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 15 Nov 2013 12:56:39 +0300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Fri, Nov 15, 2013 at 12:54 PM, anatoly techtonik wrote: > On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson wrote: >> 2013/11/12 anatoly techtonik : >>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >>>> 2013/11/10 anatoly techtonik : >>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>>>> >>>>> In Assign(expr* targets, expr value), why the first argument is a list? >>>> >>>> x = y = 42 >>> >>> Thanks. >>> >>> Speaking of this ASDL. `expr* targets` means that multiple entities of >>> `expr` under the name 'targets' can be passed to Assign statement. >>> Assign uses them as left value. But `expr` definition contains things >>> that can not be used as left side assignment targets: >>> >>> expr = BoolOp(boolop op, expr* values) >>> | BinOp(expr left, operator op, expr right) >>> ... >>> | Str(string s) -- need to specify raw, unicode, etc? >>> | Bytes(bytes s) >>> | NameConstant(singleton value) >>> | Ellipsis >>> >>> -- the following expression can appear in assignment context >>> | Attribute(expr value, identifier attr, expr_context ctx) >>> | Subscript(expr value, slice slice, expr_context ctx) >>> | Starred(expr value, expr_context ctx) >>> | Name(identifier id, expr_context ctx) >>> | List(expr* elts, expr_context ctx) >>> | Tuple(expr* elts, expr_context ctx) >>> >>> If I understand correctly, this is compiled into C struct definitions >>> (Python-ast.c), and there is a code to traverse the structure, but >>> where is code that validates that the structure is correct? Is it done >>> on the first level - text file parsing, before ASDL is built? If so, >>> then what is the role of this ADSL exactly that the first step is >>> unable to solve? >> >> Only valid expression targets are allowed during AST construction. See >> set_expr_context in ast.c. > > Oh my. Now there is also CST in addition to AST. This stuff - > http://docs.python.org/devguide/ - badly needs diagrams about data > transformation toolchain from Python source code to machine > execution instructions. I'd like some pretty stuff, but raw blogdiag > hack will do the job http://blockdiag.com/en/blockdiag/index.html > > There is no set_expr_context in my copy of CPython code, which > seems to be some alpha of Python 3.4 > >>> Is it possible to fix ADSL to move `expr` that are allowed in Assign >>> into `expr` subset? What effect will it achieve? I mean - will ADSL >>> compiler complain about wrong stuff on the left side, or it will still >>> be a role of some other component. Which one? >> >> I'm not sure what you mean by an `expr` subset. > > Transform this: > > expr = BoolOp(boolop op, expr* values) > | BinOp(expr left, operator op, expr right) > ... > | Str(string s) -- need to specify raw, unicode, etc? > | Bytes(bytes s) > | NameConstant(singleton value) > | Ellipsis > > -- the following expression can appear in assignment context > | Attribute(expr value, identifier attr, expr_context ctx) > | Subscript(expr value, slice slice, expr_context ctx) > | Starred(expr value, expr_context ctx) > | Name(identifier id, expr_context ctx) > | List(expr* elts, expr_context ctx) > | Tuple(expr* elts, expr_context ctx) > > to this: > > expr = BoolOp(boolop op, expr* values) > | BinOp(expr left, operator op, expr right) > ... > | Str(string s) -- need to specify raw, unicode, etc? > | Bytes(bytes s) > | NameConstant(singleton value) > | Ellipsis > > -- the following expression can appear in assignment context > | expr_asgn > > expr_asgn = > Attribute(expr value, identifier attr, expr_context ctx) > | Subscript(expr value, slice slice, expr_context ctx) > | Starred(expr value, expr_context ctx) > | Name(identifier id, expr_context ctx) > | List(expr* elts, expr_context ctx) > | Tuple(expr* elts, expr_context ctx) And also this: | Assign(expr* targets, expr value) to this: | Assign(expr_asgn* targets, expr value) From steve at pearwood.info Fri Nov 15 11:02:41 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 21:02:41 +1100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: <20131115100241.GP2085@ando> On Fri, Nov 15, 2013 at 05:13:34PM +1000, Nick Coghlan wrote: > A few things I noticed while implementing the recent updates: > > - as you noted in your other email, while MAL is on record as saying > the codecs module is intended for arbitrary codecs, not just Unicode > encodings, readers of the current docs can definitely be forgiven for > not realising that. We really need to better separate the codecs > module docs from the text model docs (two new sections in the language > reference, one for the codecs machinery and one for the text model > would likely be appropriate. The io module docs and those for the > builtin open function may also be affected) > - a mechanism for annotating frames would help avoid the need for > nasty hacks like the exception wrapping that aims to make codec > failures easier to debug > - if codecs exposed a way to separate the input type check from the > invocation of the codec, we could redirect users to the module API for > bad input types as well (e.g. calling "input str".encode("bz2") > - if we want something that doesn't need to be imported, then encode() > and decode() builtins make more sense than new methods on str, bytes > and bytearray objects (since builtins would support memoryview and > array.array as well, and it avoids ambiguity regarding the direction > of the operation) Sounds good to me. > - the codecs module should offer a way to register a new alias for an > existing codec > - the codecs module should offer a way to map a name to a CodecInfo > object without registering a new search function It would be really good to be able to query the available codecs. For example, many applications offer an "Encoding" menu, where you can specify the codec used for text. That's hard in Python, since you can't retrieve a list of known codecs. -- Steven From storchaka at gmail.com Fri Nov 15 11:10:41 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 15 Nov 2013 12:10:41 +0200 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <20131115100241.GP2085@ando> References: <20131115100241.GP2085@ando> Message-ID: 15.11.13 12:02, Steven D'Aprano ???????(??): > It would be really good to be able to query the available codecs. For > example, many applications offer an "Encoding" menu, where you can > specify the codec used for text. That's hard in Python, since you > can't retrieve a list of known codecs. And you can't determine which codec is binary<->text encoding. From steve at pearwood.info Fri Nov 15 11:28:35 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 21:28:35 +1100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <20131115102228.4d828516@fsol> References: <20131115102228.4d828516@fsol> Message-ID: <20131115102835.GQ2085@ando> On Fri, Nov 15, 2013 at 10:22:28AM +0100, Antoine Pitrou wrote: > On Fri, 15 Nov 2013 09:03:37 +1000 Nick Coghlan wrote: > > > > > And add transform() and untransform() methods to bytes and str types. > > > In practice, it might be same codecs registry for all codecs just with > > > a new attribute. > > > > This is completely the wrong approach. There's zero justification for > > adding new builtin methods for this use case - encoding and decoding are > > generic operations, they should use functions not methods. > > I'm sorry, I disagree. The question is what use case it is solving, and > there's zero benefit in writing codecs.encode("zlib") compared to e.g. > zlib.compress(). One benefit is: import codecs codec = get_name_of_compression_codec() result = codecs.encode(data, codec) versus: codec = get_name_of_compression_codec() if codec == "zlib": import zlib encoder = zlib.compress elif codec == "bz2" import bz2 encoder = bz2.compress elif codec == "gzip": import gzip encoder = gzip.compress elif codec == "squash": import mySquashLib encoder = mySquashLib.squash elif ...: # and so on result = encoder(data) > A transform() or untransform() method, however, allows for a much more > convenient spelling, with easy cascading, e.g.: > > b.transform("zlib").transform("base64") Yes, that's quite nice. Although it need not be a method, a built-in function works for me too: # either of these: transform(transform(b, "zlib"), "base64") encode(encode(b, "zlib"), "base64") If encoding/decoding is intended to be completely generic (even if 99% of the uses will be with strings and bytes), is there any reason to prefer built-in functions rather than methods on object? -- Steven From solipsis at pitrou.net Fri Nov 15 11:33:47 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Nov 2013 11:33:47 +0100 Subject: [Python-Dev] Add transform() and untranform() methods References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> Message-ID: <20131115113347.52f5b10c@fsol> On Fri, 15 Nov 2013 21:28:35 +1100 Steven D'Aprano wrote: > > One benefit is: > > import codecs > codec = get_name_of_compression_codec() > result = codecs.encode(data, codec) That's a good point. > If encoding/decoding is intended to be completely generic (even if 99% > of the uses will be with strings and bytes), is there any reason to > prefer built-in functions rather than methods on object? Practicality beats purity. Personally, I've never used codecs on anything else than str and bytes objects. Regards Antoine. From storchaka at gmail.com Fri Nov 15 11:46:41 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 15 Nov 2013 12:46:41 +0200 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <20131115102835.GQ2085@ando> References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> Message-ID: 15.11.13 12:28, Steven D'Aprano ???????(??): > One benefit is: > > import codecs > codec = get_name_of_compression_codec() > result = codecs.encode(data, codec) And this is a hole in a security if you don't check codec name before calling a codec. See topic about utilizing zip-bombs via codecs machinery. Also usually you need more than just uncompress binary data by Python name. You need map external compression name to internal Python codec name, you need configure decompressor object by specific options, perhaps you need different buffering strategies for different compression algorithms. See for example zipfile and tarfile sources. From arigo at tunes.org Fri Nov 15 11:48:38 2013 From: arigo at tunes.org (Armin Rigo) Date: Fri, 15 Nov 2013 11:48:38 +0100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: <20131115054207.GO2085@ando> References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> Message-ID: Hi, FWIW, the pure Python traceback.py module has a slightly different (and saner) behavior: >>> e = Exception(u"xx\u1234yy") >>> traceback.print_exception(Exception, e, None) Exception: xx\u1234yy I'd suggest that the behavior of the two should be unified anyway. The traceback module uses value.encode("ascii", "backslashreplace") for any unicode object. A bient?t, Armin. From ncoghlan at gmail.com Fri Nov 15 12:11:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 21:11:07 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> Message-ID: On 15 November 2013 17:22, Stefan Behnel wrote: > > I can't see any bit of information being added by chaining the exceptions > in this specific case. > > Remember that each change to exception messages and/or exception chaining > will break someone's doctests somewhere, and it's really ugly to work > around chained exceptions in (cross-Py-version) doctests. > > I understand that this is helpful *in general*, though, i.e. for other > kinds of exceptions in codecs, so maybe changing the exception handling in > the doctest module could be a work-around for this kind of change? IIRC, doctest ignores the traceback contents by default - this is just a bug where the chaining is also triggering for the initial codec lookup when it should avoid doing that. Created http://bugs.python.org/issue19609 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri Nov 15 12:45:31 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 21:45:31 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <20131115113347.52f5b10c@fsol> References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: On 15 November 2013 20:33, Antoine Pitrou wrote: > On Fri, 15 Nov 2013 21:28:35 +1100 > Steven D'Aprano wrote: >> >> One benefit is: >> >> import codecs >> codec = get_name_of_compression_codec() >> result = codecs.encode(data, codec) > > That's a good point. > >> If encoding/decoding is intended to be completely generic (even if 99% >> of the uses will be with strings and bytes), is there any reason to >> prefer built-in functions rather than methods on object? > > Practicality beats purity. Personally, I've never used codecs on > anything else than str and bytes objects. The reason I'm now putting some effort into better documenting the status quo for codec handling in Python 3 and filing off some of the rough edges (rather than proposing adding any new APIs to Python 3.x) is because the users I care about in this matter are web developers that already make use of the binary codecs and are adopting the single-source approach to handle supporting both Python 2 and Python 3. Armin Ronacher is the one who's been most vocal about the problem, but he's definitely not alone. A new API for binary transforms is potentially an academically interesting concept, but it solves zero current real world problems. By contrast, being clear about the fact that codecs.encode and codecs.decode exist and are available as far back as Python 2.4 helps to eliminate a genuine barrier to Python 3 adoption for a subset of the community. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Fri Nov 15 12:53:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Nov 2013 12:53:52 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: <20131115125352.5e1e8c0b@fsol> On Fri, 15 Nov 2013 21:45:31 +1000 Nick Coghlan wrote: > > The reason I'm now putting some effort into better documenting the > status quo for codec handling in Python 3 and filing off some of the > rough edges (rather than proposing adding any new APIs to Python 3.x) > is because the users I care about in this matter are web developers > that already make use of the binary codecs and are adopting the > single-source approach to handle supporting both Python 2 and Python > 3. Armin Ronacher is the one who's been most vocal about the problem, > but he's definitely not alone. zlib.compress(something) works on both Python 2 and Python 3, why do you need something else? Regards Antoine. From victor.stinner at gmail.com Fri Nov 15 13:07:02 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Nov 2013 13:07:02 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: 2013/11/15 Nick Coghlan : > The reason I'm now putting some effort into better documenting the > status quo for codec handling in Python 3 and filing off some of the > rough edges (rather than proposing adding any new APIs to Python 3.x) > is because the users I care about in this matter are web developers > that already make use of the binary codecs and are adopting the > single-source approach to handle supporting both Python 2 and Python > 3. Armin Ronacher is the one who's been most vocal about the problem, > but he's definitely not alone. Except of Armin Ronacher, I never see anyway blocked when trying to port a project to Python3 because of these bytes=>bytes and str=>str codecs. I did a quick search on Google but I failed to find a question "how can I write .encode("hex") or .encode("zlib") in Python 3?". It was just a quick search, it's likely that many developers hit this Python 3 regression, but I'm confident that developers are able to workaround themself this regression (ex: use directly the right Python module). I saw a lot of huge code base ported to Python 3 without the need of these codecs. For example: Django which is a web framework has been ported on Python 3, I know that Armin Ronacher also works on web things (I don't know what exactly). > A new API for binary transforms is potentially an academically > interesting concept, but it solves zero current real world problems. I would like to reply the same for these codecs: they are not solving any real world problem :-) Victor From p.f.moore at gmail.com Fri Nov 15 13:24:07 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 15 Nov 2013 12:24:07 +0000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: On 15 November 2013 12:07, Victor Stinner wrote: >> A new API for binary transforms is potentially an academically >> interesting concept, but it solves zero current real world problems. > > I would like to reply the same for these codecs: they are not solving > any real world problem :-) As Nick is only documenting long-existing functions, I fail to see the issue here. If someone were to propose new methods, builtins, or module functions, then I could see a reason for debate. But surely simply documenting existing functions is not worth all this pushback? Paul From mal at egenix.com Fri Nov 15 13:28:05 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 15 Nov 2013 13:28:05 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: <52861355.7030800@egenix.com> On 15.11.2013 12:45, Nick Coghlan wrote: > On 15 November 2013 20:33, Antoine Pitrou wrote: >> On Fri, 15 Nov 2013 21:28:35 +1100 >> Steven D'Aprano wrote: >>> >>> One benefit is: >>> >>> import codecs >>> codec = get_name_of_compression_codec() >>> result = codecs.encode(data, codec) >> >> That's a good point. >> >>> If encoding/decoding is intended to be completely generic (even if 99% >>> of the uses will be with strings and bytes), is there any reason to >>> prefer built-in functions rather than methods on object? >> >> Practicality beats purity. Personally, I've never used codecs on >> anything else than str and bytes objects. > > The reason I'm now putting some effort into better documenting the > status quo for codec handling in Python 3 and filing off some of the > rough edges (rather than proposing adding any new APIs to Python 3.x) > is because the users I care about in this matter are web developers > that already make use of the binary codecs and are adopting the > single-source approach to handle supporting both Python 2 and Python > 3. Armin Ronacher is the one who's been most vocal about the problem, > but he's definitely not alone. You can add me to that list :-). Esp. the hex codec is very handy. Google returns a few thousand hits for that codec alone. One detail that people often tend to forget is the extensibility of the codec system. It is easily possible to add new codecs to the system to e.g. perform encoding, escaping, compression or other conversion operations, so the set of codecs in the stdlib is not the complete set of codecs used in the wild - and it's not intended to be. As example: We've written codecs for customers that perform special types of XML un/escaping. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 15 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 4 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From facundobatista at gmail.com Fri Nov 15 13:58:05 2013 From: facundobatista at gmail.com (Facundo Batista) Date: Fri, 15 Nov 2013 10:58:05 -0200 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: On Thu, Nov 14, 2013 at 7:32 PM, Victor Stinner wrote: > I would prefer to split the registry of codecs to have 3 registries: > > - "encoding" (a better name can found): encode str=>bytes, decode bytes=>str > - bytes: encode bytes=>bytes, decode bytes=>bytes > - str: encode str=>str, decode str=>str > > And add transform() and untransform() methods to bytes and str types. > In practice, it might be same codecs registry for all codecs just with > a new attribute. I like this idea very much. But to see IIUC, let me be more explicit... you'll have (of course, always py3k-speaking): - bytes.decode() -> str ... here you can only use unicode encodings - no bytes.encode(), like today - bytes.transform() -> bytes ... here you can only use things like zlib, rot13, etc - str.encode() -> bytes ... here you can only use unicode encodings - no str.decode(), like today - str.transform() -> str ... here you can only use things like... like what? When to use decode/encode was always a major pain point for people, so doing this extra separation and cleaning would bring more clarity to when to use what. Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ Twitter: @facundobatista From martin at v.loewis.de Fri Nov 15 14:24:50 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 15 Nov 2013 14:24:50 +0100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: Message-ID: <528620A2.5010101@v.loewis.de> Am 15.11.13 00:57, schrieb Chris Barker: > Maybe so -- but we are either maintaining 2.7 or not -- it WIL be > around for along time yet... Procedurally, it's really easy. Ultimately it's up to the release manager to decide which changes go into a release and which don't, and Benjamin has already voiced an opinion. In addition, Guido van Rossum has voiced an opinion a while ago that he doesn't consider fixing bugs for 2.7 very useful, and would rather see maintenance focus on ongoing support for new operating systems, compiler, build environments, etc. The rationale is that people who have lived with the glitches of 2.x for so long surely have already made their work-arounds, so they aren't helped with receiving bug fixes. The same is true in your case: you indicated that you *already* work around the problem. It may have been tedious when you had to do it, but now it's done - and you might not even change your code even if Python 2.7.x gets changed, since you might want to support older 2.7.x release for some time. Regards, Martin From ncoghlan at gmail.com Fri Nov 15 14:50:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 23:50:23 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: On 15 November 2013 22:24, Paul Moore wrote: > On 15 November 2013 12:07, Victor Stinner wrote: >>> A new API for binary transforms is potentially an academically >>> interesting concept, but it solves zero current real world problems. >> >> I would like to reply the same for these codecs: they are not solving >> any real world problem :-) > > As Nick is only documenting long-existing functions, I fail to see the > issue here. > > If someone were to propose new methods, builtins, or module functions, > then I could see a reason for debate. But surely simply documenting > existing functions is not worth all this pushback? There's a bit more to it than that (and that's why I started the other thread about the codec aliases before proceeding to the final step). One of the changes Victor is concerned about is that when you use an incorrect codec in one of the Unicode-encoding-only convenience methods, the recent exception updates explicitly push users towards using those module level functions instead: >>> import codecs >>> "no good".encode("rot_13") Traceback (most recent call last): File "", line 1, in TypeError: 'rot_13' encoder returned 'str' instead of 'bytes'; use codecs.encode() to encode to arbitrary types >>> codecs.encode("just fine", "rot_13") 'whfg svar' >>> b"no good".decode("quopri_codec") Traceback (most recent call last): File "", line 1, in TypeError: 'quopri_codec' decoder returned 'bytes' instead of 'str'; use codecs.decode() to decode to arbitrary types >>> codecs.decode(b"just fine", "quopri_codec") b'just fine' My perspective is that, in current Python, that *is* the right thing for people to do, and any hypothetical new API proposed for Python 3.5 would do nothing to change what's right for Python 3.4 code (or Python 2/3 compatible code). I also find it bizarre that several of those arguing that this is too niche a feature to be worth refining are simultaneously in favour of a proposal to add new *methods on builtin types* for the same niche feature. The other part is the fact that I updated the What's New document to highlight these tweaks: http://docs.python.org/dev/whatsnew/3.4.html#improvements-to-handling-of-non-unicode-codecs As noted earlier in the thread, Armin Ronacher has been the most vocal of the users of this feature in Python 2 that lamented it's absence in Python 3 (see, for example, http://lucumr.pocoo.org/2012/8/11/codec-confusion/), but I've also received plenty of subsequent feedback along the lines of "what he said!" (such as http://bugs.python.org/issue7475#msg187630). Many of the proposed solutions from the people affected by the change haven't been usable (since they've often been based on a misunderstanding of why the method behaviour changed in Python 3 in the first place), but the pain they experience is genuine, and it can unnecessarily sour their whole experience of the transition. I consider documenting the existing module level functions and nudging users towards them when they try to use the affected codecs to be an expedient way to say "yes, this is still available if you really want to use it, but the required spelling is different". However, the one thing I'm *not* going to do at this point is restore the shorthand aliases, so those opposing the lowering of this barrier to transition can take comfort in the fact they have succeeded in ensuring that the out-of-the-box experience for users of this feature migrating from Python 2 remains the unfriendly: >>> b"abcdef".decode("hex") Traceback (most recent call last): File "", line 1, in LookupError: unknown encoding: hex Rather than the more useful: >>> b"abcdef".decode("hex") Traceback (most recent call last): File "", line 1, in TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use codecs.decode() to decode to arbitrary types Which would then lead them to the working (and still Python 2 compatible) code: >>> codecs.decode(b"abcdef", "hex") b'\xab\xcd\xef' Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Fri Nov 15 15:04:33 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Nov 2013 15:04:33 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> Message-ID: <20131115150433.5a54c08f@fsol> On Fri, 15 Nov 2013 23:50:23 +1000 Nick Coghlan wrote: > > My perspective is that, in current Python, that *is* the right thing > for people to do, and any hypothetical new API proposed for Python 3.5 > would do nothing to change what's right for Python 3.4 code (or Python > 2/3 compatible code). I also find it bizarre that several of those > arguing that this is too niche a feature to be worth refining are > simultaneously in favour of a proposal to add new *methods on builtin > types* for the same niche feature. I am not claiming it is a niche feature, I am claiming codecs.encode() and codecs.decode() don't solve the use case like the .transform() and .untransform() methods do. (I do think it is a nice feature in Python 2, although I find myself using it mainly at the interpreter prompt, rather than in production code) > As noted earlier in the thread, Armin Ronacher has been the most vocal > of the users of this feature in Python 2 that lamented it's absence in > Python 3 (see, for example, > http://lucumr.pocoo.org/2012/8/11/codec-confusion/), but I've also > received plenty of subsequent feedback along the lines of "what he > said!" (such as http://bugs.python.org/issue7475#msg187630). The way I read it, the positive feedback was about .transform() and .untransform(), not about recommending people switch to codecs.encode() and codecs.decode(). > Rather than the more useful: > > >>> b"abcdef".decode("hex") > Traceback (most recent call last): > File "", line 1, in > TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use > codecs.decode() to decode to arbitrary types I think this may be confusing. TypeError seems to suggest that the parameter type sent by the user to the method is wrong, which is not the actual cause of the error. Regards Antoine. From benjamin at python.org Fri Nov 15 15:43:32 2013 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 15 Nov 2013 09:43:32 -0500 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: 2013/11/15 anatoly techtonik : > On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson wrote: >> 2013/11/12 anatoly techtonik : >>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >>>> 2013/11/10 anatoly techtonik : >>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>>>> >>>>> In Assign(expr* targets, expr value), why the first argument is a list? >>>> >>>> x = y = 42 >>> >>> Thanks. >>> >>> Speaking of this ASDL. `expr* targets` means that multiple entities of >>> `expr` under the name 'targets' can be passed to Assign statement. >>> Assign uses them as left value. But `expr` definition contains things >>> that can not be used as left side assignment targets: >>> >>> expr = BoolOp(boolop op, expr* values) >>> | BinOp(expr left, operator op, expr right) >>> ... >>> | Str(string s) -- need to specify raw, unicode, etc? >>> | Bytes(bytes s) >>> | NameConstant(singleton value) >>> | Ellipsis >>> >>> -- the following expression can appear in assignment context >>> | Attribute(expr value, identifier attr, expr_context ctx) >>> | Subscript(expr value, slice slice, expr_context ctx) >>> | Starred(expr value, expr_context ctx) >>> | Name(identifier id, expr_context ctx) >>> | List(expr* elts, expr_context ctx) >>> | Tuple(expr* elts, expr_context ctx) >>> >>> If I understand correctly, this is compiled into C struct definitions >>> (Python-ast.c), and there is a code to traverse the structure, but >>> where is code that validates that the structure is correct? Is it done >>> on the first level - text file parsing, before ASDL is built? If so, >>> then what is the role of this ADSL exactly that the first step is >>> unable to solve? >> >> Only valid expression targets are allowed during AST construction. See >> set_expr_context in ast.c. > > Oh my. Now there is also CST in addition to AST. This stuff - > http://docs.python.org/devguide/ - badly needs diagrams about data > transformation toolchain from Python source code to machine > execution instructions. I'd like some pretty stuff, but raw blogdiag > hack will do the job http://blockdiag.com/en/blockdiag/index.html > > There is no set_expr_context in my copy of CPython code, which > seems to be some alpha of Python 3.4 It's actually called set_context. > >>> Is it possible to fix ADSL to move `expr` that are allowed in Assign >>> into `expr` subset? What effect will it achieve? I mean - will ADSL >>> compiler complain about wrong stuff on the left side, or it will still >>> be a role of some other component. Which one? >> >> I'm not sure what you mean by an `expr` subset. > > Transform this: > > expr = BoolOp(boolop op, expr* values) > | BinOp(expr left, operator op, expr right) > ... > | Str(string s) -- need to specify raw, unicode, etc? > | Bytes(bytes s) > | NameConstant(singleton value) > | Ellipsis > > -- the following expression can appear in assignment context > | Attribute(expr value, identifier attr, expr_context ctx) > | Subscript(expr value, slice slice, expr_context ctx) > | Starred(expr value, expr_context ctx) > | Name(identifier id, expr_context ctx) > | List(expr* elts, expr_context ctx) > | Tuple(expr* elts, expr_context ctx) > > to this: > > expr = BoolOp(boolop op, expr* values) > | BinOp(expr left, operator op, expr right) > ... > | Str(string s) -- need to specify raw, unicode, etc? > | Bytes(bytes s) > | NameConstant(singleton value) > | Ellipsis > > -- the following expression can appear in assignment context > | expr_asgn > > expr_asgn = > Attribute(expr value, identifier attr, expr_context ctx) > | Subscript(expr value, slice slice, expr_context ctx) > | Starred(expr value, expr_context ctx) > | Name(identifier id, expr_context ctx) > | List(expr* elts, expr_context ctx) > | Tuple(expr* elts, expr_context ctx) I doubt ASDL will let you do that. -- Regards, Benjamin From ncoghlan at gmail.com Fri Nov 15 15:46:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 00:46:15 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <20131115150433.5a54c08f@fsol> References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> Message-ID: On 16 November 2013 00:04, Antoine Pitrou wrote: >> Rather than the more useful: >> >> >>> b"abcdef".decode("hex") >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use >> codecs.decode() to decode to arbitrary types > > I think this may be confusing. TypeError seems to suggest that the > parameter type sent by the user to the method is wrong, which is not > the actual cause of the error. The TypeError isn't new, only the part after the semi-colon telling them that codecs.decode() doesn't include the typecheck (because it isn't constrained by the text model). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Fri Nov 15 16:57:00 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 16 Nov 2013 00:57:00 +0900 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <051807F7-D6A0-4BC5-BE4C-7EA014B42F75@livinglogic.de> References: <051807F7-D6A0-4BC5-BE4C-7EA014B42F75@livinglogic.de> Message-ID: <87a9h5hd4j.fsf@uwakimon.sk.tsukuba.ac.jp> Walter D?rwald writes: > Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : > > > > 15.11.13 00:32, Victor Stinner ???????(??): > >> And add transform() and untransform() methods to bytes and str types. > >> In practice, it might be same codecs registry for all codecs just with > >> a new attribute. > > > > If the transform() method will be added, I prefer to have only > > one transformation method and specify a direction by the > > transformation name ("bzip2"/"unbzip2"). > > +1 -1 I can't support adding such methods (and that's why I ended up giving Nick's proposal for exposing codecs.encode and codecs.decode a +1). People think about these transformations as "en- or de-coding", not "transforming", most of the time. Even for a transformation that is an involution (eg, rot13), people have an very clear idea of what's encoded and what's not, and they are going to prefer the names "encode" and "decode" for these (generic) operations in many cases. Eg, I don't think "s.transform(decoder)" is an improvement over "decode(s, codec)" (but tastes vary).[1] It does mean that we need to add a redundant method, and I don't really see an advantage to it. The semantics seem slightly off to me, since the purpose of the operation is to create a new object, not transform the original in-place. (But of course str.encode and bytes.decode are precedents for those semantics.) Footnotes: [1] Arguments "decoder" and "codec" are identifiers, not metavariables. From ethan at stoneleaf.us Fri Nov 15 17:04:55 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 15 Nov 2013 08:04:55 -0800 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: Message-ID: <52864627.7010503@stoneleaf.us> On 11/14/2013 11:13 PM, Nick Coghlan wrote: > > The proposal I posted to issue 7475 back in April (and, in the absence > of any objections to the proposal, finally implemented over the past > few weeks) was to take advantage of the fact that the codecs.encode > and codecs.decode convenience functions exist (and have been covered > by the regression test suite) as far back as Python 2.4. I did this > merely by documenting the existing of the functions for Python 2.7, > 3.3 and 3.4, changing the exception messages thrown for codec output > type errors on the convenience methods to reference them, and by > updating the Python 3.4 What's New document to explain the changes. Thanks for doing this work, Nick! -- ~Ethan~ From solipsis at pitrou.net Fri Nov 15 17:34:58 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 15 Nov 2013 17:34:58 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> Message-ID: <20131115173458.123494af@fsol> On Sat, 16 Nov 2013 00:46:15 +1000 Nick Coghlan wrote: > On 16 November 2013 00:04, Antoine Pitrou wrote: > >> Rather than the more useful: > >> > >> >>> b"abcdef".decode("hex") > >> Traceback (most recent call last): > >> File "", line 1, in > >> TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use > >> codecs.decode() to decode to arbitrary types > > > > I think this may be confusing. TypeError seems to suggest that the > > parameter type sent by the user to the method is wrong, which is not > > the actual cause of the error. > > The TypeError isn't new, Really? That's not what your message said. Regards Antoine. From status at bugs.python.org Fri Nov 15 18:07:45 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 15 Nov 2013 18:07:45 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20131115170745.6C108568C5@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-11-08 - 2013-11-15) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4265 (+38) closed 27119 (+45) total 31384 (+83) Open issues with patches: 1979 Issues opened (74) ================== #1180: Option to ignore or substitute ~/.pydistutils.cfg http://bugs.python.org/issue1180 reopened by jason.coombs #6466: duplicate get_version() code between cygwinccompiler and emxcc http://bugs.python.org/issue6466 reopened by jason.coombs #6516: reset owner/group to root for distutils tarballs http://bugs.python.org/issue6516 reopened by jason.coombs #16286: Use hash if available to optimize a==b and a!=b for bytes and http://bugs.python.org/issue16286 reopened by haypo #17354: TypeError when running setup.py upload --show-response http://bugs.python.org/issue17354 reopened by berker.peksag #19466: Clear state of threads earlier in Python shutdown http://bugs.python.org/issue19466 reopened by haypo #19531: Loading -OO bytecode files if -O was requested can lead to pro http://bugs.python.org/issue19531 opened by Sworddragon #19532: compileall -f doesn't force to write bytecode files http://bugs.python.org/issue19532 opened by Sworddragon #19533: Unloading docstrings from memory if -OO is given http://bugs.python.org/issue19533 opened by Sworddragon #19534: normalize() in locale.py fails for sr_RS.UTF-8 at latin http://bugs.python.org/issue19534 opened by mfabian #19535: Test failures with -OO http://bugs.python.org/issue19535 opened by serhiy.storchaka #19536: MatchObject should offer __getitem__() http://bugs.python.org/issue19536 opened by brandon-rhodes #19537: Fix misalignment in fastsearch_memchr_1char http://bugs.python.org/issue19537 opened by schwab #19538: Changed function prototypes in the PEP 384 stable ABI http://bugs.python.org/issue19538 opened by theller #19539: The 'raw_unicode_escape' codec buggy + not appropriate for Pyt http://bugs.python.org/issue19539 opened by zuo #19541: ast.dump(indent=True) prettyprinting http://bugs.python.org/issue19541 opened by techtonik #19542: WeakValueDictionary bug in setdefault()&pop() http://bugs.python.org/issue19542 opened by arigo #19543: Add -3 warnings for codec convenience method changes http://bugs.python.org/issue19543 opened by ncoghlan #19544: Port distutils as found in Python 2.7 to Python 3.x. http://bugs.python.org/issue19544 opened by jason.coombs #19545: time.strptime exception context http://bugs.python.org/issue19545 opened by Claudiu.Popa #19546: configparser leaks implementation detail http://bugs.python.org/issue19546 opened by Claudiu.Popa #19547: HTTPS proxy support missing without warning http://bugs.python.org/issue19547 opened by 02strich #19548: 'codecs' module docs improvements http://bugs.python.org/issue19548 opened by zuo #19549: PKG-INFO is created with CRLF on Windows http://bugs.python.org/issue19549 opened by techtonik #19550: PEP 453: Windows installer integration http://bugs.python.org/issue19550 opened by ncoghlan #19551: PEP 453: Mac OS X installer integration http://bugs.python.org/issue19551 opened by ncoghlan #19552: PEP 453: venv module and pyvenv integration http://bugs.python.org/issue19552 opened by ncoghlan #19553: PEP 453: "make install" and "make altinstall" integration http://bugs.python.org/issue19553 opened by ncoghlan #19554: Enable all freebsd* host platforms http://bugs.python.org/issue19554 opened by wg #19555: "SO" config var not getting set http://bugs.python.org/issue19555 opened by Marc.Abramowitz #19557: ast - docs for every node type are missing http://bugs.python.org/issue19557 opened by techtonik #19558: Provide Tcl/Tk linkage information for extension module builds http://bugs.python.org/issue19558 opened by ned.deily #19561: request to reopen Issue837046 - pyport.h redeclares gethostnam http://bugs.python.org/issue19561 opened by risto3 #19562: Added description for assert statement http://bugs.python.org/issue19562 opened by thatiparthy #19563: Changing barry's email to barry at python.org http://bugs.python.org/issue19563 opened by thatiparthy #19564: test_multiprocessing_spawn hangs http://bugs.python.org/issue19564 opened by haypo #19565: test_multiprocessing_spawn: RuntimeError and assertion error o http://bugs.python.org/issue19565 opened by haypo #19566: ERROR: test_close (test.test_asyncio.test_unix_events.FastChil http://bugs.python.org/issue19566 opened by haypo #19568: bytearray_setslice_linear() leaves the bytearray in an inconsi http://bugs.python.org/issue19568 opened by haypo #19569: Use __attribute__((deprecated)) to warn usage of deprecated fu http://bugs.python.org/issue19569 opened by haypo #19570: distutils' Command.ensure_dirname fails on Unicode http://bugs.python.org/issue19570 opened by saschpe #19572: Report more silently skipped tests as skipped http://bugs.python.org/issue19572 opened by zach.ware #19573: Fix the docstring of inspect.Parameter and the implementation http://bugs.python.org/issue19573 opened by Antony.Lee #19575: subprocess.Popen with multiple threads: Redirected stdout/stde http://bugs.python.org/issue19575 opened by Bernt.R??skar.Brenna #19576: "Non-Python created threads" documentation doesn't mention PyE http://bugs.python.org/issue19576 opened by haypo #19577: memoryview bind (the opposite of release) http://bugs.python.org/issue19577 opened by mpb #19578: del list[a:b:c] doesn't handle correctly list_resize() failure http://bugs.python.org/issue19578 opened by haypo #19581: PyUnicodeWriter: change the overallocation factor for Windows http://bugs.python.org/issue19581 opened by haypo #19582: Tkinter is not working with Py_SetPath http://bugs.python.org/issue19582 opened by Debarshi.Goswami #19584: IDLE fails - Python V2.7.6 - 64b on Win7 64b http://bugs.python.org/issue19584 opened by bob7wake #19585: Frame annotation http://bugs.python.org/issue19585 opened by doerwalter #19586: distutils: Remove assertEquals and assert_ deprecation warning http://bugs.python.org/issue19586 opened by Gregory.Salvan #19587: Remove empty tests in test_bytes.FixedStringTest http://bugs.python.org/issue19587 opened by zach.ware #19588: Silently skipped test in test_random http://bugs.python.org/issue19588 opened by zach.ware #19590: Use specific asserts in test_email http://bugs.python.org/issue19590 opened by serhiy.storchaka #19591: Use specific asserts in ctype tests http://bugs.python.org/issue19591 opened by serhiy.storchaka #19593: Use specific asserts in importlib tests http://bugs.python.org/issue19593 opened by serhiy.storchaka #19594: Use specific asserts in unittest tests http://bugs.python.org/issue19594 opened by serhiy.storchaka #19595: Silently skipped test in test_winsound http://bugs.python.org/issue19595 opened by zach.ware #19596: Silently skipped tests in test_importlib http://bugs.python.org/issue19596 opened by zach.ware #19597: Add decorator for tests not yet implemented http://bugs.python.org/issue19597 opened by zach.ware #19599: Failure of test_async_timeout() of test_multiprocessing_spawn: http://bugs.python.org/issue19599 opened by haypo #19600: Use specific asserts in distutils tests http://bugs.python.org/issue19600 opened by serhiy.storchaka #19601: Use specific asserts in sqlite3 tests http://bugs.python.org/issue19601 opened by serhiy.storchaka #19602: Use specific asserts in tkinter tests http://bugs.python.org/issue19602 opened by serhiy.storchaka #19603: Use specific asserts in test_decr http://bugs.python.org/issue19603 opened by serhiy.storchaka #19604: Use specific asserts in array tests http://bugs.python.org/issue19604 opened by serhiy.storchaka #19605: Use specific asserts in datetime tests http://bugs.python.org/issue19605 opened by serhiy.storchaka #19606: Use specific asserts in http.cookiejar tests http://bugs.python.org/issue19606 opened by serhiy.storchaka #19607: Use specific asserts in weakref tests http://bugs.python.org/issue19607 opened by serhiy.storchaka #19608: devguide needs pictures http://bugs.python.org/issue19608 opened by techtonik #19610: TypeError in distutils.command.upload http://bugs.python.org/issue19610 opened by berker.peksag #19611: inspect.getcallargs doesn't properly interpret set comprehensi http://bugs.python.org/issue19611 opened by nedbat #19612: test_subprocess: sporadic failure of test_communicate_epipe() http://bugs.python.org/issue19612 opened by haypo Most recent 15 issues with no replies (15) ========================================== #19612: test_subprocess: sporadic failure of test_communicate_epipe() http://bugs.python.org/issue19612 #19607: Use specific asserts in weakref tests http://bugs.python.org/issue19607 #19606: Use specific asserts in http.cookiejar tests http://bugs.python.org/issue19606 #19605: Use specific asserts in datetime tests http://bugs.python.org/issue19605 #19604: Use specific asserts in array tests http://bugs.python.org/issue19604 #19603: Use specific asserts in test_decr http://bugs.python.org/issue19603 #19602: Use specific asserts in tkinter tests http://bugs.python.org/issue19602 #19601: Use specific asserts in sqlite3 tests http://bugs.python.org/issue19601 #19600: Use specific asserts in distutils tests http://bugs.python.org/issue19600 #19599: Failure of test_async_timeout() of test_multiprocessing_spawn: http://bugs.python.org/issue19599 #19594: Use specific asserts in unittest tests http://bugs.python.org/issue19594 #19591: Use specific asserts in ctype tests http://bugs.python.org/issue19591 #19584: IDLE fails - Python V2.7.6 - 64b on Win7 64b http://bugs.python.org/issue19584 #19578: del list[a:b:c] doesn't handle correctly list_resize() failure http://bugs.python.org/issue19578 #19573: Fix the docstring of inspect.Parameter and the implementation http://bugs.python.org/issue19573 Most recent 15 issues waiting for review (15) ============================================= #19610: TypeError in distutils.command.upload http://bugs.python.org/issue19610 #19607: Use specific asserts in weakref tests http://bugs.python.org/issue19607 #19606: Use specific asserts in http.cookiejar tests http://bugs.python.org/issue19606 #19605: Use specific asserts in datetime tests http://bugs.python.org/issue19605 #19604: Use specific asserts in array tests http://bugs.python.org/issue19604 #19603: Use specific asserts in test_decr http://bugs.python.org/issue19603 #19602: Use specific asserts in tkinter tests http://bugs.python.org/issue19602 #19601: Use specific asserts in sqlite3 tests http://bugs.python.org/issue19601 #19600: Use specific asserts in distutils tests http://bugs.python.org/issue19600 #19597: Add decorator for tests not yet implemented http://bugs.python.org/issue19597 #19596: Silently skipped tests in test_importlib http://bugs.python.org/issue19596 #19594: Use specific asserts in unittest tests http://bugs.python.org/issue19594 #19593: Use specific asserts in importlib tests http://bugs.python.org/issue19593 #19591: Use specific asserts in ctype tests http://bugs.python.org/issue19591 #19590: Use specific asserts in test_email http://bugs.python.org/issue19590 Top 10 most discussed issues (10) ================================= #18864: Implementation for PEP 451 (importlib.machinery.ModuleSpec) http://bugs.python.org/issue18864 25 msgs #19575: subprocess.Popen with multiple threads: Redirected stdout/stde http://bugs.python.org/issue19575 16 msgs #19566: ERROR: test_close (test.test_asyncio.test_unix_events.FastChil http://bugs.python.org/issue19566 12 msgs #5815: locale.getdefaultlocale() missing corner case http://bugs.python.org/issue5815 10 msgs #19544: Port distutils as found in Python 2.7 to Python 3.x. http://bugs.python.org/issue19544 9 msgs #19568: bytearray_setslice_linear() leaves the bytearray in an inconsi http://bugs.python.org/issue19568 9 msgs #18861: Problems with recursive automatic exception chaining http://bugs.python.org/issue18861 7 msgs #19565: test_multiprocessing_spawn: RuntimeError and assertion error o http://bugs.python.org/issue19565 7 msgs #19533: Unloading docstrings from memory if -OO is given http://bugs.python.org/issue19533 6 msgs #19534: normalize() in locale.py fails for sr_RS.UTF-8 at latin http://bugs.python.org/issue19534 6 msgs Issues closed (43) ================== #6683: smtplib authentication - try all mechanisms http://bugs.python.org/issue6683 closed by python-dev #9878: Avoid parsing pyconfig.h and Makefile by autogenerating extens http://bugs.python.org/issue9878 closed by barry #9922: subprocess.getstatusoutput can fail with utf8 UnicodeDecodeErr http://bugs.python.org/issue9922 closed by tim.golden #10262: Add --soabi option to `configure` http://bugs.python.org/issue10262 closed by barry #10734: test_ttk test_heading_callback fails with newer Tk 8.5 http://bugs.python.org/issue10734 closed by serhiy.storchaka #11473: upload command no longer accepts repository by section name http://bugs.python.org/issue11473 closed by jason.coombs #11513: chained exception/incorrect exception from tarfile.open on a n http://bugs.python.org/issue11513 closed by barry #11677: make test has horrendous performance on an ecryptfs http://bugs.python.org/issue11677 closed by barry #12828: xml.dom.minicompat is not documented http://bugs.python.org/issue12828 closed by python-dev #13674: crash in datetime.strftime http://bugs.python.org/issue13674 closed by tim.golden #15440: multiprocess fails to re-raise exception which has mandatory a http://bugs.python.org/issue15440 closed by sbt #16685: audioop functions shouldn't accept strings http://bugs.python.org/issue16685 closed by serhiy.storchaka #16803: Make test_importlib run tests under both _frozen_importlib and http://bugs.python.org/issue16803 closed by brett.cannon #17823: 2to3 fixers for missing codecs http://bugs.python.org/issue17823 closed by ncoghlan #17828: More informative error handling when encoding and decoding http://bugs.python.org/issue17828 closed by python-dev #17874: ProcessPoolExecutor in interactive shell doesn't work in Windo http://bugs.python.org/issue17874 closed by sbt #18923: Use the new selectors module in the subprocess module http://bugs.python.org/issue18923 closed by neologix #19227: test_multiprocessing_xxx hangs under Gentoo buildbots http://bugs.python.org/issue19227 closed by benjamin.peterson #19249: Enumeration.__eq__ http://bugs.python.org/issue19249 closed by ethan.furman #19261: Add support for 24-bit in the sunau module http://bugs.python.org/issue19261 closed by serhiy.storchaka #19406: PEP 453: add the ensurepip module http://bugs.python.org/issue19406 closed by python-dev #19429: OSError constructor does not handle errors correctly http://bugs.python.org/issue19429 closed by haypo #19440: Clean up test_capi http://bugs.python.org/issue19440 closed by zach.ware #19490: Problem installing matplotlib 1.3.1 with Python 2.7.6rc1 and 3 http://bugs.python.org/issue19490 closed by ned.deily #19499: "import this" is cached in sys.modules http://bugs.python.org/issue19499 closed by barry #19515: Share duplicated _Py_IDENTIFIER identifiers in C code http://bugs.python.org/issue19515 closed by haypo #19530: cross thread shutdown of UDP socket exhibits unexpected behavi http://bugs.python.org/issue19530 closed by neologix #19540: PEP339: Fix link to Zephyr ASDL paper http://bugs.python.org/issue19540 closed by benjamin.peterson #19556: A missing link to Python-2.7.6.tar.bz2 in Download page. http://bugs.python.org/issue19556 closed by ned.deily #19559: Interactive interpreter crashes after any two commands http://bugs.python.org/issue19559 closed by ezio.melotti #19560: PEP 8 operator precedence across parens http://bugs.python.org/issue19560 closed by ezio.melotti #19567: mimetypes.init() raise unhandled excption in windows http://bugs.python.org/issue19567 closed by r.david.murray #19571: urlparse.parse_qs with empty string http://bugs.python.org/issue19571 closed by r.david.murray #19574: Negative integer division error http://bugs.python.org/issue19574 closed by r.david.murray #19579: test_asyncio: test__run_once timings should be relaxed http://bugs.python.org/issue19579 closed by haypo #19580: Got resource warning when running test_base_events (test_async http://bugs.python.org/issue19580 closed by haypo #19583: time.strftime fails to use %:z time formatter of the underlyin http://bugs.python.org/issue19583 closed by deronnax #19589: Use specific asserts in test_asyncio http://bugs.python.org/issue19589 closed by serhiy.storchaka #19592: Use specific asserts in lib2to3 tests http://bugs.python.org/issue19592 closed by serhiy.storchaka #19598: Timeout in test_popen() of test_asyncio.test_windows_utils.Pop http://bugs.python.org/issue19598 closed by haypo #19609: Codec exception chaining shouldn't cover the initial codec loo http://bugs.python.org/issue19609 closed by ncoghlan #1097797: Encoding for Code Page 273 used by EBCDIC Germany Austria http://bugs.python.org/issue1097797 closed by akuchling #1109602: Need some setup.py sanity http://bugs.python.org/issue1109602 closed by ned.deily From trent at snakebite.org Fri Nov 15 17:56:30 2013 From: trent at snakebite.org (Trent Nelson) Date: Fri, 15 Nov 2013 11:56:30 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: <20131115165629.GA1060@snakebite.org> On Tue, Nov 12, 2013 at 01:16:55PM -0800, Victor Stinner wrote: > pysandbox cannot be used in practice > ==================================== > > To protect the untrusted namespace, pysandbox installs a lot of > different protections. Because of all these protections, it becomes > hard to write Python code. Basic features like "del dict[key]" are > denied. Passing an object to a sandbox is not possible to sandbox, > pysandbox is unable to proxify arbitary objects. > > For something more complex than evaluating "1+(2*3)", pysandbox cannot > be used in practice, because of all these protections. Individual > protections cannot be disabled, all protections are required to get a > secure sandbox. This sounds a lot like the work I initially did with PyParallel to try and intercept/prevent parallel threads mutating main-thread objects. I ended up arriving at a much better solution by just relying on memory protection; main thread pages are set read-only prior to parallel threads being able to run. If a parallel thread attempts to mutate a main thread object; a SEH is raised (SIGSEV on POSIX), which I catch in the ceval loop and convert into an exception. See slide 138 of this: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 I'm wondering if this sort of an approach (which worked surprisingly well) could be leveraged to also provide a sandbox environment? The goals are the same: robust protection against mutation of memory allocated outside of the sandbox. (I'm purely talking about memory mutation; haven't thought about how that could be extended to prevent file system interaction as well.) Trent. From arigo at tunes.org Fri Nov 15 18:21:31 2013 From: arigo at tunes.org (Armin Rigo) Date: Fri, 15 Nov 2013 18:21:31 +0100 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> Message-ID: Hi again, I figured that even using the traceback.py module and getting "Exception: \u1234\u1235\u5321" is rather useless if you tried to raise an exception with a message in Thai. I believe this to also be a bug, so I opened https://bugs.pypy.org/issue1634 . According to this thread, however, python-dev is against it, so I didn't bother adding a CPython bug. A bient?t, Armin. From victor.stinner at gmail.com Fri Nov 15 18:34:20 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 15 Nov 2013 18:34:20 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <20131115165629.GA1060@snakebite.org> References: <20131115165629.GA1060@snakebite.org> Message-ID: 2013/11/15 Trent Nelson : > This sounds a lot like the work I initially did with PyParallel to > try and intercept/prevent parallel threads mutating main-thread > objects. > > I ended up arriving at a much better solution by just relying on > memory protection; main thread pages are set read-only prior to > parallel threads being able to run. If a parallel thread attempts > to mutate a main thread object; a SEH is raised (SIGSEV on POSIX), > which I catch in the ceval loop and convert into an exception. Read-only is not enough, an attack must not be able to read sensitive data. Protections of memory pages sound very low-level, so not very portable :-/ How do you know fif SIGSEGV comes from a legal call (parallel thread thing) or a real bug? Victor From chris.barker at noaa.gov Fri Nov 15 18:41:58 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 15 Nov 2013 09:41:58 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> Message-ID: On Fri, Nov 15, 2013 at 9:21 AM, Armin Rigo wrote: > I figured that even using the traceback.py module and getting > "Exception: \u1234\u1235\u5321" is rather useless if you tried to > raise an exception with a message in Thai. yup. > I believe this to also be > a bug, so I opened https://bugs.pypy.org/issue1634 . According to > this thread, however, python-dev is against it, so I didn't bother > adding a CPython bug. According to that bug report, it looks like CPython doesn't comopletely handle unicode Exception messages even in py3? Is that really the case? And from this thread, I'd say that it's unlikely anyone want to chance this in py2, but I don't know that making py3 better is this regard is off the table. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From trent at snakebite.org Fri Nov 15 18:51:00 2013 From: trent at snakebite.org (Trent Nelson) Date: Fri, 15 Nov 2013 09:51:00 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131115165629.GA1060@snakebite.org> Message-ID: <47FA6E04-AF61-4510-9667-59B23341DD76@snakebite.org> On Nov 15, 2013, at 12:34 PM, Victor Stinner wrote: > 2013/11/15 Trent Nelson : >> This sounds a lot like the work I initially did with PyParallel to >> try and intercept/prevent parallel threads mutating main-thread >> objects. >> >> I ended up arriving at a much better solution by just relying on >> memory protection; main thread pages are set read-only prior to >> parallel threads being able to run. If a parallel thread attempts >> to mutate a main thread object; a SEH is raised (SIGSEV on POSIX), >> which I catch in the ceval loop and convert into an exception. > > Read-only is not enough, an attack must not be able to read sensitive data. Well you could remove both write *and* read perms from pages, such that you would trap on read attempts too. What's an example of sensitive data that you'd need to have residing in the same process that you also want to sandbox? I was going to suggest something like: with memory.protected: htpasswd = open('htpasswd', 'r').read() ... But then I couldn't think of why you'd persist the sensitive data past the point you'd need it. > Protections of memory pages sound very low-level, so not very portable :-/ It's a pretty fundamental provision provided by operating systems; granted, the interface differs (mprotect() versus VirtualProtect()), but the result is the same. > How do you know fif SIGSEGV comes from a legal call (parallel thread > thing) or a real bug? You don't, but it doesn't really matter. It'll be pretty obvious from looking at the offending line of code in the exception whether it was a legitimate memory protection error, or a bug in an extension module/CPython internals. And having a ProtectionError bubble all the way back up to the top of the stack with exact details about the offending frame/line could be considered a nicer alternative to dumping core ;-) (Unless you happen to be in an `except: pass` block.) > Victor Trent. From brett at python.org Fri Nov 15 19:03:06 2013 From: brett at python.org (Brett Cannon) Date: Fri, 15 Nov 2013 13:03:06 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> Message-ID: On Fri, Nov 15, 2013 at 12:41 PM, Chris Barker wrote: > On Fri, Nov 15, 2013 at 9:21 AM, Armin Rigo wrote: > > > I figured that even using the traceback.py module and getting > > "Exception: \u1234\u1235\u5321" is rather useless if you tried to > > raise an exception with a message in Thai. > > yup. > > > I believe this to also be > > a bug, so I opened https://bugs.pypy.org/issue1634 . According to > > this thread, however, python-dev is against it, so I didn't bother > > adding a CPython bug. > > According to that bug report, it looks like CPython doesn't > comopletely handle unicode Exception messages even in py3? Is that > really the case? > > And from this thread, I'd say that it's unlikely anyone want to chance > this in py2, but I don't know that making py3 better is this regard is > off the table. Making changes and improvements to Python 3 is totally an option. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Fri Nov 15 19:02:07 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Fri, 15 Nov 2013 10:02:07 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: <528620A2.5010101@v.loewis.de> References: <528620A2.5010101@v.loewis.de> Message-ID: On Fri, Nov 15, 2013 at 5:24 AM, "Martin v. L?wis" wrote: > Procedurally, it's really easy. Ultimately it's up to the release > manager to decide which changes go into a release and which don't, and > Benjamin has already voiced an opinion. Very early in the conversation, though honestly, probably nothing compelling has been brought up... > In addition, Guido van Rossum has voiced an opinion a while ago that > he doesn't consider fixing bugs for 2.7 very useful, and would rather > see maintenance focus on ongoing support for new operating systems, > compiler, build environments, etc. The rationale is that people who > have lived with the glitches of 2.x for so long surely have already > made their work-arounds, so they aren't helped with receiving bug fixes. If that's the policy, then that's the policy, but ... > The same is true in your case: you indicated that you *already* work > around the problem. only in one script, and I just looked, and I missed a numer of locations in that script. I've been running that script for years with very few changes, but ;last week, someone gave me a utf-8 data file -- it was really easy to read the file as utf-8 (change one line of code), and bingo! everything else just worked. Then I hit an exception, and banged my head against the wall for a while -- though I guess this is what we always deal with anywhere we introduce unicode to a previously-non-unicode-aware application. I'm still a bit dumbfounded that you can't use a unicode message in an Exception, though, still not sure why that's required... > It may have been tedious when you had to do it, > but now it's done - and you might not even change your code even if > Python 2.7.x gets changed, since you might want to support older 2.7.x > release for some time. In this case, no -- but really this is more about making it easier to just dump unicode in somewhere, or, in fact simple give people more meaningful errors when they do that... And I have a lot of code that ignores this problem, and I"m sure it will come up for me and others over and over again. But yeas, it clearly hasn't been a deal-breaker so far! On Fri, Nov 15, 2013 at 2:48 AM, Armin Rigo wrote: > FWIW, the pure Python traceback.py module has a slightly different > (and saner) behavior: > >>>> e = Exception(u"xx\u1234yy") >>>> traceback.print_exception(Exception, e, None) > Exception: xx\u1234yy > > I'd suggest that the behavior of the two should be unified anyway. > The traceback module uses value.encode("ascii", "backslashreplace") > for any unicode object. Nice observation -- so at least someone else agreed with me about what the "right" thing to do is -- oh well. On Thu, Nov 14, 2013 at 9:42 PM, Steven D'Aprano wrote: > I'm not > convinced that treating Unicode strings as a special case is justified. > It's been at least four, and possibly six (back to 2.2) point releases > with this behaviour, and until now apparently nobody has noticed. Not true -- apparently no one has brought it up on pyton-dev or posted an issue, but I confirmed that I understood what was going on with a little googling, including: http://pythonhosted.org/kitchen/unicode-frustrations.html#frustration-5-exceptions That's document was written 19 March 2011, and at the time the library worked with pyton 2.3 and later. Anyway, back to two questions: 1) could it be improved? it seems there is some disagreement on that one. and 2) Is this a big enough deal to change 2.* ? >From what Martin says, No. So we don't need to argue about (1). I sure hope py3 behavior is solid on this (sorry, no py3 to test on here...) But I can't help myself: Of all the guidelines for writing good code, the one I come back to again and again is DRY -- it drives almost all of my code structure decisions. So, in this case, now I need to think about whether to put in a kludge every single time I raise an Exception. In the script at hand, I needed to change 7 instances of raising an Exception, out of 10 total. Contrast that with one line of code changed in the Exception code. In fact, what I"ll probably do is write a little wrapper that does teh encoding for an arbitrary exeption, and use that, somethign like: def my_raise(exp, msg): raise exp(unicode(msg).encode('ascii', 'replace')) But does it really make sense for me to write that an use it all over the place, as well as everyone else doing their own kludges? Oh well, I suppose the real lesson is go to Python 3.... -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From walter at livinglogic.de Fri Nov 15 21:00:50 2013 From: walter at livinglogic.de (=?utf-8?Q?Walter_D=C3=B6rwald?=) Date: Fri, 15 Nov 2013 21:00:50 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <87a9h5hd4j.fsf@uwakimon.sk.tsukuba.ac.jp> References: <051807F7-D6A0-4BC5-BE4C-7EA014B42F75@livinglogic.de> <87a9h5hd4j.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: Am 15.11.2013 um 16:57 schrieb "Stephen J. Turnbull" : > > Walter D?rwald writes: >>> Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : >>> >>> 15.11.13 00:32, Victor Stinner ???????(??): >>>> And add transform() and untransform() methods to bytes and str types. >>>> In practice, it might be same codecs registry for all codecs just with >>>> a new attribute. >>> >>> If the transform() method will be added, I prefer to have only >>> one transformation method and specify a direction by the >>> transformation name ("bzip2"/"unbzip2"). >> >> +1 > > -1 > > I can't support adding such methods (and that's why I ended up giving > Nick's proposal for exposing codecs.encode and codecs.decode a +1). My +1 was only for having the transformation be one-way under the condition that it is added at all. > People think about these transformations as "en- or de-coding", not > "transforming", most of the time. Even for a transformation that is > an involution (eg, rot13), people have an very clear idea of what's > encoded and what's not, and they are going to prefer the names > "encode" and "decode" for these (generic) operations in many cases. > > Eg, I don't think "s.transform(decoder)" is an improvement over > "decode(s, codec)" (but tastes vary).[1] It does mean that we need > to add a redundant method, and I don't really see an advantage to it. Actually my preferred method would be codec.decode(s). codec being the module that implements the functionality. I don't think we need to invent another function registry. > The semantics seem slightly off to me, since the purpose of the > operation is to create a new object, not transform the original > in-place. This would mean the method would have to be called transformed()? > (But of course str.encode and bytes.decode are precedents > for those semantics.) > > > Footnotes: > [1] Arguments "decoder" and "codec" are identifiers, not metavariables. Servus, Walter From tim.peters at gmail.com Fri Nov 15 22:44:56 2013 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 15 Nov 2013 15:44:56 -0600 Subject: [Python-Dev] Finding overlapping matches with re assertions: bug or feature? In-Reply-To: References: Message-ID: [Tim] >> Is that a feature? Or an accident? It's very surprising to find a >> non-empty match inside an empty match (the outermost lookahead >> assertion). [Paul Moore] > Personally, I would read (?=(R))" as finding an empty match at a point > where R starts. There's no implication that R is in any sense "inside" > the match. > > (?=(\<\w\w\w\w\w\w)\w\w\w) finds the first 3 characters of words that > are 6 or more characters long. Once again, the lookahead extends > beyond the extent of the main match. > > It's obscure and a little bizarre, but I'd say its intended and a > logical consequence of the definitions. After sleeping on it, I woke up a lot less surprised. You'd think that after decades of regexps, I'd be used to that by now ;-) Thanks for the response! Your points sound valid to me, and I agree. From christian at python.org Fri Nov 15 23:05:55 2013 From: christian at python.org (Christian Heimes) Date: Fri, 15 Nov 2013 23:05:55 +0100 Subject: [Python-Dev] cpython: Issue #19544 and Issue #6516: Restore support for --user and --group parameters In-Reply-To: <3dLnZZ0Vjnz7LqJ@mail.python.org> References: <3dLnZZ0Vjnz7LqJ@mail.python.org> Message-ID: <52869AC3.1040405@python.org> Am 15.11.2013 19:07, schrieb jason.coombs: > http://hg.python.org/cpython/rev/b9c9c4b2effe > changeset: 87119:b9c9c4b2effe > user: Andrew Kuchling > date: Fri Nov 15 13:01:52 2013 -0500 > summary: > Issue #19544 and Issue #6516: Restore support for --user and --group parameters to sdist command as found in Python 2.7 and originally slated for Python 3.2 but accidentally rolled back as part of the distutils2 rollback. Closes Issue #6516. > Your commit has broken the build: ./python -E -S -m sysconfig --generate-posix-vars Could not find platform dependent libraries Consider setting $PYTHONHOME to [:] Traceback (most recent call last): File "./setup.py", line 11, in from distutils.core import Extension, setup File "/home/heimes/dev/python/cpython/Lib/distutils/core.py", line 18, in from distutils.cmd import Command File "/home/heimes/dev/python/cpython/Lib/distutils/cmd.py", line 9, in from distutils import util, dir_util, file_util, archive_util, dep_util File "/home/heimes/dev/python/cpython/Lib/distutils/archive_util.py", line 27, in from grp import getgrnam ImportError: No module named 'grp' The grp module is built later by setup.py. From tismer at stackless.com Fri Nov 15 23:24:21 2013 From: tismer at stackless.com (Christian Tismer) Date: Fri, 15 Nov 2013 23:24:21 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: Message-ID: <52869F15.7050701@stackless.com> On 13/11/13 00:49, Josiah Carlson wrote: > > Python-dev is for the development of the Python core language, the > CPython runtime, and libraries. Your sandbox, despite using and > requiring deep knowledge of the runtime, is not developing those > things. If you had a series of requests for the language or runtime > that would make your job easier, then your thread would be on-topic. > I think you should consider to re-define you perception of the purpose of the python-dev list. Simple feature-requests is not everything. Instead, this list also touches the general direction where python should go, and discusses the current hard-to-solve problems. The sand-boxing feature via rexec, bastion etc. was perceived as a useful, quite safe thing, until it was proven to be completely broken (Samuele Pedroni et. at., 2003 I think). After that, CPython simply removed those features and failed completely to provide a better solution. I appreciate very much that Victor tried his best to fill that old gap. And after that breakage happened again, I think it is urgent to have an in-depth discussion how that situation should be treated in the future. -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From greg.ewing at canterbury.ac.nz Sat Nov 16 00:58:56 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 16 Nov 2013 12:58:56 +1300 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> Message-ID: <5286B540.70804@canterbury.ac.nz> Armin Rigo wrote: > I figured that even using the traceback.py module and getting > "Exception: \u1234\u1235\u5321" is rather useless if you tried to > raise an exception with a message in Thai. But at least it tells you that *something* went wrong, and points to the place in the code where it happened. That has to be better than pretending that nothing happened at all. Also, if the escaping preserves the original byte sequence of the message, there's a chance that someone will be able to figure out what the message said. -- Greg From ncoghlan at gmail.com Sat Nov 16 01:26:02 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 10:26:02 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <20131115173458.123494af@fsol> References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: On 16 Nov 2013 02:36, "Antoine Pitrou" wrote: > > On Sat, 16 Nov 2013 00:46:15 +1000 > Nick Coghlan wrote: > > On 16 November 2013 00:04, Antoine Pitrou wrote: > > >> Rather than the more useful: > > >> > > >> >>> b"abcdef".decode("hex") > > >> Traceback (most recent call last): > > >> File "", line 1, in > > >> TypeError: 'hex' decoder returned 'bytes' instead of 'str'; use > > >> codecs.decode() to decode to arbitrary types > > > > > > I think this may be confusing. TypeError seems to suggest that the > > > parameter type sent by the user to the method is wrong, which is not > > > the actual cause of the error. > > > > The TypeError isn't new, > > Really? That's not what your message said. The second example in my post included restoring the "hex" alias for "hex_codec" (its absence is the reason for the current "unknown encoding" error). The 3.2 and 3.3 error message for a restored alias would have been "TypeError: 'hex' decoder returned 'bytes' instead of 'str'", which I agree is confusing and uninformative - that's why I added the reference to the module level functions to the output type errors *before* proposing the restoration of the aliases. So you can already use "codecs.decode(s, 'hex_codec')" in Python 3, you just won't get a useful error leading you there if you use the more common 'hex' alias instead. To address Serhiy's security concerns with the compression codecs (which are technically independent of the question of restoring the aliases), I also plan to document how to systematically blacklist particular codecs in an application by setting attributes on the encodings module and/or appropriate entries in sys.modules. Finally, I now plan to write a documentation PEP that suggests clearly splitting the codecs module docs into two layers: the type agnostic core infrastructure and the specific application of that infrastructure to the implementation of the text encoding model. The only functional *change* I'd still like to make for 3.4 is to restore the shorthand aliases for the non-Unicode codecs (to ease the migration for folks coming from Python 2), but this thread has convinced me I likely need to write the PEP *before* doing that, and I still have to integrate ensurepip into pyvenv before the beta 1 deadline. So unless you and Victor are prepared to +1 the restoration of the codec aliases (closing issue 7475) in anticipation of that codecs infrastructure documentation PEP, the change to restore the aliases probably won't be in 3.4. (I *might* get the PEP written in time regardless, but I'm not betting on it at this point). Cheers, Nick. > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Nov 16 01:31:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 10:31:05 +1000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <52869F15.7050701@stackless.com> References: <52869F15.7050701@stackless.com> Message-ID: On 16 Nov 2013 08:25, "Christian Tismer" wrote: > > On 13/11/13 00:49, Josiah Carlson wrote: >> >> >> Python-dev is for the development of the Python core language, the CPython runtime, and libraries. Your sandbox, despite using and requiring deep knowledge of the runtime, is not developing those things. If you had a series of requests for the language or runtime that would make your job easier, then your thread would be on-topic. >> > > I think you should consider to re-define you perception of the purpose > of the python-dev list. Simple feature-requests is not everything. > Instead, this list also touches the general direction where python should > go, and discusses the current hard-to-solve problems. > > The sand-boxing feature via rexec, bastion etc. was perceived as a useful, quite > safe thing, until it was proven to be completely broken (Samuele Pedroni et. at., 2003 > I think). After that, CPython simply removed those features and failed completely to > provide a better solution. "Use an OS level sandbox" *is* better from a security point of view. It's just not portable :P Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Nov 16 01:35:20 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 15 Nov 2013 16:35:20 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <52869F15.7050701@stackless.com> Message-ID: On Fri, Nov 15, 2013 at 4:31 PM, Nick Coghlan wrote: > "Use an OS level sandbox" *is* better from a security point of view. It's > just not portable :P > Honestly, I don't believe in portable security. :-) BTW, in case it wasn't clear, I think it was a courageous step by Victor to declare defeat. Negative results are also results, and they need to be published. Thanks Victor! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Nov 16 01:47:38 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 16 Nov 2013 01:47:38 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: 2013/11/16 Nick Coghlan : > To address Serhiy's security concerns with the compression codecs (which are > technically independent of the question of restoring the aliases), I also > plan to document how to systematically blacklist particular codecs in an > application by setting attributes on the encodings module and/or appropriate > entries in sys.modules. I would be simpler and safer to blacklist bytes=>bytes and str=>str codecs from bytes.decode() and str.encode() directly. Marc Andre Lemburg proposed to add new attributes in CodecInfo to specify input and output types. > The only functional *change* I'd still like to make for 3.4 is to restore > the shorthand aliases for the non-Unicode codecs (to ease the migration for > folks coming from Python 2), but this thread has convinced me I likely need > to write the PEP *before* doing that, and I still have to integrate > ensurepip into pyvenv before the beta 1 deadline. > > So unless you and Victor are prepared to +1 the restoration of the codec > aliases (closing issue 7475) in anticipation of that codecs infrastructure > documentation PEP, the change to restore the aliases probably won't be in > 3.4. (I *might* get the PEP written in time regardless, but I'm not betting > on it at this point). Using StackOverflow search engine, I found some posts where people asks for "hex" codec on Python 3. There are two answers: use binascii module or use codecs.encode(). So even if codecs.encode() was never documented, it looks like it is used. So I now agree that documenting it would not make the situation worse. Adding transform()/untransform() method to bytes and str is a non trivial change and not everybody likes them. Anyway, it's too late for Python 3.4. In my opinion, the best option is to add new input_type/output_type attributes to CodecInfo right now, and modify the codecs so "abc".encode("hex") raises a LookupError (instead of tricky error message with some evil low-level hacks on the traceback and the exception, which is my initial concern in this mail thread). It fixes also the security vulnerability. To keep backward compatibility (even with custom codecs registered manually), if input_type/output_type is not defined, we should consider that the codec is a classical text encoding (encode str=>bytes, decode bytes=>str). The type of codecs.encode() result is my least concern in this topic. I created the following issue to implement my idea: http://bugs.python.org/issue19619 Victor From tismer at stackless.com Sat Nov 16 02:35:53 2013 From: tismer at stackless.com (Christian Tismer) Date: Sat, 16 Nov 2013 02:35:53 +0100 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <52869F15.7050701@stackless.com> Message-ID: <5286CBF9.1080809@stackless.com> On 16.11.13 01:35, Guido van Rossum wrote: > On Fri, Nov 15, 2013 at 4:31 PM, Nick Coghlan > wrote: > > "Use an OS level sandbox" *is* better from a security point of > view. It's just not portable :P > > > Honestly, I don't believe in portable security. :-) > > BTW, in case it wasn't clear, I think it was a courageous step by > Victor to declare defeat. Negative results are also results, and they > need to be published. Thanks Victor! Sure it was, and it was great to follow Victor's project! I was about to use it in production, until I saw it's flaws, a while back. Nevertheless, the issue has never been treated as much as to be able to say "this way you implement that security in Python", whatever "that" should be. So I think it is worth discussing, and may it just be to identify the levels of security involved, to help people to even identify their individual needs. My question is, actually: Do we need to address this topic, or is it already crystal clear that something like PyPy's approach is necessary and sufficient to solve the common, undefined problem of "run some script on whatnot, with the following security constraint"? IOW: Do we really need a full abstraction, embedded in a virtual OS, or is there already a compromise that suits 98 percent of the common needs? I think as a starter, categorizing the expectations of some measure of 'secure python' would make sense. And I'm asking the people with better knowledge of these matters than I have. (and not asking those who don't... ;-) ) cheers -- Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Nov 16 02:51:30 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 15 Nov 2013 20:51:30 -0500 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) Message-ID: http://bugs.python.org/issue19562 propose to change the first assert in Lib/datetime.py assert 1 <= month <= 12, month to assert 1 <= month <= 12,'month must be in 1..12' to match the next two asserts out of the *53* in the file. I think that is the wrong direction of change, but that is not my question here. Should stdlib code use assert at all? If user input can trigger an assert, then the code should raise a normal exception that will not disappear with -OO. If the assert is testing program logic, then it seems that the test belongs in the test file, in this case, test/test_datetime.py. For example, consider the following (backwards) code. _DI4Y = _days_before_year(5) # A 4-year cycle has an extra leap day over what we'd get from pasting # together 4 single years. assert _DI4Y == 4 * 365 + 1 To me, the constant should be directly set to its known value. _DI4Y = 4*365 + 1. The function should then be tested in test_datetime. self.assertEqual(dt._days_before_year(5), dt._DI4Y) Is there any policy on use of assert in stdlib production code? -- Terry Jan Reedy From tim.peters at gmail.com Sat Nov 16 03:10:33 2013 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 15 Nov 2013 20:10:33 -0600 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: [Terry Reedy] > Should stdlib code use assert at all? Of course, and for exactly the same reasons we use `assert()` in Python's C code: to verify preconditions, postconditions, and invariants that should never fail. Assertions should never be used to, e.g., verify user-supplied input (or anything else we believe _may_ fail). > If user input can trigger an assert, then the code should raise a normal > exception that will not disappear with -OO. Agreed. > If the assert is testing program logic, then it seems that the test belongs > in the test file, in this case, test/test_datetime.py. For example, consider > the following (backwards) code. > _DI4Y = _days_before_year(5) > # A 4-year cycle has an extra leap day over what we'd get from pasting > # together 4 single years. > assert _DI4Y == 4 * 365 + 1 > > To me, the constant should be directly set to its known value. > _DI4Y = 4*365 + 1. > The function should then be tested in test_datetime. > self.assertEqual(dt._days_before_year(5), dt._DI4Y) I think making that change would be pointless code churn. Harmful, even. As the guy who happened to have written that code ;-), I think it's valuable to have the _code_ (not off buried in some monstrously tedious test file) explain what the comments there do explain, and verify with the assert. If anyone needs to muck with the implementation of datetime, it's crucial they understand what DI4Y _means_, and that it's identical to _days_before_year(5). Its actual value (4*365 + 1) isn't really interesting. Defining _DI4Y _as_ _days_before_year(5) captures its _meaning_. Ain't broke - don't fix. From dreamingforward at gmail.com Sat Nov 16 04:32:17 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Fri, 15 Nov 2013 19:32:17 -0800 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: > Should stdlib code use assert at all? > > If user input can trigger an assert, then the code should raise a normal > exception that will not disappear with -OO. > > If the assert is testing program logic, then it seems that the test belongs > in the test file, in this case, test/test_datetime.py. For example, consider > the following (backwards) code. > > Is there any policy on use of assert in stdlib production code? It is my assertion that assert should only be used where a system-level problem would occur, where you cannot trap an error condition. -- MarkJ Tacoma, Washington From ethan at stoneleaf.us Sat Nov 16 05:07:15 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 15 Nov 2013 20:07:15 -0800 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <52869F15.7050701@stackless.com> References: <52869F15.7050701@stackless.com> Message-ID: <5286EF73.7070400@stoneleaf.us> On 11/15/2013 02:24 PM, Christian Tismer wrote: > > I appreciate very much that Victor tried his best to fill that old gap. > And after that breakage happened again, I think it is urgent to have an > in-depth discussion how that situation should be treated in the > future. +1 -- ~Ethan~ From larry at hastings.org Sat Nov 16 10:50:47 2013 From: larry at hastings.org (Larry Hastings) Date: Sat, 16 Nov 2013 01:50:47 -0800 Subject: [Python-Dev] Reminder: Python 3.4 feature freeze in a week Message-ID: <52873FF7.70907@hastings.org> We're on schedule to tag Python 3.4 Beta 1 next Saturday. And when that happens we go into "feature freeze" on Python trunk; no new features will be accepted in trunk until we branch the 3.4 release branch next February. Time to get those checkins in folks. Last call, everyone, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Nov 16 10:44:51 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 19:44:51 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: On 16 Nov 2013 10:47, "Victor Stinner" wrote: > > 2013/11/16 Nick Coghlan : > > To address Serhiy's security concerns with the compression codecs (which are > > technically independent of the question of restoring the aliases), I also > > plan to document how to systematically blacklist particular codecs in an > > application by setting attributes on the encodings module and/or appropriate > > entries in sys.modules. > > I would be simpler and safer to blacklist bytes=>bytes and str=>str > codecs from bytes.decode() and str.encode() directly. Marc Andre > Lemburg proposed to add new attributes in CodecInfo to specify input > and output types. Yes, but that type compatibility introspection is a change for 3.5 at the earliest (although I commented on http://bugs.python.org/issue19619 with two alternate suggestions that I think would be reasonable to implement for 3.4). Everything codec related that I am doing at the moment is about improving the state of 3.4 and source compatible 2/3 code. Proposals for further 3.5+ only improvements are relevant only in the sense that I don't want to lock us out from future improvements (which is why my main aim is to clarify the status quo, with the only functional changes related to restoring feature parity with Python 2 for non-Unicode codecs). > > The only functional *change* I'd still like to make for 3.4 is to restore > > the shorthand aliases for the non-Unicode codecs (to ease the migration for > > folks coming from Python 2), but this thread has convinced me I likely need > > to write the PEP *before* doing that, and I still have to integrate > > ensurepip into pyvenv before the beta 1 deadline. > > > > So unless you and Victor are prepared to +1 the restoration of the codec > > aliases (closing issue 7475) in anticipation of that codecs infrastructure > > documentation PEP, the change to restore the aliases probably won't be in > > 3.4. (I *might* get the PEP written in time regardless, but I'm not betting > > on it at this point). > > Using StackOverflow search engine, I found some posts where people > asks for "hex" codec on Python 3. There are two answers: use binascii > module or use codecs.encode(). So even if codecs.encode() was never > documented, it looks like it is used. So I now agree that documenting > it would not make the situation worse. Aye, that was my conclusion (hence my proposal on issue 7475 back in April). Can I take that observation as a +1 for restoring the aliases as well? (That and more efficiently rejecting the non-Unicode codecs from str.encode, bytes.decode and bytearray.decode are the only aspects of this subject to the beta 1 deadline - we can be a bit more leisurely when it comes to working out the details of the docs updates) > Adding transform()/untransform() method to bytes and str is a non > trivial change and not everybody likes them. Anyway, it's too late for > Python 3.4. > > In my opinion, the best option is to add new input_type/output_type > attributes to CodecInfo right now, and modify the codecs so > "abc".encode("hex") raises a LookupError (instead of tricky error > message with some evil low-level hacks on the traceback and the > exception, which is my initial concern in this mail thread). It fixes > also the security vulnerability. The C level code for catching the input type errors only looks evil because: - the C level equivalent of "exception Exception as Y: raise X from Y" is just plain ugly in the first place - the chaining includes a *lot* of checks of the original exception to ensure that no data is lost by raising a new instance of the same exception Type and chaining - it chains ValueError, AttributeError and any other currently stateless (aside from a str description) error the codec might throw, not just input type validation errors (it deliberately doesn't chain stateful errors as doing so might be backwards incompatible with existing error handling). However, the ugliness of that code is the reason I'm intrigued by the possibility of traceback annotations as a potentially cleaner solution than trying to seamlessly wrap exceptions with a new one that adds more context information. While I think the gain in codec debuggability is worth it in this case, my concern over the complexity and the current limitations are the reason I didn't make it a public C API. > To keep backward compatibility (even with custom codecs registered > manually), if input_type/output_type is not defined, we should > consider that the codec is a classical text encoding (encode > str=>bytes, decode bytes=>str). Without an already existing ByteSequence ABC , it isn't feasible to propose and implement this completely in the 3.4 time frame (since you would need such an ABC to express the input type accepted by our Unicode and binary codecs - the only one that wouldn't need it is rot_13, since that's str->str). However, the output types could be expressed solely as concrete types, and that's all we need for the blacklist (since we could replace the current instance check on the result with a subclass check on the specified output type (if any) prior to decoding. Cheers, Nick. From ncoghlan at gmail.com Sat Nov 16 11:12:18 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 20:12:18 +1000 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <5286CBF9.1080809@stackless.com> References: <52869F15.7050701@stackless.com> <5286CBF9.1080809@stackless.com> Message-ID: On 16 Nov 2013 11:35, "Christian Tismer" wrote: > IOW: Do we really need a full abstraction, embedded in a virtual OS, or > is there already a compromise that suits 98 percent of the common needs? > > I think as a starter, categorizing the expectations of some measure of 'secure python' > would make sense. And I'm asking the people with better knowledge of these matters > than I have. (and not asking those who don't... ;-) ) The litany of vulnerability reports against the Java sandbox has long confirmed my impression that secure sandboxing is a hard, not completely solved problem, best left to better resourced platform developers (or at least taking the appropriate steps to benefit from their work). A self-hosted language runtime level sandbox is, at best, a first line of defence that protects against basic, naive attacks. One of the assumptions I see from the folks working on operating systems, virtual machine and container security is that the sandboxes *will* be compromised at some point, so you have to make sure to understand what the consequences of those breaches will be, and the best answer is "they run into the next line of defence, so the only thing they have gained is the ability to attack that"). In terms of in-process sandboxing of CPython (*at all*, let alone self-hosted), we're currently missing some key foundational components: - the ability for a host process to cleanly configure the capabilities of an embedded CPython interpreter (that's what PEP 432 is all about) - elimination of all of the mechanisms by which hostile untrusted code can trigger a segfault in the runtime (any segfault bug can reasonably be assumed to be a security vulnerability waiting to be exploited, the only question is whether the CPython runtime is part of the exposed attack surface, and what the consequences are of compromising the runtime). While Victor Stinner's recent work with failmalloc has been a big step forward here, as have been various other changes in the CPython code base (like adding recursion depth constraints to the compiler toolchain), we're still a long way from being able to say "CPython cannot be segfaulted by legal Python code that doesn't use ctypes or an equivalent FFI library". This is why I share Guido's (and the PyPy team's) view that secure, cross-platform sandboxing of (C)Python is currently not possible. Secure in-process sandboxing is hard even for languages like Lua, JavaScript and Java that were designed from the ground up with sandboxing in mind - sure, you can lock things down to the point where untrusted code assuredly can't do any damage, but it often can't do anything *useful* in that state, either. By contrast, the PyPy sandbox model which uses a deliberately constrained runtime to execute untrusted code in an OS level process that is designed to only permit communication with the parent process is *exactly* the kind of paranoid defence-in-depth approach that should be employed when running untrusted code. Ideally, all of the platform level "this child process is not allowed to do anything except talk to me over stdin and stdout" would also be brought to bear on the sandboxed runtime, so that as yet undiscovered vulnerabilities in the PyPy sandbox don't result in a system compromise. Anyone interested in sandboxing of Python code would be well-advised to direct their efforts towards the parent process bindings for http://doc.pypy.org/en/latest/sandbox.html, as well as identifying the associated platform specific settings to lock out the child process from all system access except communication with the parent process over the standard streams. Cheers, Nick. From victor.stinner at gmail.com Sat Nov 16 11:45:08 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 16 Nov 2013 11:45:08 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: Why not using str type for str and str subtypes, and bytes type for bytes and bytes-like object (bytearray, memoryview)? I don't think that we need an ABC here. Victor Le 16 nov. 2013 10:44, "Nick Coghlan" a ?crit : > On 16 Nov 2013 10:47, "Victor Stinner" wrote: > > > > 2013/11/16 Nick Coghlan : > > > To address Serhiy's security concerns with the compression codecs > (which are > > > technically independent of the question of restoring the aliases), I > also > > > plan to document how to systematically blacklist particular codecs in > an > > > application by setting attributes on the encodings module and/or > appropriate > > > entries in sys.modules. > > > > I would be simpler and safer to blacklist bytes=>bytes and str=>str > > codecs from bytes.decode() and str.encode() directly. Marc Andre > > Lemburg proposed to add new attributes in CodecInfo to specify input > > and output types. > > Yes, but that type compatibility introspection is a change for 3.5 at > the earliest (although I commented on > http://bugs.python.org/issue19619 with two alternate suggestions that > I think would be reasonable to implement for 3.4). > > Everything codec related that I am doing at the moment is about > improving the state of 3.4 and source compatible 2/3 code. Proposals > for further 3.5+ only improvements are relevant only in the sense that > I don't want to lock us out from future improvements (which is why my > main aim is to clarify the status quo, with the only functional > changes related to restoring feature parity with Python 2 for > non-Unicode codecs). > > > > The only functional *change* I'd still like to make for 3.4 is to > restore > > > the shorthand aliases for the non-Unicode codecs (to ease the > migration for > > > folks coming from Python 2), but this thread has convinced me I likely > need > > > to write the PEP *before* doing that, and I still have to integrate > > > ensurepip into pyvenv before the beta 1 deadline. > > > > > > So unless you and Victor are prepared to +1 the restoration of the > codec > > > aliases (closing issue 7475) in anticipation of that codecs > infrastructure > > > documentation PEP, the change to restore the aliases probably won't be > in > > > 3.4. (I *might* get the PEP written in time regardless, but I'm not > betting > > > on it at this point). > > > > Using StackOverflow search engine, I found some posts where people > > asks for "hex" codec on Python 3. There are two answers: use binascii > > module or use codecs.encode(). So even if codecs.encode() was never > > documented, it looks like it is used. So I now agree that documenting > > it would not make the situation worse. > > Aye, that was my conclusion (hence my proposal on issue 7475 back in > April). > > Can I take that observation as a +1 for restoring the aliases as well? > (That and more efficiently rejecting the non-Unicode codecs from > str.encode, bytes.decode and bytearray.decode are the only aspects of > this subject to the beta 1 deadline - we can be a bit more leisurely > when it comes to working out the details of the docs updates) > > > Adding transform()/untransform() method to bytes and str is a non > > trivial change and not everybody likes them. Anyway, it's too late for > > Python 3.4. > > > > In my opinion, the best option is to add new input_type/output_type > > attributes to CodecInfo right now, and modify the codecs so > > "abc".encode("hex") raises a LookupError (instead of tricky error > > message with some evil low-level hacks on the traceback and the > > exception, which is my initial concern in this mail thread). It fixes > > also the security vulnerability. > > The C level code for catching the input type errors only looks evil > because: > > - the C level equivalent of "exception Exception as Y: raise X from Y" > is just plain ugly in the first place > - the chaining includes a *lot* of checks of the original exception to > ensure that no data is lost by raising a new instance of the same > exception Type and chaining > - it chains ValueError, AttributeError and any other currently > stateless (aside from a str description) error the codec might throw, > not just input type validation errors (it deliberately doesn't chain > stateful errors as doing so might be backwards incompatible with > existing error handling). > > However, the ugliness of that code is the reason I'm intrigued by the > possibility of traceback annotations as a potentially cleaner solution > than trying to seamlessly wrap exceptions with a new one that adds > more context information. While I think the gain in codec > debuggability is worth it in this case, my concern over the complexity > and the current limitations are the reason I didn't make it a public C > API. > > > To keep backward compatibility (even with custom codecs registered > > manually), if input_type/output_type is not defined, we should > > consider that the codec is a classical text encoding (encode > > str=>bytes, decode bytes=>str). > > Without an already existing ByteSequence ABC , it isn't feasible to > propose and implement this completely in the 3.4 time frame (since you > would need such an ABC to express the input type accepted by our > Unicode and binary codecs - the only one that wouldn't need it is > rot_13, since that's str->str). > > However, the output types could be expressed solely as concrete types, > and that's all we need for the blacklist (since we could replace the > current instance check on the result with a subclass check on the > specified output type (if any) prior to decoding. > > Cheers, > Nick. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sat Nov 16 11:50:48 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 16 Nov 2013 12:50:48 +0200 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <52869F15.7050701@stackless.com> <5286CBF9.1080809@stackless.com> Message-ID: On Sat, Nov 16, 2013 at 12:12 PM, Nick Coghlan wrote: > On 16 Nov 2013 11:35, "Christian Tismer" wrote: >> IOW: Do we really need a full abstraction, embedded in a virtual OS, or >> is there already a compromise that suits 98 percent of the common needs? >> >> I think as a starter, categorizing the expectations of some measure of 'secure python' >> would make sense. And I'm asking the people with better knowledge of these matters >> than I have. (and not asking those who don't... ;-) ) > > The litany of vulnerability reports against the Java sandbox has long > confirmed my impression that secure sandboxing is a hard, not > completely solved problem, best left to better resourced platform > developers (or at least taking the appropriate steps to benefit from > their work). > > A self-hosted language runtime level sandbox is, at best, a first line > of defence that protects against basic, naive attacks. One of the > assumptions I see from the folks working on operating systems, virtual > machine and container security is that the sandboxes *will* be > compromised at some point, so you have to make sure to understand what > the consequences of those breaches will be, and the best answer is > "they run into the next line of defence, so the only thing they have > gained is the ability to attack that"). > > In terms of in-process sandboxing of CPython (*at all*, let alone > self-hosted), we're currently missing some key foundational > components: > > - the ability for a host process to cleanly configure the capabilities > of an embedded CPython interpreter (that's what PEP 432 is all about) > - elimination of all of the mechanisms by which hostile untrusted code > can trigger a segfault in the runtime (any segfault bug can reasonably > be assumed to be a security vulnerability waiting to be exploited, the > only question is whether the CPython runtime is part of the exposed > attack surface, and what the consequences are of compromising the > runtime). While Victor Stinner's recent work with failmalloc has been > a big step forward here, as have been various other changes in the > CPython code base (like adding recursion depth constraints to the > compiler toolchain), we're still a long way from being able to say > "CPython cannot be segfaulted by legal Python code that doesn't use > ctypes or an equivalent FFI library". > > This is why I share Guido's (and the PyPy team's) view that secure, > cross-platform sandboxing of (C)Python is currently not possible. > Secure in-process sandboxing is hard even for languages like Lua, > JavaScript and Java that were designed from the ground up with > sandboxing in mind - sure, you can lock things down to the point where > untrusted code assuredly can't do any damage, but it often can't do > anything *useful* in that state, either. > > By contrast, the PyPy sandbox model which uses a deliberately > constrained runtime to execute untrusted code in an OS level process > that is designed to only permit communication with the parent process > is *exactly* the kind of paranoid defence-in-depth approach that > should be employed when running untrusted code. Ideally, all of the > platform level "this child process is not allowed to do anything > except talk to me over stdin and stdout" would also be brought to bear > on the sandboxed runtime, so that as yet undiscovered vulnerabilities > in the PyPy sandbox don't result in a system compromise. > > Anyone interested in sandboxing of Python code would be well-advised > to direct their efforts towards the parent process bindings for > http://doc.pypy.org/en/latest/sandbox.html, as well as identifying the > associated platform specific settings to lock out the child process > from all system access except communication with the parent process > over the standard streams. Note Nick that the part that runs stuff in child process (as opposed to have two different pythons running in the same process) is really not a limitation of the approach. It's just that it's a proof of concept and various other options are also possible, just noone seems to be interested to pursue them. Additional OS level blocking is really only working against potential segfaults, since we know that there is no IO possible from the inner process. A JIT-less PyPy sandbox can be made very secure by locking the executable pages as non-writable (we know the code does not do any IO). Cheers, fijal From fijall at gmail.com Sat Nov 16 11:53:22 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 16 Nov 2013 12:53:22 +0200 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: <20131115165629.GA1060@snakebite.org> References: <20131115165629.GA1060@snakebite.org> Message-ID: On Fri, Nov 15, 2013 at 6:56 PM, Trent Nelson wrote: > On Tue, Nov 12, 2013 at 01:16:55PM -0800, Victor Stinner wrote: >> pysandbox cannot be used in practice >> ==================================== >> >> To protect the untrusted namespace, pysandbox installs a lot of >> different protections. Because of all these protections, it becomes >> hard to write Python code. Basic features like "del dict[key]" are >> denied. Passing an object to a sandbox is not possible to sandbox, >> pysandbox is unable to proxify arbitary objects. >> >> For something more complex than evaluating "1+(2*3)", pysandbox cannot >> be used in practice, because of all these protections. Individual >> protections cannot be disabled, all protections are required to get a >> secure sandbox. > > This sounds a lot like the work I initially did with PyParallel to > try and intercept/prevent parallel threads mutating main-thread > objects. > > I ended up arriving at a much better solution by just relying on > memory protection; main thread pages are set read-only prior to > parallel threads being able to run. If a parallel thread attempts > to mutate a main thread object; a SEH is raised (SIGSEV on POSIX), > which I catch in the ceval loop and convert into an exception. > > See slide 138 of this: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 > > I'm wondering if this sort of an approach (which worked surprisingly > well) could be leveraged to also provide a sandbox environment? The > goals are the same: robust protection against mutation of memory > allocated outside of the sandbox. > > (I'm purely talking about memory mutation; haven't thought about how > that could be extended to prevent file system interaction as well.) > > > Trent. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com Trent, you should read the mail more carefully. Notably the same issues that make it impossible to create a sandbox make it impossible to create pyparaller really work. Being read-only is absolutely not enough - you can read some internal structures in inconsistent state that lead to crashes and/or very unexpected behavior even without modifying anything. PS. We really did a lot of work analyzing how STM-pypy can lead to conflicts and/or inconsistent behavior. Cheers, fijal From ncoghlan at gmail.com Sat Nov 16 12:38:47 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 21:38:47 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: On 16 November 2013 20:45, Victor Stinner wrote: > Why not using str type for str and str subtypes, and bytes type for bytes > and bytes-like object (bytearray, memoryview)? I don't think that we need an > ABC here. We'd only need an ABC if info was added for supported input types. However, that's not necessary since "encodes_to" and "decodes_to" are enough to identify Unicode encodings: "encodes_to in (None, bytes) and decodes_to in (None, str)", so we don't need to track input type support at all if the main question we want to answer is "is this a Unicode codec or not?". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mal at egenix.com Sat Nov 16 12:49:55 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 16 Nov 2013 12:49:55 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: <52875BE3.6070308@egenix.com> On 16.11.2013 01:47, Victor Stinner wrote: > Adding transform()/untransform() method to bytes and str is a non > trivial change and not everybody likes them. Anyway, it's too late for > Python 3.4. Just to clarify: I still like the idea of adding those methods. I just don't see what this addition has to do with the codecs.encode()/ .decode() functions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 3 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Sat Nov 16 12:52:55 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Nov 2013 12:52:55 +0100 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: <20131116125255.2abd43d8@fsol> On Sat, 16 Nov 2013 19:44:51 +1000 Nick Coghlan wrote: > > Aye, that was my conclusion (hence my proposal on issue 7475 back in April). > > Can I take that observation as a +1 for restoring the aliases as well? I see no harm in restoring the aliases personally, so +1 from me. Regards Antoine. From ncoghlan at gmail.com Sat Nov 16 13:48:06 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 22:48:06 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> Message-ID: On 16 November 2013 21:38, Nick Coghlan wrote: > On 16 November 2013 20:45, Victor Stinner wrote: >> Why not using str type for str and str subtypes, and bytes type for bytes >> and bytes-like object (bytearray, memoryview)? I don't think that we need an >> ABC here. > > We'd only need an ABC if info was added for supported input types. > However, that's not necessary since "encodes_to" and "decodes_to" are > enough to identify Unicode encodings: "encodes_to in (None, bytes) and > decodes_to in (None, str)", so we don't need to track input type > support at all if the main question we want to answer is "is this a > Unicode codec or not?". I realised I misunderstood your proposal because of the field names you initially suggested. I've now proposed a variation with different field names (encodes_to instead of output_type and decodes_to instead of input_type) and a "codecs.is_text_encoding" query function (http://bugs.python.org/issue19619?#msg203037) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Nov 16 14:05:49 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 16 Nov 2013 23:05:49 +1000 Subject: [Python-Dev] Add transform() and untranform() methods In-Reply-To: <52875BE3.6070308@egenix.com> References: <20131115102228.4d828516@fsol> <20131115102835.GQ2085@ando> <20131115113347.52f5b10c@fsol> <20131115150433.5a54c08f@fsol> <20131115173458.123494af@fsol> <52875BE3.6070308@egenix.com> Message-ID: On 16 November 2013 21:49, M.-A. Lemburg wrote: > On 16.11.2013 01:47, Victor Stinner wrote: >> Adding transform()/untransform() method to bytes and str is a non >> trivial change and not everybody likes them. Anyway, it's too late for >> Python 3.4. > > Just to clarify: I still like the idea of adding those methods. > > I just don't see what this addition has to do with the codecs.encode()/ > .decode() functions. Part of the interest here is in making Python 3 better compete with the ease of the following in Python 2: >>> "68656c6c6f".decode("hex") 'hello' >>> "hello".encode("hex") '68656c6c6f' Until recently, I (and others) thought the best Python 3 had to offer was: >>> import codecs >>> codecs.getencoder("hex")("hello")[0] '68656c6c6f' >>> codecs.getdecoder("hex")("68656c6c6f")[0] 'hello' In reality, though, Python 3 has always supported the following, it just wasn't documented so I (and others) didn't know it had actually been available as an alternative interface to the codecs machinery since Python 2.4: >>> from codecs import encode, decode >>> encode("hello", "hex") '68656c6c6f' >>> decode("68656c6c6f", "hex") 'hello' That's almost as clean as the Python 2 version, it just requires the initial import of the convenience functions from the codecs module. The fact it is supported in Python 2 means that 2/3 compatible codecs can also use it. Accordingly, I now see ensuring that everyone has a common understanding of *what is already available* as an essential next step, and only then consider significant changes in the codecs mechanisms*. I know I learned a hell of a lot about the distinction between the type agnostic codec infrastructure and the Unicode text model over the past several months, and I think this thread shows clearly that there's still a lot of confusion over the matter, even amongst core developers. That's a problem, and something we need to fix before giving further consideration to the transform/untransform idea. *(Victor's proposal in issue 19619 is actually relatively modest, now that I understand it properly, and entails taking the existing output type checks and making it possible to do them in advance, without touching input type checks) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From fijall at gmail.com Sat Nov 16 14:17:33 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 16 Nov 2013 15:17:33 +0200 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 3:51 AM, Terry Reedy wrote: > http://bugs.python.org/issue19562 > propose to change the first assert in Lib/datetime.py > assert 1 <= month <= 12, month > to > assert 1 <= month <= 12,'month must be in 1..12' > to match the next two asserts out of the *53* in the file. I think that is > the wrong direction of change, but that is not my question here. > > Should stdlib code use assert at all? > > If user input can trigger an assert, then the code should raise a normal > exception that will not disappear with -OO. May I assert that -OO should instead be killed? From ncoghlan at gmail.com Sat Nov 16 16:09:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Nov 2013 01:09:23 +1000 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: On 16 November 2013 23:17, Maciej Fijalkowski wrote: > On Sat, Nov 16, 2013 at 3:51 AM, Terry Reedy wrote: >> If user input can trigger an assert, then the code should raise a normal >> exception that will not disappear with -OO. > > May I assert that -OO should instead be killed? "I don't care about embedded devices" is not a good rationale for killing features that really only benefit people running Python on such systems. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From fijall at gmail.com Sat Nov 16 16:33:23 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 16 Nov 2013 17:33:23 +0200 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 5:09 PM, Nick Coghlan wrote: > On 16 November 2013 23:17, Maciej Fijalkowski wrote: >> On Sat, Nov 16, 2013 at 3:51 AM, Terry Reedy wrote: >>> If user input can trigger an assert, then the code should raise a normal >>> exception that will not disappear with -OO. >> >> May I assert that -OO should instead be killed? > > "I don't care about embedded devices" is not a good rationale for > killing features that really only benefit people running Python on > such systems. > > Cheers, > Nick. Can I see some writeup how -OO benefit embedded devices? From fijall at gmail.com Sat Nov 16 16:34:15 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sat, 16 Nov 2013 17:34:15 +0200 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 5:33 PM, Maciej Fijalkowski wrote: > On Sat, Nov 16, 2013 at 5:09 PM, Nick Coghlan wrote: >> On 16 November 2013 23:17, Maciej Fijalkowski wrote: >>> On Sat, Nov 16, 2013 at 3:51 AM, Terry Reedy wrote: >>>> If user input can trigger an assert, then the code should raise a normal >>>> exception that will not disappear with -OO. >>> >>> May I assert that -OO should instead be killed? >> >> "I don't care about embedded devices" is not a good rationale for >> killing features that really only benefit people running Python on >> such systems. >> >> Cheers, >> Nick. > > Can I see some writeup how -OO benefit embedded devices? Or more importantly, how removing assert does. And how not naming it --remove-asserts would not help (people really have an opinion it would optimize their code) From solipsis at pitrou.net Sat Nov 16 16:46:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Nov 2013 16:46:00 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) References: Message-ID: <20131116164600.74e4e703@fsol> On Sat, 16 Nov 2013 17:34:15 +0200 Maciej Fijalkowski wrote: > On Sat, Nov 16, 2013 at 5:33 PM, Maciej Fijalkowski wrote: > > On Sat, Nov 16, 2013 at 5:09 PM, Nick Coghlan wrote: > >> On 16 November 2013 23:17, Maciej Fijalkowski wrote: > >>> On Sat, Nov 16, 2013 at 3:51 AM, Terry Reedy wrote: > >>>> If user input can trigger an assert, then the code should raise a normal > >>>> exception that will not disappear with -OO. > >>> > >>> May I assert that -OO should instead be killed? > >> > >> "I don't care about embedded devices" is not a good rationale for > >> killing features that really only benefit people running Python on > >> such systems. > >> > >> Cheers, > >> Nick. > > > > Can I see some writeup how -OO benefit embedded devices? > > Or more importantly, how removing assert does. And how not naming it > --remove-asserts would not help (people really have an opinion it > would optimize their code) I agree that conflating the two doesn't help the discussion. While removing docstrings may be beneficial on memory-constrained devices, I can't remember a single situation where I've wanted to remove asserts on a production system. (I also tend to write less and less asserts in production code, since all of them tend to go in unit tests instead, with the help of e.g. mock objects) Regards Antoine. From ncoghlan at gmail.com Sat Nov 16 16:50:31 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Nov 2013 01:50:31 +1000 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: On 17 November 2013 01:34, Maciej Fijalkowski wrote: > On Sat, Nov 16, 2013 at 5:33 PM, Maciej Fijalkowski wrote: >> On Sat, Nov 16, 2013 at 5:09 PM, Nick Coghlan wrote: >>> On 16 November 2013 23:17, Maciej Fijalkowski wrote: >>>> On Sat, Nov 16, 2013 at 3:51 AM, Terry Reedy wrote: >>>>> If user input can trigger an assert, then the code should raise a normal >>>>> exception that will not disappear with -OO. >>>> >>>> May I assert that -OO should instead be killed? >>> >>> "I don't care about embedded devices" is not a good rationale for >>> killing features that really only benefit people running Python on >>> such systems. >>> >>> Cheers, >>> Nick. >> >> Can I see some writeup how -OO benefit embedded devices? > > Or more importantly, how removing assert does. And how not naming it > --remove-asserts would not help (people really have an opinion it > would optimize their code) No, that's the wrong question to ask. The onus is on *you* to ask "Who is this feature for? Do they still need it? Can we meet their needs in a different way?". You're the one proposing to break things, so it's up to you to make the case for why that's an OK thing to do. And until you ask those questions, and openly and honestly do the research to answer them (rather than assuming the answer you want), and can provide evidence of having done so, then it's entirely reasonable for me to dismiss the suggestion as you saying "this doesn't benefit me, so it doesn't benefit anyone, so it's OK to get rid of it". That's not the way this works - backwards compatibility is sacrosanct, and it requires some seriously compelling evidence to justify a breach. (This even applies to the Python 3 transition: the really annoying discrepancies between Python 2 and 3 are the ones where we allowed a backwards compatibility break without adequate justification, but now we're locked in to the decision due to internal backwards compatibility constraints within the Python 3 series). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Nov 16 17:08:14 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Nov 2013 02:08:14 +1000 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131116164600.74e4e703@fsol> References: <20131116164600.74e4e703@fsol> Message-ID: On 17 November 2013 01:46, Antoine Pitrou wrote: > I agree that conflating the two doesn't help the discussion. > While removing docstrings may be beneficial on memory-constrained > devices, I can't remember a single situation where I've wanted to > remove asserts on a production system. While I actually agree that having separate flags for --omit-debug, --omit-asserts and --omit-docstrings would make more sense than the current optimization levels, Maciej first proposed killing off -OO (where the most significant effect is removing docstrings which can result in substantial program footprint reductions for embedded systems), and only later switched to asking about removing asserts (part of -O, which also removes blocks guarded by "if __debug__", both of which help embedded systems preserve precious ROM space, although to a lesser degree than removing docstrings can save RAM). One of the most important questions to ask when proposing the removal of something is "What replacement are we offering for those users that actually need (or even just think they need) this feature?". Sometimes the answer is "Nothing", sometimes it's something that only covers a subset of previous use cases, and sometimes it's a complete functional equivalent with an improved spelling. But not asking the question at all (or, worse, dismissing the concerns of affected users as irrelevant and uninteresting) is a guaranteed way to annoy the very people that actually rely on the feature that is up for removal or replacement, when you *really* want them engaged and clearly explaining their use cases. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From donald at stufft.io Sat Nov 16 17:16:48 2013 From: donald at stufft.io (Donald Stufft) Date: Sat, 16 Nov 2013 11:16:48 -0500 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: <20131116164600.74e4e703@fsol> Message-ID: <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> Personally I think that none of the -O* should be removing asserts. It feels like a foot gun to me. I?ve seen more than one codebase that would be completely broken under -O* because they used asserts without even knowing -O* existed. Removing __debug__ blogs and doc strings I don?t think is as big of a deal, although removing doc strings can break code as well. On Nov 16, 2013, at 11:08 AM, Nick Coghlan wrote: > On 17 November 2013 01:46, Antoine Pitrou wrote: >> I agree that conflating the two doesn't help the discussion. >> While removing docstrings may be beneficial on memory-constrained >> devices, I can't remember a single situation where I've wanted to >> remove asserts on a production system. > > While I actually agree that having separate flags for --omit-debug, > --omit-asserts and --omit-docstrings would make more sense than the > current optimization levels, Maciej first proposed killing off -OO > (where the most significant effect is removing docstrings which can > result in substantial program footprint reductions for embedded > systems), and only later switched to asking about removing asserts > (part of -O, which also removes blocks guarded by "if __debug__", both > of which help embedded systems preserve precious ROM space, although > to a lesser degree than removing docstrings can save RAM). > > One of the most important questions to ask when proposing the removal > of something is "What replacement are we offering for those users that > actually need (or even just think they need) this feature?". Sometimes > the answer is "Nothing", sometimes it's something that only covers a > subset of previous use cases, and sometimes it's a complete functional > equivalent with an improved spelling. But not asking the question at > all (or, worse, dismissing the concerns of affected users as > irrelevant and uninteresting) is a guaranteed way to annoy the very > people that actually rely on the feature that is up for removal or > replacement, when you *really* want them engaged and clearly > explaining their use cases. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/donald%40stufft.io ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: Message signed with OpenPGP using GPGMail URL: From solipsis at pitrou.net Sat Nov 16 17:34:58 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Nov 2013 17:34:58 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> References: <20131116164600.74e4e703@fsol> <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> Message-ID: <20131116173458.5f022e90@fsol> On Sat, 16 Nov 2013 11:16:48 -0500 Donald Stufft wrote: > Personally I think that none of the -O* should be removing asserts. It feels > like a foot gun to me. I?ve seen more than one codebase that would be > completely broken under -O* because they used asserts without even knowing > -O* existed. Originally it was probably done so as to mimick C, where compiling in optimized mode will indeed disable any assert()s in the code. Regards Antoine. From apieum at gmail.com Sat Nov 16 17:41:41 2013 From: apieum at gmail.com (Gregory Salvan) Date: Sat, 16 Nov 2013 17:41:41 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> References: <20131116164600.74e4e703@fsol> <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> Message-ID: Hi, Some languages (C#, java) do the reverse by removing assertions if we don't tell compiler to keep them. Personnaly, I find this solution relatively accurate as I expect assertions not to be run in production. It would be painful to have this behaviour in python now, but I hope we'll keep a way to remove assertions and find interesting the solution of specific flags (--omit-debug, --omit-asserts and --omit-docstrings). cheers, Gr?gory 2013/11/16 Donald Stufft > Personally I think that none of the -O* should be removing asserts. It > feels > like a foot gun to me. I?ve seen more than one codebase that would be > completely broken under -O* because they used asserts without even knowing > -O* existed. > > Removing __debug__ blogs and doc strings I don?t think is as big of a deal, > although removing doc strings can break code as well. > > On Nov 16, 2013, at 11:08 AM, Nick Coghlan wrote: > > > On 17 November 2013 01:46, Antoine Pitrou wrote: > >> I agree that conflating the two doesn't help the discussion. > >> While removing docstrings may be beneficial on memory-constrained > >> devices, I can't remember a single situation where I've wanted to > >> remove asserts on a production system. > > > > While I actually agree that having separate flags for --omit-debug, > > --omit-asserts and --omit-docstrings would make more sense than the > > current optimization levels, Maciej first proposed killing off -OO > > (where the most significant effect is removing docstrings which can > > result in substantial program footprint reductions for embedded > > systems), and only later switched to asking about removing asserts > > (part of -O, which also removes blocks guarded by "if __debug__", both > > of which help embedded systems preserve precious ROM space, although > > to a lesser degree than removing docstrings can save RAM). > > > > One of the most important questions to ask when proposing the removal > > of something is "What replacement are we offering for those users that > > actually need (or even just think they need) this feature?". Sometimes > > the answer is "Nothing", sometimes it's something that only covers a > > subset of previous use cases, and sometimes it's a complete functional > > equivalent with an improved spelling. But not asking the question at > > all (or, worse, dismissing the concerns of affected users as > > irrelevant and uninteresting) is a guaranteed way to annoy the very > > people that actually rely on the feature that is up for removal or > > replacement, when you *really* want them engaged and clearly > > explaining their use cases. > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/donald%40stufft.io > > > ----------------- > Donald Stufft > PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 > DCFA > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/apieum%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From trent at snakebite.org Sat Nov 16 17:45:00 2013 From: trent at snakebite.org (Trent Nelson) Date: Sat, 16 Nov 2013 11:45:00 -0500 Subject: [Python-Dev] The pysandbox project is broken In-Reply-To: References: <20131115165629.GA1060@snakebite.org> Message-ID: <20131116164500.GA3152@snakebite.org> On Sat, Nov 16, 2013 at 02:53:22AM -0800, Maciej Fijalkowski wrote: > On Fri, Nov 15, 2013 at 6:56 PM, Trent Nelson wrote: > > On Tue, Nov 12, 2013 at 01:16:55PM -0800, Victor Stinner wrote: > >> pysandbox cannot be used in practice > >> ==================================== > >> > >> To protect the untrusted namespace, pysandbox installs a lot of > >> different protections. Because of all these protections, it becomes > >> hard to write Python code. Basic features like "del dict[key]" are > >> denied. Passing an object to a sandbox is not possible to sandbox, > >> pysandbox is unable to proxify arbitary objects. > >> > >> For something more complex than evaluating "1+(2*3)", pysandbox cannot > >> be used in practice, because of all these protections. Individual > >> protections cannot be disabled, all protections are required to get a > >> secure sandbox. > > > > This sounds a lot like the work I initially did with PyParallel to > > try and intercept/prevent parallel threads mutating main-thread > > objects. > > > > I ended up arriving at a much better solution by just relying on > > memory protection; main thread pages are set read-only prior to > > parallel threads being able to run. If a parallel thread attempts > > to mutate a main thread object; a SEH is raised (SIGSEV on POSIX), > > which I catch in the ceval loop and convert into an exception. > > > > See slide 138 of this: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 > > > > I'm wondering if this sort of an approach (which worked surprisingly > > well) could be leveraged to also provide a sandbox environment? The > > goals are the same: robust protection against mutation of memory > > allocated outside of the sandbox. > > > > (I'm purely talking about memory mutation; haven't thought about how > > that could be extended to prevent file system interaction as well.) > > > > > > Trent. > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com > > Trent, you should read the mail more carefully. Notably the same > issues that make it impossible to create a sandbox make it impossible > to create pyparaller really work. Being read-only is absolutely not > enough - you can read some internal structures in inconsistent state > that lead to crashes and/or very unexpected behavior even without > modifying anything. What do you mean by inconsistent state? Like a dict half way through `a['foo'] = 'bar'`? That can't happen with PyParallel; parallel threads don't run when the main thread runs and vice versa. The main thread's memory (and internal object structure) will always be consistent by the time the parallel threads run. > PS. We really did a lot of work analyzing how STM-pypy can lead to > conflicts and/or inconsistent behavior. But you support free-threading though, right? As in, code that subclasses threading.Thread should be able to benefit from your STM work? I explicitly don't support free-threading. Your threading.Thread code will not magically run faster with PyParallel. You'll need to re-write your code using the parallel and async fa?ade APIs I expose. On the plus side, I can completely control everything about the main thread and parallel thread execution environments; obviating the need to protect against internal inconsistencies by virtue of the fact that the main thread will always be in a consistent state when the parallel threads are running. (And it works really well in practice; I ported SimpleHTTPServer to use my new async stuff and it flies -- it'll automatically exploit all your cores if there is sufficient incoming load. Unexpected side-effect of my implementation is that code executing in parallel callbacks actually runs faster than normal single-threaded Python code; no need to do reference counting, GC, and the memory model is ridiculously cache and TLB friendly.) This is getting off-topic though and I don't want to hijack the sandbox thread. I was planning on sending an e-mail in a few days when the PyData video of my talk is live -- we can debate the merits of my parallel/async approach then :-) > Cheers, > fijal Trent. From ericsnowcurrently at gmail.com Sat Nov 16 18:48:38 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 16 Nov 2013 10:48:38 -0700 Subject: [Python-Dev] What's the story on Py_FrozenMain? Message-ID: While looking at something unrelated, I happened to peek at Python/frozenmain.c and found Py_FrozenMain(). I kind of get the idea of it, but am curious what motivated the addition and who might be using it. The function is not documented and doesn't have much explanation. I'm guessing that not many are familiar with it (e.g. http://bugs.python.org/issue15893). FWIW the function was added quite a while ago (and hasn't been touched a whole lot since): changeset: 1270:14369a5e61679364deeae9a9a0deedbd593a72e0 branch: legacy-trunk user: Guido van Rossum date: Thu Apr 01 20:59:32 1993 +0000 summary: Support for frozen scripts; added -i option. -eric From mal at egenix.com Sat Nov 16 18:55:35 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 16 Nov 2013 18:55:35 +0100 Subject: [Python-Dev] What's the story on Py_FrozenMain? In-Reply-To: References: Message-ID: <5287B197.5090206@egenix.com> On 16.11.2013 18:48, Eric Snow wrote: > While looking at something unrelated, I happened to peek at > Python/frozenmain.c and found Py_FrozenMain(). I kind of get the idea > of it, but am curious what motivated the addition and who might be > using it. The function is not documented and doesn't have much > explanation. I'm guessing that not many are familiar with it (e.g. > http://bugs.python.org/issue15893). > > FWIW the function was added quite a while ago (and hasn't been touched > a whole lot since): > > changeset: 1270:14369a5e61679364deeae9a9a0deedbd593a72e0 > branch: legacy-trunk > user: Guido van Rossum > date: Thu Apr 01 20:59:32 1993 +0000 > summary: Support for frozen scripts; added -i option. It's used as main()-function for frozen Python interpreters. See eGenix PyRun as an example and the freeze tool in Tools/freeze/ for the implementation that uses this API: http://www.egenix.com/products/python/PyRun/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-11-19: Python Meeting Duesseldorf ... 3 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Sat Nov 16 18:57:20 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Nov 2013 09:57:20 -0800 Subject: [Python-Dev] What's the story on Py_FrozenMain? In-Reply-To: References: Message-ID: This is very old DNA. The persistent user request was a way to bundle up a Python program as a single executable file that could be sent to a friend or colleague and run without first having to install Python. If you Google for python freeze you'll still see old references to it. IIRC I did the original version -- it would scan your main program and try to follow all your imports to get a list of modules (yours and stdlib) that would be needed, and it would then byte-compile all of these and produce a huge C file. You would then compile and link that C file with the rest of the Python executable. All extensions would have to be statically linked. I think this was also used as the basis of a similar tool that worked for Windows. Nowadays installers are much more accessible and easier to use, and Python isn't so new and unknown any more, so there's not much demand left. --Guido On Sat, Nov 16, 2013 at 9:48 AM, Eric Snow wrote: > While looking at something unrelated, I happened to peek at > Python/frozenmain.c and found Py_FrozenMain(). I kind of get the idea > of it, but am curious what motivated the addition and who might be > using it. The function is not documented and doesn't have much > explanation. I'm guessing that not many are familiar with it (e.g. > http://bugs.python.org/issue15893). > > FWIW the function was added quite a while ago (and hasn't been touched > a whole lot since): > > changeset: 1270:14369a5e61679364deeae9a9a0deedbd593a72e0 > branch: legacy-trunk > user: Guido van Rossum > date: Thu Apr 01 20:59:32 1993 +0000 > summary: Support for frozen scripts; added -i option. > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Sat Nov 16 19:07:04 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 16 Nov 2013 11:07:04 -0700 Subject: [Python-Dev] What's the story on Py_FrozenMain? In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 10:55 AM, M.-A. Lemburg wrote: > It's used as main()-function for frozen Python interpreters. > > See eGenix PyRun as an example and the freeze tool in Tools/freeze/ > for the implementation that uses this API: > > http://www.egenix.com/products/python/PyRun/ On Sat, Nov 16, 2013 at 10:57 AM, Guido van Rossum wrote: > This is very old DNA. The persistent user request was a way to bundle up a > Python program as a single executable file that could be sent to a friend or > colleague and run without first having to install Python. If you Google for > python freeze you'll still see old references to it. > > IIRC I did the original version -- it would scan your main program and try > to follow all your imports to get a list of modules (yours and stdlib) that > would be needed, and it would then byte-compile all of these and produce a > huge C file. You would then compile and link that C file with the rest of > the Python executable. All extensions would have to be statically linked. > > I think this was also used as the basis of a similar tool that worked for > Windows. > > Nowadays installers are much more accessible and easier to use, and Python > isn't so new and unknown any more, so there's not much demand left. Thanks for the explanations. It's interesting stuff. :) -eric From solipsis at pitrou.net Sat Nov 16 19:15:26 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Nov 2013 19:15:26 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? Message-ID: <20131116191526.16f9b79c@fsol> Hello, Alexandre Vassalotti (thanks a lot!) has recently finalized his work on the PEP 3154 implementation - pickle protocol 4. I think it would be good to get the PEP and the implementation accepted for 3.4. As far as I can say, this has been a low-controvery proposal, and it brings fairly obvious improvements to the table (which table?). I still need some kind of BDFL or BDFL delegate to do that, though -- unless I am allowed to mark my own PEP accepted :-) (I've asked Tim, specifically, for comments, since he contributed a lot to previous versions of the pickle protocol.) The PEP is at http://www.python.org/dev/peps/pep-3154/ (should be rebuilt soon by the server, I'd say) Alexandre's implementation is tracked at http://bugs.python.org/issue17810 Regards Antoine. From trent at snakebite.org Sat Nov 16 19:13:28 2013 From: trent at snakebite.org (Trent Nelson) Date: Sat, 16 Nov 2013 13:13:28 -0500 Subject: [Python-Dev] PyParallel: alternate async I/O and GIL removal Message-ID: <20131116181328.GB3152@snakebite.org> Hi folks, Video of the presentation I gave last weekend at PyData NYC regarding PyParallel just went live: https://vimeo.com/79539317 Slides are here: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 The work was driven by the async I/O discussions around this time last year on python-ideas. That resulted in me sending this: http://markmail.org/thread/kh3qgjbydvxt3exw#query:+page:1+mid:arua62vllzugjy2v+state:results ....where I attempted to argue that there was a better way of doing async I/O on Windows than the status quo of single-threaded, non-blocking I/O with an event multiplex syscall. I wasn't successful in convincing anyone at the time; I had no code to back it up and I didn't articulate my plans for GIL removal at the time either (figuring the initial suggestion would be met with enough scepticism as is). So, in the video above, I spend a lot of time detailing how IOCP works on Windows, how it presents us with a better environment than UNIX for doing asynchronous I/O, and how it paired nicely with the other work I did on coming up with a way for multiple threads to execute simultaneously across all cores without introducing any speed penalties. I'm particularly interested to hear if the video/slides helped UNIX-centric people gain a better understanding of how Windows does IOCP and why it would be preferable when doing async I/O. The reverse is also true: if you still think single-threaded, non- blocking synchronous I/O via kqueue/epoll is better than the approach afforded by IOCP, I'm interested in hearing why. As crazy as it sounds, my long term goal would be to try and influence Linux and BSD kernels to implement thread-agnostic I/O support such that an IOCP-like mechanism could be exposed; Solaris and AIX already do this via event ports and AIX's verbatim copy of Windows' IOCP API. (There is some promising work already being done on Linux; see recent MegaPipe paper for an example.) Regards, Trent. From ericsnowcurrently at gmail.com Sat Nov 16 19:40:18 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 16 Nov 2013 11:40:18 -0700 Subject: [Python-Dev] Mixed up core/module source file locations in CPython Message-ID: If you look at the Python and Modules directories in the cpython repo, you'll find modules in Python/ and core files (like python.c and main.c) in Modules/. (It's like parking on a driveway and driving on a parkway. ) It's not that big a deal and not that hard to figure out (so I'm fine with the status quo), but it is a bit surprising. When I was first getting familiar with the code base a few years ago (as a C non-expert), it was a not insignificant but not major stumbling block. The situation is mostly a consequence of history, if I understand correctly. The subject has come up before and I don't recall any objections to doing something about it. I haven't had the time to track down those earlier discussions, though I remember Benjamin having some comment about it. Would it be too disruptive (churn, etc.) to clean this up in 3.5? I see it similarly to when I moved a light switch from outside my bathroom to inside. For a while, but not that long, I kept unconsciously reaching for the switch that was no longer there on the outside. Regardless I'm glad I did it. Likewise, moving the handful of files around is a relatively inconsequential change that would make the project just a little less surprising, particularly for new contributors. -eric p.s. Either way I'll probably take some time (it shouldn't take long) after the PEP 451 implementation is done to put together a patch that moves the files around, just to see what difference it makes. From ericsnowcurrently at gmail.com Sat Nov 16 19:44:31 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 16 Nov 2013 11:44:31 -0700 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131116191526.16f9b79c@fsol> References: <20131116191526.16f9b79c@fsol> Message-ID: On Sat, Nov 16, 2013 at 11:15 AM, Antoine Pitrou wrote: > > Hello, > > Alexandre Vassalotti (thanks a lot!) has recently finalized his work on > the PEP 3154 implementation - pickle protocol 4. > > I think it would be good to get the PEP and the implementation accepted > for 3.4. +1 Once the core portion of the PEP 451 implementation is done, I plan on tweaking pickle to take advantage of module.__spec__ (where applicable). It would be nice to have the PEP 3154 implementation in place before I do anything in that space. -eric From solipsis at pitrou.net Sat Nov 16 20:15:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 16 Nov 2013 20:15:03 +0100 Subject: [Python-Dev] NTPath or WindowsPath? Message-ID: <20131116201503.3c4209dc@fsol> Hello, In a (private) discussion about PEP 428 and pathlib, Guido proposed that maybe NTPath should be renamed to WindowsPath, since the name is more likely to stay relevant in the middle term. What do you think? Regards Antoine. From benjamin at python.org Sat Nov 16 20:21:30 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 16 Nov 2013 14:21:30 -0500 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: <20131116201503.3c4209dc@fsol> References: <20131116201503.3c4209dc@fsol> Message-ID: 2013/11/16 Antoine Pitrou : > > Hello, > > In a (private) discussion about PEP 428 and pathlib, Guido proposed > that maybe NTPath should be renamed to WindowsPath, since the name is > more likely to stay relevant in the middle term. What do you think? I agree with Guido. -- Regards, Benjamin From Steve.Dower at microsoft.com Sat Nov 16 20:30:19 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Sat, 16 Nov 2013 19:30:19 +0000 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: References: <20131116201503.3c4209dc@fsol>, Message-ID: Sounds good to me. NT is already an obsolete term - Win32 would be more accurate - but WinRT hasn't changed the path format, so WindowsPath will be accurate for the foreseeable future. Cheers, Steve Top posted from my Windows Phone ________________________________ From: Benjamin Peterson Sent: ?11/?16/?2013 11:22 To: Antoine Pitrou Cc: Python Dev Subject: Re: [Python-Dev] NTPath or WindowsPath? 2013/11/16 Antoine Pitrou : > > Hello, > > In a (private) discussion about PEP 428 and pathlib, Guido proposed > that maybe NTPath should be renamed to WindowsPath, since the name is > more likely to stay relevant in the middle term. What do you think? I agree with Guido. -- Regards, Benjamin _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sat Nov 16 21:50:37 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 16 Nov 2013 22:50:37 +0200 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: <20131116201503.3c4209dc@fsol> References: <20131116201503.3c4209dc@fsol> Message-ID: 16.11.13 21:15, Antoine Pitrou ???????(??): > In a (private) discussion about PEP 428 and pathlib, Guido proposed > that maybe NTPath should be renamed to WindowsPath, since the name is > more likely to stay relevant in the middle term. What do you think? What about nturl2path, os.name, sysconfig.get_scheme_names()? From brett at python.org Sun Nov 17 00:44:56 2013 From: brett at python.org (Brett Cannon) Date: Sat, 16 Nov 2013 18:44:56 -0500 Subject: [Python-Dev] Mixed up core/module source file locations in CPython In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 1:40 PM, Eric Snow wrote: > If you look at the Python and Modules directories in the cpython repo, > you'll find modules in Python/ and core files (like python.c and > main.c) in Modules/. (It's like parking on a driveway and driving on > a parkway. ) It's not that big a deal and not that hard to > figure out (so I'm fine with the status quo), but it is a bit > surprising. When I was first getting familiar with the code base a > few years ago (as a C non-expert), it was a not insignificant but not > major stumbling block. > > The situation is mostly a consequence of history, if I understand > correctly. The subject has come up before and I don't recall any > objections to doing something about it. I haven't had the time to > track down those earlier discussions, though I remember Benjamin > having some comment about it. > > Would it be too disruptive (churn, etc.) to clean this up in 3.5? I > see it similarly to when I moved a light switch from outside my > bathroom to inside. For a while, but not that long, I kept > unconsciously reaching for the switch that was no longer there on the > outside. Regardless I'm glad I did it. Likewise, moving the handful > of files around is a relatively inconsequential change that would make > the project just a little less surprising, particularly for new > contributors. > > -eric > > p.s. Either way I'll probably take some time (it shouldn't take long) > after the PEP 451 implementation is done to put together a patch that > moves the files around, just to see what difference it makes. > I personally think it would be a good idea to re-arrange the files to make things more beginner-friendly. I believe Nick was also talking about renaming directories, etc. at some point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 17 02:41:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Nov 2013 11:41:15 +1000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> Message-ID: On 17 Nov 2013 04:45, "Eric Snow" wrote: > > On Sat, Nov 16, 2013 at 11:15 AM, Antoine Pitrou wrote: > > > > Hello, > > > > Alexandre Vassalotti (thanks a lot!) has recently finalized his work on > > the PEP 3154 implementation - pickle protocol 4. > > > > I think it would be good to get the PEP and the implementation accepted > > for 3.4. > > +1 > > Once the core portion of the PEP 451 implementation is done, I plan on > tweaking pickle to take advantage of module.__spec__ (where > applicable). It would be nice to have the PEP 3154 implementation in > place before I do anything in that space. +1 from me, too. Cheers, Nick. > > -eric > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 17 02:46:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Nov 2013 11:46:30 +1000 Subject: [Python-Dev] Mixed up core/module source file locations in CPython In-Reply-To: References: Message-ID: On 17 Nov 2013 09:48, "Brett Cannon" wrote: > > > I personally think it would be a good idea to re-arrange the files to make things more beginner-friendly. I believe Nick was also talking about renaming directories, etc. at some point. Yeah, the main ones I am interested in are a separate "Programs" dir for the C files that define C main functions and breaking up the monster that is pythonrun.c. I was looking into that as part of the PEP 432 implementation, so it fell by the wayside when I deferred that to help out with packaging issues instead. Cheers, Nick. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Nov 17 02:39:13 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Nov 2013 17:39:13 -0800 Subject: [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: <20131116181328.GB3152@snakebite.org> References: <20131116181328.GB3152@snakebite.org> Message-ID: Trent, I watched your video and read your slides. (Does the word "motormouth" mean anything to you? :-) Clearly your work isn't ready for python-dev -- it is just too speculative. I've moved python-dev to BCC and added python-ideas. It possibly doesn't even belong on python-ideas -- if you are serious about wanting to change Linux or other *NIX variants, you'll have to go find a venue where people who do forward-looking kernel work hang out. Finally, I'm not sure why you are so confrontational about the way Twisted and Tulip do things. We are doing things the only way they *can* be done without overhauling the entire CPython implementation (which you have proven will take several major release cycles, probably until 4.0). It's fine that you are looking further forward than most of us. I don't think it makes sense that you are blaming the rest of us for writing libraries that can be used today. On Sat, Nov 16, 2013 at 10:13 AM, Trent Nelson wrote: > Hi folks, > > Video of the presentation I gave last weekend at PyData NYC > regarding PyParallel just went live: https://vimeo.com/79539317 > > Slides are here: > https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 > > The work was driven by the async I/O discussions around this time > last year on python-ideas. That resulted in me sending this: > > > http://markmail.org/thread/kh3qgjbydvxt3exw#query:+page:1+mid:arua62vllzugjy2v+state:results > > ....where I attempted to argue that there was a better way of > doing async I/O on Windows than the status quo of single-threaded, > non-blocking I/O with an event multiplex syscall. > > I wasn't successful in convincing anyone at the time; I had no code > to back it up and I didn't articulate my plans for GIL removal at > the time either (figuring the initial suggestion would be met with > enough scepticism as is). > > So, in the video above, I spend a lot of time detailing how IOCP > works on Windows, how it presents us with a better environment than > UNIX for doing asynchronous I/O, and how it paired nicely with the > other work I did on coming up with a way for multiple threads to > execute simultaneously across all cores without introducing any > speed penalties. > > I'm particularly interested to hear if the video/slides helped > UNIX-centric people gain a better understanding of how Windows does > IOCP and why it would be preferable when doing async I/O. > > The reverse is also true: if you still think single-threaded, non- > blocking synchronous I/O via kqueue/epoll is better than the > approach afforded by IOCP, I'm interested in hearing why. > > As crazy as it sounds, my long term goal would be to try and > influence Linux and BSD kernels to implement thread-agnostic I/O > support such that an IOCP-like mechanism could be exposed; Solaris > and AIX already do this via event ports and AIX's verbatim copy of > Windows' IOCP API. > > (There is some promising work already being done on Linux; see > recent MegaPipe paper for an example.) > > Regards, > > Trent. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Nov 17 03:27:31 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Nov 2013 18:27:31 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131116191526.16f9b79c@fsol> References: <20131116191526.16f9b79c@fsol> Message-ID: On Sat, Nov 16, 2013 at 10:15 AM, Antoine Pitrou wrote: > Alexandre Vassalotti (thanks a lot!) has recently finalized his work on > the PEP 3154 implementation - pickle protocol 4. > > I think it would be good to get the PEP and the implementation accepted > for 3.4. As far as I can say, this has been a low-controvery proposal, > and it brings fairly obvious improvements to the table (which table?). > I still need some kind of BDFL or BDFL delegate to do that, though -- > unless I am allowed to mark my own PEP accepted :-) > > (I've asked Tim, specifically, for comments, since he contributed a lot > to previous versions of the pickle protocol.) > > The PEP is at http://www.python.org/dev/peps/pep-3154/ (should be > rebuilt soon by the server, I'd say) > > Alexandre's implementation is tracked at > http://bugs.python.org/issue17810 > Assuming Tim doesn't object (hi Tim!) I think this PEP is fine to accept -- all the ideas sound good, and I agree with moving to support 8-byte sizes and framing. I haven't looked at the implementation but I trust you as a reviewer and the beta release process. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Nov 17 07:20:09 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 17 Nov 2013 01:20:09 -0500 Subject: [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> Message-ID: On 11/16/2013 8:39 PM, Guido van Rossum wrote: > Trent, I watched your video and read your slides. I only read the slides. > (Does the word "motormouth" mean anything to you? :-) The extra background (and repetition) was helpful to me in filling in things, especially about Windows, that I could not follow in the discussion a year ago. But I noticed that it might be tedious to people already familiar with the subject. > Clearly your work isn't ready for python-dev -- it is just too > speculative. I've moved python-dev to BCC and added python-ideas. > > It possibly doesn't even belong on python-ideas -- if you are serious > about wanting to change Linux or other *NIX variants, you'll have to go > find a venue where people who do forward-looking kernel work hang out. A working and useful Windows-only PyParallel (integrated with CPython or not) might help stimulate open-source *nix improvements to match the new Windows stuff. > Finally, I'm not sure why you are so confrontational about the way > Twisted and Tulip do things. We are doing things the only way they *can* > be done without overhauling the entire CPython implementation (which you > have proven will take several major release cycles, probably until 4.0). > It's fine that you are looking further forward than most of us. I don't > think it makes sense that you are blaming the rest of us for writing > libraries that can be used today. Just reading the slides, I did not pick up as much confrontation and blame as you did. I certainly appreciate that asyncio is working *now*, and on all three major target systems. I look forward to looking at the doc when it is added. I will understand a bit better for having read Trent's slides. -- Terry Jan Reedy From steve at pearwood.info Sun Nov 17 08:04:43 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Nov 2013 18:04:43 +1100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131116164600.74e4e703@fsol> References: <20131116164600.74e4e703@fsol> Message-ID: <20131117070442.GR2085@ando> On Sat, Nov 16, 2013 at 04:46:00PM +0100, Antoine Pitrou wrote: > I agree that conflating the two doesn't help the discussion. > While removing docstrings may be beneficial on memory-constrained > devices, I can't remember a single situation where I've wanted to > remove asserts on a production system. > > (I also tend to write less and less asserts in production code, since > all of them tend to go in unit tests instead, with the help of e.g. > mock objects) I'm the opposite. I like using asserts in my code, and while I don't *typically* run it with -O, I do think it is valuable to have to opportunity to remove asserts. I've certainly written code where -O has given a major micro-optimization of individual functions (anything up to 50% speedup). I've never measured if that lead to a significant whole-application speedup, but I assume that was some benefit. (I know, premature optimization and all that...) -- Steven From steve at pearwood.info Sun Nov 17 08:08:14 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Nov 2013 18:08:14 +1100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: <20131117070813.GS2085@ando> On Sun, Nov 17, 2013 at 01:50:31AM +1000, Nick Coghlan wrote: > No, that's the wrong question to ask. The onus is on *you* to ask "Who > is this feature for? Do they still need it? Can we meet their needs in > a different way?". You're the one proposing to break things, so it's > up to you to make the case for why that's an OK thing to do. [...] > That's not the way this works - backwards compatibility is sacrosanct, > and it requires some seriously compelling evidence to justify a > breach. Thanks for saying this. I get frustrated by the number of times people propose removing -O (apparently) just because they personally don't use it. I personally have never got any benefit from the tarfile or multiprocessing modules, but I don't ask for them to be removed :-) -- Steven From steve at pearwood.info Sun Nov 17 08:42:25 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Nov 2013 18:42:25 +1100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> References: <20131116164600.74e4e703@fsol> <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> Message-ID: <20131117074225.GU2085@ando> On Sat, Nov 16, 2013 at 11:16:48AM -0500, Donald Stufft wrote: > Personally I think that none of the -O* should be removing asserts. It feels > like a foot gun to me. I?ve seen more than one codebase that would be > completely broken under -O* because they used asserts without even knowing > -O* existed. I agree that many people misuse asserts. I agree that this is a problem for them. But I disagree that the solution is to remove what I consider to be a useful feature, and a common one in other languages, just to protect people from their broken code. I prefer to see better efforts to educate them. To that end, I've taken the liberty of sending a post to python-list at python.org describing when to use asserts, and when not to use them: https://mail.python.org/pipermail/python-list/2013-November/660401.html I will continue to use my best effort to encourage good use of assertions in Python. -- Steven From solipsis at pitrou.net Sun Nov 17 11:00:50 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Nov 2013 11:00:50 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> Message-ID: <20131117110050.7f420587@fsol> On Sun, 17 Nov 2013 18:04:43 +1100 Steven D'Aprano wrote: > On Sat, Nov 16, 2013 at 04:46:00PM +0100, Antoine Pitrou wrote: > > > I agree that conflating the two doesn't help the discussion. > > While removing docstrings may be beneficial on memory-constrained > > devices, I can't remember a single situation where I've wanted to > > remove asserts on a production system. > > > > (I also tend to write less and less asserts in production code, since > > all of them tend to go in unit tests instead, with the help of e.g. > > mock objects) > > I'm the opposite. I like using asserts in my code, and while I don't > *typically* run it with -O, I do think it is valuable to have to > opportunity to remove asserts. I've certainly written code where -O has > given a major micro-optimization of individual functions (anything up to > 50% speedup). I've never measured if that lead to a significant > whole-application speedup, but I assume that was some benefit. (I know, > premature optimization and all that...) So you assumed there was some benefit, but you never used it anyway? Regards Antoine. From steve at pearwood.info Sun Nov 17 11:27:24 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Nov 2013 21:27:24 +1100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131117110050.7f420587@fsol> References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> <20131117110050.7f420587@fsol> Message-ID: <20131117102723.GV2085@ando> On Sun, Nov 17, 2013 at 11:00:50AM +0100, Antoine Pitrou wrote: > On Sun, 17 Nov 2013 18:04:43 +1100 > Steven D'Aprano wrote: > > On Sat, Nov 16, 2013 at 04:46:00PM +0100, Antoine Pitrou wrote: > > > > > I agree that conflating the two doesn't help the discussion. > > > While removing docstrings may be beneficial on memory-constrained > > > devices, I can't remember a single situation where I've wanted to > > > remove asserts on a production system. > > > > > > (I also tend to write less and less asserts in production code, since > > > all of them tend to go in unit tests instead, with the help of e.g. > > > mock objects) > > > > I'm the opposite. I like using asserts in my code, and while I don't > > *typically* run it with -O, I do think it is valuable to have to > > opportunity to remove asserts. I've certainly written code where -O has > > given a major micro-optimization of individual functions (anything up to > > 50% speedup). I've never measured if that lead to a significant > > whole-application speedup, but I assume that was some benefit. (I know, > > premature optimization and all that...) > > So you assumed there was some benefit, but you never used it anyway? I had a demonstrable non-trivial speedup when timing individual functions. At the time I considered that "good enough". If you want to call that "premature optimization", I can't entirely disagree, but can you honestly say you've never done the same? Optimize a function to make it run faster, even if you have no proof that it was a bottleneck in the your application? -- Steven From solipsis at pitrou.net Sun Nov 17 11:35:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Nov 2013 11:35:21 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> <20131117110050.7f420587@fsol> <20131117102723.GV2085@ando> Message-ID: <20131117113521.2ee4db30@fsol> On Sun, 17 Nov 2013 21:27:24 +1100 Steven D'Aprano wrote: > On Sun, Nov 17, 2013 at 11:00:50AM +0100, Antoine Pitrou wrote: > > On Sun, 17 Nov 2013 18:04:43 +1100 > > Steven D'Aprano wrote: > > > On Sat, Nov 16, 2013 at 04:46:00PM +0100, Antoine Pitrou wrote: > > > > > > > I agree that conflating the two doesn't help the discussion. > > > > While removing docstrings may be beneficial on memory-constrained > > > > devices, I can't remember a single situation where I've wanted to > > > > remove asserts on a production system. > > > > > > > > (I also tend to write less and less asserts in production code, since > > > > all of them tend to go in unit tests instead, with the help of e.g. > > > > mock objects) > > > > > > I'm the opposite. I like using asserts in my code, and while I don't > > > *typically* run it with -O, I do think it is valuable to have to > > > opportunity to remove asserts. I've certainly written code where -O has > > > given a major micro-optimization of individual functions (anything up to > > > 50% speedup). I've never measured if that lead to a significant > > > whole-application speedup, but I assume that was some benefit. (I know, > > > premature optimization and all that...) > > > > So you assumed there was some benefit, but you never used it anyway? > > I had a demonstrable non-trivial speedup when timing individual > functions. At the time I considered that "good enough". If you want to > call that "premature optimization", I can't entirely disagree, but can > you honestly say you've never done the same? You didn't answer my question: did you actually use -OO in production, or not? Saying that -OO could have helped you optimize something you didn't care about isn't a very strong argument for -OO :) What I would like to know is if people *knowingly* add costly asserts to performance-critical code, with the intent of disabling them at runtime using -OO. (in Python, that is, I know people do that in C where the programming culture and tools are different) Regards Antoine. From steve at pearwood.info Sun Nov 17 12:27:12 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Nov 2013 22:27:12 +1100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131117113521.2ee4db30@fsol> References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> <20131117110050.7f420587@fsol> <20131117102723.GV2085@ando> <20131117113521.2ee4db30@fsol> Message-ID: <20131117112711.GW2085@ando> On Sun, Nov 17, 2013 at 11:35:21AM +0100, Antoine Pitrou wrote: > You didn't answer my question: did you actually use -OO in production, > or not? Saying that -OO could have helped you optimize something you > didn't care about isn't a very strong argument for -OO :) Ah, sorry, I misunderstood your question. I didn't use -OO, since I had no reason to remove docstrings. But I did use -O to remove asserts. I was talking about assertions, not docstrings. Since I haven't had to write code for embedded devices with severely constrained memory, I've never cared about using -OO in production. > What I would like to know is if people *knowingly* add costly asserts > to performance-critical code, with the intent of disabling them at > runtime using -OO. Yes, I have knowingly added costly asserts to code with the intend of disabling them at runtime. Was it *performance-critical* code? I don't know, that was the point of my earlier rambling -- I could demonstrate a speedup of the individual functions in benchmarks, but nobody spent the effort to determine which functions were performance critical. -- Steven From chambon.pascal at wanadoo.fr Sun Nov 17 14:02:37 2013 From: chambon.pascal at wanadoo.fr (Pascal Chambon) Date: Sun, 17 Nov 2013 14:02:37 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131117112711.GW2085@ando> References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> <20131117110050.7f420587@fsol> <20131117102723.GV2085@ando> <20131117113521.2ee4db30@fsol> <20131117112711.GW2085@ando> Message-ID: <5288BE6D.1030002@wanadoo.fr> Le 17/11/2013 12:27, Steven D'Aprano a ?crit : >> What I would like to know is if people *knowingly* add costly asserts >> to performance-critical code, with the intent of disabling them at >> runtime using -OO. > Yes, I have knowingly added costly asserts to code with the intend of > disabling them at runtime. Was it *performance-critical* code? I don't > know, that was the point of my earlier rambling -- I could demonstrate a > speedup of the individual functions in benchmarks, but nobody spent the > effort to determine which functions were performance critical. Hi, my 2 cents: asserts have been of a great help in the robustness of our provisioning framework, these are like tests embedded in code, to *consistently* check what would be VERY hard to test from the outside, from unit-tests. It makes us gain much time when we develop, because asserts (often used for "method contract checking") immediately break stuffs if we make dumb programming errors, like giving the wrong type of variable as parameter etc. (if you send a string instead of a list of strings to a method, it could take a while before the errors gets noticed, since their behaviour is quite close) We also add asserts with very expensive operations (like fully checking the proper synchronization of our DBs with the mockups of remote partners, after each provisioning command treated), so that we don't need to call something like that after every line of unit-test we write. In production, we then make sure we use -O flag to avoid doubling our treatments times and traffic. regards, Pascal From eliben at gmail.com Sun Nov 17 14:58:08 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 17 Nov 2013 05:58:08 -0800 Subject: [Python-Dev] Mixed up core/module source file locations in CPython In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 3:44 PM, Brett Cannon wrote: > > > > On Sat, Nov 16, 2013 at 1:40 PM, Eric Snow wrote: > >> If you look at the Python and Modules directories in the cpython repo, >> you'll find modules in Python/ and core files (like python.c and >> main.c) in Modules/. (It's like parking on a driveway and driving on >> a parkway. ) It's not that big a deal and not that hard to >> figure out (so I'm fine with the status quo), but it is a bit >> surprising. When I was first getting familiar with the code base a >> few years ago (as a C non-expert), it was a not insignificant but not >> major stumbling block. >> >> The situation is mostly a consequence of history, if I understand >> correctly. The subject has come up before and I don't recall any >> objections to doing something about it. I haven't had the time to >> track down those earlier discussions, though I remember Benjamin >> having some comment about it. >> >> Would it be too disruptive (churn, etc.) to clean this up in 3.5? I >> see it similarly to when I moved a light switch from outside my >> bathroom to inside. For a while, but not that long, I kept >> unconsciously reaching for the switch that was no longer there on the >> outside. Regardless I'm glad I did it. Likewise, moving the handful >> of files around is a relatively inconsequential change that would make >> the project just a little less surprising, particularly for new >> contributors. >> >> -eric >> >> p.s. Either way I'll probably take some time (it shouldn't take long) >> after the PEP 451 implementation is done to put together a patch that >> moves the files around, just to see what difference it makes. >> > > I personally think it would be a good idea to re-arrange the files to make > things more beginner-friendly. I believe Nick was also talking about > renaming directories, etc. at some point. > If we're concerned with the beginner-friendliness of our source layout, I'll have to mention that I have a full ASDL parser lying around that's written in Python 3.4 (enums!) without using Spark. So that's one less custom tool to carry around with Python, less files and less LOCs in general. Just sayin' ;-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Sun Nov 17 15:20:09 2013 From: brett at python.org (Brett Cannon) Date: Sun, 17 Nov 2013 09:20:09 -0500 Subject: [Python-Dev] Mixed up core/module source file locations in CPython In-Reply-To: References: Message-ID: On Nov 17, 2013 8:58 AM, "Eli Bendersky" wrote: > > > > > On Sat, Nov 16, 2013 at 3:44 PM, Brett Cannon wrote: >> >> >> >> >> On Sat, Nov 16, 2013 at 1:40 PM, Eric Snow wrote: >>> >>> If you look at the Python and Modules directories in the cpython repo, >>> you'll find modules in Python/ and core files (like python.c and >>> main.c) in Modules/. (It's like parking on a driveway and driving on >>> a parkway. ) It's not that big a deal and not that hard to >>> figure out (so I'm fine with the status quo), but it is a bit >>> surprising. When I was first getting familiar with the code base a >>> few years ago (as a C non-expert), it was a not insignificant but not >>> major stumbling block. >>> >>> The situation is mostly a consequence of history, if I understand >>> correctly. The subject has come up before and I don't recall any >>> objections to doing something about it. I haven't had the time to >>> track down those earlier discussions, though I remember Benjamin >>> having some comment about it. >>> >>> Would it be too disruptive (churn, etc.) to clean this up in 3.5? I >>> see it similarly to when I moved a light switch from outside my >>> bathroom to inside. For a while, but not that long, I kept >>> unconsciously reaching for the switch that was no longer there on the >>> outside. Regardless I'm glad I did it. Likewise, moving the handful >>> of files around is a relatively inconsequential change that would make >>> the project just a little less surprising, particularly for new >>> contributors. >>> >>> -eric >>> >>> p.s. Either way I'll probably take some time (it shouldn't take long) >>> after the PEP 451 implementation is done to put together a patch that >>> moves the files around, just to see what difference it makes. >> >> >> I personally think it would be a good idea to re-arrange the files to make things more beginner-friendly. I believe Nick was also talking about renaming directories, etc. at some point. > > > If we're concerned with the beginner-friendliness of our source layout, I'll have to mention that I have a full ASDL parser lying around that's written in Python 3.4 (enums!) without using Spark. So that's one less custom tool to carry around with Python, less files and less LOCs in general. Just sayin' ;-) Then stop saying and check it in (or at least open a big for it). :) > > Eli > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Nov 17 15:23:20 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 17 Nov 2013 06:23:20 -0800 Subject: [Python-Dev] Mixed up core/module source file locations in CPython In-Reply-To: References: Message-ID: On Sun, Nov 17, 2013 at 6:20 AM, Brett Cannon wrote: > > On Nov 17, 2013 8:58 AM, "Eli Bendersky" wrote: > > > > > > > > > > On Sat, Nov 16, 2013 at 3:44 PM, Brett Cannon wrote: > >> > >> > >> > >> > >> On Sat, Nov 16, 2013 at 1:40 PM, Eric Snow > wrote: > >>> > >>> If you look at the Python and Modules directories in the cpython repo, > >>> you'll find modules in Python/ and core files (like python.c and > >>> main.c) in Modules/. (It's like parking on a driveway and driving on > >>> a parkway. ) It's not that big a deal and not that hard to > >>> figure out (so I'm fine with the status quo), but it is a bit > >>> surprising. When I was first getting familiar with the code base a > >>> few years ago (as a C non-expert), it was a not insignificant but not > >>> major stumbling block. > >>> > >>> The situation is mostly a consequence of history, if I understand > >>> correctly. The subject has come up before and I don't recall any > >>> objections to doing something about it. I haven't had the time to > >>> track down those earlier discussions, though I remember Benjamin > >>> having some comment about it. > >>> > >>> Would it be too disruptive (churn, etc.) to clean this up in 3.5? I > >>> see it similarly to when I moved a light switch from outside my > >>> bathroom to inside. For a while, but not that long, I kept > >>> unconsciously reaching for the switch that was no longer there on the > >>> outside. Regardless I'm glad I did it. Likewise, moving the handful > >>> of files around is a relatively inconsequential change that would make > >>> the project just a little less surprising, particularly for new > >>> contributors. > >>> > >>> -eric > >>> > >>> p.s. Either way I'll probably take some time (it shouldn't take long) > >>> after the PEP 451 implementation is done to put together a patch that > >>> moves the files around, just to see what difference it makes. > >> > >> > >> I personally think it would be a good idea to re-arrange the files to > make things more beginner-friendly. I believe Nick was also talking about > renaming directories, etc. at some point. > > > > > > If we're concerned with the beginner-friendliness of our source layout, > I'll have to mention that I have a full ASDL parser lying around that's > written in Python 3.4 (enums!) without using Spark. So that's one less > custom tool to carry around with Python, less files and less LOCs in > general. Just sayin' ;-) > > Then stop saying and check it in (or at least open a big for it). :) > I'll clean it up and will open an issue. This shouldn't be a problem w.r.t. the beta freeze since it's not a user-visible thing, and its validation is trivial given that we can compare the output files produced. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sun Nov 17 17:14:38 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 17 Nov 2013 17:14:38 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: 2013/11/16 Maciej Fijalkowski : > Can I see some writeup how -OO benefit embedded devices? You get smaller .pyc files. In an embedded device, the whole OS may be written in a small memory, something like 64 MB or smaller. Removing doctrings help to fit in 64 MB. I don't know if dropping "assert" statements is required on embedded device. Victor From guido at python.org Sun Nov 17 17:18:48 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Nov 2013 08:18:48 -0800 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131117074225.GU2085@ando> References: <20131116164600.74e4e703@fsol> <4E8F1FF1-E22F-4746-9354-36E55ED9A8B6@stufft.io> <20131117074225.GU2085@ando> Message-ID: On Saturday, November 16, 2013, Steven D'Aprano wrote: > On Sat, Nov 16, 2013 at 11:16:48AM -0500, Donald Stufft wrote: > > Personally I think that none of the -O* should be removing asserts. It > feels > > like a foot gun to me. I?ve seen more than one codebase that would be > > completely broken under -O* because they used asserts without even > knowing > > -O* existed. > > I agree that many people misuse asserts. I agree that this is a problem > for them. But I disagree that the solution is to remove what I consider > to be a useful feature, and a common one in other languages, just to > protect people from their broken code. > > I prefer to see better efforts to educate them. To that end, I've taken > the liberty of sending a post to python-list at python.org describing when > to use asserts, and when not to use them: > > https://mail.python.org/pipermail/python-list/2013-November/660401.html Nice post! > I will continue to use my best effort to encourage good use of > assertions in Python. Thank you! --Guido > > -- --Guido van Rossum (on iPad) -------------- next part -------------- An HTML attachment was scrubbed... URL: From apieum at gmail.com Sun Nov 17 14:08:57 2013 From: apieum at gmail.com (Gregory Salvan) Date: Sun, 17 Nov 2013 14:08:57 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131117112711.GW2085@ando> References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> <20131117110050.7f420587@fsol> <20131117102723.GV2085@ando> <20131117113521.2ee4db30@fsol> <20131117112711.GW2085@ando> Message-ID: I believe the point of removing assertions is also to avoid throwing unhandled developper errors to end-user and not only "performance". It's like "raise" without "try" block. It's certainly because I consider "assert" as a developper util, providing a concrete documentation about methods signatures. In this idea I've made a small bench to determine if it is interesting for an assertion lib to have null functions as default. https://github.com/apieum/BenchAssert I don't make critical performance applications but still doubt of the real interest of having dead code to better document. 2013/11/17 Steven D'Aprano > On Sun, Nov 17, 2013 at 11:35:21AM +0100, Antoine Pitrou wrote: > > > You didn't answer my question: did you actually use -OO in production, > > or not? Saying that -OO could have helped you optimize something you > > didn't care about isn't a very strong argument for -OO :) > > Ah, sorry, I misunderstood your question. > > I didn't use -OO, since I had no reason to remove docstrings. But I did > use -O to remove asserts. I was talking about assertions, not > docstrings. Since I haven't had to write code for embedded devices with > severely constrained memory, I've never cared about using -OO in > production. > > > > What I would like to know is if people *knowingly* add costly asserts > > to performance-critical code, with the intent of disabling them at > > runtime using -OO. > > Yes, I have knowingly added costly asserts to code with the intend of > disabling them at runtime. Was it *performance-critical* code? I don't > know, that was the point of my earlier rambling -- I could demonstrate a > speedup of the individual functions in benchmarks, but nobody spent the > effort to determine which functions were performance critical. > > > > -- > Steven > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/apieum%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Sun Nov 17 18:50:13 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 17 Nov 2013 12:50:13 -0500 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: <20131117175014.3CA3A250BBA@webabinitio.net> On Sun, 17 Nov 2013 17:14:38 +0100, Victor Stinner wrote: > 2013/11/16 Maciej Fijalkowski : > > Can I see some writeup how -OO benefit embedded devices? > > You get smaller .pyc files. In an embedded device, the whole OS may be > written in a small memory, something like 64 MB or smaller. Removing > doctrings help to fit in 64 MB. > > I don't know if dropping "assert" statements is required on embedded device. I've worked on a project (mobile platform) where both of these were true. Did we gain much by dropping asserts? As with Steve's case, no one bothered to measure, but it was a no-brainer win so we did it. Especially since we really needed to kill the docstrings to save memory. (Killing linecache was an even bigger win on the memory side). I unfortunately don't remember how much difference removing docstrings made in the process size, but I'm pretty sure it was more than a MB, and on a device with no swap in a process that must be running all the time, even 1MB makes a significant difference. On the assert side, if there hadn't been an option to remove the asserts, I'm sure that the call would have come down to manually remove all the asserts for production, which would have been a pain. In that case they *might* have measured, but since you'd have to do the removal work to do the measurement, they'd probably not have bothered. I'd say that if someone wants to drop assert removal, they'd need to prove that it *didn't* result in a speedup, which would be pretty hard considering that the checks covered by assert/if __debug__ can be arbitrarily complex :) But...yes, it would have been nice to have been able to remove docstrings and asserts *separately*. We might have measured the delta and decided to keep them in the beta in that case, since the asserts did catch some bugs. --David From barry at python.org Sun Nov 17 20:02:31 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 17 Nov 2013 14:02:31 -0500 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: Message-ID: <20131117140231.6e3234ac@anarchist> On Nov 17, 2013, at 05:14 PM, Victor Stinner wrote: >2013/11/16 Maciej Fijalkowski : >> Can I see some writeup how -OO benefit embedded devices? > >You get smaller .pyc files. In an embedded device, the whole OS may be >written in a small memory, something like 64 MB or smaller. Removing >doctrings help to fit in 64 MB. I'm in support of separate flags for stripping docstrings and asserts. I'd even be fine with eliminating a flag to strip docstrings if we had a post-processing tool that you could apply to pyc files to strip out the docstrings. Another problem that I had while addressing these options in Debian was the use of .pyo for both -O and -OO level. -Barry From fijall at gmail.com Sun Nov 17 22:05:07 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 17 Nov 2013 23:05:07 +0200 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: <20131117140231.6e3234ac@anarchist> References: <20131117140231.6e3234ac@anarchist> Message-ID: On Sun, Nov 17, 2013 at 9:02 PM, Barry Warsaw wrote: > On Nov 17, 2013, at 05:14 PM, Victor Stinner wrote: > >>2013/11/16 Maciej Fijalkowski : >>> Can I see some writeup how -OO benefit embedded devices? >> >>You get smaller .pyc files. In an embedded device, the whole OS may be >>written in a small memory, something like 64 MB or smaller. Removing >>doctrings help to fit in 64 MB. > > I'm in support of separate flags for stripping docstrings and asserts. I'd > even be fine with eliminating a flag to strip docstrings if we had a > post-processing tool that you could apply to pyc files to strip out the > docstrings. Another problem that I had while addressing these options in > Debian was the use of .pyo for both -O and -OO level. > > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/fijall%40gmail.com My problem with -O and -OO is that their arguments are very circular. Indeed, I understand the need why you would want in certain and limited cases to remove both docstrings and asserts. So some options for doing so are ok. But a lot of arguments I see are along the lines of "don't use asserts because -O removes them". If the option was named --remove-asserts, noone would care, but people care since -O is documented as "do optimizations" and people *assume* this is what it does (makes code faster) and as unintended consequence removes asserts. Cheers, fijal From greg.ewing at canterbury.ac.nz Sun Nov 17 23:05:18 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 18 Nov 2013 11:05:18 +1300 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: <20131116164600.74e4e703@fsol> <20131117070442.GR2085@ando> <20131117110050.7f420587@fsol> <20131117102723.GV2085@ando> <20131117113521.2ee4db30@fsol> <20131117112711.GW2085@ando> Message-ID: <52893D9E.5010307@canterbury.ac.nz> Gregory Salvan wrote: > I believe the point of removing assertions is also to avoid throwing > unhandled developper errors to end-user I'm not sure I buy that. An assert is saying that something should never happen. If it does happen, either it's going to lead to an exception further down the track, or produce incorrect results. In the first case, you've failed to shield the user from exceptions; in the second, better to log an exception than silently pretend nothing has gone wrong. -- Greg From guido at python.org Sun Nov 17 23:05:44 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 17 Nov 2013 14:05:44 -0800 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: <20131117140231.6e3234ac@anarchist> Message-ID: On Sun, Nov 17, 2013 at 1:05 PM, Maciej Fijalkowski wrote: > My problem with -O and -OO is that their arguments are very circular. > Indeed, I understand the need why you would want in certain and > limited cases to remove both docstrings and asserts. So some options > for doing so are ok. But a lot of arguments I see are along the lines > of "don't use asserts because -O removes them". If the option was > named --remove-asserts, noone would care, but people care since -O is > documented as "do optimizations" and people *assume* this is what it > does (makes code faster) and as unintended consequence removes > asserts. > It's circular indeed (and not just the shape of the letter :-). I expect that if there was a separate --remove-asserts option, people would write essential asserts and just claim "don't turn off asserts when using this code". But that's just wrong because such an option (whether named -O or --remove-asserts) is a global choice not under control of the author of the program. The correct rule should be "don't use assert (the statement) to check for valid user input" and the stated reason should be that the assert statement was *designed* to be disabled globally, not to be a shorthand for "if not X: raise (mumble) Y". A corollary should also be that unittests should not use the assert statement; some frameworks sadly encourage the anti-pattern of using it in tests. (There are valid reasons to want to run unittests with -O or -OO: for example, to validate that the code does not depend on side effects of assert statements. If there was a --remove-asserts option, running unittests with that option would be similarly useful.) That said I think our man page and --help output could be more forthcoming about this implication of "optimize". (The language reference is quite clear about it.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Mon Nov 18 03:31:13 2013 From: brian at python.org (Brian Curtin) Date: Sun, 17 Nov 2013 20:31:13 -0600 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: References: <20131116201503.3c4209dc@fsol> Message-ID: On Sat, Nov 16, 2013 at 2:50 PM, Serhiy Storchaka wrote: > 16.11.13 21:15, Antoine Pitrou ???????(??): > >> In a (private) discussion about PEP 428 and pathlib, Guido proposed >> that maybe NTPath should be renamed to WindowsPath, since the name is >> more likely to stay relevant in the middle term. What do you think? > > > What about nturl2path, os.name, sysconfig.get_scheme_names()? What about them? From zuo at chopin.edu.pl Mon Nov 18 03:34:50 2013 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Mon, 18 Nov 2013 03:34:50 +0100 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: <20131117140231.6e3234ac@anarchist> Message-ID: <22ece9dc999b33d36584b88937315626@chopin.edu.pl> 17.11.2013 23:05, Guido van Rossum wrote: > The correct rule should be "don't use assert (the statement) to check > for valid user input" and the stated reason should be that the assert > statement was *designed* to be disabled globally, not to be a > shorthand for "if not X: raise (mumble) Y". A corollary should also > be > that unittests should not use the assert statement; some frameworks > sadly encourage the anti-pattern of using it in tests. My problem with -O (and -OO) is that even though my code is valid (in terms of the rule 'use assert only for should-never-happen cases') I have no control over 3rd party library code: I can never know whether doesn't it break if I turn -O or -OO on (as long as I do not analyze carefully the code of the libraries I use, including writing regression tests [for 3rd party code]...). Woudln't it to be useful to add possibility to place an "optimisation cookie" (syntactically analogous to "encoding cookie") at the beginning of each of my source files (when I know they are "-O"-safe), e.g.: # -*- opt: asserts -*- or even combined with an encoding cookie: # -*- coding: utf-8; opt: asserts, docstrings -*- Then: * The -O flag would be effectively applied *only* to a file containing such a cookie and *exactly* according to the cookie content (whether asserts, whether docstrings...). * Running without -O/-OO would mean ignoring optimisation cookies. * The -OO flag would mean removing both asserts and docstrings (i.e. the status quo of -OO). * Fine-grained explicit command line flags such as --remove-asserts and --remove-docstings could also be useful. (Of course, the '-*-' fragments in the above examples are purely conventional; the actual regex would not include them as it does not include them now for encoding cookies.) Cheers. *j From barry at python.org Mon Nov 18 03:50:37 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 17 Nov 2013 21:50:37 -0500 Subject: [Python-Dev] (#19562) Asserts in Python stdlib code (datetime.py) In-Reply-To: References: <20131117140231.6e3234ac@anarchist> Message-ID: <20131117215037.6c72061f@anarchist> On Nov 17, 2013, at 11:05 PM, Maciej Fijalkowski wrote: >My problem with -O and -OO is that their arguments are very circular. >Indeed, I understand the need why you would want in certain and >limited cases to remove both docstrings and asserts. So some options >for doing so are ok. But a lot of arguments I see are along the lines >of "don't use asserts because -O removes them". If the option was >named --remove-asserts, noone would care, but people care since -O is >documented as "do optimizations" and people *assume* this is what it >does (makes code faster) and as unintended consequence removes >asserts. The reason I want to split them is more because I'd like an option to remove docstrings but not asserts. As others have also said, I use asserts for conditions that can't possibly happen (or so I think). E.g. using an if/elif on an enumeration where the else should never be hit. Or saying that after some calculation, this variable can never be None. In a sense, assertions are executable documentation about the state of behavior at a particular line of code. If an assert ever *does* get triggered, it means you didn't understand what you wrote. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tim.peters at gmail.com Mon Nov 18 06:53:09 2013 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 17 Nov 2013 23:53:09 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131116191526.16f9b79c@fsol> References: <20131116191526.16f9b79c@fsol> Message-ID: [Antoine Pitrou] > Alexandre Vassalotti (thanks a lot!) has recently finalized his work on > the PEP 3154 implementation - pickle protocol 4. > > I think it would be good to get the PEP and the implementation accepted > for 3.4. As far as I can say, this has been a low-controvery proposal, > and it brings fairly obvious improvements to the table (which table?). > I still need some kind of BDFL or BDFL delegate to do that, though -- > unless I am allowed to mark my own PEP accepted :-) Try it and see whether anyone complains ;-) I like it. I didn't review the code, but the PEP addresses real issues, and the solutions look good on paper ;-) One thing I wonder about: I don't know that non-seekable pickle streams are important use cases, but am willing to be told that they are. In which case, framing is a great idea. But I wonder why it isn't done with a new framing opcode instead (say, FRAME followed by 8-byte count). I suppose that would be like the "prefetch" idea, except that framing opcodes would be mandatory (instead of optional) in proto 4. Why I initially like that: - Uniform decoding loop ("the next thing" _always_ starts with an opcode). - Some easy sanity checking due to the tiny redundancy (if the byte immediately following the current frame is not a FRAME opcode, the pickle is corrupt; and also corrupt if a FRAME opcode is encountered _inside_ the current frame). When slinging 8-byte counts, _some_ sanity-checking seems like a good idea ;-) From storchaka at gmail.com Mon Nov 18 11:08:32 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 18 Nov 2013 12:08:32 +0200 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: References: <20131116201503.3c4209dc@fsol> Message-ID: 18.11.13 04:31, Brian Curtin ???????(??): > On Sat, Nov 16, 2013 at 2:50 PM, Serhiy Storchaka wrote: >> 16.11.13 21:15, Antoine Pitrou ???????(??): >> >>> In a (private) discussion about PEP 428 and pathlib, Guido proposed >>> that maybe NTPath should be renamed to WindowsPath, since the name is >>> more likely to stay relevant in the middle term. What do you think? >> >> >> What about nturl2path, os.name, sysconfig.get_scheme_names()? > > What about them? Should nturl2path be renamed to windowsurl2path? Should os.name return "windows" on Windows, and should sysconfig support "windows" scheme name? From mail at timgolden.me.uk Mon Nov 18 11:18:12 2013 From: mail at timgolden.me.uk (Tim Golden) Date: Mon, 18 Nov 2013 10:18:12 +0000 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: References: <20131116201503.3c4209dc@fsol> Message-ID: <5289E964.7030309@timgolden.me.uk> On 18/11/2013 10:08, Serhiy Storchaka wrote: > 18.11.13 04:31, Brian Curtin ???????(??): >> On Sat, Nov 16, 2013 at 2:50 PM, Serhiy Storchaka >> wrote: >>> 16.11.13 21:15, Antoine Pitrou ???????(??): >>> >>>> In a (private) discussion about PEP 428 and pathlib, Guido proposed >>>> that maybe NTPath should be renamed to WindowsPath, since the name is >>>> more likely to stay relevant in the middle term. What do you think? >>> >>> >>> What about nturl2path, os.name, sysconfig.get_scheme_names()? >> >> What about them? > > Should nturl2path be renamed to windowsurl2path? Should os.name return > "windows" on Windows, and should sysconfig support "windows" scheme name? I think the answer's fairly obviously "no" for reasons of backwards compatibility. (Although that doesn't rule out aliases if someone felt strongly enough about it). I don't say that there's no case for trying address the considerable variety of Python & C ways of saying "I'm on Windows" with the core & stdlib. But I do say that whoever makes a case had better be *very* sure of the impact. Going forwards, however, I'm happy to see new Python code using "Windows" rather than the size-specific "Win32/64" or the dated and over-specific "NT". TJG From p.f.moore at gmail.com Mon Nov 18 11:30:05 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 18 Nov 2013 10:30:05 +0000 Subject: [Python-Dev] NTPath or WindowsPath? In-Reply-To: <5289E964.7030309@timgolden.me.uk> References: <20131116201503.3c4209dc@fsol> <5289E964.7030309@timgolden.me.uk> Message-ID: On 18 November 2013 10:18, Tim Golden wrote: >> Should nturl2path be renamed to windowsurl2path? Should os.name return >> "windows" on Windows, and should sysconfig support "windows" scheme name? > > I think the answer's fairly obviously "no" for reasons of backwards > compatibility. (Although that doesn't rule out aliases if someone felt > strongly enough about it). > > I don't say that there's no case for trying address the considerable > variety of Python & C ways of saying "I'm on Windows" with the core & > stdlib. But I do say that whoever makes a case had better be *very* sure > of the impact. > > Going forwards, however, I'm happy to see new Python code using > "Windows" rather than the size-specific "Win32/64" or the dated and > over-specific "NT". Agreed on all counts. Paul From storchaka at gmail.com Mon Nov 18 12:28:00 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 18 Nov 2013 13:28:00 +0200 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> Message-ID: 18.11.13 07:53, Tim Peters ???????(??): > - Some easy sanity checking due to the tiny redundancy (if the byte > immediately following the current frame is not a FRAME opcode, the > pickle is corrupt; and also corrupt if a FRAME opcode is encountered > _inside_ the current frame). For efficient unpickling a FRAME opcode followed by 8-byte count should be *last* thing in a frame (unless it is a last frame). From guido at python.org Mon Nov 18 16:48:27 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Nov 2013 07:48:27 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> Message-ID: On Mon, Nov 18, 2013 at 3:28 AM, Serhiy Storchaka wrote: > 18.11.13 07:53, Tim Peters ???????(??): > > - Some easy sanity checking due to the tiny redundancy (if the byte >> immediately following the current frame is not a FRAME opcode, the >> pickle is corrupt; and also corrupt if a FRAME opcode is encountered >> _inside_ the current frame). >> > > For efficient unpickling a FRAME opcode followed by 8-byte count should be > *last* thing in a frame (unless it is a last frame). > I don't understand that. Clearly the framing is the weakest point of the PEP (== elicits the most bikeshedding). I am also unsure about the value of framing when pickles are written to strings. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Nov 18 17:10:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 17:10:15 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? References: <20131116191526.16f9b79c@fsol> Message-ID: <20131118171015.24cc4aba@fsol> On Mon, 18 Nov 2013 07:48:27 -0800 Guido van Rossum wrote: > On Mon, Nov 18, 2013 at 3:28 AM, Serhiy Storchaka wrote: > > > 18.11.13 07:53, Tim Peters ???????(??): > > > > - Some easy sanity checking due to the tiny redundancy (if the byte > >> immediately following the current frame is not a FRAME opcode, the > >> pickle is corrupt; and also corrupt if a FRAME opcode is encountered > >> _inside_ the current frame). > >> > > > > For efficient unpickling a FRAME opcode followed by 8-byte count should be > > *last* thing in a frame (unless it is a last frame). > > > > I don't understand that. > > Clearly the framing is the weakest point of the PEP (== elicits the most > bikeshedding). I am also unsure about the value of framing when pickles are > written to strings. It hasn't much value in that case, but the cost is also small (8 bytes every 64KB, roughly). Regards Antoine. From solipsis at pitrou.net Mon Nov 18 17:14:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 17:14:21 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> Message-ID: <20131118171421.2dc8e9a2@fsol> On Sun, 17 Nov 2013 23:53:09 -0600 Tim Peters wrote: > > But I wonder why it isn't done with a new framing opcode instead (say, > FRAME followed by 8-byte count). I suppose that would be like the > "prefetch" idea, except that framing opcodes would be mandatory > (instead of optional) in proto 4. Why I initially like that: > > - Uniform decoding loop ("the next thing" _always_ starts with an opcode). But it's not actually uniform. A frame isn't a normal opcode, it's a large section of bytes that contains potentially many opcodes. The framing layer is really below the opcode layer, it makes also sense to implement it like that. (I also tried to implement Serhiy's PREFETCH idea, but it didn't bring any actual simplification) > When slinging 8-byte counts, _some_ sanity-checking seems like a good idea ;-) I don't know. It's not much worse (for denial of service opportunities) than a 4-byte count, which already exists in earlier protocols. Regards Antoine. From chris.barker at noaa.gov Mon Nov 18 18:15:32 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 18 Nov 2013 09:15:32 -0800 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: <5286B540.70804@canterbury.ac.nz> References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> <5286B540.70804@canterbury.ac.nz> Message-ID: Folks, It seems changing anything about how Exception messages are handles is off the table for 2.7, and a non-issue in 3.* Is it worth opening an issue on this, just so we can close it as won't fix? Or is this thread documentation enough? NOTE about py3: I'm assuming unicode messages are handled properly, as unicode is the default text in py3. However, this has got me thinking about how Exception messages are handled in general. I'm still not clear on where it tries to convert to a string -- it seems to do different things depending on whether you are running on the console, or using the traceback module's print_exception function. I'm thinking that perhaps Exception messages should be as generic as possible. i.e allow any python object as an Exception message, and simple pass that off to whatever wants to do something with it. So far, that seems to be happening with the traceback.print_exception message, but I'm not sure where that conversion is happening in the console report -- clearly it's not using traceback.print_exception, but it is doing it before a final simple: print exp.message at the console. But perhaps it should defer to that last step, i.e. out of the Exception handling code altogether. (or maybe py3 already does that -- in which case, never mind). -Chris -Chris On Fri, Nov 15, 2013 at 3:58 PM, Greg Ewing wrote: > Armin Rigo wrote: > >> I figured that even using the traceback.py module and getting >> "Exception: \u1234\u1235\u5321" is rather useless if you tried to >> raise an exception with a message in Thai. >> > > But at least it tells you that *something* went wrong, > and points to the place in the code where it happened. > That has to be better than pretending that nothing > happened at all. > > Also, if the escaping preserves the original byte > sequence of the message, there's a chance that someone > will be able to figure out what the message said. > > -- > Greg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From avichal.dayal at gmail.com Mon Nov 18 18:54:36 2013 From: avichal.dayal at gmail.com (Avichal Dayal) Date: Mon, 18 Nov 2013 23:24:36 +0530 Subject: [Python-Dev] New to the community - Introduction Message-ID: Hello, I'm Avichal Dayal and I'm new to this community rather the whole open source world. I've been "lurking" around for a while, reading bugs, code and asking relevant doubts on the core-mentorship mailing list. Now I intend to see how the development goes on and be a part of it. Hope to be a part of this community and contribute! Regards, Avichal -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Nov 18 18:58:01 2013 From: brett at python.org (Brett Cannon) Date: Mon, 18 Nov 2013 12:58:01 -0500 Subject: [Python-Dev] unicode Exception messages in py2.7 In-Reply-To: References: <20131115030855.GN2085@ando> <20131115032848.GA39820@cskk.homeip.net> <20131115054207.GO2085@ando> <5286B540.70804@canterbury.ac.nz> Message-ID: On Mon, Nov 18, 2013 at 12:15 PM, Chris Barker wrote: > Folks, > > It seems changing anything about how Exception messages are handles is off > the table for 2.7, and a non-issue in 3.* > > Is it worth opening an issue on this, just so we can close it as won't > fix? Or is this thread documentation enough? > A closed bug wouldn't hurt, but I wouldn't view it as necessary. > > NOTE about py3: I'm assuming unicode messages are handled properly, as > unicode is the default text in py3. However, this has got me thinking about > how Exception messages are handled in general. I'm still not clear on where > it tries to convert to a string -- it seems to do different things > depending on whether you are running on the console, or using the traceback > module's print_exception function. I'm thinking that perhaps Exception > messages should be as generic as possible. i.e allow any python object as > an Exception message, and simple pass that off to whatever wants to > do something with it. So far, that seems to be happening with the > traceback.print_exception message, but I'm not sure where that conversion > is happening in the console report -- clearly it's not using > traceback.print_exception, but it is doing it before a final simple: > > print exp.message > > at the console. But perhaps it should defer to that last step, i.e. out of > the Exception handling code altogether. (or maybe py3 already does that -- > in which case, never mind). > > The conversion is probably happening in Exception.__str__ (don't know about the traceback module w/o looking at it; probably skipping `str(exc)` and building the string from scratch). -Brett > -Chris > > > > > > > > > > > > > -Chris > > > > > On Fri, Nov 15, 2013 at 3:58 PM, Greg Ewing wrote: > >> Armin Rigo wrote: >> >>> I figured that even using the traceback.py module and getting >>> "Exception: \u1234\u1235\u5321" is rather useless if you tried to >>> raise an exception with a message in Thai. >>> >> >> But at least it tells you that *something* went wrong, >> and points to the place in the code where it happened. >> That has to be better than pretending that nothing >> happened at all. >> >> Also, if the escaping preserves the original byte >> sequence of the message, there's a chance that someone >> will be able to figure out what the message said. >> >> -- >> Greg >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >> chris.barker%40noaa.gov >> > > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Nov 18 19:07:59 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Nov 2013 10:07:59 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131118171015.24cc4aba@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> Message-ID: On Mon, Nov 18, 2013 at 8:10 AM, Antoine Pitrou wrote: > On Mon, 18 Nov 2013 07:48:27 -0800 > Guido van Rossum wrote: > > On Mon, Nov 18, 2013 at 3:28 AM, Serhiy Storchaka >wrote: > > > > > 18.11.13 07:53, Tim Peters ???????(??): > > > > > > - Some easy sanity checking due to the tiny redundancy (if the byte > > >> immediately following the current frame is not a FRAME opcode, the > > >> pickle is corrupt; and also corrupt if a FRAME opcode is encountered > > >> _inside_ the current frame). > > >> > > > > > > For efficient unpickling a FRAME opcode followed by 8-byte count > should be > > > *last* thing in a frame (unless it is a last frame). > > > > > > > I don't understand that. > > > > Clearly the framing is the weakest point of the PEP (== elicits the most > > bikeshedding). I am also unsure about the value of framing when pickles > are > > written to strings. > > It hasn't much value in that case, but the cost is also small (8 bytes > every 64KB, roughly). > That's small if your pickle is large, but for small pickles it can add up. Still, not enough to reject the PEP. Just get Tim to agree with you, or switch to Tim's proposal. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Nov 18 19:29:09 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 19:29:09 +0100 Subject: [Python-Dev] New to the community - Introduction References: Message-ID: <20131118192909.3589ce44@fsol> On Mon, 18 Nov 2013 23:24:36 +0530 Avichal Dayal wrote: > Hello, > > I'm Avichal Dayal and I'm new to this community rather the whole open > source world. > > I've been "lurking" around for a while, reading bugs, code and asking > relevant doubts on the core-mentorship mailing list. > > Now I intend to see how the development goes on and be a part of it. > Hope to be a part of this community and contribute! Welcome, Avichal. I hope you'll find something to work on that will make you want to contribute! Regards Antoine. From solipsis at pitrou.net Mon Nov 18 19:41:54 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 19:41:54 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> Message-ID: <20131118194154.672332e5@fsol> On Mon, 18 Nov 2013 17:14:21 +0100 Antoine Pitrou wrote: > On Sun, 17 Nov 2013 23:53:09 -0600 > Tim Peters wrote: > > > > But I wonder why it isn't done with a new framing opcode instead (say, > > FRAME followed by 8-byte count). I suppose that would be like the > > "prefetch" idea, except that framing opcodes would be mandatory > > (instead of optional) in proto 4. Why I initially like that: > > > > - Uniform decoding loop ("the next thing" _always_ starts with an opcode). > > But it's not actually uniform. A frame isn't a normal opcode, it's a > large section of bytes that contains potentially many opcodes. > > The framing layer is really below the opcode layer, it makes also sense > to implement it like that. > > (I also tried to implement Serhiy's PREFETCH idea, but it didn't bring > any actual simplification) > > > When slinging 8-byte counts, _some_ sanity-checking seems like a good idea ;-) > > I don't know. It's not much worse (for denial of service opportunities) > than a 4-byte count, which already exists in earlier protocols. Actually, now that I think of it, it's even better. A 2**63 bytes allocation is guaranteed to fail, since most 64-bit CPUs have a smaller address space than that (for example, x86-64 CPUs seem to have a 48 bits virtual address space). On the other hand, a 2**31 bytes allocation may very well succeed, eat almost all the RAM and/or slow down the machine by swapping out. Regards Antoine. From tim.peters at gmail.com Mon Nov 18 23:02:31 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 16:02:31 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131118171015.24cc4aba@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> Message-ID: [Guido] >> Clearly the framing is the weakest point of the PEP (== elicits the most >> bikeshedding). I am also unsure about the value of framing when pickles are >> written to strings. [Antoine] > It hasn't much value in that case, It has _no_ value in that case, yes? It doesn't appear to have _much_ value in the case of a seekable stream, either - the implementation has always been free to read ahead then. The real value appears to be in cases of non-seekable streams. > but the cost is also small (8 bytes every 64KB, roughly). >> That's small if your pickle is large, but for small pickles it can add up. Which is annoying. It was already annoying when the PROTO opcode was introduced, and the size of small pickles increased by 2 bytes. That added up too :-( >> Still, not enough to reject the PEP. Just get Tim to agree with you, >> or switch to Tim's proposal. I just asked a question ;-) If a mandatory proto 4 FRAME opcode were added, it would just increase the bloat for small pickles (up from the currently proposed 8 bytes of additional overhead to 9). From solipsis at pitrou.net Mon Nov 18 23:08:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 23:08:52 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> Message-ID: <20131118230852.3ca29739@fsol> On Mon, 18 Nov 2013 16:02:31 -0600 Tim Peters wrote: > [Guido] > >> Clearly the framing is the weakest point of the PEP (== elicits the most > >> bikeshedding). I am also unsure about the value of framing when pickles are > >> written to strings. > > [Antoine] > > It hasn't much value in that case, > > It has _no_ value in that case, yes? It doesn't appear to have _much_ > value in the case of a seekable stream, either - the implementation > has always been free to read ahead then. The real value appears to be > in cases of non-seekable streams. > > > > but the cost is also small (8 bytes every 64KB, roughly). > > >> That's small if your pickle is large, but for small pickles it can add up. > > Which is annoying. It was already annoying when the PROTO opcode was > introduced, and the size of small pickles increased by 2 bytes. That > added up too :-( Are very small pickles that size-sensitive? I have the impression that if 8 bytes vs. e.g. 15 bytes makes a difference for your application, you'd be better off with a hand-made format. Regards Antoine. From tim.peters at gmail.com Mon Nov 18 23:18:21 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 16:18:21 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131118171421.2dc8e9a2@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> Message-ID: [Tim] >> But I wonder why it isn't done with a new framing opcode instead (say, >> FRAME followed by 8-byte count). I suppose that would be like the >> "prefetch" idea, except that framing opcodes would be mandatory >> (instead of optional) in proto 4. Why I initially like that: >> >> - Uniform decoding loop ("the next thing" _always_ starts with an opcode). > But it's not actually uniform. A frame isn't a normal opcode, it's a > large section of bytes that contains potentially many opcodes. > > The framing layer is really below the opcode layer, it makes also sense > to implement it like that. That makes sense to me. > (I also tried to implement Serhiy's PREFETCH idea, but it didn't bring > any actual simplification) But it has a different kind of advantage: PREFETCH was optional. As Guido said, it's annoying to bloat the size of small pickles (which may, although individually small, occur in great numbers) by 8 bytes each. There's really no point to framing small chunks of data, right? Which leads to another idea: after the PROTO opcode, there is, or is not, an optional PREFETCH opcde with an 8-byte argument. If the PREFETCH opcode exists, then it gives the number of bytes up to and including the pickle's STOP opcode. So there's exactly 0 or 1 PREFETCH opcodes per pickle. Is there an advantage to spraying multiple 8-byte "frame counts" throughout a pickle stream? 8 bytes is surely enough to specify the size of any single pickle for half a generation ;-) to come. >> When slinging 8-byte counts, _some_ sanity-checking seems like a good idea ;-) > I don't know. It's not much worse (for denial of service opportunities) > than a 4-byte count, which already exists in earlier protocols. I'm not thinking of DOS at all, just general sanity as data objects get larger & larger. Pickles have almost no internal checks now. But I've seen my share of corrupted pickles! About the only thing that catches them early is hitting a byte that isn't a legitimate pickle opcode. That _used_ to be a much stronger check than it is now, because the 8-bit opcode space was sparsely populated at first. But, over time, more and more opcodes get added, so the chance of mistaking a garbage byte for a legit opcode has increased correspondingly. A PREFETCH opcode with a "bytes until STOP" makes for a decent bad ;-) sanity check too ;-) From tim.peters at gmail.com Mon Nov 18 23:25:07 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 16:25:07 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131118230852.3ca29739@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> Message-ID: [Tim] >> ... >> It was already annoying when the PROTO opcode was introduced, >> and the size of small pickles increased by 2 bytes. That >> added up too :-( [Antoine] > Are very small pickles that size-sensitive? I have the impression that > if 8 bytes vs. e.g. 15 bytes makes a difference for your application, > you'd be better off with a hand-made format. The difference between 8 and 15 is, e.g., nearly doubling the amount of network traffic (for apps that use pickles across processes or machines). A general approach has no way to guess how it will be used. For example, `multiprocessing` uses pickles extensively for inter-process communication of Python data. Some users try broadcasting giant arrays across processes, while others broadcast oceans of tiny integers (like indices into giant arrays inherited via fork()). Since pickle intends to be "the" Python serialization format, it really should try to be friendly for all plausible uses. From solipsis at pitrou.net Mon Nov 18 23:31:34 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 23:31:34 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> Message-ID: <20131118233134.02bb4edb@fsol> On Mon, 18 Nov 2013 16:18:21 -0600 Tim Peters wrote: > > But it has a different kind of advantage: PREFETCH was optional. As > Guido said, it's annoying to bloat the size of small pickles (which > may, although individually small, occur in great numbers) by 8 bytes > each. There's really no point to framing small chunks of data, right? You can't know how much space the pickle will take until the pickling ends, though, which makes it difficult to decide whether you want to emit a PREFETCH opcode or not. > Which leads to another idea: after the PROTO opcode, there is, or is > not, an optional PREFETCH opcde with an 8-byte argument. If the > PREFETCH opcode exists, then it gives the number of bytes up to and > including the pickle's STOP opcode. So there's exactly 0 or 1 > PREFETCH opcodes per pickle. > > Is there an advantage to spraying multiple 8-byte "frame counts" > throughout a pickle stream? Well, yes: much better memory usage for large pickles. Some people use pickles to store huge data, which was the motivation to add the 8-byte-size opcodes after all. > I'm not thinking of DOS at all, just general sanity as data objects > get larger & larger. Pickles have almost no internal checks now. But > I've seen my share of corrupted pickles! IMO, any validation should use a dedicated CRC-like scheme, rather than relying on the fact that correct pickles are statistically unlikely :-) (or you can implicitly rely on TCP or UDP checksums, or you disk subsystem's own sanity checks) Regards Antoine. From solipsis at pitrou.net Mon Nov 18 23:39:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 18 Nov 2013 23:39:16 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> Message-ID: <20131118233916.096ec10a@fsol> On Mon, 18 Nov 2013 16:25:07 -0600 Tim Peters wrote: > > [Antoine] > > Are very small pickles that size-sensitive? I have the impression that > > if 8 bytes vs. e.g. 15 bytes makes a difference for your application, > > you'd be better off with a hand-made format. > > The difference between 8 and 15 is, e.g., nearly doubling the amount > of network traffic (for apps that use pickles across processes or > machines). > > A general approach has no way to guess how it will be used. For > example, `multiprocessing` uses pickles extensively for inter-process > communication of Python data. Some users try broadcasting giant > arrays across processes, while others broadcast oceans of tiny > integers (like indices into giant arrays inherited via fork()). Well, sending oceans of tiny integers will also incur many system calls and additional synchronization costs, since sending data on a multiprocessing Queue has to acquire a semaphore. So it generally sounds like a bad idea, IMHO. That said, I agree with: > Since pickle intends to be "the" Python serialization format, it > really should try to be friendly for all plausible uses. I simply don't think adding a fixed 8-byte overhead is actually annoying. It's less than the PyObject overhead in 64-bit mode... Regards Antoine. From tim.peters at gmail.com Mon Nov 18 23:44:59 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 16:44:59 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131118233134.02bb4edb@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> Message-ID: [Tim] >> But it has a different kind of advantage: PREFETCH was optional. As >> Guido said, it's annoying to bloat the size of small pickles (which >> may, although individually small, occur in great numbers) by 8 bytes >> each. There's really no point to framing small chunks of data, right? [Antoine] > You can't know how much space the pickle will take until the pickling > ends, though, which makes it difficult to decide whether you want to > emit a PREFETCH opcode or not. Ah, of course. Presumably the outgoing pickle stream is first stored in some memory buffer, right? If pickling completes before the buffer is first flushed, then you know exactly how large the entire pickle is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH part. Else do. >> Which leads to another idea: after the PROTO opcode, there is, or is >> not, an optional PREFETCH opcde with an 8-byte argument. If the >> PREFETCH opcode exists, then it gives the number of bytes up to and >> including the pickle's STOP opcode. So there's exactly 0 or 1 >> PREFETCH opcodes per pickle. >> >> Is there an advantage to spraying multiple 8-byte "frame counts" >> throughout a pickle stream? > Well, yes: much better memory usage for large pickles. > Some people use pickles to store huge data, which was the motivation to > add the 8-byte-size opcodes after all. We'd have the same advantage _if_ it were feasible to know the entire size up front. I understand now that it's not feasible. > ... > IMO, any validation should use a dedicated CRC-like scheme, rather than > relying on the fact that correct pickles are statistically unlikely :-) OK! A new CRC opcode every two bytes ;-) > (or you can implicitly rely on TCP or UDP checksums, or you disk > subsystem's own sanity checks) Of course we've been doing that all along. Yet I've still seen my share of corrupted pickles anyway. Don't underestimate the determination of users to screw up everything possible in every way possible ;-) From ncoghlan at gmail.com Tue Nov 19 00:02:50 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Nov 2013 09:02:50 +1000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> Message-ID: On 19 Nov 2013 08:52, "Tim Peters" wrote: > > [Tim] > >> But it has a different kind of advantage: PREFETCH was optional. As > >> Guido said, it's annoying to bloat the size of small pickles (which > >> may, although individually small, occur in great numbers) by 8 bytes > >> each. There's really no point to framing small chunks of data, right? > > [Antoine] > > You can't know how much space the pickle will take until the pickling > > ends, though, which makes it difficult to decide whether you want to > > emit a PREFETCH opcode or not. > > Ah, of course. Presumably the outgoing pickle stream is first stored > in some memory buffer, right? If pickling completes before the buffer > is first flushed, then you know exactly how large the entire pickle > is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH > part. Else do. > > > >> Which leads to another idea: after the PROTO opcode, there is, or is > >> not, an optional PREFETCH opcde with an 8-byte argument. If the > >> PREFETCH opcode exists, then it gives the number of bytes up to and > >> including the pickle's STOP opcode. So there's exactly 0 or 1 > >> PREFETCH opcodes per pickle. > >> > >> Is there an advantage to spraying multiple 8-byte "frame counts" > >> throughout a pickle stream? > > > Well, yes: much better memory usage for large pickles. > > Some people use pickles to store huge data, which was the motivation to > > add the 8-byte-size opcodes after all. > > We'd have the same advantage _if_ it were feasible to know the entire > size up front. I understand now that it's not feasible. This may be a dumb suggestion (since I don't know the pickle infrastructure at all), but could there be a short section of unframed data before switching to framing mode? For example, emit up to 256 bytes unframed in all pickles, then start emitting appropriate FRAME opcodes if the pickle continues on? Cheers, Nick. > > > ... > > IMO, any validation should use a dedicated CRC-like scheme, rather than > > relying on the fact that correct pickles are statistically unlikely :-) > > OK! A new CRC opcode every two bytes ;-) > > > (or you can implicitly rely on TCP or UDP checksums, or you disk > > subsystem's own sanity checks) > > Of course we've been doing that all along. Yet I've still seen my > share of corrupted pickles anyway. Don't underestimate the > determination of users to screw up everything possible in every way > possible ;-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 19 00:10:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 00:10:14 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> Message-ID: <20131119001014.58c20a77@fsol> Ok, how about merging the two sub-threads :-) On Mon, 18 Nov 2013 16:44:59 -0600 Tim Peters wrote: > [Antoine] > > You can't know how much space the pickle will take until the pickling > > ends, though, which makes it difficult to decide whether you want to > > emit a PREFETCH opcode or not. > > Ah, of course. Presumably the outgoing pickle stream is first stored > in some memory buffer, right? If pickling completes before the buffer > is first flushed, then you know exactly how large the entire pickle > is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH > part. Else do. That's true. We could also have a SMALLPREFETCH opcode with a one-byte length to still get the benefits of prefetching. > > Well, yes: much better memory usage for large pickles. > > Some people use pickles to store huge data, which was the motivation to > > add the 8-byte-size opcodes after all. > > We'd have the same advantage _if_ it were feasible to know the entire > size up front. I understand now that it's not feasible. AFAICT, it would only be possible by doing two-pass pickling, which would also slow it down massively. > A long-running process can legitimately put billions of items on work > queues, far more than could ever fit in RAM simultaneously. Comparing > this to PyObject overhead makes no sense to me. Neither does the line > of argument "there are several kinds of overheads, so making this > overhead worse too doesn't matter". Well, it's a question of cost / benefit: does it make sense to optimize something that will be dwarfed by other factors in real world situations? > When possible, we should strive not to add overheads that don't repay > their costs. For small pickles, an 8-byte size field doesn't appear > to buy anything. But I appreciate that it costs implementation effort > to avoid producing it in these cases. I share the concern, although I still don't think the "ocean of tiny pickles" is a reasonable use case :-) That said, assuming you think this is important (do you?), we're left with the following constraints: - it would be nice to have this PEP in 3.4 - 3.4 beta1 and feature freeze is in approximately one week - switching to the PREFETCH scheme requires some non-trivial work on the current patch, work done by either Alexandre or me (but I already have pathlib (PEP 428) on my plate, so it'll have to be Alexandre) - unless you want to do it, of course? What do you think? Regards Antoine. From tim.peters at gmail.com Mon Nov 18 23:54:33 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 16:54:33 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131118233916.096ec10a@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> <20131118233916.096ec10a@fsol> Message-ID: [Antoine] > Well, sending oceans of tiny integers will also incur many system calls > and additional synchronization costs, since sending data on a > multiprocessing Queue has to acquire a semaphore. So it generally > sounds like a bad idea, IMHO. > > That said, I agree with: >> Since pickle intends to be "the" Python serialization format, it >> really should try to be friendly for all plausible uses. > > I simply don't think adding a fixed 8-byte overhead is actually > annoying. It's less than the PyObject overhead in 64-bit mode... A long-running process can legitimately put billions of items on work queues, far more than could ever fit in RAM simultaneously. Comparing this to PyObject overhead makes no sense to me. Neither does the line of argument "there are several kinds of overheads, so making this overhead worse too doesn't matter". When possible, we should strive not to add overheads that don't repay their costs. For small pickles, an 8-byte size field doesn't appear to buy anything. But I appreciate that it costs implementation effort to avoid producing it in these cases. From shibturn at gmail.com Tue Nov 19 00:31:04 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Mon, 18 Nov 2013 23:31:04 +0000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> Message-ID: <528AA338.8060701@gmail.com> On 18/11/2013 10:25pm, Tim Peters wrote: > The difference between 8 and 15 is, e.g., nearly doubling the amount > of network traffic (for apps that use pickles across processes or > machines). I tried using multiprocessing.Pipe() and send_bytes()/recv_bytes() to send messages between processes: 8 bytes messages -- 525,000 msgs/sec 15 bytes messages -- 556,000 msgs/sec So the size of small messages does not seem to make much difference. (This was in a Linux VM with Python 2.7. Python 3.3 is much slower because it is not implemented in C, but gives similarly close results.) -- Richard From shibturn at gmail.com Tue Nov 19 00:31:04 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Mon, 18 Nov 2013 23:31:04 +0000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> Message-ID: <528AA338.8060701@gmail.com> On 18/11/2013 10:25pm, Tim Peters wrote: > The difference between 8 and 15 is, e.g., nearly doubling the amount > of network traffic (for apps that use pickles across processes or > machines). I tried using multiprocessing.Pipe() and send_bytes()/recv_bytes() to send messages between processes: 8 bytes messages -- 525,000 msgs/sec 15 bytes messages -- 556,000 msgs/sec So the size of small messages does not seem to make much difference. (This was in a Linux VM with Python 2.7. Python 3.3 is much slower because it is not implemented in C, but gives similarly close results.) -- Richard From ncoghlan at gmail.com Tue Nov 19 00:51:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Nov 2013 09:51:23 +1000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> <20131118233916.096ec10a@fsol> Message-ID: On 19 Nov 2013 09:33, "Tim Peters" wrote: > > [Antoine] > > Well, sending oceans of tiny integers will also incur many system calls > > and additional synchronization costs, since sending data on a > > multiprocessing Queue has to acquire a semaphore. So it generally > > sounds like a bad idea, IMHO. > > > > That said, I agree with: > >> Since pickle intends to be "the" Python serialization format, it > >> really should try to be friendly for all plausible uses. > > > > I simply don't think adding a fixed 8-byte overhead is actually > > annoying. It's less than the PyObject overhead in 64-bit mode... > > A long-running process can legitimately put billions of items on work > queues, far more than could ever fit in RAM simultaneously. Comparing > this to PyObject overhead makes no sense to me. Neither does the line > of argument "there are several kinds of overheads, so making this > overhead worse too doesn't matter". > > When possible, we should strive not to add overheads that don't repay > their costs. For small pickles, an 8-byte size field doesn't appear > to buy anything. But I appreciate that it costs implementation effort > to avoid producing it in these cases. Given that cases where the overhead matters can drop back to proto 3 if absolutely necessary, perhaps it's reasonable to just run with the current proposal? Cheers, Nick. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 19 00:57:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 00:57:08 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> Message-ID: <20131119005708.1c638214@fsol> On Mon, 18 Nov 2013 16:44:59 -0600 Tim Peters wrote: > [Tim] > >> But it has a different kind of advantage: PREFETCH was optional. As > >> Guido said, it's annoying to bloat the size of small pickles (which > >> may, although individually small, occur in great numbers) by 8 bytes > >> each. There's really no point to framing small chunks of data, right? > > [Antoine] > > You can't know how much space the pickle will take until the pickling > > ends, though, which makes it difficult to decide whether you want to > > emit a PREFETCH opcode or not. > > Ah, of course. Presumably the outgoing pickle stream is first stored > in some memory buffer, right? If pickling completes before the buffer > is first flushed, then you know exactly how large the entire pickle > is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH > part. Else do. Yet another possibility: keep framing but use a variable-length encoding for the frame size: - first byte: bits 7-5: N (= frame size bytes length - 1) - first byte: bits 4-0: first 5 bits of frame size - remaning N bytes: remaining bits of frame size With this scheme, very small pickles have a one byte overhead; small ones a two byte overhead; and the max frame size is 2**61 rather than 2**64, which should still be sufficient. And the frame size is read using either one or two read() calls, which is efficient. Regards Antoine. From ncoghlan at gmail.com Tue Nov 19 01:30:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Nov 2013 10:30:45 +1000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119005708.1c638214@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: On 19 November 2013 09:57, Antoine Pitrou wrote: > On Mon, 18 Nov 2013 16:44:59 -0600 > Tim Peters wrote: >> [Tim] >> >> But it has a different kind of advantage: PREFETCH was optional. As >> >> Guido said, it's annoying to bloat the size of small pickles (which >> >> may, although individually small, occur in great numbers) by 8 bytes >> >> each. There's really no point to framing small chunks of data, right? >> >> [Antoine] >> > You can't know how much space the pickle will take until the pickling >> > ends, though, which makes it difficult to decide whether you want to >> > emit a PREFETCH opcode or not. >> >> Ah, of course. Presumably the outgoing pickle stream is first stored >> in some memory buffer, right? If pickling completes before the buffer >> is first flushed, then you know exactly how large the entire pickle >> is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH >> part. Else do. > > Yet another possibility: keep framing but use a variable-length > encoding for the frame size: > > - first byte: bits 7-5: N (= frame size bytes length - 1) > - first byte: bits 4-0: first 5 bits of frame size > - remaning N bytes: remaining bits of frame size > > With this scheme, very small pickles have a one byte overhead; small > ones a two byte overhead; and the max frame size is 2**61 rather than > 2**64, which should still be sufficient. > > And the frame size is read using either one or two read() calls, which > is efficient. And it's only a minimal change from the current patch. Sounds good to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Tue Nov 19 01:48:05 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 18 Nov 2013 16:48:05 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: On Mon, Nov 18, 2013 at 4:30 PM, Nick Coghlan wrote: > On 19 November 2013 09:57, Antoine Pitrou wrote: > > On Mon, 18 Nov 2013 16:44:59 -0600 > > Tim Peters wrote: > >> [Tim] > >> >> But it has a different kind of advantage: PREFETCH was optional. As > >> >> Guido said, it's annoying to bloat the size of small pickles (which > >> >> may, although individually small, occur in great numbers) by 8 bytes > >> >> each. There's really no point to framing small chunks of data, > right? > >> > >> [Antoine] > >> > You can't know how much space the pickle will take until the pickling > >> > ends, though, which makes it difficult to decide whether you want to > >> > emit a PREFETCH opcode or not. > >> > >> Ah, of course. Presumably the outgoing pickle stream is first stored > >> in some memory buffer, right? If pickling completes before the buffer > >> is first flushed, then you know exactly how large the entire pickle > >> is. If "it's small" (say, < 100 bytes), don't write out the PREFETCH > >> part. Else do. > > > > Yet another possibility: keep framing but use a variable-length > > encoding for the frame size: > > > > - first byte: bits 7-5: N (= frame size bytes length - 1) > > - first byte: bits 4-0: first 5 bits of frame size > > - remaning N bytes: remaining bits of frame size > > > > With this scheme, very small pickles have a one byte overhead; small > > ones a two byte overhead; and the max frame size is 2**61 rather than > > 2**64, which should still be sufficient. > > > > And the frame size is read using either one or two read() calls, which > > is efficient. > > And it's only a minimal change from the current patch. Sounds good to me. > Food for thought: maybe we should have variable-encoding lengths for all opcodes, rather than the current cumbersome scheme? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Nov 19 01:50:01 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 18:50:01 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119005708.1c638214@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: [Antoine] > Yet another possibility: keep framing but use a variable-length > encoding for the frame size: > > - first byte: bits 7-5: N (= frame size bytes length - 1) > - first byte: bits 4-0: first 5 bits of frame size > - remaning N bytes: remaining bits of frame size > > With this scheme, very small pickles have a one byte overhead; small > ones a two byte overhead; and the max frame size is 2**61 rather than > 2**64, which should still be sufficient. > > And the frame size is read using either one or two read() calls, which > is efficient. That would be a happy compromise :-) I'm unclear on how that would work for, e.g., encoding 40 = 0b000101000. That has 6 significant bits. Would you store 0 in the leading byte and 40 in the second byte? That would work. 2**61 = 2,305,843,009,213,693,952 is a lot of bytes, especially for a pickle ;-) From tim.peters at gmail.com Tue Nov 19 01:53:00 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 18:53:00 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: [Guido] > Food for thought: maybe we should have variable-encoding lengths for all > opcodes, rather than the current cumbersome scheme? Yes, but not for protocol 4 - time's running out fast for that. When we "only" had the XXX1, XXX2, and XXX4 opcodes, it was kinda silly, but after adding XXX8 flavors to all of those it's become unbearable. From tim.peters at gmail.com Tue Nov 19 01:55:10 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 18:55:10 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528AA338.8060701@gmail.com> References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> <528AA338.8060701@gmail.com> Message-ID: [Richard Oudkerk] > I tried using multiprocessing.Pipe() and send_bytes()/recv_bytes() to send > messages between processes: > > 8 bytes messages -- 525,000 msgs/sec > 15 bytes messages -- 556,000 msgs/sec > > So the size of small messages does not seem to make much difference. To the contrary, the lesson is clear: to speed up multiprocessing, the larger the messages the faster it goes ;-) From solipsis at pitrou.net Tue Nov 19 01:56:41 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 01:56:41 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: <20131119015641.53791621@fsol> On Mon, 18 Nov 2013 18:50:01 -0600 Tim Peters wrote: > [Antoine] > > Yet another possibility: keep framing but use a variable-length > > encoding for the frame size: > > > > - first byte: bits 7-5: N (= frame size bytes length - 1) > > - first byte: bits 4-0: first 5 bits of frame size > > - remaning N bytes: remaining bits of frame size > > > > With this scheme, very small pickles have a one byte overhead; small > > ones a two byte overhead; and the max frame size is 2**61 rather than > > 2**64, which should still be sufficient. > > > > And the frame size is read using either one or two read() calls, which > > is efficient. > > That would be a happy compromise :-) > > I'm unclear on how that would work for, e.g., encoding 40 = > 0b000101000. That has 6 significant bits. Would you store 0 in the > leading byte and 40 in the second byte? That would work. Yeah, I haven't decided whether it would be big-endian or little-endian. It doesn't matter much. big-endian sounds a bit easier to decode and encode (no bit shifts needed), it's also less consistent with the rest of the pickle opcodes. > 2**61 = 2,305,843,009,213,693,952 is a lot of bytes, especially for a pickle ;-) Let's call it a biggle :-) Regards Antoine. From tim.peters at gmail.com Tue Nov 19 02:06:05 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 19:06:05 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119015641.53791621@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119015641.53791621@fsol> Message-ID: [Antoine] >>> - first byte: bits 7-5: N (= frame size bytes length - 1) >>> - first byte: bits 4-0: first 5 bits of frame size >>> - remaning N bytes: remaining bits of frame size [Tim] >> I'm unclear on how that would work for, e.g., encoding 40 = >> 0b000101000. That has 6 significant bits. Would you store 0 in the >> leading byte and 40 in the second byte? That would work. [Antoine] > Yeah, I haven't decided whether it would be big-endian or > little-endian. It doesn't matter much. big-endian sounds a bit easier > to decode and encode (no bit shifts needed), it's also less consistent > with the rest of the pickle opcodes. As you've told me, the framing layer is beneath the opcode layer, so what opcodes do is irrelevant ;-) Big-endian would be my choice (easier (en)(de)coding, both via software and via eyeball when staring at dumps);. From shibturn at gmail.com Tue Nov 19 02:09:23 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Tue, 19 Nov 2013 01:09:23 +0000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> <528AA338.8060701@gmail.com> Message-ID: <528ABA43.5050506@gmail.com> On 19/11/2013 12:55am, Tim Peters wrote: > [Richard Oudkerk] >> I tried using multiprocessing.Pipe() and send_bytes()/recv_bytes() to send >> messages between processes: >> >> 8 bytes messages -- 525,000 msgs/sec >> 15 bytes messages -- 556,000 msgs/sec >> >> So the size of small messages does not seem to make much difference. > > To the contrary, the lesson is clear: to speed up multiprocessing, > the larger the messages the faster it goes ;-) Ah, yes. It was probably round the other way. -- Richard From shibturn at gmail.com Tue Nov 19 02:09:23 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Tue, 19 Nov 2013 01:09:23 +0000 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171015.24cc4aba@fsol> <20131118230852.3ca29739@fsol> <528AA338.8060701@gmail.com> Message-ID: <528ABA43.5050506@gmail.com> On 19/11/2013 12:55am, Tim Peters wrote: > [Richard Oudkerk] >> I tried using multiprocessing.Pipe() and send_bytes()/recv_bytes() to send >> messages between processes: >> >> 8 bytes messages -- 525,000 msgs/sec >> 15 bytes messages -- 556,000 msgs/sec >> >> So the size of small messages does not seem to make much difference. > > To the contrary, the lesson is clear: to speed up multiprocessing, > the larger the messages the faster it goes ;-) Ah, yes. It was probably round the other way. -- Richard From tim.peters at gmail.com Tue Nov 19 02:17:24 2013 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 18 Nov 2013 19:17:24 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119001014.58c20a77@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119001014.58c20a77@fsol> Message-ID: [Antoine] > ... > Well, it's a question of cost / benefit: does it make sense to optimize > something that will be dwarfed by other factors in real world > situations? For most of my career, a megabyte of RAM was an unthinkable luxury. Now I'm running on an OS that needs a gigabyte of RAM just to boot. So I'm old enough to think from the opposite end: "does it make sense to bloat something that's already working fine?". It's the "death of a thousand cuts" (well, this thing doesn't matter _that_ much - and neither does that, nor those others over there ...) that leads to a GiB-swallowing OS and a once-every-4-years crusade to reduce Python's startup time. For examples ;-) > ... > That said, assuming you think this is important (do you?), Honestly, it's more important to me "in theory" to oppose unnecessary bloat (of all kinds). But, over time, it's attitude that shapes results, so I'm not apologetic about that. > we're left with the following constraints: > - it would be nice to have this PEP in 3.4 > - 3.4 beta1 and feature freeze is in approximately one week > - switching to the PREFETCH scheme requires some non-trivial work on the > current patch, work done by either Alexandre or me (but I already > have pathlib (PEP 428) on my plate, so it'll have to be Alexandre) - > unless you want to do it, of course? > > What do you think? I wouldn't hold up PEP acceptance for this. It won't be a disaster in any case, just - at worse - another little ratchet in the "needless bloat" direction. And the PEP has more things that are pure wins. If it's possible to squeeze in the variable-length encoding, that would be great. If I were you, though, I'd check the patch in as-is, just in case. I can't tell whether I'll have time to work on it (have other things going now outside my control). From georg at python.org Tue Nov 19 07:59:50 2013 From: georg at python.org (Georg Brandl) Date: Tue, 19 Nov 2013 07:59:50 +0100 Subject: [Python-Dev] [RELEASED] Python 3.3.3 final Message-ID: <528B0C66.5090502@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team, I'm very happy to announce the release of Python 3.3.3. Python 3.3.3 includes several security fixes and over 150 bug fixes compared to the Python 3.3.2 release. Importantly, a security bug in CGIHTTPServer was fixed [1]. Thank you to those who tested the 3.3.3 release candidate! This release fully supports OS X 10.9 Mavericks. In particular, this release fixes an issue that could cause previous versions of Python to crash when typing in interactive mode on OS X 10.9. Python 3.3 includes a range of improvements of the 3.x series, as well as easier porting between 2.x and 3.x. In total, almost 500 API items are new or improved in Python 3.3. For a more extensive list of changes in the 3.3 series, see http://docs.python.org/3.3/whatsnew/3.3.html To download Python 3.3.3 rc2 visit: http://www.python.org/download/releases/3.3.3/ This is a production release, please report any bugs to http://bugs.python.org/ Enjoy! - -- Georg Brandl, Release Manager georg at python.org (on behalf of the entire python-dev team and 3.3's contributors) [1] http://bugs.python.org/issue19435 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iEYEARECAAYFAlKLDGYACgkQN9GcIYhpnLCjewCfQ+EJHpzGzIyTvYOrYmsRmnbv n40AniVM0UuQWpPrlCu349fu7Edt+d4+ =WSIj -----END PGP SIGNATURE----- From storchaka at gmail.com Tue Nov 19 08:52:56 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 19 Nov 2013 09:52:56 +0200 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119001014.58c20a77@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119001014.58c20a77@fsol> Message-ID: 19.11.13 01:10, Antoine Pitrou ???????(??): > - switching to the PREFETCH scheme requires some non-trivial work on the > current patch, work done by either Alexandre or me (but I already > have pathlib (PEP 428) on my plate, so it'll have to be Alexandre) - > unless you want to do it, of course? I had implemented optimized PREFETCH scheme months ago (the code was smaller than with the framing layer). I need a day or two to update the code to the current Alexandre's patch. From walter at livinglogic.de Tue Nov 19 13:24:19 2013 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Tue, 19 Nov 2013 13:24:19 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: <5285566F.3060300@canterbury.ac.nz> References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> <5284CE88.7080101@livinglogic.de> <52850809.3080908@livinglogic.de> <5285566F.3060300@canterbury.ac.nz> Message-ID: <528B5873.6040907@livinglogic.de> On 15.11.13 00:02, Greg Ewing wrote: > Walter D?rwald wrote: >> Unfortunaty the frame from the decorator shows up in the traceback. > > Maybe the decorator could remove its own frame from > the traceback? True, this could be done via either an additional attribute on the frame, or a special value for frame.f_annotation. Would we want to add frame annotations to every function call in the Python stdlib? Certainly not. So which functions would get annotations and which ones won't? When we have many annotations, doing it with a decorator might be a performance problem, as each function call goes through another stack level. Is there any other way to implement it? Servus, Walter From ncoghlan at gmail.com Tue Nov 19 13:50:39 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 19 Nov 2013 22:50:39 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors In-Reply-To: <528B5873.6040907@livinglogic.de> References: <3dKS143jbKz7Lk8@mail.python.org> <52838D0E.5040606@livinglogic.de> <5284CE88.7080101@livinglogic.de> <52850809.3080908@livinglogic.de> <5285566F.3060300@canterbury.ac.nz> <528B5873.6040907@livinglogic.de> Message-ID: On 19 November 2013 22:24, Walter D?rwald wrote: > On 15.11.13 00:02, Greg Ewing wrote: > >> Walter D?rwald wrote: >>> >>> Unfortunaty the frame from the decorator shows up in the traceback. >> >> >> Maybe the decorator could remove its own frame from >> the traceback? > > > True, this could be done via either an additional attribute on the frame, or > a special value for frame.f_annotation. > > Would we want to add frame annotations to every function call in the Python > stdlib? Certainly not. So which functions would get annotations and which > ones won't? > > When we have many annotations, doing it with a decorator might be a > performance problem, as each function call goes through another stack level. > > Is there any other way to implement it? Yep, you make the annotations a mapping and use module based naming as a convention to avoid conflicts: http://bugs.python.org/issue18861#msg202754 However, that issue also goes into why this is definitely a PEP level question - there's a bunch of things that are currently painful that a general frame annotation mechanism could help simplify. The challenge is to do it in a way that doesn't hurt performance in the normal case, that is acceptable to other interpreter implementations, and to show that it actually *does* make it possible to clean up at least the already noted issues: - avoiding inadvertent suppression of the original context when another exception gets replaced or suppressed inside an exception handler - more reliably hiding traceback frames involving the importlib machinery - more reliably reporting the codec involved in an encoding or decoding error (for example, the exception chaining I added for 3.4 can't provide any context for failures in the bz2_codec input validation, because that throws a stateful OSError that the chaining system can't handle, and thus doesn't wrap) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eliben at gmail.com Tue Nov 19 15:02:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 19 Nov 2013 06:02:56 -0800 Subject: [Python-Dev] Mixed up core/module source file locations in CPython In-Reply-To: References: Message-ID: On Sun, Nov 17, 2013 at 6:20 AM, Brett Cannon wrote: > > On Nov 17, 2013 8:58 AM, "Eli Bendersky" wrote: > > > > > > > > > > On Sat, Nov 16, 2013 at 3:44 PM, Brett Cannon wrote: > >> > >> > >> > >> > >> On Sat, Nov 16, 2013 at 1:40 PM, Eric Snow > wrote: > >>> > >>> If you look at the Python and Modules directories in the cpython repo, > >>> you'll find modules in Python/ and core files (like python.c and > >>> main.c) in Modules/. (It's like parking on a driveway and driving on > >>> a parkway. ) It's not that big a deal and not that hard to > >>> figure out (so I'm fine with the status quo), but it is a bit > >>> surprising. When I was first getting familiar with the code base a > >>> few years ago (as a C non-expert), it was a not insignificant but not > >>> major stumbling block. > >>> > >>> The situation is mostly a consequence of history, if I understand > >>> correctly. The subject has come up before and I don't recall any > >>> objections to doing something about it. I haven't had the time to > >>> track down those earlier discussions, though I remember Benjamin > >>> having some comment about it. > >>> > >>> Would it be too disruptive (churn, etc.) to clean this up in 3.5? I > >>> see it similarly to when I moved a light switch from outside my > >>> bathroom to inside. For a while, but not that long, I kept > >>> unconsciously reaching for the switch that was no longer there on the > >>> outside. Regardless I'm glad I did it. Likewise, moving the handful > >>> of files around is a relatively inconsequential change that would make > >>> the project just a little less surprising, particularly for new > >>> contributors. > >>> > >>> -eric > >>> > >>> p.s. Either way I'll probably take some time (it shouldn't take long) > >>> after the PEP 451 implementation is done to put together a patch that > >>> moves the files around, just to see what difference it makes. > >> > >> > >> I personally think it would be a good idea to re-arrange the files to > make things more beginner-friendly. I believe Nick was also talking about > renaming directories, etc. at some point. > > > > > > If we're concerned with the beginner-friendliness of our source layout, > I'll have to mention that I have a full ASDL parser lying around that's > written in Python 3.4 (enums!) without using Spark. So that's one less > custom tool to carry around with Python, less files and less LOCs in > general. Just sayin' ;-) > > Then stop saying and check it in (or at least open a big for it). :) > > http://bugs.python.org/issue19655 Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From xiscu at email.de Tue Nov 19 19:36:01 2013 From: xiscu at email.de (xiscu) Date: Tue, 19 Nov 2013 19:36:01 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: <528BAF91.60307@email.de> > Food for thought: maybe we should have variable-encoding lengths for > all opcodes, rather than the current cumbersome scheme? > Funny, it sounds like UTF-8 :-) From francismb at email.de Tue Nov 19 19:39:03 2013 From: francismb at email.de (francis) Date: Tue, 19 Nov 2013 19:39:03 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: <528BB047.1000009@email.de> > Food for thought: maybe we should have variable-encoding lengths for > all opcodes, rather than the current cumbersome scheme? > Funny, it sounds like UTF-8 :-) From solipsis at pitrou.net Tue Nov 19 19:51:10 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 19:51:10 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> Message-ID: <20131119195110.3904ba25@fsol> On Mon, 18 Nov 2013 16:48:05 -0800 Guido van Rossum wrote: > > Food for thought: maybe we should have variable-encoding lengths for all > opcodes, rather than the current cumbersome scheme? Well, it's not that cumbersome... If you look at CPU encodings, they also tend to have different opcodes for different immediate lengths. In your case, I'd say it mostly leads to a bit of code duplication. But the opcode space is far from exhausted right now :) Regards Antoine. From guido at python.org Tue Nov 19 19:52:58 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Nov 2013 10:52:58 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119195110.3904ba25@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> Message-ID: So why is framing different? On Tue, Nov 19, 2013 at 10:51 AM, Antoine Pitrou wrote: > On Mon, 18 Nov 2013 16:48:05 -0800 > Guido van Rossum wrote: > > > > Food for thought: maybe we should have variable-encoding lengths for all > > opcodes, rather than the current cumbersome scheme? > > Well, it's not that cumbersome... If you look at CPU encodings, they > also tend to have different opcodes for different immediate lengths. > > In your case, I'd say it mostly leads to a bit of code duplication. But > the opcode space is far from exhausted right now :) > > Regards > > Antoine. > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 19 19:53:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 19:53:52 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> Message-ID: <20131119195352.1d26bf11@fsol> On Tue, 19 Nov 2013 19:51:10 +0100 Antoine Pitrou wrote: > On Mon, 18 Nov 2013 16:48:05 -0800 > Guido van Rossum wrote: > > > > Food for thought: maybe we should have variable-encoding lengths for all > > opcodes, rather than the current cumbersome scheme? > > Well, it's not that cumbersome... If you look at CPU encodings, they > also tend to have different opcodes for different immediate lengths. > > In your case, I'd say it mostly leads to a bit of code duplication. But > the opcode space is far from exhausted right now :) Oops... Make that "in our case", of course. cheers Antoine. From solipsis at pitrou.net Tue Nov 19 19:57:35 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 19:57:35 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> Message-ID: <20131119195735.036cb16c@fsol> On Tue, 19 Nov 2013 10:52:58 -0800 Guido van Rossum wrote: > So why is framing different? Because it doesn't use opcodes, so it can't use different opcodes to differentiate between different frame size widths :-) Regards Antoine. From guido at python.org Tue Nov 19 20:05:45 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Nov 2013 11:05:45 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119195735.036cb16c@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> Message-ID: So using an opcode for framing is out? (Sorry, I've lost track of the back-and-forth.) On Tue, Nov 19, 2013 at 10:57 AM, Antoine Pitrou wrote: > On Tue, 19 Nov 2013 10:52:58 -0800 > Guido van Rossum wrote: > > So why is framing different? > > Because it doesn't use opcodes, so it can't use different opcodes to > differentiate between different frame size widths :-) > > Regards > > Antoine. > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 19 20:11:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 20:11:08 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> Message-ID: <20131119201108.1ade7cac@fsol> On Tue, 19 Nov 2013 11:05:45 -0800 Guido van Rossum wrote: > So using an opcode for framing is out? (Sorry, I've lost track of the > back-and-forth.) It doesn't seem to bring anything, and it makes the overhead worse for tiny pickles (since it will be two bytes at least, instead of one byte with the current variable length encoding proposal). If overhead doesn't matter, I'm fine with keeping a simple 8-bytes frame size :-) Regards Antoine. From tim.peters at gmail.com Tue Nov 19 20:22:52 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 19 Nov 2013 13:22:52 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> Message-ID: [Guido] > So using an opcode for framing is out? (Sorry, I've lost track of the > back-and-forth.) It was never in ;-) I'd *prefer* one, but not enough to try to block the PEP. As is, framing is done at a "lower level" than opcode decoding. I fear this is brittle, for all the usual "explicit is better than implicit" kinds of reasons. The only way now to know that you're looking at a frame size is to keep a running count of bytes processed and realize you've reached a byte offset where a frame size "is expected". With an opcode, framing could also be optional (whenever desired), because frame sizes would be _explicitly_ marked in the byte stream Then the framing overhead for small pickles could drop to 0 bytes (instead of the current 8, or 1 thru 9 under various other schemes). Ideal would be an explicit framing opcode combined with variable-length size encoding. That would never require more bytes than the current scheme, and 'almost always" require fewer. But even I don't think it's of much value to chop a few bytes off every 64KB of pickle ;-) From guido at python.org Tue Nov 19 20:26:44 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Nov 2013 11:26:44 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119201108.1ade7cac@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119201108.1ade7cac@fsol> Message-ID: Well, both fixed 8-byte framing and variable-size framing it introduces a new way of representing numbers in the stream, which means that everyone parsing and generating pickles must be able to support both styles. (But fixed is easier since the XXX8 opcodes use the same format.) I'm thinking of how you correctly read a pickle from a non-buffering pipe with the minimum number of read() calls without ever reading beyond the end of a valid pickle. (That's a requirement, right?) If you know it's protocol 4: with fixed framing: read 10 bytes, that's the magic word plus first frame; then you can start buffering with variable framing: read 3 bytes, then depending on the 3rd byte read some more to find the frame size; then you can start buffering with mandatory frame opcode: pretty much the same with optional frame opcode: pretty much the same (the 3rd byte must be a valid opcode, even if it isn't a frame opcode) if you don't know the protocol number: read the first byte, then read the second byte (or not if it's not explicitly versioned), then you know the protocol and can do the rest as above On Tue, Nov 19, 2013 at 11:11 AM, Antoine Pitrou wrote: > On Tue, 19 Nov 2013 11:05:45 -0800 > Guido van Rossum wrote: > > > So using an opcode for framing is out? (Sorry, I've lost track of the > > back-and-forth.) > > It doesn't seem to bring anything, and it makes the overhead worse for > tiny pickles (since it will be two bytes at least, instead of one byte > with the current variable length encoding proposal). > > If overhead doesn't matter, I'm fine with keeping a simple 8-bytes > frame size :-) > > Regards > > Antoine. > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Tue Nov 19 20:40:31 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 19 Nov 2013 20:40:31 +0100 Subject: [Python-Dev] [RELEASED] Python 3.3.3 final In-Reply-To: References: <528B0C66.5090502@python.org> Message-ID: Am 19.11.2013 17:14, schrieb Mark Lawrence: > On 19/11/2013 06:59, Georg Brandl wrote: >> >> To download Python 3.3.3 rc2 visit: >> >> http://www.python.org/download/releases/3.3.3/ >> > > Please make your mind up, final or rc2? > > Thanks everybody for your efforts, much appreciated :) It's my firm belief that every announce should have a small error to appease the gods of regression. *ahem* :) Georg From solipsis at pitrou.net Tue Nov 19 20:59:20 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 20:59:20 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> Message-ID: <20131119205920.26b9eb90@fsol> On Tue, 19 Nov 2013 13:22:52 -0600 Tim Peters wrote: > [Guido] > > So using an opcode for framing is out? (Sorry, I've lost track of the > > back-and-forth.) > > It was never in ;-) I'd *prefer* one, but not enough to try to block > the PEP. As is, framing is done at a "lower level" than opcode > decoding. I fear this is brittle, for all the usual "explicit is > better than implicit" kinds of reasons. The only way now to know that > you're looking at a frame size is to keep a running count of bytes > processed and realize you've reached a byte offset where a frame size > "is expected". That's integrated to the built-in buffering. It's not really an additional constraint: the frame sizes simply dictate how buffering happens in practice. The main point of framing is to *simplify* the buffering logic (of course, the old buffering logic is still there for protocols <= 3, unfortunately). Note some drawbacks of frame opcodes: - the decoder has to sanity check the frame opcodes (what if a frame opcode is encountered when already inside a frame?) - a pickle-mutating function such as pickletools.optimize() may naively ignore the frame opcodes while rearranging the pickle stream, only to emit a new pickle with invalid frame sizes Regards Antoine. From martin at v.loewis.de Tue Nov 19 21:25:34 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 19 Nov 2013 21:25:34 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119205920.26b9eb90@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> Message-ID: <528BC93E.9000702@v.loewis.de> Am 19.11.13 20:59, schrieb Antoine Pitrou: > That's integrated to the built-in buffering. It's not really an > additional constraint: the frame sizes simply dictate how buffering > happens in practice. The main point of framing is to *simplify* the > buffering logic (of course, the old buffering logic is still there for > protocols <= 3, unfortunately). I wonder why this needs to be part of the pickle protocol at all, if it really is "below" the opcodes. Anybody desiring framing could just implement a framing version of the io.BufferedReader, which could be used on top of a socket connection (say) to allow fetching larger blocks from the network stack. This would then be transparent to the pickle implementation; the framing reader would, of course, provide the peek() operation to allow the unpickler to continue to use buffering. Such a framing BufferedReader might even be included in the standard library. Regards, Martin From solipsis at pitrou.net Tue Nov 19 21:28:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 21:28:39 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528BC93E.9000702@v.loewis.de> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> Message-ID: <20131119212839.266b3d67@fsol> On Tue, 19 Nov 2013 21:25:34 +0100 "Martin v. L?wis" wrote: > Am 19.11.13 20:59, schrieb Antoine Pitrou: > > That's integrated to the built-in buffering. It's not really an > > additional constraint: the frame sizes simply dictate how buffering > > happens in practice. The main point of framing is to *simplify* the > > buffering logic (of course, the old buffering logic is still there for > > protocols <= 3, unfortunately). > > I wonder why this needs to be part of the pickle protocol at all, > if it really is "below" the opcodes. Anybody desiring framing could > just implement a framing version of the io.BufferedReader, which > could be used on top of a socket connection (say) to allow fetching > larger blocks from the network stack. This would then be transparent > to the pickle implementation; the framing reader would, of course, > provide the peek() operation to allow the unpickler to continue to use > buffering. > > Such a framing BufferedReader might even be included in the standard > library. Well, unless you propose a patch before Saturday, I will happily ignore your proposal. Regards Antoine. From storchaka at gmail.com Tue Nov 19 21:32:11 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 19 Nov 2013 22:32:11 +0200 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119205920.26b9eb90@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> Message-ID: 19.11.13 21:59, Antoine Pitrou ???????(??): > Note some drawbacks of frame opcodes: > - the decoder has to sanity check the frame opcodes (what if a frame > opcode is encountered when already inside a frame?) This is only one simple check when reading the frame opcode. > - a pickle-mutating function such as pickletools.optimize() may naively > ignore the frame opcodes while rearranging the pickle stream, only to > emit a new pickle with invalid frame sizes But with naked frame sizes without opcodes it have even more chance to produce invalid pickle. From solipsis at pitrou.net Tue Nov 19 22:04:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 22:04:00 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval Message-ID: <20131119220400.695a8bcd@fsol> Hello, Guido has told me that he was ready to approve PEP 428 (pathlib) in its latest amended form. Here is the last call for any comments or arguments against approval, before Guido marks the PEP accepted (or changes his mind :-)). Regards Antoine. From tim.peters at gmail.com Tue Nov 19 22:17:06 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 19 Nov 2013 15:17:06 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119205920.26b9eb90@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> Message-ID: [Tim] >> ... >> better than implicit" kinds of reasons. The only way now to know that >> you're looking at a frame size is to keep a running count of bytes >> processed and realize you've reached a byte offset where a frame size >> "is expected". [Antoine] > That's integrated to the built-in buffering. Well, obviously, because it wouldn't work at all unless the built-in buffering knew all about it ;-) > It's not really an additional constraint: the frame sizes simply > dictate how buffering happens in practice. The main point of > framing is to *simplify* the buffering logic (of course, the old > buffering logic is still there for protocols <= 3, unfortunately). And always will be - there are no pickle simplifications, because everything always sticks around forever. Over time, pickle just gets more complicated. That's in the nature of the beast. > Note some drawbacks of frame opcodes: > - the decoder has to sanity check the frame opcodes (what if a frame > opcode is encountered when already inside a frame?) > - a pickle-mutating function such as pickletools.optimize() may naively > ignore the frame opcodes while rearranging the pickle stream, only to > emit a new pickle with invalid frame sizes I suspect we have very different mental models here. By "has an opcode", I do NOT mean "must be visible to the opcode-decoding loop". I just mean "has a unique byte assigned in the pickle opcode space". I expect that in the CPython implementation of unpickling, the buffering layer would _consume_ the FRAME opcode, along with the frame size. The opcode-decoding loop would never see it. But if some _other_ implementation of unpickling didn't give a hoot about framing, having an explicit opcode means that implementation could ignore the whole scheme very easily: just implement the FRAME opcode in *its* opcode-decoding loop to consume the FRAME argument, ignore it, and move on. As-is, all other implementations _have_ to know everything about the buffering scheme because it's all implicit low-level magic. So, then, to the 2 points you raised: 1. If the CPython decoder ever sees a FRAME opcode, I expect it to raise an exception. That's all - it's an invalid pickle (or bug in the code) if it contains a FRAME the buffering layer didn't consume. 2. pickletools.optimize() in the CPython implementation should never see a FRAME opcode either. Initially, all I desperately ;-) want changed here is for the _buffering layer_, on the writing end, to write 9 bytes instead of 8 (1 new one for a FRAME opcode), and on the reading end to consume 9 bytes instead of 8 (extra credit if it checked the first byte to verify it really is a FRAME opcode - there's nothing wrong with sanity checks). Then it becomes _possible_ to optimize "small pickles" later (in the sense of not bothering to frame them at all). So long as frames remain implicit magic, that's impossible without moving to yet another new protocol level. From solipsis at pitrou.net Tue Nov 19 22:28:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 22:28:14 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> Message-ID: <20131119222814.45cacd30@fsol> On Tue, 19 Nov 2013 15:17:06 -0600 Tim Peters wrote: > > > Note some drawbacks of frame opcodes: > > - the decoder has to sanity check the frame opcodes (what if a frame > > opcode is encountered when already inside a frame?) > > - a pickle-mutating function such as pickletools.optimize() may naively > > ignore the frame opcodes while rearranging the pickle stream, only to > > emit a new pickle with invalid frame sizes > > I suspect we have very different mental models here. By "has an > opcode", I do NOT mean "must be visible to the opcode-decoding loop". > I just mean "has a unique byte assigned in the pickle opcode space". > > I expect that in the CPython implementation of unpickling, the > buffering layer would _consume_ the FRAME opcode, along with the frame > size. The opcode-decoding loop would never see it. > > But if some _other_ implementation of unpickling didn't give a hoot > about framing, having an explicit opcode means that implementation > could ignore the whole scheme very easily: just implement the FRAME > opcode in *its* opcode-decoding loop to consume the FRAME argument, > ignore it, and move on. As-is, all other implementations _have_ to > know everything about the buffering scheme because it's all implicit > low-level magic. Ahah, ok, I see where you're going. But how many other implementations of unpickling are there? > Initially, all I desperately ;-) want changed here is for the > _buffering layer_, on the writing end, to write 9 bytes instead of 8 > (1 new one for a FRAME opcode), and on the reading end to consume 9 > bytes instead of 8 (extra credit if it checked the first byte to > verify it really is a FRAME opcode - there's nothing wrong with sanity > checks). > > Then it becomes _possible_ to optimize "small pickles" later (in the > sense of not bothering to frame them at all). So the CPython unpickler must be able to work with and without framing by detecting the FRAME opcode? Otherwise the "later optimization" can't work. Regards Antoine. From tim.peters at gmail.com Tue Nov 19 22:41:51 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 19 Nov 2013 15:41:51 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119222814.45cacd30@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <20131119222814.45cacd30@fsol> Message-ID: [Tim] >> ... >> But if some _other_ implementation of unpickling didn't give a hoot >> about framing, having an explicit opcode means that implementation >> could ignore the whole scheme very easily: just implement the FRAME >> opcode in *its* opcode-decoding loop to consume the FRAME argument, >> ignore it, and move on. As-is, all other implementations _have_ to >> know everything about the buffering scheme because it's all implicit >> low-level magic. [Antoine] > Ahah, ok, I see where you're going. But how many other implementations > of unpickling are there? That's something you should have researched when writing the PEP ;-) How many implementations of Python aren't CPython? That's probably the answer. I'm not an expert on that, but there's more than one. >> Initially, all I desperately ;-) want changed here is for the >> _buffering layer_, on the writing end, to write 9 bytes instead of 8 >> (1 new one for a FRAME opcode), and on the reading end to consume 9 >> bytes instead of 8 (extra credit if it checked the first byte to >> verify it really is a FRAME opcode - there's nothing wrong with sanity >> checks). >> >> Then it becomes _possible_ to optimize "small pickles" later (in the >> sense of not bothering to frame them at all). > So the CPython unpickler must be able to work with and without framing > by detecting the FRAME opcode? Not at first, no. At first the buffering layer could raise an exception if there's no FRAME opcode when it expected one. Or just read up garbage bytes and assume it's a frame size, which is effectively what it's doing now anyway ;-) > Otherwise the "later optimization" can't work. Right. _If_ reducing framing overhead to "nothing" for small pickles turns out to be sufficiently desirable, then the buffering layer would need to learn how to turn itself off in the absence of a FRAME opcode immediately following the current frame. Perhaps the opcode decoding loop would also need to learn how to turn the buffering layer back on again too (next time a FRAME opcode showed up). Sounds annoying, but not impossible. From solipsis at pitrou.net Tue Nov 19 22:54:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 22:54:14 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <20131119222814.45cacd30@fsol> Message-ID: <20131119225414.0604316e@fsol> On Tue, 19 Nov 2013 15:41:51 -0600 Tim Peters wrote: > [Tim] > >> ... > >> But if some _other_ implementation of unpickling didn't give a hoot > >> about framing, having an explicit opcode means that implementation > >> could ignore the whole scheme very easily: just implement the FRAME > >> opcode in *its* opcode-decoding loop to consume the FRAME argument, > >> ignore it, and move on. As-is, all other implementations _have_ to > >> know everything about the buffering scheme because it's all implicit > >> low-level magic. > > [Antoine] > > Ahah, ok, I see where you're going. But how many other implementations > > of unpickling are there? > > That's something you should have researched when writing the PEP ;-) > How many implementations of Python aren't CPython? That's probably > the answer. I'm not an expert on that, but there's more than one. But "how many of them use something else than Lib/pickle.py" is the actual question. > > Otherwise the "later optimization" can't work. > > Right. _If_ reducing framing overhead to "nothing" for small pickles > turns out to be sufficiently desirable, then the buffering layer would > need to learn how to turn itself off in the absence of a FRAME opcode > immediately following the current frame. Perhaps the opcode decoding > loop would also need to learn how to turn the buffering layer back on > again too (next time a FRAME opcode showed up). Sounds annoying, but > not impossible. The problem with "let's make the unpickler more lenient in a later version" is that then you have protocol 4 pickles that won't work with all protocol 4-accepting versions of the pickle module. Regards Antoine. From brett at python.org Tue Nov 19 23:02:15 2013 From: brett at python.org (Brett Cannon) Date: Tue, 19 Nov 2013 17:02:15 -0500 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131119220400.695a8bcd@fsol> References: <20131119220400.695a8bcd@fsol> Message-ID: On Tue, Nov 19, 2013 at 4:04 PM, Antoine Pitrou wrote: > > Hello, > > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > latest amended form. Here is the last call for any comments or > arguments against approval, before Guido marks the PEP accepted (or > changes his mind :-)). > Is 'ext' going to exist with 'suffix'? Seems redundant (I'm guessing the example is out-of-date and 'ext' was changed to suffix). And a very minor grammatical thing: "you have to lookup a dedicate attribute" -> "dedicated" -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Nov 19 23:06:22 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 19 Nov 2013 16:06:22 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119225414.0604316e@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <20131119222814.45cacd30@fsol> <20131119225414.0604316e@fsol> Message-ID: [Antoine] >>> Ahah, ok, I see where you're going. But how many other implementations >>> of unpickling are there? [Tim] >> That's something you should have researched when writing the PEP ;-) >> How many implementations of Python aren't CPython? That's probably >> the answer. I'm not an expert on that, but there's more than one. [Antoine] > But "how many of them use something else than Lib/pickle.py" is the > actual question. I don't know - and neither do you ;-) I do know that I'd like, e.g., a version of pickletools.dis() in CPython that _did_ show the framing bits, for debugging. That's a bare-bones "unpickler". I don't know how many other "partial unpicklers" exist in the wild either. But their lives would also be much easier if the framing stuff were explicit. "Mandatory optimization" should be an oxymoron ;-) > ... > The problem with "let's make the unpickler more lenient in a later > version" is that then you have protocol 4 pickles that won't work with > all protocol 4-accepting versions of the pickle module. Yup. s/4/5/ would need to be part of a delayed optimization. From solipsis at pitrou.net Tue Nov 19 23:06:41 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 23:06:41 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> Message-ID: <20131119230641.0dd6e8e2@fsol> On Tue, 19 Nov 2013 17:02:15 -0500 Brett Cannon wrote: > On Tue, Nov 19, 2013 at 4:04 PM, Antoine Pitrou wrote: > > > > > Hello, > > > > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > > latest amended form. Here is the last call for any comments or > > arguments against approval, before Guido marks the PEP accepted (or > > changes his mind :-)). > > > > Is 'ext' going to exist with 'suffix'? Seems redundant (I'm guessing the > example is out-of-date and 'ext' was changed to suffix). > > And a very minor grammatical thing: "you have to lookup a dedicate > attribute" -> "dedicated" Thanks. Both errors are now fixed. Regards Antoine. From solipsis at pitrou.net Tue Nov 19 23:09:40 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 23:09:40 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <20131119222814.45cacd30@fsol> <20131119225414.0604316e@fsol> Message-ID: <20131119230940.35e0c912@fsol> On Tue, 19 Nov 2013 16:06:22 -0600 Tim Peters wrote: > [Antoine] > >>> Ahah, ok, I see where you're going. But how many other implementations > >>> of unpickling are there? > > [Tim] > >> That's something you should have researched when writing the PEP ;-) > >> How many implementations of Python aren't CPython? That's probably > >> the answer. I'm not an expert on that, but there's more than one. > > [Antoine] > > But "how many of them use something else than Lib/pickle.py" is the > > actual question. > > I don't know - and neither do you ;-) > > I do know that I'd like, e.g., a version of pickletools.dis() in > CPython that _did_ show the framing bits, for debugging. That's a > bare-bones "unpickler". I don't know how many other "partial > unpicklers" exist in the wild either. But their lives would also be > much easier if the framing stuff were explicit. "Mandatory > optimization" should be an oxymoron ;-) Well, I don't think it's a big deal to add a FRAME opcode if it doesn't change the current framing logic. I'd like to defer to Alexandre on this one, anyway. Regards Antoine. From martin at v.loewis.de Tue Nov 19 23:05:07 2013 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 19 Nov 2013 23:05:07 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119212839.266b3d67@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> Message-ID: <528BE093.7020902@v.loewis.de> Am 19.11.13 21:28, schrieb Antoine Pitrou: > Well, unless you propose a patch before Saturday, I will happily ignore > your proposal. See http://bugs.python.org/file32709/framing.diff Regards, Martin From solipsis at pitrou.net Tue Nov 19 23:50:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 19 Nov 2013 23:50:15 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528BE093.7020902@v.loewis.de> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> Message-ID: <20131119235015.2fb70088@fsol> On Tue, 19 Nov 2013 23:05:07 +0100 "Martin v. L?wis" wrote: > Am 19.11.13 21:28, schrieb Antoine Pitrou: > > > Well, unless you propose a patch before Saturday, I will happily ignore > > your proposal. > > See > > http://bugs.python.org/file32709/framing.diff Ok, thanks. So now that I look at the patch I see the following problems with this idea: - "pickle + framing" becomes a different protocol than "pickle" alone, which means we lose the benefit of protocol autodetection. It's as though pickle.load() required you to give the protocol number, instead of inferring it from the pickle bytestream. - it is less efficient than framing built inside pickle, since it adds separate buffers and memory copies (while the point of framing is to make buffering more efficient). Your idea is morally similar to saying "we don't need to optimize the size of pickles, since you can gzip them anyway". However, the fact that the _pickle module currently goes to lengths to try to optimize buffering, implies to me that it's reasonable to also improve the pickle protocol so as to optimize buffering. Regards Antoine. From martin at v.loewis.de Wed Nov 20 00:56:13 2013 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 20 Nov 2013 00:56:13 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119235015.2fb70088@fsol> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> <20131119235015.2fb70088@fsol> Message-ID: <528BFA9D.8010400@v.loewis.de> Am 19.11.13 23:50, schrieb Antoine Pitrou: > Ok, thanks. So now that I look at the patch I see the following problems > with this idea: > > - "pickle + framing" becomes a different protocol than "pickle" alone, > which means we lose the benefit of protocol autodetection. It's as > though pickle.load() required you to give the protocol number, > instead of inferring it from the pickle bytestream. Not necessarily. Framing becomes a different protocol, yes. But autodetection would still be possible (it actually is possible in my proposed definition). > - it is less efficient than framing built inside pickle, since it > adds separate buffers and memory copies (while the point of framing > is to make buffering more efficient). Correct. However, if the intent is to reduce the number of system calls, then this is still achieved. > Your idea is morally similar to saying "we don't need to optimize the > size of pickles, since you can gzip them anyway". Not really. In the case of gzip, it might be that the size reduction of properly saving bytes in pickle might be even larger. Here, the wire representation, and the number of system calls is actually (nearly) identical. > However, the fact > that the _pickle module currently goes to lengths to try to optimize > buffering, implies to me that it's reasonable to also improve the > pickle protocol so as to optimize buffering. AFAICT, the real driving force is the desire to not read-ahead more than the pickle is long. This is what complicates the code. The easiest (and most space-efficient) solution to that problem would be to prefix the entire pickle with a data size field (possibly in a variable-length representation), i.e. to make a single frame. If that was done, I would guess that Tim's concerns about brittleness would go away (as you couldn't have a length field in the middle of data). IMO, the PEP has nearly the same flaw as the HTTP chunked transfer, which also puts length fields in the middle of the payload (except that HTTP makes it worse by making them optional). Of course, a single length field has other drawbacks, such as having to pickle everything before sending out the first bytes. Regards, Martin From victor.stinner at gmail.com Wed Nov 20 01:09:15 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 20 Nov 2013 01:09:15 +0100 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement In-Reply-To: References: Message-ID: 2013/11/11 Charles-Fran?ois Natali : > After several exchanges with Victor, PEP 454 has reached a status > which I consider ready for pronuncement [1]: so if you have any last > minute comment, now is the time! So, what's going on? The deadline is Saturday, in 5 days. If the PEP is accepted (I hope so), should the code be merged before the beta1? Victor From solipsis at pitrou.net Wed Nov 20 01:10:50 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 20 Nov 2013 01:10:50 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528BFA9D.8010400@v.loewis.de> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> <20131119235015.2fb70088@fsol> <528BFA9D.8010400@v.loewis.de> Message-ID: <20131120011050.19e88622@fsol> On Wed, 20 Nov 2013 00:56:13 +0100 "Martin v. L?wis" wrote: > AFAICT, the real driving force is the desire to not read-ahead > more than the pickle is long. This is what complicates the code. > The easiest (and most space-efficient) solution to that problem > would be to prefix the entire pickle with a data size field > (possibly in a variable-length representation), i.e. to make a > single frame. Pickling then becomes very problematic: you have to keep the entire pickle in memory until the end, when you finally can write the size at the beginning of the pickle. > If that was done, I would guess that Tim's concerns about brittleness > would go away (as you couldn't have a length field in the middle of > data). IMO, the PEP has nearly the same flaw as the HTTP chunked > transfer, which also puts length fields in the middle of the payload > (except that HTTP makes it worse by making them optional). Tim's concern is easily addressed with a FRAME opcode, without changing the overall scheme (as he lately proposed). Regards Antoine. From guido at python.org Wed Nov 20 01:22:10 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 19 Nov 2013 16:22:10 -0800 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement In-Reply-To: References: Message-ID: I'm guessing CF is the PEP-BDFL? So he should approve it (the period for final comments is long over) and then you can merge. On Tue, Nov 19, 2013 at 4:09 PM, Victor Stinner wrote: > 2013/11/11 Charles-Fran?ois Natali : > > After several exchanges with Victor, PEP 454 has reached a status > > which I consider ready for pronuncement [1]: so if you have any last > > minute comment, now is the time! > > So, what's going on? The deadline is Saturday, in 5 days. > > If the PEP is accepted (I hope so), should the code be merged before the > beta1? > > Victor > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Nov 20 01:46:20 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 20 Nov 2013 01:46:20 +0100 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement In-Reply-To: References: Message-ID: 2013/11/20 Guido van Rossum : > I'm guessing CF is the PEP-BDFL? Yes, you delegated him this PEP :-) > So he should approve it (the period for > final comments is long over) and then you can merge. I'm asking if the code *must* be merged before the beta1, or if it can be merged later. Charles-Fran?ois told me that he would like to review the code after he reviewed the PEP. Victor > > On Tue, Nov 19, 2013 at 4:09 PM, Victor Stinner > wrote: >> >> 2013/11/11 Charles-Fran?ois Natali : >> > After several exchanges with Victor, PEP 454 has reached a status >> > which I consider ready for pronuncement [1]: so if you have any last >> > minute comment, now is the time! >> >> So, what's going on? The deadline is Saturday, in 5 days. >> >> If the PEP is accepted (I hope so), should the code be merged before the >> beta1? >> >> Victor >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) From jimjjewett at gmail.com Wed Nov 20 02:28:48 2013 From: jimjjewett at gmail.com (Jim J. Jewett) Date: Tue, 19 Nov 2013 17:28:48 -0800 (PST) Subject: [Python-Dev] Which direction is UnTransform? / Unicode is different In-Reply-To: <87a9h5hd4j.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> (Fri Nov 15 16:57:00 CET 2013) Stephen J. Turnbull wrote: > Serhiy Storchaka wrote: > > If the transform() method will be added, I prefer to have only > > one transformation method and specify a direction by the > > transformation name ("bzip2"/"unbzip2"). Me too. Until I consider special cases like "compress", or "lower", and realize that there are enough special cases to become a major wart if generic transforms ever became popular. > People think about these transformations as "en- or de-coding", not > "transforming", most of the time. Even for a transformation that is > an involution (eg, rot13), people have an very clear idea of what's > encoded and what's not, and they are going to prefer the names > "encode" and "decode" for these (generic) operations in many cases. I think this is one of the major stumbling blocks with unicode. I originally disagreed strongly with what Stephen wrote -- but then I realized that all my counterexamples involved unicode text. I can tell whether something is tarred or untarred, zipped or unzipped. But an 8-bit (even Latin-1, let alone ASCII) bytestring really doesn't seem "encoded", and it doesn't make sense to "decode" a perfectly readable (ASCII) string into a sequence of "code units". Nor does it help that http://www.unicode.org/glossary/#code_unit defines "code unit" as "The minimal bit combination that can represent a unit of encoded text for processing or interchange. The Unicode Standard uses 8-bit code units in the UTF-8 encoding form, 16-bit code units in the UTF-16 encoding form, and 32-bit code units in the UTF-32 encoding form. (See definition D77 in Section 3.9, Unicode Encoding Forms.)" I have to read that very carefully to avoid mentally translating it into "Code Units are *en*coded, and there are lots of different complicated encodings that I wouldn't use unless I were doing special processing or interchange." If I'm not using the network, or if my "interchange format" already looks like readable ASCII, then unicode sure sounds like a complication. I *will* get confused over which direction is encoding and which is decoding. (Removing .decode() from the (unicode) str type in 3 does help a lot, if I have a Python 3 interpreter running to check against.) I'm not sure exactly what implications the above has, but it certainly supports separating the Text Processing from the generic codecs, both in the documentation and in any potential new methods. Instead of relying on introspection of .decodes_to and .encodes_to, it would be useful to have charsetcodecs and tranformcodecs as entirely different modules, with their own separate registries. I will even note that the existing help(codecs) seems more appropriate for charsetcodecs than it does for the current conjoined module. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ From ncoghlan at gmail.com Wed Nov 20 02:30:52 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Nov 2013 11:30:52 +1000 Subject: [Python-Dev] PEP 454 (tracemalloc) close to pronouncement In-Reply-To: References: Message-ID: On 20 Nov 2013 10:48, "Victor Stinner" wrote: > > 2013/11/20 Guido van Rossum : > > I'm guessing CF is the PEP-BDFL? > > Yes, you delegated him this PEP :-) > > > So he should approve it (the period for > > final comments is long over) and then you can merge. > > I'm asking if the code *must* be merged before the beta1, or if it can > be merged later. Charles-Fran?ois told me that he would like to review > the code after he reviewed the PEP. The feature needs to be available and the API stable in beta 1, but implementation details are free to change until the first RC. And yes, this does mean that beta 1 releases are often a bit "interesting" :) Cheers, Nick. > > Victor > > > > > On Tue, Nov 19, 2013 at 4:09 PM, Victor Stinner < victor.stinner at gmail.com> > > wrote: > >> > >> 2013/11/11 Charles-Fran?ois Natali : > >> > After several exchanges with Victor, PEP 454 has reached a status > >> > which I consider ready for pronuncement [1]: so if you have any last > >> > minute comment, now is the time! > >> > >> So, what's going on? The deadline is Saturday, in 5 days. > >> > >> If the PEP is accepted (I hope so), should the code be merged before the > >> beta1? > >> > >> Victor > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> https://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > > > > > > -- > > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Wed Nov 20 02:35:29 2013 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 19 Nov 2013 17:35:29 -0800 Subject: [Python-Dev] [RELEASED] Python 3.3.3 final In-Reply-To: References: <528B0C66.5090502@python.org> Message-ID: How else would you know that anyone is listening? :) [yay, thanks for our being release monkey!] -gps On Tue, Nov 19, 2013 at 11:40 AM, Georg Brandl wrote: > Am 19.11.2013 17:14, schrieb Mark Lawrence: > > On 19/11/2013 06:59, Georg Brandl wrote: > >> > >> To download Python 3.3.3 rc2 visit: > >> > >> http://www.python.org/download/releases/3.3.3/ > >> > > > > Please make your mind up, final or rc2? > > > > Thanks everybody for your efforts, much appreciated :) > > It's my firm belief that every announce should have a small error > to appease the gods of regression. *ahem* > > :) > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Nov 20 06:18:09 2013 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 19 Nov 2013 23:18:09 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528BFA9D.8010400@v.loewis.de> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> <20131119235015.2fb70088@fsol> <528BFA9D.8010400@v.loewis.de> Message-ID: [Martin v. L?wis] > ... > AFAICT, the real driving force is the desire to not read-ahead > more than the pickle is long. This is what complicates the code. > The easiest (and most space-efficient) solution to that problem > would be to prefix the entire pickle with a data size field > (possibly in a variable-length representation), i.e. to make a > single frame. In a bout of giddy optimism, I suggested that earlier in the thread. It would be sweet :-) > If that was done, I would guess that Tim's concerns about brittleness > would go away (as you couldn't have a length field in the middle of > data). IMO, the PEP has nearly the same flaw as the HTTP chunked > transfer, which also puts length fields in the middle of the payload > (except that HTTP makes it worse by making them optional). > > Of course, a single length field has other drawbacks, such as having > to pickle everything before sending out the first bytes. And that's the killer. Pickle strings are generally produced incrementally, in smallish pieces. But that may go on for very many iterations, and there's no way to guess the final size in advance. I only see three ways to do it: 1. Hope the whole string fits in RAM. 2. Pickle twice, the first time just to get the final size (& throw the pickle pieces away on the first pass while summing their sizes). 3. Flush the pickle string to disk periodically, then after it's done read it up and copy it to the intended output stream. All of those really suck :-( BTW, I'm not a web guy: in what way is HTTP chunked transfer mode viewed as being flawed? Everything I ever read about it seemed to think it was A Good Idea. From greg.ewing at canterbury.ac.nz Wed Nov 20 06:19:05 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 20 Nov 2013 18:19:05 +1300 Subject: [Python-Dev] Which direction is UnTransform? / Unicode is different In-Reply-To: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> References: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> Message-ID: <528C4649.20707@canterbury.ac.nz> My thought on this for the day, for what it's worth: Anything that doesn't have directions clearly identifiable as "encoding" and "decoding" maybe shouldn't be called a "codec"? -- Greg From alexandre at peadrop.com Wed Nov 20 09:40:38 2013 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Wed, 20 Nov 2013 00:40:38 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131119230940.35e0c912@fsol> References: <20131116191526.16f9b79c@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <20131119222814.45cacd30@fsol> <20131119225414.0604316e@fsol> <20131119230940.35e0c912@fsol> Message-ID: On Tue, Nov 19, 2013 at 2:09 PM, Antoine Pitrou wrote: > > Well, I don't think it's a big deal to add a FRAME opcode if it doesn't > change the current framing logic. I'd like to defer to Alexandre on this > one, anyway. Looking at the different options available to us: 1A. Mandatory framing (+) Allows the internal buffering layer of the Unpickler to rely on the presence of framing to simplify its implementation. (-) Forces all implementations of pickle to include support for framing if they want to use the new protocol. (-) Cannot be removed from future versions of the Unpickler without breaking protocols which mandates framing. 1B. Optional framing (+) Could allow optimizations to disable framing if beneficial (e.g., when pickling to and unpickling from a string). 2A. With explicit FRAME opcode (+) Makes optional framing simpler to implement. (+) Makes variable-length encoding of the frame size simpler to implement. (+) Makes framing visible to pickletools. (-) Adds an extra byte of overhead to each frames. 2B. No opcode 3A. With fixed 8-bytes headers (+) Is simple to implement (-) Adds overhead to small pickles. 3B. With variable-length headers (-) Requires Pickler implemention to do extra data copies when pickling to strings. 4A. Framing baked-in the pickle protocol (+) Enables faster implementations 4B. Framing through a specialized I/O buffering layer (+) Could be reused by other modules I may change my mind as I work on the implementation, but at least for now, I think the combination of 1B, 2A, 3A, 4A will be a reasonable compromise here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Nov 20 11:07:28 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Nov 2013 20:07:28 +1000 Subject: [Python-Dev] Accepting PEP 456 (Secure hash algorithm) Message-ID: Christian has indicated he now considers PEP 456, which adds an updated and configurable hash algorithm ready for pronouncement ( http://www.python.org/dev/peps/pep-0456/) I am happy the PEP and the associated implementation represent a desirable improvement to CPython, and approve of the proposal to experiment further with the small string optimisation during the beta period, before either enabling it or removing it entirely in beta 2. Accordingly, I am formally accepting the PEP for inclusion in Python 3.4 :) Regards, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Wed Nov 20 11:50:38 2013 From: christian at python.org (Christian Heimes) Date: Wed, 20 Nov 2013 11:50:38 +0100 Subject: [Python-Dev] Accepting PEP 456 (Secure hash algorithm) In-Reply-To: References: Message-ID: <528C93FE.70208@python.org> Am 20.11.2013 11:07, schrieb Nick Coghlan: > Christian has indicated he now considers PEP 456, which adds an updated > and configurable hash algorithm ready for pronouncement > (http://www.python.org/dev/peps/pep-0456/) > > I am happy the PEP and the associated implementation represent a > desirable improvement to CPython, and approve of the proposal to > experiment further with the small string optimisation during the beta > period, before either enabling it or removing it entirely in beta 2. > > Accordingly, I am formally accepting the PEP for inclusion in Python 3.4 :) Thanks a lot, Nick! :) The PEP has landed in revision http://hg.python.org/cpython/rev/adb471b9cba1 . I don't expect any test failures as I have tested the PEP on a lot of platforms. The new code compiles and passes its tests on Linux, Windows, BSD, HUPX, Solaris with all supported CPUs (little and big endian, 32 bit and 64 bit). Christian From victor.stinner at gmail.com Wed Nov 20 12:29:33 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 20 Nov 2013 12:29:33 +0100 Subject: [Python-Dev] Accepting PEP 456 (Secure hash algorithm) In-Reply-To: <528C93FE.70208@python.org> References: <528C93FE.70208@python.org> Message-ID: 2013/11/20 Christian Heimes : > The PEP has landed in revision > http://hg.python.org/cpython/rev/adb471b9cba1 . I don't expect any test > failures as I have tested the PEP on a lot of platforms. The new code > compiles and passes its tests on Linux, Windows, BSD, HUPX, Solaris with > all supported CPUs (little and big endian, 32 bit and 64 bit). It looks like dict, set and frozenset representation (repr(...)) now depends on the platform (probably 32 bit vs 64 bit), even if PYTHONHASHSEED is set. I don't know if it's an issue or not. This is why test_gdb was failing. I fixed test_gdb in: http://hg.python.org/cpython/rev/11cb1c8faf11 The fix runs Python with a fixed PYTHONHASHSEED to get repr(value), and then compare it to the value from gdb. Victor From victor.stinner at gmail.com Wed Nov 20 12:41:09 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 20 Nov 2013 12:41:09 +0100 Subject: [Python-Dev] Accepting PEP 456 (Secure hash algorithm) In-Reply-To: References: <528C93FE.70208@python.org> Message-ID: 2013/11/20 Victor Stinner : > It looks like dict, set and frozenset representation (repr(...)) now > depends on the platform (probably 32 bit vs 64 bit), even if > PYTHONHASHSEED is set. I don't know if it's an issue or not. In Python 3.3, repr(set("abcd")) with PYTHONHASHSEED=0 always give "{'a', 'c', 'b', 'd'}" on 32 bit and 64 bit platforms. In Python 3.4, repr(set("abcd")) with PYTHONHASHSEED=0 now gives "{'b', 'a', 'c', 'd'}" on 32 bit.and "{'a', 'b', 'c', 'd'}" on 64 bit. Victor From christian at python.org Wed Nov 20 12:54:08 2013 From: christian at python.org (Christian Heimes) Date: Wed, 20 Nov 2013 12:54:08 +0100 Subject: [Python-Dev] Accepting PEP 456 (Secure hash algorithm) In-Reply-To: References: <528C93FE.70208@python.org> Message-ID: <528CA2E0.3050305@python.org> Am 20.11.2013 12:41, schrieb Victor Stinner: > 2013/11/20 Victor Stinner : >> It looks like dict, set and frozenset representation (repr(...)) >> now depends on the platform (probably 32 bit vs 64 bit), even if >> PYTHONHASHSEED is set. I don't know if it's an issue or not. > > In Python 3.3, repr(set("abcd")) with PYTHONHASHSEED=0 always give > "{'a', 'c', 'b', 'd'}" on 32 bit and 64 bit platforms. > > In Python 3.4, repr(set("abcd")) with PYTHONHASHSEED=0 now gives > "{'b', 'a', 'c', 'd'}" on 32 bit.and "{'a', 'b', 'c', 'd'}" on 64 > bit. I have fixed the problem in http://hg.python.org/cpython/rev/961d832d8734 . My attempt to mix the hash's MSB into LSB for 32 bit builds was the culprit. It should be secure enough without the op. From steve at pearwood.info Wed Nov 20 13:03:03 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 20 Nov 2013 23:03:03 +1100 Subject: [Python-Dev] Which direction is UnTransform? / Unicode is different In-Reply-To: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> References: <87a9h5hd4j.fsf@uwakimon.sk.tsukuba.ac.jp> <528c1050.8742e00a.03c3.ffffd600@mx.google.com> Message-ID: <20131120120302.GD2085@ando> On Tue, Nov 19, 2013 at 05:28:48PM -0800, Jim J. Jewett wrote: > > > (Fri Nov 15 16:57:00 CET 2013) Stephen J. Turnbull wrote: > > > Serhiy Storchaka wrote: > > > > If the transform() method will be added, I prefer to have only > > > one transformation method and specify a direction by the > > > transformation name ("bzip2"/"unbzip2"). > > Me too. Until I consider special cases like "compress", or "lower", > and realize that there are enough special cases to become a major wart > if generic transforms ever became popular. I'm not sure I understand this comment. Why are "compress" and "lower" special cases? If there's a "compress" codec, presumably there'll be an "uncompress" or "expand" that reverses it. In the case of "lower", it's not losslessly reversable, but there's certainly a reverse transformation, "upper". Some transformations are their own reverse, e.g. "rot13". In that case, there's no need for an unrot13 codec, since applying it twice undoes it. > > People think about these transformations as "en- or de-coding", not > > "transforming", most of the time. Even for a transformation that is > > an involution (eg, rot13), people have an very clear idea of what's > > encoded and what's not, and they are going to prefer the names > > "encode" and "decode" for these (generic) operations in many cases. > > I think this is one of the major stumbling blocks with unicode. > > I originally disagreed strongly with what Stephen wrote -- but then > I realized that all my counterexamples involved unicode text. Counterexamples to what? Again, I'm afraid I can't really understand what point you're trying to make here. Perhaps an explicit counterexample, and an explicit statement of what you're disagreeing with (e.g. "I disagree that people have a clear example of what's encoded and what's not") will help. [...] > But an 8-bit (even Latin-1, let alone ASCII) bytestring really doesn't > seem "encoded", and it doesn't make sense to "decode" a perfectly > readable (ASCII) string into a sequence of "code units". Of course it is encoded. There's nothing "a"-like about the byte 0x61, byte 0x2E is nothing like a period, and there is nothing about the byte 0x0A that forces text editors to start a new line -- or should that be 0x0D, or even possibly 0x85? There's nothing that distinguishes the text "spam" from the four-byte integer 1936744813 (0x7370616d in hex) except the semantics that we grant it, and that includes an implicit transformation 0x73 <-> "s", etc. Reading this may help: www.joelonsoftware.com/articles/Unicode.html? > Nor does it help that http://www.unicode.org/glossary/#code_unit > defines "code unit" as "The minimal bit combination that can represent > a unit of encoded text for processing or interchange. The Unicode > Standard uses 8-bit code units in the UTF-8 encoding form, 16-bit code > units in the UTF-16 encoding form, and 32-bit code units in the UTF-32 > encoding form. (See definition D77 in Section 3.9, Unicode Encoding > Forms.)" I agree that the official Unicode glossary is unfortunately confusing. It has a huge amount of information, often with confusingly similar terminology (code points and code units are, in a sense, opposites), and it's quite hard for beginners to Unicode to make sense of it all. > I have to read that very carefully to avoid mentally translating it > into "Code Units are *en*coded, Code units *are* encoded, in the sense that we say a burger is cooked. Take a raw meat patty and cook it, and you get a burger. Similarly, code units are the product of an encoding process, hence have been encoded. Code points (think of them as characters, modulo a few technicalities) are encoded *into* code units, which are bytes. Which code units you get depend on the encoding form you use, i.e. the codec. If you start with the character "a", and apply the UTF-8 encoding, you get a single 8-bit (one byte) code unit, 0x61. If you apply the UTF-16 (big endian) encoding, you get a single 16-bit (two bytes) code unit, 0x0061. If you apply UTF-32be codec, you get a single 32-bit (four bytes) code unit, 0x00000061. > and there are lots of different > complicated encodings that I wouldn't use unless I were doing special > processing or interchange." Very few of those encodings are Unicode. With the exception of a small handful of UTF-* codecs, and maybe one or two others, the vast majority are legacy encodings from the Bad Old Days when just about every computer had it's own distinct character set, or sets. If you're a Windows user, the non-UTF codecs (all the Latin-whatever codecs, Big5, cp-whatever, koi8-whatever, there are dozens of them) are basically old Windows code pages and the equivalent from other computer systems. And yes, it is best to avoid them like the plague except when you need them for interoperability with legacy data. > If I'm not using the network, or if my > "interchange format" already looks like readable ASCII, then unicode > sure sounds like a complication. It's not, not compared to the Bad Old Days. If you're like me, you remember when you couldn't exchange text files from Macintosh to Windows and visa versa without data being corrupted. Now, so long as both sides use Unicode, data corruption ought to be a thing of the past. It's not, but only because some operating systems still insist on using non-Unicode encodings by default. > I *will* get confused over which > direction is encoding and which is decoding. (Removing .decode() > from the (unicode) str type in 3 does help a lot, if I have a Python 3 > interpreter running to check against.) It took me a long time to learn that text encodes to bytes, and bytes decode back to text. Using Python 3 really helped with that. -- Steven From rosuav at gmail.com Wed Nov 20 13:13:32 2013 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 20 Nov 2013 23:13:32 +1100 Subject: [Python-Dev] Which direction is UnTransform? / Unicode is different In-Reply-To: <20131120120302.GD2085@ando> References: <87a9h5hd4j.fsf@uwakimon.sk.tsukuba.ac.jp> <528c1050.8742e00a.03c3.ffffd600@mx.google.com> <20131120120302.GD2085@ando> Message-ID: On Wed, Nov 20, 2013 at 11:03 PM, Steven D'Aprano wrote: >> I *will* get confused over which >> direction is encoding and which is decoding. (Removing .decode() >> from the (unicode) str type in 3 does help a lot, if I have a Python 3 >> interpreter running to check against.) > > It took me a long time to learn that text encodes to bytes, and bytes > decode back to text. Using Python 3 really helped with that. Rule of thumb: Stuff gets encoded for transmission/storage and decoded for usage. That covers encryption (you transmit the coded form and read the decoded), compression (you store the tighter form and use the expanded), Unicode (you store bytes, you work with characters), and quite a few others. I don't know that it's an iron-clad rule (though I can't off-hand think of a contrary example), but it's certainly an easy way to remember a lot of the encode/decode pairs. ChrisA From garth at garthy.com Wed Nov 20 13:25:20 2013 From: garth at garthy.com (Garth Bushell) Date: Wed, 20 Nov 2013 12:25:20 +0000 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131119230641.0dd6e8e2@fsol> References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> Message-ID: I've noticed in pathlib.py the following error on line 39 if sys.getwindowsversion()[:2] >= (6, 0) and sys.version_info >= (3, 2): it should be:- if sys.getwindowsversion()[2:] >= (6, 0) and sys.version_info >= (3, 2): I'm also quite uneasy on the case insensitive comparison on Windows as the File system NTFS is case sensitive. """Current Windows file systems, like NTFS, are case-sensitive; that is a readme.txt and a Readme.txt can exist in the same directory. Windows disallows the user to create a second file differing only in case due to compatibility issues with older software not designed for such operation.""" (http://en.wikipedia.org/wiki/Case_sensitivity) If people create .PY files it wouldn't work on Linux so why make it work on windows? Cheers Garth On 19 November 2013 22:06, Antoine Pitrou wrote: On Tue, 19 Nov 2013 17:02:15 -0500 Brett Cannon wrote: > On Tue, Nov 19, 2013 at 4:04 PM, Antoine Pitrou wrote: > > > > > Hello, > > > > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > > latest amended form. Here is the last call for any comments or > > arguments against approval, before Guido marks the PEP accepted (or > > changes his mind :-)). > > > > Is 'ext' going to exist with 'suffix'? Seems redundant (I'm guessing the > example is out-of-date and 'ext' was changed to suffix). > > And a very minor grammatical thing: "you have to lookup a dedicate > attribute" -> "dedicated" Thanks. Both errors are now fixed. Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev at python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/garth%40garthy.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Nov 20 13:49:17 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 20 Nov 2013 13:49:17 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> Message-ID: <20131120134917.33725342@fsol> On Wed, 20 Nov 2013 12:25:20 +0000 Garth Bushell wrote: > > I'm also quite uneasy on the case insensitive comparison on Windows as the > File system NTFS is case sensitive. > > """Current Windows file systems, like NTFS, are case-sensitive; that is a > readme.txt and a Readme.txt can exist in the same directory. Windows > disallows the user to create a second file differing only in case due to > compatibility issues with older software not designed for such > operation.""" (http://en.wikipedia.org/wiki/Case_sensitivity) Well the path class is named WindowsPath, not NTFSPath. In other words, it embodies path semantics as exposed by the Windows system and API, not what NTFS is able to do. Having per-filesystem concrete path classes would quickly grow of control (do we need a separate class for FAT32 filesystems? what if Windows later switches to another filesystem?). The PEP already points to a corresponding discussion: http://www.python.org/dev/peps/pep-0428/#case-sensitivity > If people create .PY files it wouldn't work on Linux so why make it work on > windows? What do you mean with "work"? What I know is that if I save a something.PY file under Windows and then double-click on it in the Explorer, it will be launched with the Python interpreter. Regards Antoine. From walter at livinglogic.de Wed Nov 20 14:38:46 2013 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Wed, 20 Nov 2013 14:38:46 +0100 Subject: [Python-Dev] Which direction is UnTransform? / Unicode is different In-Reply-To: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> References: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> Message-ID: <528CBB66.1040604@livinglogic.de> On 20.11.13 02:28, Jim J. Jewett wrote: > [...] > Instead of relying on introspection of .decodes_to and .encodes_to, it > would be useful to have charsetcodecs and tranformcodecs as entirely > different modules, with their own separate registries. I will even > note that the existing help(codecs) seems more appropriate for > charsetcodecs than it does for the current conjoined module. I don't understand how a registry of transformation functions would simplify code. Without the transform() method I would write: >>> import binascii >>> binascii.hexlify(b'foo') b'666f6f' With the transform() method I should be able to write: >>> b'foo'.transform("hex") However how does the hex transformer get registered in the registry? If the hex transformer is not part of the stdlib, there must be some code that does the registration, but to get that code to execute, I'd have to import a module, so we're back to square one, as I'd have to write: >>> import hex_transformer >>> b'foo'.transform("hex") A way around this would be some kind of import magic, but is this really neccessary to be able to avoid one import statement? Furthermore different transformation functions might have different additional options. Supporting those is simple when we have simple transformation functions: The functions has arguments, and those are documented where the function is documented. If we want to support custom options for the .transform() method, transform() would have to pass along *args, **kwargs to the underlying transformer. However this is difficult to document in a way that makes it easy to find which options exist for a particular transformer. Servus, Walter From ncoghlan at gmail.com Wed Nov 20 14:44:57 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Nov 2013 23:44:57 +1000 Subject: [Python-Dev] Which direction is UnTransform? / Unicode is different In-Reply-To: <528CBB66.1040604@livinglogic.de> References: <528c1050.8742e00a.03c3.ffffd600@mx.google.com> <528CBB66.1040604@livinglogic.de> Message-ID: On 20 November 2013 23:38, Walter D?rwald wrote: > On 20.11.13 02:28, Jim J. Jewett wrote: > >> [...] >> >> Instead of relying on introspection of .decodes_to and .encodes_to, it >> would be useful to have charsetcodecs and tranformcodecs as entirely >> different modules, with their own separate registries. I will even >> note that the existing help(codecs) seems more appropriate for >> charsetcodecs than it does for the current conjoined module. > > > I don't understand how a registry of transformation functions would simplify > code. Without the transform() method I would write: > > >>> import binascii > >>> binascii.hexlify(b'foo') > b'666f6f' > > With the transform() method I should be able to write: > > >>> b'foo'.transform("hex") > > However how does the hex transformer get registered in the registry? If the > hex transformer is not part of the stdlib, there must be some code that does > the registration, but to get that code to execute, I'd have to import a > module, so we're back to square one, as I'd have to write: > > >>> import hex_transformer > >>> b'foo'.transform("hex") > > A way around this would be some kind of import magic, but is this really > neccessary to be able to avoid one import statement? Could we please move discussion of hypothetical future divisions of the codec namespace into additional APIs to python-ideas? We don't even have consensus to restore the codecs module to parity with its Python 2 functionality at this point, let alone agreement on adding more convenience APIs for functionality that we have core developers arguing shouldn't be restored at all. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ethan at stoneleaf.us Wed Nov 20 15:01:04 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 20 Nov 2013 06:01:04 -0800 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> Message-ID: <528CC0A0.3070105@stoneleaf.us> On 11/20/2013 04:25 AM, Garth Bushell wrote: > > I'm also quite uneasy on the case insensitive comparison on Windows as the File system NTFS is case sensitive. No, it's case-preserving. > """Current Windows file systems, like NTFS, are case-sensitive; that is a readme.txt and a Readme.txt can exist in the > same directory. Windows disallows the user to create a second file differing only in case due to compatibility issues > with older software not designed for such operation.""" (http://en.wikipedia.org/wiki/Case_sensitivity) I just did some tests on my Win7 NTFS file system and I was unable to get two files in the same directory which differed only by case using either cmd or explorer. > If people create .PY files it wouldn't work on Linux so why make it work on windows? Because it *does* work on windows. c:\temp > copy con: hello.PY print "Hello World!" ^Z c:\temp > \python273\python --> import hello Hello World! Just one of the many things to be aware of if attempting to write cross-platform code. -- ~Ethan~ From guido at python.org Wed Nov 20 16:54:14 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Nov 2013 07:54:14 -0800 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131120134917.33725342@fsol> References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> <20131120134917.33725342@fsol> Message-ID: On Wed, Nov 20, 2013 at 4:49 AM, Antoine Pitrou wrote: > On Wed, 20 Nov 2013 12:25:20 +0000 > Garth Bushell wrote: > > > > I'm also quite uneasy on the case insensitive comparison on Windows as > the > > File system NTFS is case sensitive. > > > > """Current Windows file systems, like NTFS, are case-sensitive; that is a > > readme.txt and a Readme.txt can exist in the same directory. Windows > > disallows the user to create a second file differing only in case due to > > compatibility issues with older software not designed for such > > operation.""" (http://en.wikipedia.org/wiki/Case_sensitivity) > > Well the path class is named WindowsPath, not NTFSPath. In other words, > it embodies path semantics as exposed by the Windows system and API, > not what NTFS is able to do. Having per-filesystem concrete path > classes would quickly grow of control (do we need a separate class for > FAT32 filesystems? what if Windows later switches to another > filesystem?). > > The PEP already points to a corresponding discussion: > http://www.python.org/dev/peps/pep-0428/#case-sensitivity > > > If people create .PY files it wouldn't work on Linux so why make it work > on > > windows? > > What do you mean with "work"? > What I know is that if I save a something.PY file under Windows and then > double-click on it in the Explorer, it will be launched with the Python > interpreter. > Also, let's not forget that apart from comparison (Path('a') == Path('A')), matching and globbing, the WindowsPath class does not normalize the case of pathname components (only slash vs. backslash, redundant [back]slashes, and redundant '.' components are handled). So if you are in the unusual circumstances where you have to use case-sensitive paths on Windows, you can still use WindowsPath. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed Nov 20 16:44:21 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 20 Nov 2013 16:44:21 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> <20131119235015.2fb70088@fsol> <528BFA9D.8010400@v.loewis.de> Message-ID: <528CD8D5.4010702@v.loewis.de> Am 20.11.13 06:18, schrieb Tim Peters: > BTW, I'm not a web guy: in what way is HTTP chunked transfer mode > viewed as being flawed? Everything I ever read about it seemed to > think it was A Good Idea. It just didn't work for some time, see e.g. http://bugs.python.org/issue1486335 http://bugs.python.org/issue1966 http://bugs.python.org/issue1312980 http://bugs.python.org/issue3761 It's not that the protocol was underspecified - just the implementation was "brittle" (if I understand that word correctly). And I believe (and agree with you) that the cause for this "difficult to implement" property is that the framing is in putting framing "in the middle" of the stack (i.e. not really *below* pickle itself, but into pickle but below the opcodes - just like http chunked transfer is "in" http, but below the content encoding). Regards, Martin From eric at trueblade.com Wed Nov 20 17:04:43 2013 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 20 Nov 2013 11:04:43 -0500 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <528CC0A0.3070105@stoneleaf.us> References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> <528CC0A0.3070105@stoneleaf.us> Message-ID: <528CDD9B.7000709@trueblade.com> On 11/20/2013 09:01 AM, Ethan Furman wrote: > On 11/20/2013 04:25 AM, Garth Bushell wrote: >> >> I'm also quite uneasy on the case insensitive comparison on Windows as >> the File system NTFS is case sensitive. > > No, it's case-preserving. > >> """Current Windows file systems, like NTFS, are case-sensitive; that >> is a readme.txt and a Readme.txt can exist in the >> same directory. Windows disallows the user to create a second file >> differing only in case due to compatibility issues >> with older software not designed for such operation.""" >> (http://en.wikipedia.org/wiki/Case_sensitivity) > > I just did some tests on my Win7 NTFS file system and I was unable to > get two files in the same directory which differed only by case using > either cmd or explorer. I think the confusion comes from the difference between what NTFS can do and what the Win32 (or whatever it's now called) layer allows you to do. Rumor has it that the old Posix subsystem allowed NTFS to create 2 files in the same directory that differed only by case. "Filenames are Case Sensitive on NTFS Volumes" http://support.microsoft.com/kb/100625 I realize that for practical purposes, NTFS is seen as case-preserving, but deep down inside it's case-sensitive. Eric. From guido at python.org Wed Nov 20 17:07:15 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Nov 2013 08:07:15 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528CD8D5.4010702@v.loewis.de> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> <20131119235015.2fb70088@fsol> <528BFA9D.8010400@v.loewis.de> <528CD8D5.4010702@v.loewis.de> Message-ID: A problem with chunked IIRC is that the frame headers are variable-length (a CRLF followed by a hex number followed by some optional gunk followed by CRLF) so you have to drop back into one-byte-at-a-time to read it. (Well, I suppose you could read 5 bytes, which is the minimum: CR, LF, X, CR, LF, and when the second CR isn't among these, you have a lower bound for how much more to read, although at that point you better load up on coffee before writing the rest of the code. :-) Some good things about it: - Explicit final frame (byte count zero), so no need to rely on the data to know the end. - The redundancy in the format (start and end with CRLF, hex numbers) makes it more likely that framing errors (e.g. due to an incorrect counting or some layer collapsing CRLF into LF) are detected. On Wed, Nov 20, 2013 at 7:44 AM, "Martin v. L?wis" wrote: > Am 20.11.13 06:18, schrieb Tim Peters: > > BTW, I'm not a web guy: in what way is HTTP chunked transfer mode > > viewed as being flawed? Everything I ever read about it seemed to > > think it was A Good Idea. > > It just didn't work for some time, see e.g. > > http://bugs.python.org/issue1486335 > http://bugs.python.org/issue1966 > http://bugs.python.org/issue1312980 > http://bugs.python.org/issue3761 > > It's not that the protocol was underspecified - just the implementation > was "brittle" (if I understand that word correctly). And I believe (and > agree with you) that the cause for this "difficult to implement" > property is that the framing is in putting framing "in the middle" > of the stack (i.e. not really *below* pickle itself, but into pickle > but below the opcodes - just like http chunked transfer is "in" http, > but below the content encoding). > > Regards, > Martin > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Nov 20 17:09:52 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Nov 2013 08:09:52 -0800 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <528CC0A0.3070105@stoneleaf.us> References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> <528CC0A0.3070105@stoneleaf.us> Message-ID: On Wed, Nov 20, 2013 at 6:01 AM, Ethan Furman wrote: > On 11/20/2013 04:25 AM, Garth Bushell wrote: > >> >> I'm also quite uneasy on the case insensitive comparison on Windows as >> the File system NTFS is case sensitive. >> > > No, it's case-preserving. > It's quite possible that you are both right -- possibly the filesystem driver supports foo and FOO in the same directory but the kernel I/O layer prohibits that. Stranger things have happened. (IIRC the behavior might have been intended so that NT could get a "POSIX compliant" stamp of approval -- using a different set of kernel interfaces that don't enforce case insensitive matching.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at masklinn.net Wed Nov 20 17:55:21 2013 From: python-dev at masklinn.net (Xavier Morel) Date: Wed, 20 Nov 2013 17:55:21 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> <528CC0A0.3070105@stoneleaf.us> Message-ID: <3D8B8E04-BB9B-4353-856C-04250B01EA50@masklinn.net> On 2013-11-20, at 17:09 , Guido van Rossum wrote: > On Wed, Nov 20, 2013 at 6:01 AM, Ethan Furman wrote: > On 11/20/2013 04:25 AM, Garth Bushell wrote: > > I'm also quite uneasy on the case insensitive comparison on Windows as the File system NTFS is case sensitive. > > No, it's case-preserving. > > It's quite possible that you are both right -- possibly the filesystem driver supports foo and FOO in the same directory but the kernel I/O layer prohibits that. Stranger things have happened. (IIRC the behaviour might have been intended so that NT could get a "POSIX compliant" stamp of approval -- using a different set of kernel interfaces that don't enforce case insensitive matching.) The Subsystem for Unix-based Applications (SUA, formerly WSU and n?e Interix) does expose NTFS?s case sensitivity, but FWIW it was originally developed outside of Microsoft and independently from NTFS (and a few years later). It looks like NTFS?s designers simply didn?t feel completely beholden to the limitations of the Windows API. From martin at v.loewis.de Wed Nov 20 18:16:35 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 20 Nov 2013 18:16:35 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <528CDD9B.7000709@trueblade.com> References: <20131119220400.695a8bcd@fsol> <20131119230641.0dd6e8e2@fsol> <528CC0A0.3070105@stoneleaf.us> <528CDD9B.7000709@trueblade.com> Message-ID: <528CEE73.3000604@v.loewis.de> Am 20.11.13 17:04, schrieb Eric V. Smith: > I think the confusion comes from the difference between what NTFS can do > and what the Win32 (or whatever it's now called) layer allows you to do. > Rumor has it that the old Posix subsystem allowed NTFS to create 2 files > in the same directory that differed only by case. Actually, the Windows API *does* support case-sensitivity in file names, through the FILE_FLAG_POSIX_SEMANTICS flag to CreateFileW. Unfortunately, in recent Windows versions, the kernel object manager (a layer above the file system driver, but below the Windows API) started setting the kernel variable for case insensitivity of object names to true, defeating the purpose of this flag: http://support.microsoft.com/kb/929110 So if you set ObCaseInsensitive to 0, reboot, then use CreateFileW with FILE_FLAG_POSIX_SEMANTICS, then you should be able to create two files in the same directory that differ only in case - i.e. the system would be case-sensitive. Regards, Martin From solipsis at pitrou.net Wed Nov 20 19:08:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 20 Nov 2013 19:08:39 +0100 Subject: [Python-Dev] PEP 3156 (asyncio) formally accepted Message-ID: <20131120190839.5b91d65d@fsol> Hello, I have decided to mark PEP 3156 accepted. This reflects the fact that the implementation is now in the stdlib tree, and its API has been pretty much validated during all previous discussions (mostly on the tulip mailing-list). I cannot stress enough that the main preoccupation right now should be to write and integrate module docs inside the CPython source tree :-) Congratulations to Guido and all people involved! This is a large and very significant addition to the stdlib, and as far as I can tell both design and implementation have benefitted from the contributions of many people (perhaps someone should add them to Misc/ACKS?). Regards Antoine. From guido at python.org Wed Nov 20 19:35:38 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Nov 2013 10:35:38 -0800 Subject: [Python-Dev] PEP 3156 (asyncio) formally accepted In-Reply-To: <20131120190839.5b91d65d@fsol> References: <20131120190839.5b91d65d@fsol> Message-ID: Thanks Antoine! I will stop pretending to myself that I can finish the PEP this week and instead focus on getting the docs in the CPython repo bootstrapped. I would also like to thank the many contributors to the design and implementation. (I've tried to mention everyone in the PEP's Acknowledgements section -- if you know of someone who has contributed but isn't mentioned yet, please let me know.) It's been a great ride! (And it isn't over yet. :-) --Guido On Wed, Nov 20, 2013 at 10:08 AM, Antoine Pitrou wrote: > > Hello, > > I have decided to mark PEP 3156 accepted. This reflects the fact that > the implementation is now in the stdlib tree, and its API has been > pretty much validated during all previous discussions (mostly on the > tulip mailing-list). > > I cannot stress enough that the main preoccupation right now should be > to write and integrate module docs inside the CPython source tree :-) > > Congratulations to Guido and all people involved! This is a large and > very significant addition to the stdlib, and as far as I can tell both > design and implementation have benefitted from the contributions of many > people (perhaps someone should add them to Misc/ACKS?). > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tismer at stackless.com Wed Nov 20 21:52:10 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 20 Nov 2013 21:52:10 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 Message-ID: <528D20FA.8040902@stackless.com> Howdy friends, according to pep 404, there will never be an official Python 2.8. The migration path is from 2.7 to 3.x. I agree with this strategy in almost all consequences but this one: Many customers are forced to stick with Python 2.X because of other products, but they require a Python 2.X version which can be compiled using Visual Studio 2010 or better. This is considered an improvement and not a bug fix, where I disagree. So I see many Python 2.7 repositories on BB/GH which are hacked to support VS 2010, but they are not converted with the same quality in mind that fits our standards. My question ----------- I have created a very clean Python 2.7.6+ based CPython with the Stackless additions, that compiles with VS 2010, using the adapted project structure of Python 3.3.X, and I want to publish that on the Stackless website as the official "Stackless Python 2.8". If you consider Stackless as official ;-) . This compiler change is currently the only deviation from CPython 2.7, but we may support a few easy back-ports on customer demand. We don'd add any random stuff, of course. My question is if that is safe to do: Can I rely on PEP 404 that the "Python 2.8" namespace never will clash with CPython? And if not, what do you suggest then? It will be submitted by end of November, thanks for your quick responses! all the best -- Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From jsbueno at python.org.br Wed Nov 20 22:15:02 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Wed, 20 Nov 2013 19:15:02 -0200 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D20FA.8040902@stackless.com> References: <528D20FA.8040902@stackless.com> Message-ID: I'd say publishing a high profile installable code with a "python 2.8" name would cause a lot of undesired confusion to start with. I usually lecture on Python to present the language to college students and I.T. workers - and explaining away the current versioning scheme (use either 2.7 or 3.3) is hard already. Having to add an explanation about a downloadable and installable Python 2.8 that would be incompatible with extensions compiled in Pypi would be tough. and I doubt it could even be done without making your project look bad on the process. Can't you just mark it as "visual studio 2010" version instead? js -><- On 20 November 2013 18:52, Christian Tismer wrote: > Howdy friends, > > according to pep 404, there will never be an official Python 2.8. > The migration path is from 2.7 to 3.x. > > I agree with this strategy in almost all consequences but this one: > > Many customers are forced to stick with Python 2.X because of other > products, but they require a Python 2.X version which can be compiled > using Visual Studio 2010 or better. > This is considered an improvement and not a bug fix, where I disagree. > > So I see many Python 2.7 repositories on BB/GH which are hacked to support > VS 2010, but they are not converted with the same quality in mind > that fits our standards. > > > My question > ----------- > > I have created a very clean Python 2.7.6+ based CPython with the Stackless > additions, that compiles with VS 2010, using the adapted project structure > of Python 3.3.X, and I want to publish that on the Stackless website as the > official "Stackless Python 2.8". If you consider Stackless as official ;-) > . > > This compiler change is currently the only deviation from CPython 2.7, > but we may support a few easy back-ports on customer demand. We don'd add > any random stuff, of course. > > My question is if that is safe to do: > Can I rely on PEP 404 that the "Python 2.8" namespace never will clash > with CPython? > > And if not, what do you suggest then? > It will be submitted by end of November, thanks for your quick responses! > > all the best -- Chris > > -- > Christian Tismer :^) > Software Consulting : Have a break! Take a ride on Python's > Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ > 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de > phone +49 173 24 18 776 fax +49 (30) 700143-0023 > PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 > whom do you want to sponsor today? http://www.stackless.com/ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > jsbueno%40python.org.br > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Wed Nov 20 22:39:43 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Wed, 20 Nov 2013 22:39:43 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131119220400.695a8bcd@fsol> References: <20131119220400.695a8bcd@fsol> Message-ID: On Tue, Nov 19, 2013 at 10:04 PM, Antoine Pitrou wrote: > > Hello, > > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > latest amended form. Here is the last call for any comments or > arguments against approval, before Guido marks the PEP accepted (or > changes his mind :-)). > > Regards > > Antoine. > Isn't this redundant? >>> Path.cwd() PosixPath('/home/antoine/pathlib') Probably this is just personal taste but I'd prefer the more explicit: >>> Path(os.getcwd()) PosixPath('/home/antoine/pathlib') I understand all the os.* replication (Path.rename, Path.stat etc.) but all these methods assume you're dealing with an instantiated Path instance whereas Path.cwd is the only one which doesn't. Not a big deal anyway, it just catched my eye because it's different. Other than that the module looks absolutely awesome and a big improvement over os.path! --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Nov 20 22:42:42 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Nov 2013 13:42:42 -0800 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131119220400.695a8bcd@fsol> References: <20131119220400.695a8bcd@fsol> Message-ID: On Tue, Nov 19, 2013 at 1:04 PM, Antoine Pitrou wrote: > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > latest amended form. Here is the last call for any comments or > arguments against approval, before Guido marks the PEP accepted (or > changes his mind :-)). > Congrats Antoine! I've approved your PEP. Go ahead with the integration. Everybody: pathlib is an immediately useful standard way for manipulating file paths, both platform-specific and platform-independent (so you can work with Windows paths on a POSIX system and vice versa). pathlib won't replace os.path immediately (the current status is "provisional", which means that the API may still be tweaked after 3.4 is released), but in the long term I expect os.path to die a quiet death (as well as the host of 3rd party path modules that preceded pathlib). Great job Antoine! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Nov 20 23:01:48 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 20 Nov 2013 23:01:48 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> Message-ID: <20131120230148.7e42d073@fsol> On Wed, 20 Nov 2013 13:42:42 -0800 Guido van Rossum wrote: > On Tue, Nov 19, 2013 at 1:04 PM, Antoine Pitrou wrote: > > > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > > latest amended form. Here is the last call for any comments or > > arguments against approval, before Guido marks the PEP accepted (or > > changes his mind :-)). > > > > Congrats Antoine! I've approved your PEP. Go ahead with the integration. > > Everybody: pathlib is an immediately useful standard way for manipulating > file paths, both platform-specific and platform-independent (so you can > work with Windows paths on a POSIX system and vice versa). > > pathlib won't replace os.path immediately (the current status is > "provisional", which means that the API may still be tweaked after 3.4 is > released), but in the long term I expect os.path to die a quiet death (as > well as the host of 3rd party path modules that preceded pathlib). pathlib imports many modules at startup, so for scripts for which startup time is critical using os.path may still be the best option. But thanks anyway :) Regards Antoine. From tismer at stackless.com Wed Nov 20 23:04:28 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 20 Nov 2013 23:04:28 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> Message-ID: <528D31EC.5030908@stackless.com> My question is not answered at all, sorry Joao! I did not ask a teacher for his opinion on Stackless, but the community about the validity of pep 404. I don't want a python 2.7 that does not install correctly, because people don't read instructions. And exactly that will happen if I submit a modified python 2.7 to PyPI. This is a topic on Stackless Python, and I am asking python-dev before I do it. But people know this has its limits. cheers - chris On 20/11/13 22:15, Joao S. O. Bueno wrote: > I'd say publishing a high profile installable code with a "python 2.8" > name would cause a lot of undesired confusion to start with. > > I usually lecture on Python to present the language to college > students and I.T. workers - and explaining away the current versioning > scheme (use either 2.7 or 3.3) is hard already. Having to add an > explanation about a downloadable and installable Python 2.8 that would > be incompatible with extensions compiled in Pypi would be tough. and I > doubt it could even be done without making your project look bad on > the process. > > Can't you just mark it as "visual studio 2010" version instead? > > js > -><- > > > > > > On 20 November 2013 18:52, Christian Tismer > wrote: > > Howdy friends, > > according to pep 404, there will never be an official Python 2.8. > The migration path is from 2.7 to 3.x. > > I agree with this strategy in almost all consequences but this one: > > Many customers are forced to stick with Python 2.X because of other > products, but they require a Python 2.X version which can be compiled > using Visual Studio 2010 or better. > This is considered an improvement and not a bug fix, where I disagree. > > So I see many Python 2.7 repositories on BB/GH which are hacked to > support > VS 2010, but they are not converted with the same quality in mind > that fits our standards. > > > My question > ----------- > > I have created a very clean Python 2.7.6+ based CPython with the > Stackless > additions, that compiles with VS 2010, using the adapted project > structure > of Python 3.3.X, and I want to publish that on the Stackless > website as the > official "Stackless Python 2.8". If you consider Stackless as > official ;-) . > > This compiler change is currently the only deviation from CPython 2.7, > but we may support a few easy back-ports on customer demand. We > don'd add > any random stuff, of course. > > My question is if that is safe to do: > Can I rely on PEP 404 that the "Python 2.8" namespace never will clash > with CPython? > > And if not, what do you suggest then? > It will be submitted by end of November, thanks for your quick > responses! > > all the best -- Chris > > -- > Christian Tismer :^) > > Software Consulting : Have a break! Take a ride on > Python's > Karl-Liebknecht-Str. 121 : *Starship* > http://starship.python.net/ > 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de > phone +49 173 24 18 776 fax +49 > (30) 700143-0023 > PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 > BF04 > whom do you want to sponsor today? http://www.stackless.com/ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tismer%40stackless.com -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Wed Nov 20 23:15:58 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 20 Nov 2013 22:15:58 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D31EC.5030908@stackless.com> References: <528D20FA.8040902@stackless.com> <528D31EC.5030908@stackless.com> Message-ID: On 20 November 2013 22:04, Christian Tismer wrote: > My question is not answered at all, sorry Joao! > I did not ask a teacher for his opinion on Stackless, but the community > about the > validity of pep 404. > > I don't want a python 2.7 that does not install correctly, because people > don't read instructions. And exactly that will happen if I submit a modified > python 2.7 to PyPI. > > This is a topic on Stackless Python, and I am asking python-dev before I do > it. > But people know this has its limits. PEP 404 says there will be no Python 2.8. In my view, If you release something named (Stackless) Python 2.8, that seems to me to be pretty unfriendly given python-dev's clear intentions. Call it Stackless Python 2.7 plus for Visual Studio 2010 if you want, but using the version number 2.8 is bound to confuse at least some newcomers about the status of the Python 2.x line, and that's what PEP 404 was intended to avoid. Paul From solipsis at pitrou.net Wed Nov 20 23:25:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 20 Nov 2013 23:25:01 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> Message-ID: <20131120232501.28613181@fsol> On Wed, 20 Nov 2013 13:42:42 -0800 Guido van Rossum wrote: > On Tue, Nov 19, 2013 at 1:04 PM, Antoine Pitrou wrote: > > > Guido has told me that he was ready to approve PEP 428 (pathlib) in its > > latest amended form. Here is the last call for any comments or > > arguments against approval, before Guido marks the PEP accepted (or > > changes his mind :-)). > > > > Congrats Antoine! I've approved your PEP. Go ahead with the integration. > > Everybody: pathlib is an immediately useful standard way for manipulating > file paths, both platform-specific and platform-independent (so you can > work with Windows paths on a POSIX system and vice versa). > > pathlib won't replace os.path immediately (the current status is > "provisional", which means that the API may still be tweaked after 3.4 is > released), but in the long term I expect os.path to die a quiet death (as > well as the host of 3rd party path modules that preceded pathlib). > > Great job Antoine! I've posted the implementation at http://bugs.python.org/issue19673 I know you've already more or less reviewed it, so I'm gonna commit in the coming days anyway :) Regards Antoine. From barry at python.org Wed Nov 20 23:30:44 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 20 Nov 2013 17:30:44 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D20FA.8040902@stackless.com> References: <528D20FA.8040902@stackless.com> Message-ID: <20131120173044.77f6fcef@anarchist> On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: >Many customers are forced to stick with Python 2.X because of other products, >but they require a Python 2.X version which can be compiled using Visual >Studio 2010 or better. This is considered an improvement and not a bug fix, >where I disagree. I'm not so sure about that. Python 2.7 can still get patches to help extend its useful life by allowing it to be built with newer compiler suites. I believe this has already been done for various Linux compilers. I see no non-technical reason why Python 2.7 can't be taught how to build with VS 2010 or newer. Details are subject to RM approval, IMHO. >I have created a very clean Python 2.7.6+ based CPython with the Stackless >additions, that compiles with VS 2010, using the adapted project structure >of Python 3.3.X, and I want to publish that on the Stackless website as the >official "Stackless Python 2.8". If you consider Stackless as official ;-) . > >This compiler change is currently the only deviation from CPython 2.7, >but we may support a few easy back-ports on customer demand. We don'd add >any random stuff, of course. I think you're just going to confuse everyone if you call it "Stackless Python 2.8" and it will do more harm than good. Cheers, -Barry From tim.peters at gmail.com Wed Nov 20 23:37:28 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 20 Nov 2013 16:37:28 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <528CD8D5.4010702@v.loewis.de> References: <20131116191526.16f9b79c@fsol> <20131118171421.2dc8e9a2@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <528BC93E.9000702@v.loewis.de> <20131119212839.266b3d67@fsol> <528BE093.7020902@v.loewis.de> <20131119235015.2fb70088@fsol> <528BFA9D.8010400@v.loewis.de> <528CD8D5.4010702@v.loewis.de> Message-ID: [Tim] >> BTW, I'm not a web guy: in what way is HTTP chunked transfer mode >> viewed as being flawed? Everything I ever read about it seemed to >> think it was A Good Idea. [Martin] > It just didn't work for some time, see e.g. > > http://bugs.python.org/issue1486335 > http://bugs.python.org/issue1966 > http://bugs.python.org/issue1312980 > http://bugs.python.org/issue3761 > > It's not that the protocol was underspecified - just the implementation > was "brittle" (if I understand that word correctly). "Easily broken in catastrophic ways" is close, like a chunk of peanut brittle can shatter into a gazillion pieces if you drop it on the floor. http://en.wikipedia.org/wiki/Brittle_(food) Or like the infinite loops in some of the bug reports, "just because" some server screwed up the protocol a little at EOF. But for pickling there are a lot fewer picklers than HTML transport creators ;-) So I'm not much worried about that. Another of the bug reports amounted just to that urllib, at first, didn't support chunked transfer mode at all. > And I believe (and agree with you) that the cause for this "difficult > to implement" property is that the framing is in putting framing "in the middle" > of the stack (i.e. not really *below* pickle itself, but into pickle > but below the opcodes - just like http chunked transfer is "in" http, > but below the content encoding). It's certainly messy that way. But doable, and I expect the people working on it are more than capable enough to get it right, by at latest the 4th try ;-) From breamoreboy at yahoo.co.uk Wed Nov 20 23:41:54 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 20 Nov 2013 22:41:54 +0000 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131120230148.7e42d073@fsol> References: <20131119220400.695a8bcd@fsol> <20131120230148.7e42d073@fsol> Message-ID: On 20/11/2013 22:01, Antoine Pitrou wrote: > > pathlib imports many modules at startup, so for scripts for which > startup time is critical using os.path may still be the best option. > Will there be or is there a note to this effect in the docs? -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From chris.barker at noaa.gov Wed Nov 20 23:52:23 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 20 Nov 2013 14:52:23 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D20FA.8040902@stackless.com> References: <528D20FA.8040902@stackless.com> Message-ID: On Wed, Nov 20, 2013 at 12:52 PM, Christian Tismer wrote: > according to pep 404, there will never be an official Python 2.8. > The migration path is from 2.7 to 3.x. > > I agree with this strategy in almost all consequences but this one: > > Many customers are forced to stick with Python 2.X because of other > products, but they require a Python 2.X version which can be compiled > using Visual Studio 2010 or better. > This is considered an improvement and not a bug fix, where I disagree. > maybe we need to continue that discussion then -- and call this a bug fix. It seems clear to me that the Python X.Y version number specifies a set of features and their implementation, not a particular build with a particular compiler. So a python 2.7 build with VS2010 (or any other compiler for that matter) is still python 2.7. And for what it's worth, stackless aside I'd love to see a python2.7 binary built with a newer MS compiler -- for instance, the latest pyton plugin for Visual Studio lets you debug python and C extensions side by side -- very cool, and only available with VS2012. Sorry I'm out of the loop, but are there patched required to use newer VS compilers to build 2.7? or is this really just a matter of what the official python org builds are build with? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcepl at redhat.com Thu Nov 21 00:08:30 2013 From: mcepl at redhat.com (Matej Cepl) Date: Thu, 21 Nov 2013 00:08:30 +0100 Subject: [Python-Dev] [issue19494] Problems with HTTPBasicAuthHandler Message-ID: <20131120230829.GA9778@wycliff.ceplovi.cz> Hello, I am trying to work on fixing issue 19494 (HTTPBasicAuthHandler doesn't work with Github and other websites which require prior Authorization header in the first request). I have created this testing script calling GitHub v3 API using new ?authentication handlers? (they are subclasses of HTTPSHandler, not HTTPBasicAuthHandler) and hoped that when I manage to make this script working, I could rewrite it into patch against cpython code. However, when I run this script I get this error (using python3-3.3.2-8.el7.x86_64) matej at wycliff: $ python3 test_create_1.py DEBUG::gh_url = https://api.github.com/repos/mcepl/odt2rst/issues/ DEBUG::req = DEBUG::req.type = https DEBUG:http_request:type = https Traceback (most recent call last): File "test_create_1.py", line 80, in handler = opener.open(req, json.dumps(create_data).encode('utf8')) File "/usr/lib64/python3.3/urllib/request.py", line 475, in open response = meth(req, response) File "/usr/lib64/python3.3/urllib/request.py", line 587, in http_response 'http', request, response, code, msg, hdrs) File "/usr/lib64/python3.3/urllib/request.py", line 513, in error return self._call_chain(*args) File "/usr/lib64/python3.3/urllib/request.py", line 447, in _call_chain result = func(*args) File "/usr/lib64/python3.3/urllib/request.py", line 595, in http_error_default raise HTTPError(req.full_url, code, msg, hdrs, fp) urllib.error.HTTPError: HTTP Error 401: Unauthorized matej at wycliff: $ Could anybody suggest what I am doing wrong? I guess I need to somehow include into opener original HTTPBasicAuthHandler (for error handlers). Any ideas how to do it? Best, Mat?j Cepl -- http://www.ceplovi.cz/matej/, Jabber: mceplceplovi.cz GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC - Do you think of yourself as a Christian artist? - I'm an artist who is a Christian. I'm not a Christian artist. -- Johny Cash in his last interview ======================================================================= import base64 import http.client import urllib.request import json import os import logging from configparser import ConfigParser logging.basicConfig(format='%(levelname)s:%(funcName)s:%(message)s', level=logging.DEBUG) class HTTPBasicPriorAuthHandler(urllib.request.HTTPHandler): handler_order = 400 def __init__(self, *args, **kwargs): passwd_mgr = kwargs.pop('password_mgr', None) urllib.request.HTTPHandler.__init__(self, *args, **kwargs) self.passwd = passwd_mgr def http_request(self, req): logging.debug("type = {}".format(req.type)) if not req.has_header('Authorization'): user, passwd = self.passwd.find_user_password( None, req.get_full_url()) auth_str = base64.standard_b64encode( '{}:{}'.format(user, passwd).encode('ascii')) req.add_header('Authorization', 'Basic {}'.format(auth_str)) return urllib.request.AbstractHTTPHandler.do_request_(self, req) if hasattr(http.client, 'ssl'): handler_order = 400 class HTTPSBasicPriorAuthHandler(urllib.request.HTTPSHandler): def __init__(self, *args, **kwargs): passwd_mgr = kwargs.pop('password_mgr', None) urllib.request.HTTPSHandler.__init__(self, *args, **kwargs) self.passwd = passwd_mgr https_request = HTTPBasicPriorAuthHandler.http_request cp = ConfigParser() cp.read(os.path.expanduser('~/.githubrc')) # That configuration file should look something like # [github] # user=mylogin # password=myveryverysecretpassword gh_user = cp.get('github', 'user') gh_passw = cp.get('github', 'password') repo = 'odt2rst' pwd_manager = urllib.request.HTTPPasswordMgrWithDefaultRealm() pwd_manager.add_password(None, uri='https://api.github.com', user=gh_user, passwd=gh_passw) auth_prior_handler = HTTPSBasicPriorAuthHandler(password_mgr=pwd_manager) auth_prior_handler.set_http_debuglevel(2) auth_handler = urllib.request.HTTPBasicAuthHandler(password_mgr=pwd_manager) opener = urllib.request.build_opener(auth_prior_handler, auth_handler) API_URL = "https://api.github.com/repos/{0}/{1}/issues/" gh_url = API_URL.format(gh_user, repo, 1) logging.debug("gh_url = {0}".format(gh_url)) req = urllib.request.Request(gh_url) logging.debug('req = {}'.format(req)) logging.debug('req.type = {}'.format(req.type)) create_data = { 'title': 'Testing bug summary', 'body': '''This is a testing bug. I am writing here this stuff, just to have some text for body.''', 'labels': ['BEbug'] } handler = opener.open(req, json.dumps(create_data).encode('utf8')) print(handler.getcode()) print(handler) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 190 bytes Desc: not available URL: From tismer at stackless.com Thu Nov 21 00:36:46 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 21 Nov 2013 00:36:46 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131120173044.77f6fcef@anarchist> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> Message-ID: <528D478E.1040105@stackless.com> Hey Barry, On 20.11.13 23:30, Barry Warsaw wrote: > On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: > >> Many customers are forced to stick with Python 2.X because of other products, >> but they require a Python 2.X version which can be compiled using Visual >> Studio 2010 or better. This is considered an improvement and not a bug fix, >> where I disagree. > I'm not so sure about that. Python 2.7 can still get patches to help extend > its useful life by allowing it to be built with newer compiler suites. I > believe this has already been done for various Linux compilers. I see no > non-technical reason why Python 2.7 can't be taught how to build with VS 2010 > or newer. Details are subject to RM approval, IMHO. > >> I have created a very clean Python 2.7.6+ based CPython with the Stackless >> additions, that compiles with VS 2010, using the adapted project structure >> of Python 3.3.X, and I want to publish that on the Stackless website as the >> official "Stackless Python 2.8". If you consider Stackless as official ;-) . >> >> This compiler change is currently the only deviation from CPython 2.7, >> but we may support a few easy back-ports on customer demand. We don'd add >> any random stuff, of course. > I think you're just going to confuse everyone if you call it "Stackless Python > 2.8" and it will do more harm than good. Barry, that's a good thing! This way I have a chance to get my build in at all. And that's the question, after re-thinking: Where can I check my change in, if it is going to be accepted as a valid 2.7 bug fix (concerning VS 2008 as a bug, that is is)? I was intending to do this since a year but was stopped by MVL's influence. Maybe this needs to be re-thought, since I think different. What I do not know: Should I simply check this in to a python 2.7.x-VS2010 branch? Or what do you suggest, then? In any case, my question still stands, and I will do something with the Stackless branch by end of November. Please influence me ASAP, I don't intend to do any harm, but that problem is caused simply by my existence, and I want to stick with that for another few decades. If I think something must be done, then I have my way to do it. If you have good reasons to prevent this, then you should speak up in the next 10 days, or it will happen. Is that ok with you? Hugs -- your friend Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From tjreedy at udel.edu Thu Nov 21 01:23:30 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 20 Nov 2013 19:23:30 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131120173044.77f6fcef@anarchist> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> Message-ID: On 11/20/2013 5:30 PM, Barry Warsaw wrote: > On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: > >> Many customers are forced to stick with Python 2.X because of other products, >> but they require a Python 2.X version which can be compiled using Visual >> Studio 2010 or better. This is considered an improvement and not a bug fix, >> where I disagree. > > I'm not so sure about that. Python 2.7 can still get patches to help extend > its useful life by allowing it to be built with newer compiler suites. I > believe this has already been done for various Linux compilers. I see no > non-technical reason why Python 2.7 can't be taught how to build with VS 2010 > or newer. Details are subject to RM approval, IMHO. With the availability of the free 2008 express edition being problematical, I would really like to see this. >> I have created a very clean Python 2.7.6+ based CPython with the Stackless >> additions, that compiles with VS 2010, using the adapted project structure >> of Python 3.3.X, and I want to publish that on the Stackless website as the >> official "Stackless Python 2.8". If you consider Stackless as official ;-) . A compiler change is not a language change and does not need merit a new language version number. Anyway I believe that using 'Python 2.8' would be illegal without PSF approval. From http://www.python.org/psf/trademarks/ '"Python" is a registered trademark of the PSF.' (PSF Tracemarks Committee, psf-trademarks at python.org) The core developers determine the meaning of 'Python x.y' and we have determined that there shall be no 'Python 2.8'. Or if you prefer, we have defined it to be the empty language ;-). >> This compiler change is currently the only deviation from CPython 2.7, It is not a deviation from 'Python 2.7'. The compiler used by the PSF *CPython* distribution is an implementation detail. For all I know, other distributors, such as ActiveState, already distribute 2.7 compiled with other compilers. Your new Stackless would still run Python 2.7. >> but we may support a few easy back-ports on customer demand. We don'd add >> any random stuff, of course. I suspect that you (or your customers) will not be the only people who want selected 3.x features without the 3.0 shift to unicode text and who do not care about the 3.3 improvement of the unicode implementation. But such hybrid mixtures are not 'Python x.y'. If you make such mixtures on a private customer by customer basis, then no public name is needed. > I think you're just going to confuse everyone if you call it "Stackless Python > 2.8" and it will do more harm than good. Avoiding confusion is the purpose of registering trademarks. -- Terry Jan Reedy From solipsis at pitrou.net Thu Nov 21 01:31:20 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 01:31:20 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> Message-ID: <20131121013120.4c827da8@fsol> On Wed, 20 Nov 2013 17:30:44 -0500 Barry Warsaw wrote: > On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: > > >Many customers are forced to stick with Python 2.X because of other products, > >but they require a Python 2.X version which can be compiled using Visual > >Studio 2010 or better. This is considered an improvement and not a bug fix, > >where I disagree. > > I'm not so sure about that. Python 2.7 can still get patches to help extend > its useful life by allowing it to be built with newer compiler suites. I > believe this has already been done for various Linux compilers. I see no > non-technical reason why Python 2.7 can't be taught how to build with VS 2010 > or newer. Details are subject to RM approval, IMHO. I think it isn't only about teaching it to build with VS 2010, but providing binaries compatible with the VS 2010 runtime. Otherwise, AFAIU, if extensions are built with VS 2010 but loader with a VS 2008-compiled Python, there will be ABI problems. Regards Antoine. From tismer at stackless.com Thu Nov 21 01:36:55 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 21 Nov 2013 01:36:55 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <528D31EC.5030908@stackless.com> Message-ID: <528D55A7.2020305@stackless.com> Yes Paul, On 20.11.13 23:15, Paul Moore wrote: > On 20 November 2013 22:04, Christian Tismer wrote: >> My question is not answered at all, sorry Joao! >> I did not ask a teacher for his opinion on Stackless, but the community >> about the >> validity of pep 404. >> >> I don't want a python 2.7 that does not install correctly, because people >> don't read instructions. And exactly that will happen if I submit a modified >> python 2.7 to PyPI. >> >> This is a topic on Stackless Python, and I am asking python-dev before I do >> it. >> But people know this has its limits. > PEP 404 says there will be no Python 2.8. In my view, If you release > something named (Stackless) Python 2.8, that seems to me to be pretty > unfriendly given python-dev's clear intentions. Call it Stackless > Python 2.7 plus for Visual Studio 2010 if you want, but using the > version number 2.8 is bound to confuse at least some newcomers about > the status of the Python 2.x line, and that's what PEP 404 was > intended to avoid. I see what you intend, but I am not convinced what's best. Building a version that is numbered the same as existing versions, but binary incompatible is IMHO much more confusing that a version 2.8 which clearly says "if you are not 2.8, then you are not compatible". I am addressing Windows users, and they usually click a version, and if it doesn't work, they complain. The reason to go this way _was_ simplicity, moving to an impossible version, that justifies itself by that impossibility. They would have to install a whole series of packages, which they all would get from my site, and it works. And to repeat: Stackless Python is a slightly different, very compatible version, that has never got approval about being official or compatible. The reason that I am asking is to minimize the friction that I had to envision, anyway. It is your chance to minimize that friction by giving me good reasons to re-think my decision. But it will happen, soon, in the one or the other way. So please don't think I am begging for something - I am offering something, but it will happen. Best regards -- Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From solipsis at pitrou.net Thu Nov 21 01:35:43 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 01:35:43 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? References: <20131116191526.16f9b79c@fsol> Message-ID: <20131121013543.6e93ce37@fsol> Hello, I have made two last-minute changes to the PEP: - addition of the FRAME opcode, as discussed with Tim, and keeping a fixed 8-byte frame size - addition of the MEMOIZE opcode, courtesy of Alexandre, which replaces PUT opcodes in protocol 4 and helps shrink the size of pickles If there's no further opposition, I'd like to mark this PEP accepted (or let someone else do it) in 24 hours, so that the implementation can be integrated before Sunday. Regards Antoine. On Sat, 16 Nov 2013 19:15:26 +0100 Antoine Pitrou wrote: > > Hello, > > Alexandre Vassalotti (thanks a lot!) has recently finalized his work on > the PEP 3154 implementation - pickle protocol 4. > > I think it would be good to get the PEP and the implementation accepted > for 3.4. As far as I can say, this has been a low-controvery proposal, > and it brings fairly obvious improvements to the table (which table?). > I still need some kind of BDFL or BDFL delegate to do that, though -- > unless I am allowed to mark my own PEP accepted :-) > > (I've asked Tim, specifically, for comments, since he contributed a lot > to previous versions of the pickle protocol.) > > The PEP is at http://www.python.org/dev/peps/pep-3154/ (should be > rebuilt soon by the server, I'd say) > > Alexandre's implementation is tracked at > http://bugs.python.org/issue17810 > > Regards > > Antoine. > > From tim.peters at gmail.com Thu Nov 21 01:45:53 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 20 Nov 2013 18:45:53 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131121013543.6e93ce37@fsol> References: <20131116191526.16f9b79c@fsol> <20131121013543.6e93ce37@fsol> Message-ID: [Antoine] > I have made two last-minute changes to the PEP: > > - addition of the FRAME opcode, as discussed with Tim, and keeping a > fixed 8-byte frame size Cool! > - addition of the MEMOIZE opcode, courtesy of Alexandre, which replaces > PUT opcodes in protocol 4 and helps shrink the size of pickles Long overdue - clever idea! > If there's no further opposition, I'd like to mark this PEP accepted > (or let someone else do it) in 24 hours, so that the implementation can > be integrated before Sunday. I think Guido already spoke on this - but, if he didn't, I will. Accepted :-) From chris.barker at noaa.gov Wed Nov 20 23:43:26 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 20 Nov 2013 14:43:26 -0800 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> Message-ID: On Wed, Nov 20, 2013 at 1:39 PM, Giampaolo Rodola' wrote: > Isn't this redundant? > > >>> Path.cwd() > PosixPath('/home/antoine/pathlib') > > Probably this is just personal taste but I'd prefer the more explicit: > > >>> Path(os.getcwd()) > PosixPath('/home/antoine/pathlib') > > I understand all the os.* replication (Path.rename, Path.stat etc.) but > all these methods assume you're dealing with an instantiated Path instance > whereas Path.cwd is the only one which doesn't. > That's a standard pattern for using a classmethod as an alternate constructor -- similar to: datetime.now() dict.fromkeys() etc.... and it can really reduce namespace clutter (do you really want to import both os and path for this? (OK, you way well be importing os anyway, but...) It also allows you to get that functionality in a subclass. Anyway, it's not that different, and I like it ;-) Other than that the module looks absolutely awesome and a big improvement > over os.path! > indeed -- thanks for moving this forward, it is a long time coming! By the way, for us dinosaurs is this going to exactly match the pathlib implementation that can be used with py2? -Chris > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Nov 21 01:51:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 01:51:59 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval References: <20131119220400.695a8bcd@fsol> Message-ID: <20131121015159.3a534c9b@fsol> On Wed, 20 Nov 2013 14:43:26 -0800 Chris Barker wrote: > > By the way, for us dinosaurs is this going to exactly match the > pathlib implementation that can be used with py2? pathlib up to 0.8 (on PyPI) has a different API - since there were so many changes done as part of the release process. When pathlib-in-the-stdlib stabilizes, I plan to release a pathlib 1.0 on PyPI that will integrate the PEP's API. In the meantime, if you don't mind installing from VCS, you clone the Mercurial repo (https://bitbucket.org/pitrou/pathlib/) and then checkout branch "pep428". It's 2.7-compatible. Regards Antoine. From solipsis at pitrou.net Thu Nov 21 01:54:40 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 01:54:40 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131121013543.6e93ce37@fsol> Message-ID: <20131121015440.19887df2@fsol> On Wed, 20 Nov 2013 18:45:53 -0600 Tim Peters wrote: > [Antoine] > > I have made two last-minute changes to the PEP: > > > > - addition of the FRAME opcode, as discussed with Tim, and keeping a > > fixed 8-byte frame size > > Cool! > > > > - addition of the MEMOIZE opcode, courtesy of Alexandre, which replaces > > PUT opcodes in protocol 4 and helps shrink the size of pickles > > Long overdue - clever idea! > > > > If there's no further opposition, I'd like to mark this PEP accepted > > (or let someone else do it) in 24 hours, so that the implementation can > > be integrated before Sunday. > > I think Guido already spoke on this - but, if he didn't, I will. Accepted :-) Thank you! I have marked it accepted then. (with a final nit - EMPTY_FROZENSET isn't necessary and is gone) Regards Antoine. From solipsis at pitrou.net Thu Nov 21 01:55:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 01:55:03 +0100 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval References: <20131119220400.695a8bcd@fsol> <20131121015159.3a534c9b@fsol> Message-ID: <20131121015503.548bd851@fsol> On Thu, 21 Nov 2013 01:51:59 +0100 Antoine Pitrou wrote: > On Wed, 20 Nov 2013 14:43:26 -0800 > Chris Barker wrote: > > > > By the way, for us dinosaurs is this going to exactly match the > > pathlib implementation that can be used with py2? > > pathlib up to 0.8 (on PyPI) has a different API - since there were so > many changes done as part of the release process. Make that "PEP process", sorry. Regards Antoine. From guido at python.org Thu Nov 21 02:02:08 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 20 Nov 2013 17:02:08 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: <20131121015440.19887df2@fsol> References: <20131116191526.16f9b79c@fsol> <20131121013543.6e93ce37@fsol> <20131121015440.19887df2@fsol> Message-ID: Yup. Agreed. Ship it! On Wed, Nov 20, 2013 at 4:54 PM, Antoine Pitrou wrote: > On Wed, 20 Nov 2013 18:45:53 -0600 > Tim Peters wrote: > > [Antoine] > > > I have made two last-minute changes to the PEP: > > > > > > - addition of the FRAME opcode, as discussed with Tim, and keeping a > > > fixed 8-byte frame size > > > > Cool! > > > > > > > - addition of the MEMOIZE opcode, courtesy of Alexandre, which replaces > > > PUT opcodes in protocol 4 and helps shrink the size of pickles > > > > Long overdue - clever idea! > > > > > > > If there's no further opposition, I'd like to mark this PEP accepted > > > (or let someone else do it) in 24 hours, so that the implementation can > > > be integrated before Sunday. > > > > I think Guido already spoke on this - but, if he didn't, I will. > Accepted :-) > > Thank you! I have marked it accepted then. > (with a final nit - EMPTY_FROZENSET isn't necessary and is gone) > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Nov 21 02:23:18 2013 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 20 Nov 2013 19:23:18 -0600 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> <20131118233134.02bb4edb@fsol> <20131119005708.1c638214@fsol> <20131119195110.3904ba25@fsol> <20131119195735.036cb16c@fsol> <20131119205920.26b9eb90@fsol> <20131119222814.45cacd30@fsol> <20131119225414.0604316e@fsol> <20131119230940.35e0c912@fsol> Message-ID: [Alexandre Vassalotti] > Looking at the different options available to us: > > 1A. Mandatory framing > (+) Allows the internal buffering layer of the Unpickler to rely > on the presence of framing to simplify its implementation. > (-) Forces all implementations of pickle to include support for > framing if they want to use the new protocol. > (-) Cannot be removed from future versions of the Unpickler > without breaking protocols which mandates framing. > 1B. Optional framing > (+) Could allow optimizations to disable framing if beneficial > (e.g., when pickling to and unpickling from a string). Or to slash the size of small pickles (an 8-byte size field can be more than half the total pickle size). > 2A. With explicit FRAME opcode > (+) Makes optional framing simpler to implement. > (+) Makes variable-length encoding of the frame size simpler > to implement. > (+) Makes framing visible to pickletools. (+) Adds (a little) redundancy for sanity checking. > (-) Adds an extra byte of overhead to each frames. > 2B. No opcode > > 3A. With fixed 8-bytes headers > (+) Is simple to implement > (-) Adds overhead to small pickles. > 3B. With variable-length headers > (-) Requires Pickler implemention to do extra data copies when > pickling to strings. > > 4A. Framing baked-in the pickle protocol > (+) Enables faster implementations > 4B. Framing through a specialized I/O buffering layer > (+) Could be reused by other modules > > I may change my mind as I work on the implementation, but at least for now, > I think the combination of 1B, 2A, 3A, 4A will be a reasonable compromise > here. At this time I'd make the same choices, so don't expect an argument from me ;-) Thank you! From chris.barker at noaa.gov Thu Nov 21 03:13:25 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 20 Nov 2013 18:13:25 -0800 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: <20131121015159.3a534c9b@fsol> References: <20131119220400.695a8bcd@fsol> <20131121015159.3a534c9b@fsol> Message-ID: <-8336249909486859261@unknownmsgid> On Nov 20, 2013, at 4:53 PM, Antoine Pitrou wrote: > > When pathlib-in-the-stdlib stabilizes, I plan to release a pathlib 1.0 > on PyPI that will integrate the PEP's API. Great, thanks! Chris > In the meantime, if you don't mind installing from VCS, you clone the > Mercurial repo (https://bitbucket.org/pitrou/pathlib/) and then > checkout branch "pep428". It's 2.7-compatible. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov From python at mrabarnett.plus.com Thu Nov 21 03:35:17 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 21 Nov 2013 02:35:17 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D478E.1040105@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> Message-ID: <528D7165.10606@mrabarnett.plus.com> On 20/11/2013 23:36, Christian Tismer wrote: > Hey Barry, > > On 20.11.13 23:30, Barry Warsaw wrote: >> On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: >> >>> Many customers are forced to stick with Python 2.X because of other products, >>> but they require a Python 2.X version which can be compiled using Visual >>> Studio 2010 or better. This is considered an improvement and not a bug fix, >>> where I disagree. >> I'm not so sure about that. Python 2.7 can still get patches to help extend >> its useful life by allowing it to be built with newer compiler suites. I >> believe this has already been done for various Linux compilers. I see no >> non-technical reason why Python 2.7 can't be taught how to build with VS 2010 >> or newer. Details are subject to RM approval, IMHO. >> >>> I have created a very clean Python 2.7.6+ based CPython with the Stackless >>> additions, that compiles with VS 2010, using the adapted project structure >>> of Python 3.3.X, and I want to publish that on the Stackless website as the >>> official "Stackless Python 2.8". If you consider Stackless as official ;-) . >>> >>> This compiler change is currently the only deviation from CPython 2.7, >>> but we may support a few easy back-ports on customer demand. We don'd add >>> any random stuff, of course. >> I think you're just going to confuse everyone if you call it "Stackless Python >> 2.8" and it will do more harm than good. > > Barry, that's a good thing! This way I have a chance to get my build in > at all. > And that's the question, after re-thinking: > > Where can I check my change in, if it is going to be accepted as a valid > 2.7 bug fix (concerning VS 2008 as a bug, that is is)? > > I was intending to do this since a year but was stopped by MVL's influence. > Maybe this needs to be re-thought, since I think different. > > What I do not know: > Should I simply check this in to a python 2.7.x-VS2010 branch? > Or what do you suggest, then? > > In any case, my question still stands, and I will do something with the > Stackless branch by end of November. Please influence me ASAP, I don't > intend to do any harm, but that problem is caused simply by my existence, > and I want to stick with that for another few decades. > > If I think something must be done, then I have my way to do it. > If you have good reasons to prevent this, then you should speak up in the > next 10 days, or it will happen. Is that ok with you? > > Hugs -- your friend Chris > You could call it "Spython" instead! From brian at python.org Thu Nov 21 04:37:39 2013 From: brian at python.org (Brian Curtin) Date: Wed, 20 Nov 2013 21:37:39 -0600 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D478E.1040105@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> Message-ID: On Wed, Nov 20, 2013 at 5:36 PM, Christian Tismer wrote: > Hey Barry, > > > On 20.11.13 23:30, Barry Warsaw wrote: >> >> On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: >> >>> Many customers are forced to stick with Python 2.X because of other >>> products, >>> but they require a Python 2.X version which can be compiled using Visual >>> Studio 2010 or better. This is considered an improvement and not a bug >>> fix, >>> where I disagree. >> >> I'm not so sure about that. Python 2.7 can still get patches to help >> extend >> its useful life by allowing it to be built with newer compiler suites. I >> believe this has already been done for various Linux compilers. I see no >> non-technical reason why Python 2.7 can't be taught how to build with VS >> 2010 >> or newer. Details are subject to RM approval, IMHO. >> >>> I have created a very clean Python 2.7.6+ based CPython with the >>> Stackless >>> additions, that compiles with VS 2010, using the adapted project >>> structure >>> of Python 3.3.X, and I want to publish that on the Stackless website as >>> the >>> official "Stackless Python 2.8". If you consider Stackless as official >>> ;-) . >>> >>> This compiler change is currently the only deviation from CPython 2.7, >>> but we may support a few easy back-ports on customer demand. We don'd add >>> any random stuff, of course. >> >> I think you're just going to confuse everyone if you call it "Stackless >> Python >> 2.8" and it will do more harm than good. > > > Barry, that's a good thing! This way I have a chance to get my build in at > all. > And that's the question, after re-thinking: > > Where can I check my change in, if it is going to be accepted as a valid > 2.7 bug fix (concerning VS 2008 as a bug, that is is)? If you do end up checking something in, I think it should be a backport of the 3.x VS2010 work, rather than contributing your own patch starting from 2.7. Otherwise any differences in the way you did things could cause pain while merging changes between the branches. From ncoghlan at gmail.com Thu Nov 21 05:09:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Nov 2013 14:09:38 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121013120.4c827da8@fsol> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> Message-ID: On 21 Nov 2013 10:33, "Antoine Pitrou" wrote: > > On Wed, 20 Nov 2013 17:30:44 -0500 > Barry Warsaw wrote: > > On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: > > > > >Many customers are forced to stick with Python 2.X because of other products, > > >but they require a Python 2.X version which can be compiled using Visual > > >Studio 2010 or better. This is considered an improvement and not a bug fix, > > >where I disagree. > > > > I'm not so sure about that. Python 2.7 can still get patches to help extend > > its useful life by allowing it to be built with newer compiler suites. I > > believe this has already been done for various Linux compilers. I see no > > non-technical reason why Python 2.7 can't be taught how to build with VS 2010 > > or newer. Details are subject to RM approval, IMHO. > > I think it isn't only about teaching it to build with VS 2010, but > providing binaries compatible with the VS 2010 runtime. > Otherwise, AFAIU, if extensions are built with VS 2010 but loader with > a VS 2008-compiled Python, there will be ABI problems. Right, this is the problem - building with newer compilers isn't an issue, changing the Windows ABI *is* (which is the reason Christian is proposing a version bump to denote the ABI incompatibility). If Visual Studio Express 2010 and later can be told to build against the VS2008 C runtime, then that is something that can (and probably should) be enabled in the next CPython 2.7 maintenance release (for both the main build process and for distutils extension building). Doing a 2.8 release *just* to change the Windows ABI to a new version of the MS C runtime, on the other hand, seems impossible to do without thoroughly confusing people (regardless of whether it's CPython or Stackless making such a change). I'd certainly want to explore all our alternatives with the Microsoft folks for getting our preferred option (new compiler, old C runtime) working in an open source friendly way before we went down the path of a 2.x ABI bump. I've cc'ed Steve Dower directly, as he's the best person I know to ask about this kind of problem. Cheers, Nick. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 21 05:14:55 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Nov 2013 14:14:55 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> Message-ID: Another alternative I'd prefer to an ABI version bump: backporting the "C runtime independence" aspects of the stable ABI to Python 2.7. There are only a relatively small number of APIs that lead to the requirement for consistent C runtimes, so allowing those to be excluded at compile time would make it possible to ensure extensions can be built safely under VS2010 with the newer C runtime. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kristjan at ccpgames.com Thu Nov 21 10:19:27 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Thu, 21 Nov 2013 09:19:27 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D478E.1040105@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> Message-ID: > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Christian Tismer > Sent: 20. n?vember 2013 23:37 > To: Barry Warsaw; python-dev at python.org > Subject: Re: [Python-Dev] PEP 0404 and VS 2010 > > Hey Barry, > > In any case, my question still stands, and I will do something with the > Stackless branch by end of November. Please influence me ASAP, I don't > intend to do any harm, but that problem is caused simply by my existence, > and I want to stick with that for another few decades. > > If I think something must be done, then I have my way to do it. > If you have good reasons to prevent this, then you should speak up in the > next 10 days, or it will happen. Is that ok with you? > > Hugs -- your friend Chris I'd like to add here that at PyCon 2011 (if memory serves me right) I got a verbal agreement from many of you that there would be no objection to me creating an _unofficial_ "2.8" fork of python. It could even be hosted on hg.python.org. I forget if we decided on a name for it, but I remember advocating it as "Python classic." For reasons of work and others, I never got round to creating that branch but recently Stackless development has picked up the pace to the point where we feel it necessary to break with strict 2.7 conformance. Cheers, Kristj?n From dimaqq at gmail.com Thu Nov 21 11:53:09 2013 From: dimaqq at gmail.com (Dima Tisnek) Date: Thu, 21 Nov 2013 11:53:09 +0100 Subject: [Python-Dev] custom thread scheduler to test user synchronisation code Message-ID: Hi, First, in case such project already exists, could someone point me towards this? It occurs to me that in some cases may be possible to run exhaustive test on user-implemented synchronisation code. let's say, in a simple case, we've got a 2 threads, some user code using threading.* primitives to synchronise these threads and a unit test that ensures the logical result of synchronisation is sane. now, cpython relies on OS to schedule these 2 threads, however, if this scheduler was under my control, I could force this test case into many different, distinct thread interleaves. Granted, the total number of interleaves is exponential and even unbounded in presence of loops. Yet for small user code and unit test, this could even be exhaustive. In a brute-force case, a new interleave should be branched off after every bytecode evaluation (and in case of extensions GIL acquisition/release). A smarter tester could possibly track primitive and data accesses by threads and only create interleave branch when there's potential of something being used by both threads. How would I go about doing this? That is how would I go about implementing own thread scheduler? Thanks, d. From martin at v.loewis.de Thu Nov 21 12:15:36 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 21 Nov 2013 12:15:36 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131120173044.77f6fcef@anarchist> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> Message-ID: <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> Quoting Barry Warsaw : > On Nov 20, 2013, at 09:52 PM, Christian Tismer wrote: > >> Many customers are forced to stick with Python 2.X because of other >> products, >> but they require a Python 2.X version which can be compiled using Visual >> Studio 2010 or better. This is considered an improvement and not a bug fix, >> where I disagree. > > I'm not so sure about that. Python 2.7 can still get patches to help extend > its useful life by allowing it to be built with newer compiler suites. I > believe this has already been done for various Linux compilers. I see no > non-technical reason why Python 2.7 can't be taught how to build with VS 2010 > or newer. Unfortunately, there are severe *technical* reasons that make this a challenge. If they wouldn't exist, we would have done this long ago. The problem is in the ABI, and the question which msvcrt python27.dll links with. With official support for VS 2010, there would be two ABI-incompatible releases of Python on Windows; entirely different from supporting a new version of gcc (say), as newer gccs will typically be binary-compatible with older releases (I admit I don't understand the ABI compatibility on OSX). So adding VS 2010 build files to the 2.7 tree would be fine with me, but I doubt users are helped with that - they want a binary release that uses them. Releasing this as "Python 2.7", with a "python27.dll" would lead right to DLL hell. I could imagine having a separate release that is called "Python 2.7 for Visual Studio 2010" (and then another one called "Python 2.7 for Visual Studio 2013"). They could give different names to the Python DLL, e.g. py27cr100.dll and py27cr110.dll. Whether this would be a good idea or not, I don't know. It would create separate ecosystems for different releases of Python 2.7 for different CRTs. Package authors would have to create multiple binary releases of the same modules for Windows, and upload them to PyPI. pip would have to learn to download the right one, depending on what build of Python 2.7 is running. I'm skeptical that using "Python 2.8" as the release name would help in the long run. Assuming this is done for VS 2010, then "Python 2.9" will have to be released for binary compatibility with VS 2013, and "Python 2.10" for VS 2016, assuming Python 2.7 is then still around by that time. Regards, Martin From martin at v.loewis.de Thu Nov 21 12:31:30 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 21 Nov 2013 12:31:30 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> Message-ID: <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> Quoting Nick Coghlan : > Another alternative I'd prefer to an ABI version bump: backporting the "C > runtime independence" aspects of the stable ABI to Python 2.7. That sounds doable. If we provided a "python2.dll", would could make the header files using the "restricted API" by default if Python is compiled with VS 2010. Extension builders could then regularly compile their extensions with VS 2010, or VS 2013, despite Python itself being linked with the VS 2008 CRT. Anybody releasing such binaries would either have to target Python 2.7.7, or distribute "python2.dll" along with the binary. It should actually be possible to provide a "python27vs2010.msi" upgrade package that only deploys the python2.dll and the updated header files, on top of an existing Python 2.7 installation. I had originally planned to support the "stable ABI" in Python 2, but the PEP lagged, and I couldn't finish it in time for the 2.7 release. If Chris could contribute to make this happen, it would be much appreciated. Regards, Martin P.S. Thinking about this, there are some issues. The "restricted API" hides the object layout of all objects, in particular of type objects. Adding the PEP 384 API (PyType_FromSpec) might be a bit heavy for 2.7. So it might by better to provide a "py27compat.dll" instead which does not hide the structures (as they won't change during the remaining life of 2.7), but only hides any APIs and macros that directly expose CRT functionality. From p.f.moore at gmail.com Thu Nov 21 12:39:21 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Nov 2013 11:39:21 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> Message-ID: On 21 November 2013 11:15, wrote: > Whether this would be a good idea or not, I don't know. It would create > separate ecosystems for different releases of Python 2.7 for different > CRTs. Package authors would have to create multiple binary releases of > the same modules for Windows, and upload them to PyPI. pip would have > to learn to download the right one, depending on what build of Python > 2.7 is running. It would also mean that the wheel compatibility tags for Windows would no longer work, as currently the code makes no provision for multiple ABIs on Windows within a single Python minor version. So it would not be possible to reliably release binary wheels which could be verified as compatible with the Python version you're installing them to (a key benefit of wheels over earlier binary formats, which did not consider ABI compatibility at all). Paul From solipsis at pitrou.net Thu Nov 21 13:06:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 13:06:01 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> Message-ID: <20131121130601.42c09c1a@fsol> On Thu, 21 Nov 2013 09:19:27 +0000 Kristj?n Valur J?nsson wrote: > > For reasons of work and others, I never got round to creating that branch but > recently Stackless development has picked up the pace to the point where we > feel it necessary to break with strict 2.7 conformance. Why is that? Stackless can't switch to 3.x? Regards Antoine. From ncoghlan at gmail.com Thu Nov 21 13:45:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Nov 2013 22:45:37 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> Message-ID: On 21 November 2013 21:31, wrote: > > Quoting Nick Coghlan : > >> Another alternative I'd prefer to an ABI version bump: backporting the "C >> runtime independence" aspects of the stable ABI to Python 2.7. > > P.S. Thinking about this, there are some issues. The "restricted API" > hides the object layout of all objects, in particular of type objects. > Adding the PEP 384 API (PyType_FromSpec) might be a bit heavy for 2.7. > > So it might by better to provide a "py27compat.dll" instead which does > not hide the structures (as they won't change during the remaining life > of 2.7), but only hides any APIs and macros that directly expose CRT > functionality. Yep, that's what I meant by backporting just the "C runtime independence" aspects - there's no reason to backport the version independence features, since we're not planning to do another Python 2.x ABI bump anyway. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From martin at v.loewis.de Thu Nov 21 13:50:25 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 21 Nov 2013 13:50:25 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528D20FA.8040902@stackless.com> References: <528D20FA.8040902@stackless.com> Message-ID: <20131121135025.Horde.9KuHayzZSBkhdU0yz1oRyQ1@webmail.df.eu> Quoting Christian Tismer : > Can I rely on PEP 404 that the "Python 2.8" namespace never will clash > with CPython? This question still hasn't been answered (AFAICT). So let me try to answer it, and apologies upfront for being picky. First, I don't understand the question: What is the "Python 2.8" *namespace*? I try to rephrase the question: # May I rely on PEP 404 that CPython will never make a Python 2.8 release? To that, the clear answer is YES. A natural follow-up question then is # May I use the name "Python 2.8" for my own products? To that, the answer is "ONLY IF YOU GET PERMISSION". This is really off-topic for python-dev, but you would have to ask the PSF trademark committee for permission to use that name as a product name. IIUC, the current policy is that you may use "Python" if it a) is related to the Python programming language, and b) is somehow qualified (e.g. "Stackless Python"). To use just "Python", you need permission (and I suspect that this might not be granted). HTH, Martin From christian at python.org Thu Nov 21 13:51:45 2013 From: christian at python.org (Christian Heimes) Date: Thu, 21 Nov 2013 13:51:45 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> Message-ID: <528E01E1.1070305@python.org> Am 21.11.2013 12:31, schrieb martin at v.loewis.de: > That sounds doable. If we provided a "python2.dll", would could make the > header files using the "restricted API" by default if Python is compiled > with VS 2010. Extension builders could then regularly compile their > extensions with VS 2010, or VS 2013, despite Python itself being linked > with the VS 2008 CRT. What about the CAPI functions like PyFile_FromFile() and PyFile_AsFile() that take a FILE* as argument? Or functions like PyErr_SetFromErrno() that use the errno thread local variable? These function will either crash or not work properly with mixed CRTs. Every CRT has its own errno TLS, so Python won't see the errno of a CRT100 function like open(2), see http://bugs.python.org/issue15883 . I don't understand how a stable Python ABI solves the issue of unstable CRT ABIs. Are you planning to address these issues? Christian From kristjan at ccpgames.com Thu Nov 21 13:36:48 2013 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Thu, 21 Nov 2013 12:36:48 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121130601.42c09c1a@fsol> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> Message-ID: > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou > Sent: 21. n?vember 2013 12:06 > To: python-dev at python.org > Subject: Re: [Python-Dev] PEP 0404 and VS 2010 > > On Thu, 21 Nov 2013 09:19:27 +0000 > Kristj?n Valur J?nsson wrote: > > > > For reasons of work and others, I never got round to creating that > > branch but recently Stackless development has picked up the pace to > > the point where we feel it necessary to break with strict 2.7 conformance. > > Why is that? Stackless can't switch to 3.x? > Yes, we have stackless 3.3 But there is desire to have a 2.X version, with added fixes from 3.x, e.g. certain improvements in the standard library etc. It's the old argument: moving to 3.x is not an option for some users, but there are known improvements that can be applied to current 2.7. Why not both have our cake and eat it? cPython had probably two driving concerns for not making a 2.8: 1) Focussing development on one branch 2) encouraging (forcing) users to take the leap to 3 come hell or high water. For Stackless, neither argument applies because 2.8 work would be done by us and stackless has no particular allegiance towards either version. K From solipsis at pitrou.net Thu Nov 21 13:59:41 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Nov 2013 13:59:41 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> Message-ID: <20131121135941.7c5b5015@fsol> On Thu, 21 Nov 2013 12:36:48 +0000 Kristj?n Valur J?nsson wrote: > Yes, we have stackless 3.3 > But there is desire to have a 2.X version, with added fixes from 3.x, e.g. certain improvements in the > standard library etc. > It's the old argument: moving to 3.x is not an option for some users, but there are known improvements that > can be applied to current 2.7. Why not both have our cake and eat it? > cPython had probably two driving concerns for not making a 2.8: > 1) Focussing development on one branch > 2) encouraging (forcing) users to take the leap to 3 come hell or high water. > > For Stackless, neither argument applies because 2.8 work would be done > by us and stackless has no particular allegiance towards either version. Stackless can release their own Stackless 2.8 if they want, but I don't get why CPython would have a 2.8 too. Regards Antoine. From martin at v.loewis.de Thu Nov 21 14:03:55 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 21 Nov 2013 14:03:55 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E01E1.1070305@python.org> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> <528E01E1.1070305@python.org> Message-ID: <20131121140355.Horde.4CTJ5uidFFdoq1B14lRkVA2@webmail.df.eu> Quoting Christian Heimes : > What about the CAPI functions like PyFile_FromFile() and PyFile_AsFile() > that take a FILE* as argument? They are unable in the stable ABI, and would be unavailable in py27compat.dll. Modules using them would have to be rewritten to not use them anymore, or continue to be compiled with VS 2008. Regards, Martin From ncoghlan at gmail.com Thu Nov 21 15:01:24 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 00:01:24 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E01E1.1070305@python.org> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> <20131121123130.Horde.x2onCIlm_fzNfKIe79EgrA3@webmail.df.eu> <528E01E1.1070305@python.org> Message-ID: On 21 November 2013 22:51, Christian Heimes wrote: > Am 21.11.2013 12:31, schrieb martin at v.loewis.de: >> That sounds doable. If we provided a "python2.dll", would could make the >> header files using the "restricted API" by default if Python is compiled >> with VS 2010. Extension builders could then regularly compile their >> extensions with VS 2010, or VS 2013, despite Python itself being linked >> with the VS 2008 CRT. > > What about the CAPI functions like PyFile_FromFile() and PyFile_AsFile() > that take a FILE* as argument? Or functions like PyErr_SetFromErrno() > that use the errno thread local variable? These function will either > crash or not work properly with mixed CRTs. Every CRT has its own errno > TLS, so Python won't see the errno of a CRT100 function like open(2), > see http://bugs.python.org/issue15883 . > > I don't understand how a stable Python ABI solves the issue of unstable > CRT ABIs. Are you planning to address these issues? The stable ABI excludes all the CRT dependent APIs - that is exactly the subset of the stable ABI restrictions that I am suggesting we should backport to 2.7.7 rather than Stackless making a 2.8 release just to update to a backwards incompatible C runtime on Windows. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From kristjan at ccpgames.com Thu Nov 21 15:16:12 2013 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Thu, 21 Nov 2013 14:16:12 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121135941.7c5b5015@fsol> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <20131121135941.7c5b5015@fsol> Message-ID: > > For Stackless, neither argument applies because 2.8 work would be done > > by us and stackless has no particular allegiance towards either version. > > Stackless can release their own Stackless 2.8 if they want, but I don't get why > CPython would have a 2.8 too. Oh, this is the misunderstanding. No one is trying to get permission for "CPython 2.8", only "Stackless Python 2.8". The "namespace" question from Christian has to do with a "python28.dll" which would be built using VS2010, that this would never clash with a CPython version built the same way. such clashes would be very unfortunate. Of course, we could even make a full break, if there will never be a CPython 2.8 (which there won't be) and call the dll slpython28.dll. Cheers, K From ncoghlan at gmail.com Thu Nov 21 15:53:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 00:53:07 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <20131121135941.7c5b5015@fsol> Message-ID: On 22 November 2013 00:16, Kristj?n Valur J?nsson wrote: > >> > For Stackless, neither argument applies because 2.8 work would be done >> > by us and stackless has no particular allegiance towards either version. >> >> Stackless can release their own Stackless 2.8 if they want, but I don't get why >> CPython would have a 2.8 too. > > Oh, this is the misunderstanding. No one is trying to get permission for "CPython 2.8", > only "Stackless Python 2.8". > > The "namespace" question from Christian has to do with a "python28.dll" which would be > built using VS2010, that this would never clash with a CPython version built the > same way. such clashes would be very unfortunate. > > Of course, we could even make a full break, if there will never be a CPython 2.8 (which there won't be) > and call the dll slpython28.dll. OK, rereading Christian's original post, I think we may need to understand the problem you are trying to solve a little better. If I have understood correctly, 1. Currently CPython 2.7 only officially supports building with Visual Studio 2008. This is due to the ABI instability of the Windows CRT and the fact that elements of the CRT ABI are republished as part of the Python ABI. 2. (CCP?) customers are wanting to build CPython *from source*, and don't care about compatibility with binary extensions built by others in the normal way against the regular CPython release. If they need extensions, they will build them from source themselves, thus avoiding any CRT ABI incompatibility issues. If this interpretation is correct, then Martin's suggestion of a different DLL name entirely *should work* for your use case. Something like: slpy27cr100.dll ("Stackless Python 2.7 for Visual Studio 2010") slpy27cr110.dll ("Stackless Python 2.7 for Visual Studio 2013") If there's no need for a compatible ecosystem of C extensions (e.g. it's being embedded as part of a large application which takes care of linking everything properly), then the problem becomes much simpler, and there's a much higher chance it can be handled upstream in CPython itself. The only thing that *can't* happen is to build a python27.dll that links against a newer C runtime than the Visual Studio 2008 one. I believe this approach would be substantially less confusing for the broader community than trying to explain that a 2.8 release of Stackless Python was really just a version bump for the Windows CRT ABI. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Thu Nov 21 16:12:51 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 21 Nov 2013 10:12:51 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <20131121135941.7c5b5015@fsol> Message-ID: <20131121101251.796872e9@anarchist> On Nov 21, 2013, at 02:16 PM, Kristj?n Valur J?nsson wrote: >Oh, this is the misunderstanding. No one is trying to get permission for >"CPython 2.8", only "Stackless Python 2.8". I think this is a very bad idea. We've worked hard to send the message that the migration path is to Python 3 and while I understand that many people don't want to or cannot switch to Python 3, backpeddling on this message from the Python community will be bad PR and open the floodgates for pressure to continue the Python 2 lineage. Stackless can of course do whatever it wants, but I would strongly oppose the use of the word "Python" in that. I don't want to be antagonistic, but I think the PSF should also oppose allowing the use of the trademark for any association with an unofficial 2.8. -Barry From christian at python.org Thu Nov 21 16:31:13 2013 From: christian at python.org (Christian Heimes) Date: Thu, 21 Nov 2013 16:31:13 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121101251.796872e9@anarchist> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <20131121135941.7c5b5015@fsol> <20131121101251.796872e9@anarchist> Message-ID: Am 21.11.2013 16:12, schrieb Barry Warsaw: > On Nov 21, 2013, at 02:16 PM, Kristj?n Valur J?nsson wrote: > >> Oh, this is the misunderstanding. No one is trying to get permission for >> "CPython 2.8", only "Stackless Python 2.8". > > I think this is a very bad idea. We've worked hard to send the message that > the migration path is to Python 3 and while I understand that many people > don't want to or cannot switch to Python 3, backpeddling on this message from > the Python community will be bad PR and open the floodgates for pressure to > continue the Python 2 lineage. > > Stackless can of course do whatever it wants, but I would strongly oppose the > use of the word "Python" in that. I don't want to be antagonistic, but I > think the PSF should also oppose allowing the use of the trademark for > any association with an unofficial 2.8. Yes, please don't use a name that contains both the strings "Python" and "2.8". It's going to create lots and lots of confusion. I strongly urge you to call it "Stackless 2.8" or something similar. Christian From brett at python.org Thu Nov 21 16:46:48 2013 From: brett at python.org (Brett Cannon) Date: Thu, 21 Nov 2013 10:46:48 -0500 Subject: [Python-Dev] PEP 428 - pathlib - ready for approval In-Reply-To: References: <20131119220400.695a8bcd@fsol> <20131120230148.7e42d073@fsol> Message-ID: On Wed, Nov 20, 2013 at 5:41 PM, Mark Lawrence wrote: > On 20/11/2013 22:01, Antoine Pitrou wrote: > >> >> pathlib imports many modules at startup, so for scripts for which >> startup time is critical using os.path may still be the best option. >> >> > Will there be or is there a note to this effect in the docs? Could be a comment in the source but it should not be in the docs. Only python-dev would care about this and actual users shouldn't over-optimize and avoid pathlib because it imports stuff. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Nov 21 16:57:56 2013 From: brett at python.org (Brett Cannon) Date: Thu, 21 Nov 2013 10:57:56 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <20131121135941.7c5b5015@fsol> <20131121101251.796872e9@anarchist> Message-ID: On Thu, Nov 21, 2013 at 10:31 AM, Christian Heimes wrote: > Am 21.11.2013 16:12, schrieb Barry Warsaw: > > On Nov 21, 2013, at 02:16 PM, Kristj?n Valur J?nsson wrote: > > > >> Oh, this is the misunderstanding. No one is trying to get permission > for > >> "CPython 2.8", only "Stackless Python 2.8". > > > > I think this is a very bad idea. We've worked hard to send the message > that > > the migration path is to Python 3 and while I understand that many people > > don't want to or cannot switch to Python 3, backpeddling on this message > from > > the Python community will be bad PR and open the floodgates for pressure > to > > continue the Python 2 lineage. > > > > Stackless can of course do whatever it wants, but I would strongly > oppose the > > use of the word "Python" in that. I don't want to be antagonistic, but I > > think the PSF should also oppose allowing the use of the trademark for > > any association with an unofficial 2.8. > > Yes, please don't use a name that contains both the strings "Python" and > "2.8". It's going to create lots and lots of confusion. I strongly urge > you to call it "Stackless 2.8" or something similar. > Agreed. PyPy has their own version number so there is precedent of keeping a version number that is not directly tied to the version of Python that one is compatible with. If you're really worried about confusion then go the OS X route and call it "Stackless 10" and start way up the version number chain where there is no possible clash for several decades. =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Nov 21 17:01:26 2013 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 21 Nov 2013 08:01:26 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> Message-ID: <-4665162615556836517@unknownmsgid> with older releases (I admit I don't understand the ABI compatibility on OSX). Well, with OS-X, it's not exactly the C lic in the same way, but there are different SDKs for different OS versions, and you can add to that PPC vs Intel processors and 32 vs 64 bit. So we have for years had two builds for OS-X, and you can add to that Macports, Homebrew, and what have you. And the idea that this isn't an issue for gcc makes no sense-- it's such a big issue for Linux, in fact that python.org doesn't even try to build binaries for Linux, and pypi has disabled binary wheels for Linux. So adding VS 2010 build files to the 2.7 tree would be fine with me, I think this part is a no brainer. I also think having a 2.8 out there that is exactly the same as 2.7, except that it was built with a different version of a compiler on one particular platform is a very very bad idea. So how CAN this be addressed? This is really a distribution issue, and the distutils-sig has been hard at work building the infrastructure to support things like this. Right now, if you build binary wheels with the two different "official" OS-X builds, you will get two different names, and pop can find the correct one. (last I checked, pip would happily install the wrong one if you asked it to, though I think that is slated to be fixed) So if a different build of 2.7 for Windows is put out there, we need to make sure it reports the platform in a way that wheel and pip can make the distinction. It might be nice to patch the win_inst too--IIRC it's still not very smart about even 32 vs 64 bit builds. As for stackless--just to be clear--can you use extensions built with the "regular" python with a stack less binary? If so, I understand the concern. If not, then it seems stackless is a separate ecosystem anyway. Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Thu Nov 21 17:37:57 2013 From: christian at python.org (Christian Heimes) Date: Thu, 21 Nov 2013 17:37:57 +0100 Subject: [Python-Dev] flaky tests caused by repr() sort order Message-ID: Hi, the buildbots are flaky because two repr() tests for userdict and functools.partial fail every now and then. The test cases depend on a fixed order of keyword arguments the representation of userdict and partial instances. The improved hash randomization of PEP 456 shows its power. I haven't seen the issue before because it happens rarely and mostly on 32 bit platforms. http://bugs.python.org/issue19681 http://bugs.python.org/issue19664 I'm not sure about the best approach for the issues. Either we need to change the test case and make it more resilient or the code for repr() must sort its dict keys. Christian From tim.peters at gmail.com Thu Nov 21 18:57:11 2013 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 21 Nov 2013 11:57:11 -0600 Subject: [Python-Dev] flaky tests caused by repr() sort order In-Reply-To: References: Message-ID: [Christian Heimes] > the buildbots are flaky because two repr() tests for userdict and > functools.partial fail every now and then. The test cases depend on a > fixed order of keyword arguments the representation of userdict and > partial instances. The improved hash randomization of PEP 456 shows its > power. I haven't seen the issue before because it happens rarely and > mostly on 32 bit platforms. > > http://bugs.python.org/issue19681 > http://bugs.python.org/issue19664 > > I'm not sure about the best approach for the issues. Either we need to > change the test case and make it more resilient or the code for repr() > must sort its dict keys. Best to change the failing tests. For example, _they_ can sort the dict keys if they rely on a fixed order. Sorting in general is a dubious idea because it can be a major expense with no real benefit for most uses. From christian at python.org Thu Nov 21 19:10:29 2013 From: christian at python.org (Christian Heimes) Date: Thu, 21 Nov 2013 19:10:29 +0100 Subject: [Python-Dev] flaky tests caused by repr() sort order In-Reply-To: References: Message-ID: <528E4C95.4040000@python.org> Am 21.11.2013 18:57, schrieb Tim Peters: > Best to change the failing tests. For example, _they_ can sort the > dict keys if they rely on a fixed order. Sorting in general is a > dubious idea because it can be a major expense with no real benefit > for most uses. I don't consider repr() as a performance critical function. It's mostly used for debugging. From dholth at gmail.com Thu Nov 21 19:14:45 2013 From: dholth at gmail.com (Daniel Holth) Date: Thu, 21 Nov 2013 13:14:45 -0500 Subject: [Python-Dev] flaky tests caused by repr() sort order In-Reply-To: <528E4C95.4040000@python.org> References: <528E4C95.4040000@python.org> Message-ID: +1 on unsorted repr(). It makes it obvious that the collection is not sorted. On Thu, Nov 21, 2013 at 1:10 PM, Christian Heimes wrote: > Am 21.11.2013 18:57, schrieb Tim Peters: >> Best to change the failing tests. For example, _they_ can sort the >> dict keys if they rely on a fixed order. Sorting in general is a >> dubious idea because it can be a major expense with no real benefit >> for most uses. > > I don't consider repr() as a performance critical function. It's mostly > used for debugging. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com From guido at python.org Thu Nov 21 19:18:12 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 21 Nov 2013 10:18:12 -0800 Subject: [Python-Dev] flaky tests caused by repr() sort order In-Reply-To: References: <528E4C95.4040000@python.org> Message-ID: Correct. On Nov 21, 2013 10:15 AM, "Daniel Holth" wrote: > +1 on unsorted repr(). It makes it obvious that the collection is not > sorted. > > On Thu, Nov 21, 2013 at 1:10 PM, Christian Heimes > wrote: > > Am 21.11.2013 18:57, schrieb Tim Peters: > >> Best to change the failing tests. For example, _they_ can sort the > >> dict keys if they rely on a fixed order. Sorting in general is a > >> dubious idea because it can be a major expense with no real benefit > >> for most uses. > > > > I don't consider repr() as a performance critical function. It's mostly > > used for debugging. > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tismer at stackless.com Thu Nov 21 19:53:22 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 21 Nov 2013 19:53:22 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <-4665162615556836517@unknownmsgid> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> Message-ID: <528E56A2.7020905@stackless.com> ... > I also think having a 2.8 out there that is exactly the same as 2.7, > except that it was built with a different version of a compiler on one > particular platform is a very very bad idea. > This was not my proposal. I was seeking a way to make a version that produces no collisions with DLLs, and it was a question about Stackless, not Python. But I admit that I was not aware of a license problem. > So how CAN this be addressed? This is really a distribution issue, and > the distutils-sig has been hard at work building the infrastructure to > support things like this. > > Right now, if you build binary wheels with the two different > "official" OS-X builds, you will get two different names, and pop can > find the correct one. (last I checked, pip would happily install the > wrong one if you asked it to, though I think that is slated to be fixed) > > So if a different build of 2.7 for Windows is put out there, we need > to make sure it reports the platform in a way that wheel and pip can > make the distinction. > That was the reason to change the number. If other approaches like a different name for certain things is easy to do, I am fine with that, too. > It might be nice to patch the win_inst too--IIRC it's still not very > smart about even 32 vs 64 bit builds. > > As for stackless--just to be clear--can you use extensions built with > the "regular" python with a stack less binary? If so, I understand the > concern. If not, then it seems stackless is a separate ecosystem anyway. Good question, and this _is_ a problem: Minus a few number of things, Stackless is almost completely binary compatible with extensions, and people _will_ want to try Stackless for some things or might want to go back and use CPython, be it only to remove concerns of a customer. People do want to install binary packages which are not built for Stackless, and we always did best efforts to support that. That means: basically we want to improve the situation for Stackless-based project in the first place, but make it easy to go back to CPython. We have a compiler switch that can completely remove the stackless additions and create a 100 % binary identical CPython version! That implies in the end that extension modules which work with Stackless will also be runnable on CPython 2.7.x VS2010 or whatever name it is. The naming problem then comes in again through the back-door, and we cannot shut our eyes and pretend "hey it is Stackless", because that is admittedly close to a fraud. So even if VS2010 exists only in the stackless branch, it is very likely to get used as CPython VS 2010, and I again have the naming problem ... cheers - Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From ethan at stoneleaf.us Thu Nov 21 19:59:51 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 21 Nov 2013 10:59:51 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E56A2.7020905@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> Message-ID: <528E5827.1010008@stoneleaf.us> On 11/21/2013 10:53 AM, Christian Tismer wrote: > > So even if VS2010 exists only in the stackless branch, it is very likely > to get used as CPython VS 2010, and I again have the naming problem ... What's wrong with calling it CPython VS 2010? And Stackless VS 2010? -- ~Ethan~ From chris.barker at noaa.gov Thu Nov 21 20:46:08 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Nov 2013 11:46:08 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E56A2.7020905@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> Message-ID: On Thu, Nov 21, 2013 at 10:53 AM, Christian Tismer wrote: > I also think having a 2.8 out there that is exactly the same as 2.7, >> except that it was built with a different version of a compiler on one >> particular platform is a very very bad idea. >> > This was not my proposal. I was seeking a way to make a version that > produces no collisions with DLLs, and it was a question about Stackless, > not Python. But I admit that I was not aware of a license problem. > > well, as you said below, you want to keep binary compatibility between stackless and cPython, right down to the same dll name, so yes, it is about Python. And since we are talking about it -- it actually would be nice to be able to have a builds of python for Windows (any version) that are not restricted to one compiler version per python version. (Like a VS2010 py2.7, for example, but not JUST that) >> So if a different build of 2.7 for Windows is put out there, we need to >> make sure it reports the platform in a way that wheel and pip can make the >> distinction. >> >> > That was the reason to change the number. If other approaches like a > different > name for certain things is easy to do, I am fine with that, too. well, there is precedent for that with the OS-X builds -- so reasonable enough. However, that really only helps for binary wheels and pip -- which haven't been widely adopted yet (to put it mildly!). So maybe a new dll name makes sense -- honestly while some of how Windows handles dlls makes "dll hell" inevitable, it's made worse by really short dll names, and re-using names even when versions change -- we're so far past the 8.3 world, I have no idea why that's still done. so a python27_VS_2010.dll or something might make sense. Throw some info in there about 64 vs 32 bit too? or is it enough that the linker will let you know if there is a problem? It might be nice to patch the win_inst too--IIRC it's still not very smart >> about even 32 vs 64 bit builds. >> >> As for stackless--just to be clear--can you use extensions built with the >> "regular" python with a stack less binary? If so, I understand the concern. >> If not, then it seems stackless is a separate ecosystem anyway. >> > > Good question, and this _is_ a problem: > Minus a few number of things, Stackless is almost completely binary > compatible with extensions, and people _will_ want to try Stackless for > some > things or might want to go back and use CPython, be it only to remove > concerns of > a customer. > People do want to install binary packages which are not built for > Stackless, > and we always did best efforts to support that. > > That means: basically we want to improve the situation for Stackless-based > project in the first place, but make it easy to go back to CPython. > We have a compiler switch that can completely remove the stackless > additions and create a 100 % binary identical CPython version! > > That implies in the end that extension modules which work with Stackless > will also be runnable on CPython 2.7.x VS2010 or whatever name it is. > The naming problem then comes in again through the back-door, > and we cannot shut our eyes and pretend "hey it is Stackless", > because that is admittedly close to a fraud. > > So even if VS2010 exists only in the stackless branch, it is very likely > to get used as CPython VS 2010, and I again have the naming problem ... > Just to be clear here: You want to be able to create a non-stackless, regular old CPython binary built with VS2010. (which is also compatible with stackless build) OK, now: Do you want people to be able to use extensions built by third parties for python.org CPython with your binary builds? If so, then there will need to be a python.org binary built with VS2010, and a way that makes it hard for people to try to use extensions built for a different VS-version-build. If not, then the only problem is that users of your VS2010-built binary will go off willy nilly and try to use extensions built for python.orgbuilds, and they may appear to work at first glance, but may do weird things or crash unexpectedly. I'd say the issue there is education as much as anything. Or couldn't you simply install your binary somewhere other than C:\python27? (and use different registry setting or whatever so the windows installers will not get confused?) The other potential route for error is a pip install -- you don't want pip to find a binary that is incompatible with your build -- so you should assure that whatever pip/wheel uses to identify the build is set differently in your build (see the relevant PEPs). Note that for OS-X we have some of the same issues -- what with Homebrew and Macports, and Apple, and ... there are a lot of potentially binary incompatible builds of PythonX.Y out there. I don't think the issue really is safely resolved, but at a policy level, I THINK the conclusion on the distutils list was to declare that we only support binary wheels on PyPi for the python.org builds. Users of other builds should use their build system to get binaries (much like Linux). That policy is more or less in place already for Windows, though pretty much defacto, as there aren't any other widely used Windows binaries out there. (there is Cygwin though, and I think it reports itself as a different platform). -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tismer at stackless.com Thu Nov 21 21:23:00 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 21 Nov 2013 21:23:00 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E5827.1010008@stoneleaf.us> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> <528E5827.1010008@stoneleaf.us> Message-ID: <528E6BA4.8000909@stackless.com> On 21/11/13 19:59, Ethan Furman wrote: > On 11/21/2013 10:53 AM, Christian Tismer wrote: >> >> So even if VS2010 exists only in the stackless branch, it is very likely >> to get used as CPython VS 2010, and I again have the naming problem ... > > What's wrong with calling it CPython VS 2010? And Stackless VS 2010? Sounds tempting in the first place, but then I need to change policy a bit. With the compiler switch, we always produced the identical Python version where Stackless is based upon. Well, this version string is not a problem. You mean I would stick with the latest CPython version numbers, produce maybe different dll names (maybe cpy27vs2010.dll and spy27vs2010.dll) which could even live together? Maybe I would generate a cpython and spython exe and support them both in the same distribution? What would happen with binary extensions on PyPI? Would they install and then fail, or is the distinction based upon the name "python"? If the latter is true, I think I could live with that, and build binary packages for all relevant extensions. At first sight I was about to complain, at second sight I am about to fall in the trap of being happy. I do not want to loose our relation to CPython and do any harm. But if there is such an easy solution, I might think to go for it. In the end, I want to go the path that MvL/Nick proposed. But as a quick solution for now, I would love to have an easy first solution. Please do not take me too seriously, I need to sleep over this to see if the idea stands my nightmares. cheers & thanks -- Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From greg.ewing at canterbury.ac.nz Thu Nov 21 22:02:29 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Nov 2013 10:02:29 +1300 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> Message-ID: <528E74E5.7050702@canterbury.ac.nz> martin at v.loewis.de wrote: > Package authors would have to create multiple binary releases of > the same modules for Windows, and upload them to PyPI. pip would have > to learn to download the right one, depending on what build of Python > 2.7 is running. Is that much different from package authors having to release binaries for different versions of Python, if they want to support older versions? Having multiple binaries for the same x.y version is different from what's been done before, but it seems to me an unavoidable consequence of supporting one x.y version for longer than usual. If we *don't* provide a version for the current MS compiler, we will end up with 2.7 becoming effectively unsupported due to the actions of MS. Do we want to allow that to happen, or do we want 2.7 to remain viable for a while longer? -- Greg From greg.ewing at canterbury.ac.nz Thu Nov 21 22:09:38 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Nov 2013 10:09:38 +1300 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> Message-ID: <528E7692.3080006@canterbury.ac.nz> Concerning the version number, I thought the intention of PEP 404 was simply to say that the PSF would not be releasing anything called Python 2.8, not to forbid anyone *else* from doing so. Or am I wrong about that? If I'm right, there's nothing stopping Christian from releasing Stackless Python 2.8 with whatever improvements he wants. If it includes other improvements beyond the change of compiler, it may even deserve the higher version number. -- Greg From p.f.moore at gmail.com Thu Nov 21 22:12:35 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Nov 2013 21:12:35 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E74E5.7050702@canterbury.ac.nz> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <528E74E5.7050702@canterbury.ac.nz> Message-ID: On 21 November 2013 21:02, Greg Ewing wrote: > Is that much different from package authors having to > release binaries for different versions of Python, > if they want to support older versions? > > Having multiple binaries for the same x.y version > is different from what's been done before, but it > seems to me an unavoidable consequence of supporting > one x.y version for longer than usual. None of the currently available binary distribution formats distinguish Windows binaries by anything other than minor version. For wheels (and I think eggs), this is a showstopper as the name is essential metadata (compatibility tags) for the other formats (wininst and msi) the name is merely informational - packagers could rename, but (a) they will forget, and (b) the users won't know if they have or not. Before we can cleanly support multiple ABIs for a single minor version on Windows, we need to have a resolution of this dilemma (which may be nothing more than "only binaries for the python.org builds are allowed on PyPI"...) Paul From v+python at g.nevcal.com Thu Nov 21 22:13:03 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 21 Nov 2013 13:13:03 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E6BA4.8000909@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> <528E5827.1010008@stoneleaf.us> <528E6BA4.8000909@stackless.com> Message-ID: <528E775F.7040800@g.nevcal.com> On 11/21/2013 12:23 PM, Christian Tismer wrote: > Maybe I would generate a cpython and spython exe and support them > both in the same distribution? That sounds cool, if possible. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Nov 21 22:17:24 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Nov 2013 13:17:24 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E7692.3080006@canterbury.ac.nz> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> Message-ID: On Thu, Nov 21, 2013 at 1:09 PM, Greg Ewing wrote: > Concerning the version number, I thought the intention of > PEP 404 was simply to say that the PSF would not be releasing > anything called Python 2.8, not to forbid anyone *else* > from doing so. > > Or am I wrong about that? > > If I'm right, there's nothing stopping Christian from > releasing Stackless Python 2.8 with whatever improvements > he wants. If it includes other improvements beyond the > change of compiler, it may even deserve the higher > version number. If so, I still think it's a very bad idea to use a version number bump for a compiler change -- that's not at all what the version number means. That version number specifies a version of the code, and a version of the language (and library, not a particular ABI -- after all, how many different binary versions of python2.7 are out there now? Multiple platforms, multiple bit-depths, etc. -Chris > > -- > Greg > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu Nov 21 22:23:34 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 22 Nov 2013 10:23:34 +1300 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <20131121135941.7c5b5015@fsol> Message-ID: <528E79D6.20506@canterbury.ac.nz> Kristj?n Valur J?nsson wrote: > The "namespace" question from Christian has to do with a "python28.dll" which would be > built using VS2010, that this would never clash with a CPython version built the > same way. However, it *would* clash with someone else who did the same thing, e.g. Fred Bloggs releases BloggPython 2.8 and calls his dll python28.dll. So it would be a bit rude to claim a bare pythonxy.dll name for something that's not an official PSF version. -- Greg From chris.barker at noaa.gov Thu Nov 21 22:27:42 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Nov 2013 13:27:42 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <528E74E5.7050702@canterbury.ac.nz> Message-ID: On Thu, Nov 21, 2013 at 1:12 PM, Paul Moore wrote: > None of the currently available binary distribution formats > distinguish Windows binaries by anything other than minor version. For > wheels (and I think eggs), this is a showstopper as the name is > essential metadata (compatibility tags) for the other formats (wininst > and msi) the name is merely informational - packagers could rename, > but (a) they will forget, and (b) the users won't know if they have or > not. > exactly. > Before we can cleanly support multiple ABIs for a single minor version > on Windows, we need to have a resolution of this dilemma (which may be > nothing more than "only binaries for the python.org builds are allowed > on PyPI"...) That's already the unstated case. But besides stackless, it some of us are advocating that there be python.org-provided binaries built with a newer compiler (eventually, anyway). Also, I haven't gotten a reply, but I get the impression that Christian would like stackless-users not to have a n easy way to get this all messed up. the wheel namign scheme is defined by PEP 425. The bit in play here is: """ The platform tag is simply distutils.util.get_platform() with all hyphens - and periods . replaced with underscore _. win32 linux_i386 linux_x86_64 """ I suspect that now we have only win32 and win64 for platform_tags for the pyton.org Windows builds. But I'm also pretty sure that, for instance, cygwin builds use a different tag. And the "official" python.org OS-X builds have two different platform tags for the two builds. So the precedent is there -- and it's easy enough to keep "win32" as the VS2008 version, and then have a "win32_VS_2010" or whatever for a newer build. That wouldn't take much to do, and it would allow pip and binary wheels to "just work". It would be nice if the msi installers could be similarly patched, but I have no idea what that would take. -Chris > Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From Steve.Dower at microsoft.com Thu Nov 21 22:30:55 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Thu, 21 Nov 2013 21:30:55 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121013120.4c827da8@fsol> Message-ID: <6cf842157d9c47fc95456268ec835bb5@BLUPR03MB293.namprd03.prod.outlook.com> Nick Coghlan wrote: > On 21 Nov 2013 10:33, "Antoine Pitrou" wrote: >> I think it isn't only about teaching it to build with VS 2010, but >> providing binaries compatible with the VS 2010 runtime. >> Otherwise, AFAIU, if extensions are built with VS 2010 but loader with >> a VS 2008-compiled Python, there will be ABI problems. > > Right, this is the problem - building with newer compilers isn't an issue, > changing the Windows ABI *is* (which is the reason Christian is proposing a > version bump to denote the ABI incompatibility). > > If Visual Studio Express 2010 and later can be told to build against the VS2008 > C runtime, then that is something that can (and probably should) be enabled in > the next CPython 2.7 maintenance release (for both the main build process and > for distutils extension building). Unfortunately, this is not the case. The compiler has intimate knowledge of the associated CRT and can't build against other versions (officially; if there's an unofficial way I genuinely don't know what it is). It is possible to use the older compiler from newer versions of Visual Studio, but that doesn't really solve any issues. > Doing a 2.8 release *just* to change the Windows ABI to a new version of the MS > C runtime, on the other hand, seems impossible to do without thoroughly > confusing people (regardless of whether it's CPython or Stackless making such a > change). I agree, this is a bad idea. We'd also need to update distutils to detect the right version, and if we can do this then it's easier to update it to detect VC9 when installed from http://www.microsoft.com/en-us/download/details.aspx?id=3138 - currently it only finds VS2008, which installs things slightly differently, despite being an identical compiler. I've had one go at making the change (http://bugs.python.org/issue7511), though that (a) had a bug and (b) broke people who monkey patch distutils to return a different vcvarsall.bat (including the distutils tests - my fix removed the dependency on vcvarsall.bat completely). Happy to revise it if there's interest in taking the change. > I'd certainly want to explore all our alternatives with the Microsoft folks for > getting our preferred option (new compiler, old C runtime) working in an open > source friendly way before we went down the path of a 2.x ABI bump. In my experience, extensions built with later compilers work fine, provided you stay away from the FILE* APIs. One of my side projects is C++11 bindings for hosting/extending Python, and I run my tests with VC12 (Visual Studio 2013) against python.org's 2.6, 2.7, 3.2 and 3.3 binaries without issue. What may be best is to detect MSVC>9.0 in the headers and simply block the dangerous APIs when building sdists. Then extensions can be built with any available version of MSVC, provided they are "safe", and the build will fail if they aren't. There are still some ways the differing CRTs can cause issues - alloc/free pairing, for example - so some macros may also have to become exported functions. After that I don't think there are any ways to accidentally cause issues; you'd really have to be doing strange things (such as passing pointers to CRT types between extensions via capsules). > I've cc'ed Steve Dower directly, as he's the best person I know to ask about > this kind of problem. Cheers, Steve From dholth at gmail.com Thu Nov 21 22:31:01 2013 From: dholth at gmail.com (Daniel Holth) Date: Thu, 21 Nov 2013 16:31:01 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <528E74E5.7050702@canterbury.ac.nz> Message-ID: On Thu, Nov 21, 2013 at 4:12 PM, Paul Moore wrote: > On 21 November 2013 21:02, Greg Ewing wrote: >> Is that much different from package authors having to >> release binaries for different versions of Python, >> if they want to support older versions? >> >> Having multiple binaries for the same x.y version >> is different from what's been done before, but it >> seems to me an unavoidable consequence of supporting >> one x.y version for longer than usual. > > None of the currently available binary distribution formats > distinguish Windows binaries by anything other than minor version. For > wheels (and I think eggs), this is a showstopper as the name is > essential metadata (compatibility tags) for the other formats (wininst > and msi) the name is merely informational - packagers could rename, > but (a) they will forget, and (b) the users won't know if they have or > not. > > Before we can cleanly support multiple ABIs for a single minor version > on Windows, we need to have a resolution of this dilemma (which may be > nothing more than "only binaries for the python.org builds are allowed > on PyPI"...) > > Paul As for wheel it actually does support an ABI tag that is separate from the Python version and the architecture. It's the second one pyversion-abi-architecture as in py27-none-any or py27-cp27-linux_i386 (spelling?). The build tool and installer would have to be modified to be aware of any newly defined ABI tags. From tismer at stackless.com Thu Nov 21 22:32:55 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 21 Nov 2013 22:32:55 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> Message-ID: <528E7C07.8080405@stackless.com> On 21/11/13 20:46, Chris Barker wrote: > well, as you said below, you want to keep binary compatibility between > stackless and cPython, right down to the same dll name, so yes, it is > about Python. And since we are talking about it -- it actually would > be nice to be able to have a builds of python for Windows (any > version) that are not restricted to one compiler version per python > version. (Like a VS2010 py2.7, for example, but not JUST that) That is a great target that I cannot address right now, but would love to work on that, when I have understood all the API/ABI miracles. I was not aware of those things that are already there. > > well, there is precedent for that with the OS-X builds -- so > reasonable enough. However, that really only helps for binary wheels > and pip -- which haven't been widely adopted yet (to put it mildly!). > So maybe a new dll name makes sense -- honestly while some of how > Windows handles dlls makes "dll hell" inevitable, it's made worse by > really short dll names, and re-using names even when versions change > -- we're so far past the 8.3 world, I have no idea why that's still done. > > so a python27_VS_2010.dll or something might make sense. > I am converted to an OS X developer since 2006, but never had ABI problems, because I use homebrew, but instead of being set free on Windows after 30 years of pain, I now have the same mess in my Parallels VMs. Customers are so cruel, aren't they? > Throw some info in there about 64 vs 32 bit too? or is it enough that > the linker will let you know if there is a problem? > > It might be nice to patch the win_inst too--IIRC it's still > not very smart about even 32 vs 64 bit builds. > > As for stackless--just to be clear--can you use extensions > built with the "regular" python with a stack less binary? If > so, I understand the concern. If not, then it seems stackless > is a separate ecosystem anyway. > > > Good question, and this _is_ a problem: > Minus a few number of things, Stackless is almost completely binary > compatible with extensions, and people _will_ want to try > Stackless for some > things or might want to go back and use CPython, be it only to > remove concerns of > a customer. > People do want to install binary packages which are not built for > Stackless, > and we always did best efforts to support that. > > That means: basically we want to improve the situation for > Stackless-based > project in the first place, but make it easy to go back to CPython. > We have a compiler switch that can completely remove the stackless > additions and create a 100 % binary identical CPython version! > > That implies in the end that extension modules which work with > Stackless > will also be runnable on CPython 2.7.x VS2010 or whatever name it is. > The naming problem then comes in again through the back-door, > and we cannot shut our eyes and pretend "hey it is Stackless", > because that is admittedly close to a fraud. > > So even if VS2010 exists only in the stackless branch, it is very > likely > to get used as CPython VS 2010, and I again have the naming > problem ... > > > Just to be clear here: > > You want to be able to create a non-stackless, regular old CPython > binary built with VS2010. (which is also compatible with stackless build) Yes, this is the idea, to some contorted definition of 'idea'. > > OK, now: > > Do you want people to be able to use extensions built by third parties > for python.org CPython with your binary builds? Would be great, but I would not mind to create their extensions on stackless.com, instead. > > If so, then there will need to be a python.org > binary built with VS2010, and a way that makes it hard for people to > try to use extensions built for a different VS-version-build. > > If not, then the only problem is that users of your VS2010-built > binary will go off willy nilly and try to use extensions built for > python.org builds, and they may appear to work at > first glance, but may do weird things or crash unexpectedly. Yes, in the end it is much better to get some changes into CPython. But as I read the input from Nick and Martin, I am afraid this is over my tops, at least for the timeline I have set for myself. (And yeah, I was pushy, as I always was with the Stackless project - forgive me). > > I'd say the issue there is education as much as anything. > > Or couldn't you simply install your binary somewhere other than > C:\python27? (and use different registry setting or whatever so the > windows installers will not get confused?) > Yes I can, and I'm pretty much considering. Seeking an improvement right now, not a controversial path or whatnot... > The other potential route for error is a pip install -- you don't want > pip to find a binary that is incompatible with your build -- so you > should assure that whatever pip/wheel uses to identify the build is > set differently in your build (see the relevant PEPs). Yes, I want to make PIP work with it, want to make it very simple to install whatnot, and let people use that stuff. So if you can, please teach me what I need to do or avoid. I don't want to intrude anywhere, I just want to make the Stackless site a useful site where people can try extensions and additions without getting into that DLL hell where I was for ages. Conclusion: ---------- I do not want to do anything bad. I do not want to solve hard-to-solve ABI problems in a week. I do not want to drive python-dev crazy right now for just that. What I want is a workable CPython path for some customer (!=CCP) to use for the next (maybe 5) years, and I want to build that now, for good. I think you have helped me incredibly much, and we need to talk in private. Cheers -- Chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tismer at stackless.com Thu Nov 21 22:41:10 2013 From: tismer at stackless.com (Christian Tismer) Date: Thu, 21 Nov 2013 22:41:10 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E775F.7040800@g.nevcal.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> <528E5827.1010008@stoneleaf.us> <528E6BA4.8000909@stackless.com> <528E775F.7040800@g.nevcal.com> Message-ID: <528E7DF6.7070408@stackless.com> On 21/11/13 22:13, Glenn Linderman wrote: > On 11/21/2013 12:23 PM, Christian Tismer wrote: >> Maybe I would generate a cpython and spython exe and support them >> both in the same distribution? > That sounds cool, if possible. Hooka Hooka! Let's see if the nightmares agree :-) -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From cf.natali at gmail.com Thu Nov 21 22:41:17 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 21 Nov 2013 22:41:17 +0100 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted Message-ID: Hi, I'm happy to officially accept PEP 454 aka tracemalloc. The API has substantially improved over the past weeks, and is now both easy to use and suitable as a fundation for high-level tools for memory-profiling. Thanks to Victor for his work! Charles-Fran?ois From brett at python.org Thu Nov 21 22:43:10 2013 From: brett at python.org (Brett Cannon) Date: Thu, 21 Nov 2013 16:43:10 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E7692.3080006@canterbury.ac.nz> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> Message-ID: On Thu, Nov 21, 2013 at 4:09 PM, Greg Ewing wrote: > Concerning the version number, I thought the intention of > PEP 404 was simply to say that the PSF would not be releasing > anything called Python 2.8, not to forbid anyone *else* > from doing so. > > Or am I wrong about that? > Well, it's essentially BSD open source. We can't stop anyone from doing anything when it comes to code. > > If I'm right, there's nothing stopping Christian from > releasing Stackless Python 2.8 with whatever improvements > he wants. You're right, there is nothing legally stopping him. > If it includes other improvements beyond the > change of compiler, it may even deserve the higher > version number. I disagree with that. Python-dev will not be releasing Python 2.8, ever (that's what PEP 404 states). If I decided to start making decisions of what constituted Python 2.8 outside of python-dev then it really isn't a new feature release and thus does not deserve the number as it isn't a release from the collective decision-making of the core developers of Python. That's why so many people lept up initially when Christian said "Stackless Python 2.8"; it isn't a true successor and no one wants to confuse users into thinking that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Nov 21 22:14:18 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Nov 2013 13:14:18 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E74E5.7050702@canterbury.ac.nz> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <528E74E5.7050702@canterbury.ac.nz> Message-ID: On Thu, Nov 21, 2013 at 1:02 PM, Greg Ewing wrote: > martin at v.loewis.de wrote: > >> Package authors would have to create multiple binary releases of >> the same modules for Windows, and upload them to PyPI. pip would have >> to learn to download the right one, depending on what build of Python >> 2.7 is running. >> > > Is that much different from package authors having to > release binaries for different versions of Python, > if they want to support older versions? > or a better analogy -- both 32 and 64 bit versions -- same python version, different binary compatibility and really exactly like the OS-X situation: one for OS-X version 10.3+ (ppc and intel32) and one for 10.6+(intel32+64). In both cases, users need to know what they have installed, and be careful that they use a matching binary extension when they grab one. And we are in process of having PyPi and pip do the right thing with wheels. I think the "only" added complication is that a 64bit cpython built with VS2008 and and one built with VS2010 would look exactly the same to both the user and the linker (i.e. python27.dll would be different, but still link). So it would be helpful if we could: - Have a file naming scheme so people know what they are getting for both the python binary and extension binaries. - Have a wheel naming and identifying scheme that also makes that distinction. - Probably have a dll naming scheme that makes the distinction as well, so if the above two checks fail, users will get a clear failure at runtime, rather than a buggy success. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Thu Nov 21 23:09:49 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 21 Nov 2013 23:09:49 +0100 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: 2013/11/21 Charles-Fran?ois Natali : > I'm happy to officially accept PEP 454 aka tracemalloc. > The API has substantially improved over the past weeks, and is now > both easy to use and suitable as a fundation for high-level tools for > memory-profiling. > > Thanks to Victor for his work! Thanks Charles-Fran?ois for your help! Victor From martin at v.loewis.de Thu Nov 21 23:13:28 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 21 Nov 2013 23:13:28 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E7692.3080006@canterbury.ac.nz> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> Message-ID: <20131121231328.Horde.nrfQw2ZUCpk6e2yXHIZrlg1@webmail.df.eu> Quoting Greg Ewing : > Concerning the version number, I thought the intention of > PEP 404 was simply to say that the PSF would not be releasing > anything called Python 2.8, not to forbid anyone *else* > from doing so. > > Or am I wrong about that? That's correct. > If I'm right, there's nothing stopping Christian from > releasing Stackless Python 2.8 with whatever improvements > he wants. "Nothing stopping" is exactly right. People still don't need to like it. Barry wishes there was something stopping him, and be it a lawyer invoking trademark law. If "2.8" was just a version number of Stackless Python not related to Python version (like PyPy's version number currently being 2.2), protest would be much less. People fear that releasing Stackless Python 2.8 would create the impression that even CPython has continued development, and it might require significant PR efforts to educate people. Regards, Martin From ncoghlan at gmail.com Thu Nov 21 23:22:41 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 08:22:41 +1000 Subject: [Python-Dev] flaky tests caused by repr() sort order In-Reply-To: <528E4C95.4040000@python.org> References: <528E4C95.4040000@python.org> Message-ID: On 22 Nov 2013 04:12, "Christian Heimes" wrote: > > Am 21.11.2013 18:57, schrieb Tim Peters: > > Best to change the failing tests. For example, _they_ can sort the > > dict keys if they rely on a fixed order. Sorting in general is a > > dubious idea because it can be a major expense with no real benefit > > for most uses. > > I don't consider repr() as a performance critical function. It's mostly > used for debugging. +1 from me for fixing the tests (likely by checking the expected sort order rather than assuming it) rather than changing the repr() implementations. Cheers, Nick. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 21 23:24:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 08:24:13 +1000 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: On 22 Nov 2013 07:43, "Charles-Fran?ois Natali" wrote: > > Hi, > > I'm happy to officially accept PEP 454 aka tracemalloc. > The API has substantially improved over the past weeks, and is now > both easy to use and suitable as a fundation for high-level tools for > memory-profiling. > > Thanks to Victor for his work! Huzzah! Thanks to you both for getting this ready for inclusion :) Cheers, Nick. > > > Charles-Fran?ois > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Nov 21 23:33:49 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Nov 2013 14:33:49 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528E7C07.8080405@stackless.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <528E56A2.7020905@stackless.com> <528E7C07.8080405@stackless.com> Message-ID: On Thu, Nov 21, 2013 at 1:32 PM, Christian Tismer wrote: > I am converted to an OS X developer since 2006, but never had ABI > problems, > because I use homebrew, > Right, different story -- you are supposed to compile everything on the target system, so everything stays compatible. but instead of being set free on Windows after 30 years > of pain, I now have the same mess in my Parallels VMs. > Customers are so cruel, aren't they? > business would be som much easier without them [but less successful ;-) ] Do you want people to be able to use extensions built by third parties for > python.org CPython with your binary builds? > > Would be great, but I would not mind to create their extensions on > stackless.com, instead. > The other potential route for error is a pip install -- you don't want pip > to find a binary that is incompatible with your build -- so you should > assure that whatever pip/wheel uses to identify the build is > set differently in your build (see the relevant PEPs). > > > Yes, I want to make PIP work with it, want to make it very simple to > install > whatnot, and let people use that stuff. So if you can, please teach me > what I need to do or avoid. > I think it's a "simple" as making sure that the names your binary generate for pip and wheel is different than the python.org VS2008 build. That way, if someone does "pip install something" with your binary, it won't accidentally download a binary wheel, but rather will try to install from source. And if their complier is set up right, that should work (ignore dependency hell for now). And you could build wheels and host them on the stackless site for things that aren't easy to build. folks could still get in trouble trying to install msi installers built for pyton,org python -- but it sounds like you could fix that with a dll name change. What I want is a workable CPython path for some customer (!=CCP) to use > for the next (maybe 5) years, and I want to build that now, for good. > That's the trick -- I don't think it's only me that thinks we should have the option, some point in the future, to put out VS_something_else binary of Python2.7, or a future version. So it would be nice to come to a consensus on the pip/wheel/dll naming scheme before you put anything out. On Thu, Nov 21, 2013 at 1:31 PM, Daniel Holth wrote: > As for wheel it actually does support an ABI tag that is separate from > the Python version and the architecture. It's the second one > pyversion-abi-architecture as in py27-none-any or py27-cp27-linux_i386 > (spelling?). The build tool and installer would have to be modified to > be aware of any newly defined ABI tags. I'm still confused a bit about ABI vs. Platform vs. I had thought that ABI meant the ABI defined by the python source code, rather than the ABI defined by the complier -- the examples in the doc imply that. As for "platform", I'm not sure how to define that, but it seems reasonable to consider the run-time core Clib as part of the "platform". In any case, that's what the Mac builds are doing. But it really doesn't matter which you use, as long as there is some consistency. Maybe this part is a topic for the distutils-sig. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 21 23:53:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 08:53:37 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <-4665162615556836517@unknownmsgid> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> Message-ID: On 22 Nov 2013 02:03, "Chris Barker - NOAA Federal" wrote: > > >> >> with older releases (I admit I don't understand the ABI compatibility on OSX). > > > Well, with OS-X, it's not exactly the C lic in the same way, but there are different SDKs for different OS versions, and you can add to that PPC vs Intel processors and 32 vs 64 bit. > > So we have for years had two builds for OS-X, and you can add to that Macports, Homebrew, and what have you. > > And the idea that this isn't an issue for gcc makes no sense-- it's such a big issue for Linux, in fact that python.org doesn't even try to build binaries for Linux, and pypi has disabled binary wheels for Linux. Indeed :) For 2.7.7, I think some combination of the two following ideas would be worth pursuing: - a C runtime independent API flag (set by default on Windows when building with a compiler other than VS2008). This would largely be a backport of some of the stable ABI work from Python 3. - getting Windows closer to the current Mac OS X situation by ensuring that the C runtime used directly affects the ABI flags and shared library names. PyPI would apply the Mac OS X guideline where extensions are expected to be compatible with the python.org binaries. This would be the biggest change pushed through under the "make builds work" policy for the extended 2.7 lifecycle, but Microsoft's aggressive approach to deprecating old compilers and C runtimes means I think we don't have much choice. In the near term, if Stackless build to a different DLL name under VS2010 and make it clear to their users that extension compatibility issues are possible (or even likely) if they aren't rebuilt from source, then I think that would be compatible with the above proposal for a way forward. Then we'd just need some volunteers to write and implement a PEP or two :) (Note, similar to the Mac OS X situation, I think we should do this without hosting any new interpreter variants on python.org - VS2010 and VS2013 source builds would become separate build-from-source ecosystems for extensions, using sdists on PyPI as the default distribution mechanism) Cheers, Nick. > > >> >> So adding VS 2010 build files to the 2.7 tree would be fine with me, > > > I think this part is a no brainer. > > I also think having a 2.8 out there that is exactly the same as 2.7, except that it was built with a different version of a compiler on one particular platform is a very very bad idea. > > So how CAN this be addressed? This is really a distribution issue, and the distutils-sig has been hard at work building the infrastructure to support things like this. > > Right now, if you build binary wheels with the two different "official" OS-X builds, you will get two different names, and pop can find the correct one. (last I checked, pip would happily install the wrong one if you asked it to, though I think that is slated to be fixed) > > So if a different build of 2.7 for Windows is put out there, we need to make sure it reports the platform in a way that wheel and pip can make the distinction. > > It might be nice to patch the win_inst too--IIRC it's still not very smart about even 32 vs 64 bit builds. > > As for stackless--just to be clear--can you use extensions built with the "regular" python with a stack less binary? If so, I understand the concern. If not, then it seems stackless is a separate ecosystem anyway. > > Chris > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Fri Nov 22 00:02:02 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 22 Nov 2013 00:02:02 +0100 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: 2013/11/21 Nick Coghlan : > Huzzah! Thanks to you both for getting this ready for inclusion :) I now hope that someone will use it :-) By the way, collections.namedtuple has a private _source attribute. This attributes uses something like 676.2 kB in the Python test suite, it the 5th biggest user of memory. Woud you mind if I simply remove it? I'm asking because this *private* attribute is documented, which is surprising for something private. I don't understand the use case. Is it to debug the implementation of the namedtuple? Or I imagined a different usecase: dump the _source into a file to speedup the startup time. But this optimization is not used in CPython whereas we care of the startup time. Can maybe Raymond explain the use case for this attribute? If there is a real use case, I would prefer a real function to get the source code of a function defining a new namedtuple type. I already opened an issue for that: http://bugs.python.org/issue19640 -- The implementation of namedtuple constructor ("factory" ?) looks inefficient. Why do we need to build source code to then parse the code source? Why not having a base namedtuple class and then inherit from this class? It may reduce the memory footprint, allow to check if a type is namedtuple, etc. There are two different concerns, I prefer to only focus on the removal of the _source attribute. But it's also to say the the _source attribute is very specific to the current (inefficient?) implement of the namedtuple constructor. Victor From p.f.moore at gmail.com Fri Nov 22 00:07:05 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Nov 2013 23:07:05 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <528E74E5.7050702@canterbury.ac.nz> Message-ID: On 21 November 2013 21:27, Chris Barker wrote: > That's already the unstated case. But besides stackless, it some of us are > advocating that there be python.org-provided binaries built with a newer > compiler (eventually, anyway). I see no problem with python.org producing and distributing a VS2010-compiled version of Python 2.7 (subject to python-dev being willing to do so of course). But in order to do so, it would be necessary to address the issue of having multiple ABIs within one minor version on Windows. As you say, that's just a matter of doing things like enabling the ABI tag in wheel, making the necessary distutils changes (don't forget that distutils validates the MSVC version in use) and whatever other things I've forgotten about. It's by no means impossible, but nor is it trivial to do it "right". On the other hand, as Monty Python would say, "2.8 is right out" :-) Paul From ncoghlan at gmail.com Fri Nov 22 00:17:14 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 09:17:14 +1000 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: On 22 Nov 2013 09:02, "Victor Stinner" wrote: > > 2013/11/21 Nick Coghlan : > > Huzzah! Thanks to you both for getting this ready for inclusion :) > > I now hope that someone will use it :-) > > > By the way, collections.namedtuple has a private _source attribute. > This attributes uses something like 676.2 kB in the Python test suite, > it the 5th biggest user of memory. > > Woud you mind if I simply remove it? > > I'm asking because this *private* attribute is documented, which is > surprising for something private. I don't understand the use case. Is > it to debug the implementation of the namedtuple? Or I imagined a > different usecase: dump the _source into a file to speedup the startup > time. But this optimization is not used in CPython whereas we care of > the startup time. namedtuple deviates from normal naming conventions to avoid name clashes between field names and the attributes and methods of the object itself (this is covered in the docs for namedtuple). Skipping saving _source under -OO would probably be a good thing, but otherwise it's a public API with the usual backwards compatibility guarantees. Cheers, Nick. > > Can maybe Raymond explain the use case for this attribute? > > If there is a real use case, I would prefer a real function to get the > source code of a function defining a new namedtuple type. > > I already opened an issue for that: > http://bugs.python.org/issue19640 > > -- > > The implementation of namedtuple constructor ("factory" ?) looks > inefficient. Why do we need to build source code to then parse the > code source? Why not having a base namedtuple class and then inherit > from this class? It may reduce the memory footprint, allow to check if > a type is namedtuple, etc. > > There are two different concerns, I prefer to only focus on the > removal of the _source attribute. But it's also to say the the _source > attribute is very specific to the current (inefficient?) implement of > the namedtuple constructor. > > Victor -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Nov 22 00:36:57 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 21 Nov 2013 18:36:57 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131121231328.Horde.nrfQw2ZUCpk6e2yXHIZrlg1@webmail.df.eu> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> <20131121231328.Horde.nrfQw2ZUCpk6e2yXHIZrlg1@webmail.df.eu> Message-ID: On 11/21/2013 5:13 PM, martin at v.loewis.de wrote: > > Quoting Greg Ewing : > >> Concerning the version number, I thought the intention of >> PEP 404 was simply to say that the PSF would not be releasing >> anything called Python 2.8, not to forbid anyone *else* >> from doing so. >> >> Or am I wrong about that? > > That's correct. > >> If I'm right, there's nothing stopping Christian from >> releasing Stackless Python 2.8 with whatever improvements >> he wants. > > "Nothing stopping" is exactly right. People still don't need > to like it. Barry wishes there was something stopping him, > and be it a lawyer invoking trademark law. My lay knowledge of US Trademark law and case history and reading of http://www.python.org/psf/trademarks/ suggests that 'nothing stopping' is exactly wrong. I believe the trademark has also been registered in Europe. As usual, 'I am not a lawyer', but if Christian wants to push forward with using 'Python 2.8', I suggest that he consult the PSF Trademark Committee and lawyer first. > If "2.8" was just a version number of Stackless Python not > related to Python version (like PyPy's version number currently > being 2.2), protest would be much less. But it is *not* unrelated. > People fear that > releasing Stackless Python 2.8 would create the impression that > even CPython has continued development, and it might require > significant PR efforts to educate people. Yep, I do. 'Stackless 10' would have no issue. --- I think two unrelated issues are being mixed together in this thread that really should be two separate thread. 1. Compiling Windows CPython x.y with more than one compiler. This is not specifically a Stackless issue or even a 2.7 issue. If 3.4 is released compiled with VS2010, there will be people wanting it compiled with VS2013(12?). And vice versa. 2. Public releases of new Python versions based on 2.7 that are not 3.x. How they are named is an issue regardless of what Windows compiler is used, if indeed a release even uses one. In my view, either no one should be allowed to call something 'X Python 2.8' (or any unofficial x.y), or everyone should. The latter might mean that we see Stackless Python 2.8, Ubuntu Python 2.8, RedHat Python 2.8, ActiveState Python 2.8, Enthought Python 2.8, etc, all different. I prefer no one. -- Terry Jan Reedy From barry at python.org Fri Nov 22 00:43:37 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 21 Nov 2013 18:43:37 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> <20131121231328.Horde.nrfQw2ZUCpk6e2yXHIZrlg1@webmail.df.eu> Message-ID: <20131121184337.0cf4f001@anarchist> On Nov 21, 2013, at 06:36 PM, Terry Reedy wrote: >As usual, 'I am not a lawyer', but if Christian wants to push forward with >using 'Python 2.8', I suggest that he consult the PSF Trademark Committee and >lawyer first. Just to make clear, I'm definitely *not* suggesting this particular case ever escalate to such a level. Chris is our friend and a nice guy. :) I'm just saying that I think we should avoid any possible confusion on the part of ours (and Stackless's) users. And based on the discussions here, there are plenty of good alternatives. Cheers, -Barry From solipsis at pitrou.net Fri Nov 22 00:53:20 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 00:53:20 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> <20131121231328.Horde.nrfQw2ZUCpk6e2yXHIZrlg1@webmail.df.eu> <20131121184337.0cf4f001@anarchist> Message-ID: <20131122005320.4c584dad@fsol> On Thu, 21 Nov 2013 18:43:37 -0500 Barry Warsaw wrote: > On Nov 21, 2013, at 06:36 PM, Terry Reedy wrote: > > >As usual, 'I am not a lawyer', but if Christian wants to push forward with > >using 'Python 2.8', I suggest that he consult the PSF Trademark Committee and > >lawyer first. > > Just to make clear, I'm definitely *not* suggesting this particular case ever > escalate to such a level. Chris is our friend and a nice guy. :) > > I'm just saying that I think we should avoid any possible confusion on the > part of ours (and Stackless's) users. And based on the discussions here, > there are plenty of good alternatives. +1 from me :-) Regards Antoine. From tismer at stackless.com Fri Nov 22 01:00:37 2013 From: tismer at stackless.com (Christian Tismer) Date: Fri, 22 Nov 2013 01:00:37 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131122005320.4c584dad@fsol> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <528D478E.1040105@stackless.com> <20131121130601.42c09c1a@fsol> <528E7692.3080006@canterbury.ac.nz> <20131121231328.Horde.nrfQw2ZUCpk6e2yXHIZrlg1@webmail.df.eu> <20131121184337.0cf4f001@anarchist> <20131122005320.4c584dad@fsol> Message-ID: <528E9EA5.9030607@stackless.com> On 22.11.13 00:53, Antoine Pitrou wrote: > On Thu, 21 Nov 2013 18:43:37 -0500 > Barry Warsaw wrote: > >> On Nov 21, 2013, at 06:36 PM, Terry Reedy wrote: >> >>> As usual, 'I am not a lawyer', but if Christian wants to push forward with >>> using 'Python 2.8', I suggest that he consult the PSF Trademark Committee and >>> lawyer first. >> Just to make clear, I'm definitely *not* suggesting this particular case ever >> escalate to such a level. Chris is our friend and a nice guy. :) >> >> I'm just saying that I think we should avoid any possible confusion on the >> part of ours (and Stackless's) users. And based on the discussions here, >> there are plenty of good alternatives. > +1 from me :-) Thank you for that input! It was important and urgent, as I saw myself jumping into the wrong wagon, again. -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From chris.barker at noaa.gov Fri Nov 22 01:22:51 2013 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 21 Nov 2013 16:22:51 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> Message-ID: On Thu, Nov 21, 2013 at 2:53 PM, Nick Coghlan wrote: > For 2.7.7, I think some combination of the two following ideas would be > worth pursuing: > > - a C runtime independent API flag (set by default on Windows when > building with a compiler other than VS2008). This would largely be a > backport of some of the stable ABI work from Python 3. > very cool! > - getting Windows closer to the current Mac OS X situation by ensuring > that the C runtime used directly affects the ABI flags and shared library > names. PyPI would apply the Mac OS X guideline where extensions are > expected to be compatible with the python.org binaries. > sounds good. > (Note, similar to the Mac OS X situation, I think we should do this > without hosting any new interpreter variants on python.org - > but we do host different variants for OS-X on python.org. It's complicated by the universal binary thing, but there are two variants hosted. There are also two Windows variants 32 and 64 bit -- yes you can think of those as different platforms, but as you can run the 32 bit build on a 64 bit Windows, things do get confused -- I've seen a lot of questions on mailing lists that have to do with those two getting mixed up (the msi installer really could be smarter...) I'm not really proposing that python.org host another varient at this point, but I dont hink it should be off the table. > VS2010 and VS2013 source builds would become separate build-from-source > ecosystems for extensions, using sdists on PyPI as the default distribution > mechanism) > A lot of Windows users need binaries -- it would be really nice if we could make that easy. But the key thing is that people who do a "pip install something" should not get an incompatible binary. And the way to do that is make sure the wheel naming scheme used by binary builds in teh wild don't use the same naming scheme as pyton.org builds if they are not compatible with them. As it sounds like stackless is going to do this soon, it woulld be nice to have some consensus as to what the standard should be. By the way, I'm still not sure if it should go in the ABI tag or the Platfrom tag. from the examples in the PEP, it doesn't look like it should be the ABI tab, and in the case of teh OS-X builds, it's in the platfrm tag. Also, on my OS-X box, with a somewhat hacked ptyon,org build, I get: pyGnome-alpha-cp27-none-macosx_10_6_i386.whl when I build a binary wheel. Note that the "abi" tag is "none" -- not sure why that is, this is clealy cpython2.7.* -- shouldn't that be cp27 for the ABI tag? The wheel docs are kind of sparse, so I have no idea where that abi tag is supposed to come from -- ah, I found something in PEP 425: """Why is the ABI tag (the second tag) sometimes "none" in the reference implementation? Since Python 2 does not have an easy way to get to the SOABI (the concept comes from newer versions of Python 3) the reference implentation at the time of writing guesses "none". Ideally it would detect "py27(d|m|u)" analagous to newer versions of Python, but in the meantime "none" is a good enough way to say "don't know". """ I don't know what SOABI is, but it soudns like that defines what should be in the abi tag... But the platfrom tag is: macosx_10_6_i386.whl macosx -- natch' 10_6 -- built for the 10.6 SDK (I'm running 10.7...) i386 -- I've hacked my build to be 32 bit only Does this belong on the distutils list instead? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Nov 22 01:38:22 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 01:38:22 +0100 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted References: Message-ID: <20131122013822.11520a87@fsol> On Fri, 22 Nov 2013 09:17:14 +1000 Nick Coghlan wrote: > > Skipping saving _source under -OO would probably be a good thing, but > otherwise it's a public API with the usual backwards compatibility > guarantees. I think skipping saving _source under -OO should be a bugfix. It's terribly weird and suboptimal to remove docstrings but keep the source code of namedtuple instances in some obscure attributes. Regards Antoine. From ericsnowcurrently at gmail.com Fri Nov 22 01:40:48 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 21 Nov 2013 17:40:48 -0700 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: On Thu, Nov 21, 2013 at 4:17 PM, Nick Coghlan wrote: > Skipping saving _source under -OO would probably be a good thing, but > otherwise it's a public API with the usual backwards compatibility > guarantees. One alternative might be to make it a property that re-generates the source (you just need self.__class__.__name__ and self._fields). -eric From victor.stinner at gmail.com Fri Nov 22 02:04:04 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 22 Nov 2013 02:04:04 +0100 Subject: [Python-Dev] [Python-checkins] peps: Update list of accepted but not yet implemented or merged peps. In-Reply-To: <3dQcGM6M5pz7M8F@mail.python.org> References: <3dQcGM6M5pz7M8F@mail.python.org> Message-ID: Hum, I don't think that regex module will enter Python 3.4 before this week-end, there is no PEP. For the "Introspection information for builtins", I think the PEP 436 has been accepted. The code has been merged, but the PEP status is still draft. Victor 2013/11/22 barry.warsaw : > http://hg.python.org/peps/rev/ea82fe7aa1ff > changeset: 5311:ea82fe7aa1ff > user: Barry Warsaw > date: Thu Nov 21 18:21:06 2013 -0500 > summary: > Update list of accepted but not yet implemented or merged peps. > > files: > pep-0429.txt | 14 +++++++++----- > 1 files changed, 9 insertions(+), 5 deletions(-) > > > diff --git a/pep-0429.txt b/pep-0429.txt > --- a/pep-0429.txt > +++ b/pep-0429.txt > @@ -74,6 +74,15 @@ > * PEP 446, make newly created file descriptors not inherited by default > * PEP 450, basic statistics module for the standard library > > +Accepted but not yet implemented/merged: > + > +* PEP 428, the pathlib module -- object-oriented filesystem paths > +* PEP 451, a ModuleSpec Type for the Import System > +* PEP 453, pip bootstrapping/bundling with CPython > +* PEP 454, the tracemalloc module for tracing Python memory allocations > +* PEP 3154, Pickle protocol revision 4 > +* PEP 3156, improved asynchronous IO support > + > Other final large-scale changes: > > * None so far > @@ -85,12 +94,7 @@ > * PEP 441, improved Python zip application support > * PEP 447, support for __locallookup__ metaclass method > * PEP 448, additional unpacking generalizations > -* PEP 451, making custom import hooks easier to implement correctly > -* PEP 453, pip bootstrapping/bundling with CPython > -* PEP 454, PEP 445 based malloc tracing support > * PEP 455, key transforming dictionary > -* PEP 3154, Pickle protocol revision 4 > -* PEP 3156, improved asynchronous IO support > > Other proposed large-scale changes: > > > -- > Repository URL: http://hg.python.org/peps > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > From Steve.Dower at microsoft.com Fri Nov 22 01:58:41 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Fri, 22 Nov 2013 00:58:41 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> Message-ID: <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> Nick Coghlan wrote: > For 2.7.7, I think some combination of the two following ideas would be worth > pursuing: > - a C runtime independent API flag (set by default on Windows when building with > a compiler other than VS2008). This would largely be a backport of some of the > stable ABI work from Python 3. > - getting Windows closer to the current Mac OS X situation by ensuring that the > C runtime used directly affects the ABI flags and shared library names. PyPI > would apply the Mac OS X guideline where extensions are expected to be > compatible with the python.org binaries. I don't really think either of these are necessary. With some changes to Python's headers and some extra exports, it should be possible to future-proof Python 2.7.7 against any new compilers, at least on Windows. What I have in mind is basically detecting the MSVC version in the headers (there are preprocessor variables for this) and, if it isn't VC9, substituting a different function for those that require FILE*. This function/macro could call _get_osfhandle() and pass it to an API (built into python27.dll) that calls _open_osfhandle() and forwards it to the usual API. This should let any compiler be used for building extensions or hosting python27.dll without affecting existing code or requiring changes to the packages. > This would be the biggest change pushed through under the "make builds work" > policy for the extended 2.7 lifecycle, but Microsoft's aggressive approach to > deprecating old compilers and C runtimes means I think we don't have much > choice. Ultimately, compilers are probably going to be deprecated more quickly now that we're on a faster release cadence, which makes it more important that Python 2.7 is prepared for an unknown future. > In the near term, if Stackless build to a different DLL name under VS2010 and > make it clear to their users that extension compatibility issues are possible > (or even likely) if they aren't rebuilt from source, then I think that would be > compatible with the above proposal for a way forward. > Then we'd just need some volunteers to write and implement a PEP or two :) I'm happy to work on a PEP and changes for what I described above, if there's enough interest? I can also update distutils to detect and build with any available compiler, though this may be more of a feature than we'd want for 2.7 at this point. Cheers, Steve > (Note, similar to the Mac OS X situation, I think we should do this without > hosting any new interpreter variants on python.org - VS2010 and VS2013 source > builds would become separate build-from-source ecosystems for extensions, using > sdists on PyPI as the default distribution mechanism) From barry at python.org Fri Nov 22 03:04:14 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 21 Nov 2013 21:04:14 -0500 Subject: [Python-Dev] [Python-checkins] peps: Update list of accepted but not yet implemented or merged peps. In-Reply-To: References: <3dQcGM6M5pz7M8F@mail.python.org> Message-ID: <20131121210414.3232e357@limelight.wooz.org> On Nov 22, 2013, at 02:04 AM, Victor Stinner wrote: >Hum, I don't think that regex module will enter Python 3.4 before this >week-end, there is no PEP. Okay thanks, I'll remove this from PEP 429. >For the "Introspection information for builtins", I think the PEP 436 >has been accepted. The code has been merged, but the PEP status is >still draft. This is an odd one. This is the only relevant message I could find. Guido only gives his approval to merge Argument Clinic in early: https://mail.python.org/pipermail/python-dev/2013-October/129591.html But this is not acceptance of PEP 436, which has still not gotten formal approval afaict. We'll need that in order to change the status on the PEP, with a Resolution header. Larry, is the PEP ready to be pronounced upon? -Barry From ncoghlan at gmail.com Fri Nov 22 10:01:03 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 19:01:03 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> Message-ID: On 22 Nov 2013 10:58, "Steve Dower" wrote: > > Nick Coghlan wrote: > > For 2.7.7, I think some combination of the two following ideas would be worth > > pursuing: > > - a C runtime independent API flag (set by default on Windows when building with > > a compiler other than VS2008). This would largely be a backport of some of the > > stable ABI work from Python 3. > > - getting Windows closer to the current Mac OS X situation by ensuring that the > > C runtime used directly affects the ABI flags and shared library names. PyPI > > would apply the Mac OS X guideline where extensions are expected to be > > compatible with the python.org binaries. > > I don't really think either of these are necessary. With some changes to Python's headers and some extra exports, it should be possible to future-proof Python 2.7.7 against any new compilers, at least on Windows. > > What I have in mind is basically detecting the MSVC version in the headers (there are preprocessor variables for this) and, if it isn't VC9, substituting a different function for those that require FILE*. This function/macro could call _get_osfhandle() and pass it to an API (built into python27.dll) that calls _open_osfhandle() and forwards it to the usual API. > > This should let any compiler be used for building extensions or hosting python27.dll without affecting existing code or requiring changes to the packages. > > > This would be the biggest change pushed through under the "make builds work" > > policy for the extended 2.7 lifecycle, but Microsoft's aggressive approach to > > deprecating old compilers and C runtimes means I think we don't have much > > choice. > > Ultimately, compilers are probably going to be deprecated more quickly now that we're on a faster release cadence, which makes it more important that Python 2.7 is prepared for an unknown future. > > > In the near term, if Stackless build to a different DLL name under VS2010 and > > make it clear to their users that extension compatibility issues are possible > > (or even likely) if they aren't rebuilt from source, then I think that would be > > compatible with the above proposal for a way forward. > > Then we'd just need some volunteers to write and implement a PEP or two :) > > I'm happy to work on a PEP and changes for what I described above, if there's enough interest? I can also update distutils to detect and build with any available compiler, though this may be more of a feature than we'd want for 2.7 at this point. That's part of what a PEP can help us decide, though, so if you're willing to put one together, that would be great :) Cheers, Nick. > > Cheers, > Steve > > > (Note, similar to the Mac OS X situation, I think we should do this without > > hosting any new interpreter variants on python.org - VS2010 and VS2013 source > > builds would become separate build-from-source ecosystems for extensions, using > > sdists on PyPI as the default distribution mechanism) -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.m.tew at gmail.com Fri Nov 22 10:00:54 2013 From: richard.m.tew at gmail.com (Richard Tew) Date: Fri, 22 Nov 2013 22:00:54 +1300 Subject: [Python-Dev] PEP 0404 and VS 2010 Message-ID: Hi, I've been following this thread and the specific issue of the strong negative reaction to the name "Stackless Python 2.8", and I'm a bit bothered by the whole experience. Was there any concern about the name "Stackless Python 0.x" when Christian released his original versions? More importantly, has it caused any real problems over the years? https://mail.python.org/pipermail/python-announce-list/1999-July/000117.html Was there any concern about the dozens of "Stackless Python 2.x" and "Stackless Python 3.x" versions that I have released over the years being a cause for confusion? Or more importantly, any actual problems experienced? Yet, suddenly, the chance that we may release a "Stackless Python 2.8", is a concern. https://mail.python.org/pipermail/python-dev/2013-November/130429.html https://mail.python.org/pipermail/python-dev/2013-November/130466.html https://mail.python.org/pipermail/python-dev/2013-November/130467.html https://mail.python.org/pipermail/python-dev/2013-November/130495.html We've been releasing a variant of Python with related version numbers for over 14 years, under the name "Stackless Python x.x". Now because of a different version number, not only is there strong opposition to us continuing to use this name, but people go so far as to talk about trademarks as a way to prevent us using "Python" in there. https://mail.python.org/pipermail/python-dev/2013-November/130435.html We're not talking about releasing a Python 2.8 against anyone's wishes here. Or in _this specific case_, doing anything other than naming something in a way we've been naming it for over a decade. Yet it's reached the point where people are alluding to ways to forcibly stop us. That there are people who would consider using the trademark to force us to change the name we've been using without harm for 14 years, worries me. It's one thing to perhaps use it to stop someone scamming Python users, and another to suggest using it as a weapon against us for this? Really? Cheers, Richard. From martin at v.loewis.de Fri Nov 22 12:21:45 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Nov 2013 12:21:45 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> Message-ID: <528F3E49.9030800@v.loewis.de> Am 22.11.13 01:58, schrieb Steve Dower: > I'm happy to work on a PEP and changes for what I described above, if > there's enough interest? I can also update distutils to detect and > build with any available compiler, though this may be more of a > feature than we'd want for 2.7 at this point. I don't think a PEP is necessary - Guido essentially pre-approved changes needed to make Python 2.7 work with newer tools and operating systems. A patch for this would be appreciated - perhaps you would want to put it into your sandbox on hg.python.org. Regards, Martin From ncoghlan at gmail.com Fri Nov 22 12:32:48 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 21:32:48 +1000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: Message-ID: On 22 November 2013 19:00, Richard Tew wrote: > We're not talking about releasing a Python 2.8 against anyone's wishes > here. Or in _this specific case_, doing anything other than naming > something in a way we've been naming it for over a decade. Yet it's > reached the point where people are alluding to ways to forcibly stop > us. Every previous case was a simple matter of "this is the Stackless variant of Python X.Y". The difference in this case is that there is *no* Python 2.8 (and a PEP stating explicitly that python-dev have no intention of creating one) to create a variant, so this would have meant the Stackless community taking the novel step of creating a new feature release of Python 2.x, and hence taking up the burden of defining the feature set of that release. A few folks overreacted in their concern about the community confusion such a move would inevitably create - *anything* called "Python 2.8" is going to give the impression that we've changed our mind about 2.7 being the last feature release in the 2.x series, even if it has the word Stackless in front of it. We can almost certainly resolve the Windows ABI issue without the Stackless community needing to take such a drastic step, though, so I for one greatly appreciate Christian raising the question here and giving us the opportunity to help out. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rosuav at gmail.com Fri Nov 22 12:56:20 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Nov 2013 22:56:20 +1100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 10:32 PM, Nick Coghlan wrote: > A few folks overreacted in their concern about the community confusion > such a move would inevitably create - *anything* called "Python 2.8" > is going to give the impression that we've changed our mind about 2.7 > being the last feature release in the 2.x series, even if it has the > word Stackless in front of it. >From what I understand of the discussion, the problem is only because it's the specific value "2.8", which could plausibly be a Python version number. Calling it "Stackless Python 10" wouldn't cause confusion (don't remember who suggested that), as it's clearly not part of the regular Python versioning system - you could say "Stackless Python 10 is code compatible with Python 2.7" and there'd be no confusion. But if, say, Red Hat release something called "Python 2.4.6-rhel", I would expect it to be entirely compatible with a python.org Python 2.4 release, possibly with some additional bugfixes; it says "2.4" in it, so I'm not going to go dig through its docs to figure out what my code will run on. I certainly wouldn't expect it to be "2.2 built with a new compiler", or "2.7 with additional libraries added", or anything. It's all about how it reads at first glance. In my personal opinion, I would very much like to see the 2.x line *not* be supported indefinitely. Why are so many people still using it? Because the impetus to shift is weaker than the inertia to stay. How will that be changed? By strengthening the need to move on. If anyone pledges to support a 2.x variant for the next couple of decades, then python-list will have to deal with people not knowing (and not saying) whether they're on 2.x or 3.x for probably another 30 years or more. I'd rather be able to stop warning people about input() and non-Unicode strings and from __future__ import division. ChrisA From stefan at bytereef.org Fri Nov 22 13:03:18 2013 From: stefan at bytereef.org (Stefan Krah) Date: Fri, 22 Nov 2013 13:03:18 +0100 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: <20131122120318.GA12967@sleipnir.bytereef.org> Victor Stinner wrote: > 2013/11/21 Nick Coghlan : > > Huzzah! Thanks to you both for getting this ready for inclusion :) > > I now hope that someone will use it :-) Congratulations! I hope pyfailmalloc can go into 3.4, too. Stefan Krah From techtonik at gmail.com Fri Nov 22 12:21:49 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 22 Nov 2013 14:21:49 +0300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Fri, Nov 15, 2013 at 5:43 PM, Benjamin Peterson wrote: > 2013/11/15 anatoly techtonik : >> On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson wrote: >>> 2013/11/12 anatoly techtonik : >>>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >>>>> 2013/11/10 anatoly techtonik : >>>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>>>>> >>>>>> In Assign(expr* targets, expr value), why the first argument is a list? >>>>> >>>>> x = y = 42 >>>> >>>> Thanks. >>>> >>>> Speaking of this ASDL. `expr* targets` means that multiple entities of >>>> `expr` under the name 'targets' can be passed to Assign statement. >>>> Assign uses them as left value. But `expr` definition contains things >>>> that can not be used as left side assignment targets: >>>> >>>> expr = BoolOp(boolop op, expr* values) >>>> | BinOp(expr left, operator op, expr right) >>>> ... >>>> | Str(string s) -- need to specify raw, unicode, etc? >>>> | Bytes(bytes s) >>>> | NameConstant(singleton value) >>>> | Ellipsis >>>> >>>> -- the following expression can appear in assignment context >>>> | Attribute(expr value, identifier attr, expr_context ctx) >>>> | Subscript(expr value, slice slice, expr_context ctx) >>>> | Starred(expr value, expr_context ctx) >>>> | Name(identifier id, expr_context ctx) >>>> | List(expr* elts, expr_context ctx) >>>> | Tuple(expr* elts, expr_context ctx) >>>> >>>> If I understand correctly, this is compiled into C struct definitions >>>> (Python-ast.c), and there is a code to traverse the structure, but >>>> where is code that validates that the structure is correct? Is it done >>>> on the first level - text file parsing, before ASDL is built? If so, >>>> then what is the role of this ADSL exactly that the first step is >>>> unable to solve? >>> >>> Only valid expression targets are allowed during AST construction. See >>> set_expr_context in ast.c. >> >> Oh my. Now there is also CST in addition to AST. This stuff - >> http://docs.python.org/devguide/ - badly needs diagrams about data >> transformation toolchain from Python source code to machine >> execution instructions. I'd like some pretty stuff, but raw blogdiag >> hack will do the job http://blockdiag.com/en/blockdiag/index.html >> >> There is no set_expr_context in my copy of CPython code, which >> seems to be some alpha of Python 3.4 > > It's actually called set_context. Ok. So what is the process? SOURCE --> TOKEN STREAM --> SENTENCE STREAM --> CST --> --> AST --> BYTECODE Is that right? >>>> Is it possible to fix ADSL to move `expr` that are allowed in Assign >>>> into `expr` subset? What effect will it achieve? I mean - will ADSL >>>> compiler complain about wrong stuff on the left side, or it will still >>>> be a role of some other component. Which one? >>> >>> I'm not sure what you mean by an `expr` subset. >> >> Transform this: >> >> expr = BoolOp(boolop op, expr* values) >> | BinOp(expr left, operator op, expr right) >> ... >> | Str(string s) -- need to specify raw, unicode, etc? >> | Bytes(bytes s) >> | NameConstant(singleton value) >> | Ellipsis >> >> -- the following expression can appear in assignment context >> | Attribute(expr value, identifier attr, expr_context ctx) >> | Subscript(expr value, slice slice, expr_context ctx) >> | Starred(expr value, expr_context ctx) >> | Name(identifier id, expr_context ctx) >> | List(expr* elts, expr_context ctx) >> | Tuple(expr* elts, expr_context ctx) >> >> to this: >> >> expr = BoolOp(boolop op, expr* values) >> | BinOp(expr left, operator op, expr right) >> ... >> | Str(string s) -- need to specify raw, unicode, etc? >> | Bytes(bytes s) >> | NameConstant(singleton value) >> | Ellipsis >> >> -- the following expression can appear in assignment context >> | expr_asgn >> >> expr_asgn = >> Attribute(expr value, identifier attr, expr_context ctx) >> | Subscript(expr value, slice slice, expr_context ctx) >> | Starred(expr value, expr_context ctx) >> | Name(identifier id, expr_context ctx) >> | List(expr* elts, expr_context ctx) >> | Tuple(expr* elts, expr_context ctx) > > I doubt ASDL will let you do that. asdl.py is plain broken - wrong number of arguments passed to output function asdl_c.py worked ok with fixed ASDL and generated - diff attached. I don't know what to check further - on Windows without Visual Studio. -- anatoly t. -------------- next part -------------- --- C:/__py/_pydotorg/devinabox/cpython/Parser/Python-ast.c Fri Nov 22 14:19:30 2013 +++ C:/__py/_pydotorg/devinabox/cpython/Parser/Python-ast.c-patched Fri Nov 22 14:19:11 2013 @@ -161,10 +161,6 @@ static PyTypeObject *Break_type; static PyTypeObject *Continue_type; static PyTypeObject *expr_type; -static char *expr_attributes[] = { - "lineno", - "col_offset", -}; static PyObject* ast2obj_expr(void*); static PyTypeObject *BoolOp_type; _Py_IDENTIFIER(values); @@ -276,6 +272,13 @@ "value", }; static PyTypeObject *Ellipsis_type; +static PyTypeObject *expr_asgn_type; +static PyTypeObject *expr_asgn_type; +static char *expr_asgn_attributes[] = { + "lineno", + "col_offset", +}; +static PyObject* ast2obj_expr_asgn(void*); static PyTypeObject *Attribute_type; _Py_IDENTIFIER(attr); _Py_IDENTIFIER(ctx); @@ -853,7 +856,7 @@ if (!Continue_type) return 0; expr_type = make_type("expr", &AST_type, NULL, 0); if (!expr_type) return 0; - if (!add_attributes(expr_type, expr_attributes, 2)) return 0; + if (!add_attributes(expr_type, NULL, 0)) return 0; BoolOp_type = make_type("BoolOp", expr_type, BoolOp_fields, 2); if (!BoolOp_type) return 0; BinOp_type = make_type("BinOp", expr_type, BinOp_fields, 3); @@ -896,17 +899,24 @@ if (!NameConstant_type) return 0; Ellipsis_type = make_type("Ellipsis", expr_type, NULL, 0); if (!Ellipsis_type) return 0; - Attribute_type = make_type("Attribute", expr_type, Attribute_fields, 3); + expr_asgn_type = make_type("expr_asgn", expr_type, NULL, 0); + if (!expr_asgn_type) return 0; + expr_asgn_type = make_type("expr_asgn", &AST_type, NULL, 0); + if (!expr_asgn_type) return 0; + if (!add_attributes(expr_asgn_type, expr_asgn_attributes, 2)) return 0; + Attribute_type = make_type("Attribute", expr_asgn_type, Attribute_fields, + 3); if (!Attribute_type) return 0; - Subscript_type = make_type("Subscript", expr_type, Subscript_fields, 3); + Subscript_type = make_type("Subscript", expr_asgn_type, Subscript_fields, + 3); if (!Subscript_type) return 0; - Starred_type = make_type("Starred", expr_type, Starred_fields, 2); + Starred_type = make_type("Starred", expr_asgn_type, Starred_fields, 2); if (!Starred_type) return 0; - Name_type = make_type("Name", expr_type, Name_fields, 2); + Name_type = make_type("Name", expr_asgn_type, Name_fields, 2); if (!Name_type) return 0; - List_type = make_type("List", expr_type, List_fields, 2); + List_type = make_type("List", expr_asgn_type, List_fields, 2); if (!List_type) return 0; - Tuple_type = make_type("Tuple", expr_type, Tuple_fields, 2); + Tuple_type = make_type("Tuple", expr_asgn_type, Tuple_fields, 2); if (!Tuple_type) return 0; expr_context_type = make_type("expr_context", &AST_type, NULL, 0); if (!expr_context_type) return 0; @@ -1101,6 +1111,7 @@ static int obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena); static int obj2ast_stmt(PyObject* obj, stmt_ty* out, PyArena* arena); static int obj2ast_expr(PyObject* obj, expr_ty* out, PyArena* arena); +static int obj2ast_expr_asgn(PyObject* obj, expr_asgn_ty* out, PyArena* arena); static int obj2ast_expr_context(PyObject* obj, expr_context_ty* out, PyArena* arena); static int obj2ast_slice(PyObject* obj, slice_ty* out, PyArena* arena); @@ -1568,8 +1579,7 @@ } expr_ty -BoolOp(boolop_ty op, asdl_seq * values, int lineno, int col_offset, PyArena - *arena) +BoolOp(boolop_ty op, asdl_seq * values, PyArena *arena) { expr_ty p; if (!op) { @@ -1583,14 +1593,11 @@ p->kind = BoolOp_kind; p->v.BoolOp.op = op; p->v.BoolOp.values = values; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -BinOp(expr_ty left, operator_ty op, expr_ty right, int lineno, int col_offset, - PyArena *arena) +BinOp(expr_ty left, operator_ty op, expr_ty right, PyArena *arena) { expr_ty p; if (!left) { @@ -1615,14 +1622,11 @@ p->v.BinOp.left = left; p->v.BinOp.op = op; p->v.BinOp.right = right; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -UnaryOp(unaryop_ty op, expr_ty operand, int lineno, int col_offset, PyArena - *arena) +UnaryOp(unaryop_ty op, expr_ty operand, PyArena *arena) { expr_ty p; if (!op) { @@ -1641,14 +1645,11 @@ p->kind = UnaryOp_kind; p->v.UnaryOp.op = op; p->v.UnaryOp.operand = operand; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Lambda(arguments_ty args, expr_ty body, int lineno, int col_offset, PyArena - *arena) +Lambda(arguments_ty args, expr_ty body, PyArena *arena) { expr_ty p; if (!args) { @@ -1667,14 +1668,11 @@ p->kind = Lambda_kind; p->v.Lambda.args = args; p->v.Lambda.body = body; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -IfExp(expr_ty test, expr_ty body, expr_ty orelse, int lineno, int col_offset, - PyArena *arena) +IfExp(expr_ty test, expr_ty body, expr_ty orelse, PyArena *arena) { expr_ty p; if (!test) { @@ -1699,14 +1697,11 @@ p->v.IfExp.test = test; p->v.IfExp.body = body; p->v.IfExp.orelse = orelse; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Dict(asdl_seq * keys, asdl_seq * values, int lineno, int col_offset, PyArena - *arena) +Dict(asdl_seq * keys, asdl_seq * values, PyArena *arena) { expr_ty p; p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); @@ -1715,13 +1710,11 @@ p->kind = Dict_kind; p->v.Dict.keys = keys; p->v.Dict.values = values; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Set(asdl_seq * elts, int lineno, int col_offset, PyArena *arena) +Set(asdl_seq * elts, PyArena *arena) { expr_ty p; p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); @@ -1729,14 +1722,11 @@ return NULL; p->kind = Set_kind; p->v.Set.elts = elts; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -ListComp(expr_ty elt, asdl_seq * generators, int lineno, int col_offset, - PyArena *arena) +ListComp(expr_ty elt, asdl_seq * generators, PyArena *arena) { expr_ty p; if (!elt) { @@ -1750,14 +1740,11 @@ p->kind = ListComp_kind; p->v.ListComp.elt = elt; p->v.ListComp.generators = generators; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -SetComp(expr_ty elt, asdl_seq * generators, int lineno, int col_offset, PyArena - *arena) +SetComp(expr_ty elt, asdl_seq * generators, PyArena *arena) { expr_ty p; if (!elt) { @@ -1771,14 +1758,11 @@ p->kind = SetComp_kind; p->v.SetComp.elt = elt; p->v.SetComp.generators = generators; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -DictComp(expr_ty key, expr_ty value, asdl_seq * generators, int lineno, int - col_offset, PyArena *arena) +DictComp(expr_ty key, expr_ty value, asdl_seq * generators, PyArena *arena) { expr_ty p; if (!key) { @@ -1798,14 +1782,11 @@ p->v.DictComp.key = key; p->v.DictComp.value = value; p->v.DictComp.generators = generators; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -GeneratorExp(expr_ty elt, asdl_seq * generators, int lineno, int col_offset, - PyArena *arena) +GeneratorExp(expr_ty elt, asdl_seq * generators, PyArena *arena) { expr_ty p; if (!elt) { @@ -1819,13 +1800,11 @@ p->kind = GeneratorExp_kind; p->v.GeneratorExp.elt = elt; p->v.GeneratorExp.generators = generators; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Yield(expr_ty value, int lineno, int col_offset, PyArena *arena) +Yield(expr_ty value, PyArena *arena) { expr_ty p; p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); @@ -1833,13 +1812,11 @@ return NULL; p->kind = Yield_kind; p->v.Yield.value = value; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -YieldFrom(expr_ty value, int lineno, int col_offset, PyArena *arena) +YieldFrom(expr_ty value, PyArena *arena) { expr_ty p; if (!value) { @@ -1852,14 +1829,12 @@ return NULL; p->kind = YieldFrom_kind; p->v.YieldFrom.value = value; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Compare(expr_ty left, asdl_int_seq * ops, asdl_seq * comparators, int lineno, - int col_offset, PyArena *arena) +Compare(expr_ty left, asdl_int_seq * ops, asdl_seq * comparators, PyArena + *arena) { expr_ty p; if (!left) { @@ -1874,14 +1849,12 @@ p->v.Compare.left = left; p->v.Compare.ops = ops; p->v.Compare.comparators = comparators; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty Call(expr_ty func, asdl_seq * args, asdl_seq * keywords, expr_ty starargs, - expr_ty kwargs, int lineno, int col_offset, PyArena *arena) + expr_ty kwargs, PyArena *arena) { expr_ty p; if (!func) { @@ -1898,13 +1871,11 @@ p->v.Call.keywords = keywords; p->v.Call.starargs = starargs; p->v.Call.kwargs = kwargs; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Num(object n, int lineno, int col_offset, PyArena *arena) +Num(object n, PyArena *arena) { expr_ty p; if (!n) { @@ -1917,13 +1888,11 @@ return NULL; p->kind = Num_kind; p->v.Num.n = n; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Str(string s, int lineno, int col_offset, PyArena *arena) +Str(string s, PyArena *arena) { expr_ty p; if (!s) { @@ -1936,13 +1905,11 @@ return NULL; p->kind = Str_kind; p->v.Str.s = s; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Bytes(bytes s, int lineno, int col_offset, PyArena *arena) +Bytes(bytes s, PyArena *arena) { expr_ty p; if (!s) { @@ -1955,13 +1922,11 @@ return NULL; p->kind = Bytes_kind; p->v.Bytes.s = s; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -NameConstant(singleton value, int lineno, int col_offset, PyArena *arena) +NameConstant(singleton value, PyArena *arena) { expr_ty p; if (!value) { @@ -1974,29 +1939,36 @@ return NULL; p->kind = NameConstant_kind; p->v.NameConstant.value = value; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty -Ellipsis(int lineno, int col_offset, PyArena *arena) +Ellipsis(PyArena *arena) { expr_ty p; p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Ellipsis_kind; - p->lineno = lineno; - p->col_offset = col_offset; return p; } expr_ty +expr_asgn(PyArena *arena) +{ + expr_ty p; + p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + if (!p) + return NULL; + p->kind = expr_asgn_kind; + return p; +} + +expr_asgn_ty Attribute(expr_ty value, identifier attr, expr_context_ty ctx, int lineno, int col_offset, PyArena *arena) { - expr_ty p; + expr_asgn_ty p; if (!value) { PyErr_SetString(PyExc_ValueError, "field value is required for Attribute"); @@ -2012,7 +1984,7 @@ "field ctx is required for Attribute"); return NULL; } - p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + p = (expr_asgn_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Attribute_kind; @@ -2024,11 +1996,11 @@ return p; } -expr_ty +expr_asgn_ty Subscript(expr_ty value, slice_ty slice, expr_context_ty ctx, int lineno, int col_offset, PyArena *arena) { - expr_ty p; + expr_asgn_ty p; if (!value) { PyErr_SetString(PyExc_ValueError, "field value is required for Subscript"); @@ -2044,7 +2016,7 @@ "field ctx is required for Subscript"); return NULL; } - p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + p = (expr_asgn_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Subscript_kind; @@ -2056,11 +2028,11 @@ return p; } -expr_ty +expr_asgn_ty Starred(expr_ty value, expr_context_ty ctx, int lineno, int col_offset, PyArena *arena) { - expr_ty p; + expr_asgn_ty p; if (!value) { PyErr_SetString(PyExc_ValueError, "field value is required for Starred"); @@ -2071,7 +2043,7 @@ "field ctx is required for Starred"); return NULL; } - p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + p = (expr_asgn_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Starred_kind; @@ -2082,11 +2054,11 @@ return p; } -expr_ty +expr_asgn_ty Name(identifier id, expr_context_ty ctx, int lineno, int col_offset, PyArena *arena) { - expr_ty p; + expr_asgn_ty p; if (!id) { PyErr_SetString(PyExc_ValueError, "field id is required for Name"); @@ -2097,7 +2069,7 @@ "field ctx is required for Name"); return NULL; } - p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + p = (expr_asgn_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Name_kind; @@ -2108,17 +2080,17 @@ return p; } -expr_ty +expr_asgn_ty List(asdl_seq * elts, expr_context_ty ctx, int lineno, int col_offset, PyArena *arena) { - expr_ty p; + expr_asgn_ty p; if (!ctx) { PyErr_SetString(PyExc_ValueError, "field ctx is required for List"); return NULL; } - p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + p = (expr_asgn_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = List_kind; @@ -2129,17 +2101,17 @@ return p; } -expr_ty +expr_asgn_ty Tuple(asdl_seq * elts, expr_context_ty ctx, int lineno, int col_offset, PyArena *arena) { - expr_ty p; + expr_asgn_ty p; if (!ctx) { PyErr_SetString(PyExc_ValueError, "field ctx is required for Tuple"); return NULL; } - p = (expr_ty)PyArena_Malloc(arena, sizeof(*p)); + p = (expr_asgn_ty)PyArena_Malloc(arena, sizeof(*p)); if (!p) return NULL; p->kind = Tuple_kind; @@ -2479,7 +2451,7 @@ case Assign_kind: result = PyType_GenericNew(Assign_type, NULL, NULL); if (!result) goto failed; - value = ast2obj_list(o->v.Assign.targets, ast2obj_expr); + value = ast2obj_list(o->v.Assign.targets, ast2obj_expr_asgn); if (!value) goto failed; if (_PyObject_SetAttrId(result, &PyId_targets, value) == -1) goto failed; @@ -3010,6 +2982,29 @@ result = PyType_GenericNew(Ellipsis_type, NULL, NULL); if (!result) goto failed; break; + case expr_asgn_kind: + result = PyType_GenericNew(expr_asgn_type, NULL, NULL); + if (!result) goto failed; + break; + } + return result; +failed: + Py_XDECREF(value); + Py_XDECREF(result); + return NULL; +} + +PyObject* +ast2obj_expr_asgn(void* _o) +{ + expr_asgn_ty o = (expr_asgn_ty)_o; + PyObject *result = NULL, *value = NULL; + if (!o) { + Py_INCREF(Py_None); + return Py_None; + } + + switch (o->kind) { case Attribute_kind: result = PyType_GenericNew(Attribute_type, NULL, NULL); if (!result) goto failed; @@ -4082,8 +4077,8 @@ targets = asdl_seq_new(len, arena); if (targets == NULL) goto failed; for (i = 0; i < len; i++) { - expr_ty value; - res = obj2ast_expr(PyList_GET_ITEM(tmp, i), &value, arena); + expr_asgn_ty value; + res = obj2ast_expr_asgn(PyList_GET_ITEM(tmp, i), &value, arena); if (res != 0) goto failed; asdl_seq_SET(targets, i, value); } @@ -4844,35 +4839,11 @@ int isinstance; PyObject *tmp = NULL; - int lineno; - int col_offset; if (obj == Py_None) { *out = NULL; return 0; } - if (_PyObject_HasAttrId(obj, &PyId_lineno)) { - int res; - tmp = _PyObject_GetAttrId(obj, &PyId_lineno); - if (tmp == NULL) goto failed; - res = obj2ast_int(tmp, &lineno, arena); - if (res != 0) goto failed; - Py_CLEAR(tmp); - } else { - PyErr_SetString(PyExc_TypeError, "required field \"lineno\" missing from expr"); - return 1; - } - if (_PyObject_HasAttrId(obj, &PyId_col_offset)) { - int res; - tmp = _PyObject_GetAttrId(obj, &PyId_col_offset); - if (tmp == NULL) goto failed; - res = obj2ast_int(tmp, &col_offset, arena); - if (res != 0) goto failed; - Py_CLEAR(tmp); - } else { - PyErr_SetString(PyExc_TypeError, "required field \"col_offset\" missing from expr"); - return 1; - } isinstance = PyObject_IsInstance(obj, (PyObject*)BoolOp_type); if (isinstance == -1) { return 1; @@ -4916,7 +4887,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"values\" missing from BoolOp"); return 1; } - *out = BoolOp(op, values, lineno, col_offset, arena); + *out = BoolOp(op, values, arena); if (*out == NULL) goto failed; return 0; } @@ -4962,7 +4933,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"right\" missing from BinOp"); return 1; } - *out = BinOp(left, op, right, lineno, col_offset, arena); + *out = BinOp(left, op, right, arena); if (*out == NULL) goto failed; return 0; } @@ -4996,7 +4967,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"operand\" missing from UnaryOp"); return 1; } - *out = UnaryOp(op, operand, lineno, col_offset, arena); + *out = UnaryOp(op, operand, arena); if (*out == NULL) goto failed; return 0; } @@ -5030,7 +5001,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"body\" missing from Lambda"); return 1; } - *out = Lambda(args, body, lineno, col_offset, arena); + *out = Lambda(args, body, arena); if (*out == NULL) goto failed; return 0; } @@ -5076,7 +5047,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"orelse\" missing from IfExp"); return 1; } - *out = IfExp(test, body, orelse, lineno, col_offset, arena); + *out = IfExp(test, body, orelse, arena); if (*out == NULL) goto failed; return 0; } @@ -5136,7 +5107,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"values\" missing from Dict"); return 1; } - *out = Dict(keys, values, lineno, col_offset, arena); + *out = Dict(keys, values, arena); if (*out == NULL) goto failed; return 0; } @@ -5171,7 +5142,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"elts\" missing from Set"); return 1; } - *out = Set(elts, lineno, col_offset, arena); + *out = Set(elts, arena); if (*out == NULL) goto failed; return 0; } @@ -5218,7 +5189,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"generators\" missing from ListComp"); return 1; } - *out = ListComp(elt, generators, lineno, col_offset, arena); + *out = ListComp(elt, generators, arena); if (*out == NULL) goto failed; return 0; } @@ -5265,7 +5236,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"generators\" missing from SetComp"); return 1; } - *out = SetComp(elt, generators, lineno, col_offset, arena); + *out = SetComp(elt, generators, arena); if (*out == NULL) goto failed; return 0; } @@ -5324,7 +5295,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"generators\" missing from DictComp"); return 1; } - *out = DictComp(key, value, generators, lineno, col_offset, arena); + *out = DictComp(key, value, generators, arena); if (*out == NULL) goto failed; return 0; } @@ -5371,7 +5342,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"generators\" missing from GeneratorExp"); return 1; } - *out = GeneratorExp(elt, generators, lineno, col_offset, arena); + *out = GeneratorExp(elt, generators, arena); if (*out == NULL) goto failed; return 0; } @@ -5392,7 +5363,7 @@ } else { value = NULL; } - *out = Yield(value, lineno, col_offset, arena); + *out = Yield(value, arena); if (*out == NULL) goto failed; return 0; } @@ -5414,7 +5385,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"value\" missing from YieldFrom"); return 1; } - *out = YieldFrom(value, lineno, col_offset, arena); + *out = YieldFrom(value, arena); if (*out == NULL) goto failed; return 0; } @@ -5486,7 +5457,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"comparators\" missing from Compare"); return 1; } - *out = Compare(left, ops, comparators, lineno, col_offset, arena); + *out = Compare(left, ops, comparators, arena); if (*out == NULL) goto failed; return 0; } @@ -5580,8 +5551,7 @@ } else { kwargs = NULL; } - *out = Call(func, args, keywords, starargs, kwargs, lineno, col_offset, - arena); + *out = Call(func, args, keywords, starargs, kwargs, arena); if (*out == NULL) goto failed; return 0; } @@ -5603,7 +5573,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"n\" missing from Num"); return 1; } - *out = Num(n, lineno, col_offset, arena); + *out = Num(n, arena); if (*out == NULL) goto failed; return 0; } @@ -5625,7 +5595,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"s\" missing from Str"); return 1; } - *out = Str(s, lineno, col_offset, arena); + *out = Str(s, arena); if (*out == NULL) goto failed; return 0; } @@ -5647,7 +5617,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"s\" missing from Bytes"); return 1; } - *out = Bytes(s, lineno, col_offset, arena); + *out = Bytes(s, arena); if (*out == NULL) goto failed; return 0; } @@ -5669,7 +5639,7 @@ PyErr_SetString(PyExc_TypeError, "required field \"value\" missing from NameConstant"); return 1; } - *out = NameConstant(value, lineno, col_offset, arena); + *out = NameConstant(value, arena); if (*out == NULL) goto failed; return 0; } @@ -5679,10 +5649,62 @@ } if (isinstance) { - *out = Ellipsis(lineno, col_offset, arena); + *out = Ellipsis(arena); + if (*out == NULL) goto failed; + return 0; + } + isinstance = PyObject_IsInstance(obj, (PyObject*)expr_asgn_type); + if (isinstance == -1) { + return 1; + } + if (isinstance) { + + *out = expr_asgn(arena); if (*out == NULL) goto failed; return 0; } + + PyErr_Format(PyExc_TypeError, "expected some sort of expr, but got %R", obj); + failed: + Py_XDECREF(tmp); + return 1; +} + +int +obj2ast_expr_asgn(PyObject* obj, expr_asgn_ty* out, PyArena* arena) +{ + int isinstance; + + PyObject *tmp = NULL; + int lineno; + int col_offset; + + if (obj == Py_None) { + *out = NULL; + return 0; + } + if (_PyObject_HasAttrId(obj, &PyId_lineno)) { + int res; + tmp = _PyObject_GetAttrId(obj, &PyId_lineno); + if (tmp == NULL) goto failed; + res = obj2ast_int(tmp, &lineno, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } else { + PyErr_SetString(PyExc_TypeError, "required field \"lineno\" missing from expr_asgn"); + return 1; + } + if (_PyObject_HasAttrId(obj, &PyId_col_offset)) { + int res; + tmp = _PyObject_GetAttrId(obj, &PyId_col_offset); + if (tmp == NULL) goto failed; + res = obj2ast_int(tmp, &col_offset, arena); + if (res != 0) goto failed; + Py_CLEAR(tmp); + } else { + PyErr_SetString(PyExc_TypeError, "required field \"col_offset\" missing from expr_asgn"); + return 1; + } isinstance = PyObject_IsInstance(obj, (PyObject*)Attribute_type); if (isinstance == -1) { return 1; @@ -5938,7 +5960,7 @@ return 0; } - PyErr_Format(PyExc_TypeError, "expected some sort of expr, but got %R", obj); + PyErr_Format(PyExc_TypeError, "expected some sort of expr_asgn, but got %R", obj); failed: Py_XDECREF(tmp); return 1; @@ -6916,6 +6938,10 @@ if (PyDict_SetItemString(d, "NameConstant", (PyObject*)NameConstant_type) < 0) return NULL; if (PyDict_SetItemString(d, "Ellipsis", (PyObject*)Ellipsis_type) < 0) + return NULL; + if (PyDict_SetItemString(d, "expr_asgn", (PyObject*)expr_asgn_type) < 0) + return NULL; + if (PyDict_SetItemString(d, "expr_asgn", (PyObject*)expr_asgn_type) < 0) return NULL; if (PyDict_SetItemString(d, "Attribute", (PyObject*)Attribute_type) < 0) return NULL; From benjamin at python.org Fri Nov 22 13:29:47 2013 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 22 Nov 2013 07:29:47 -0500 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: 2013/11/22 anatoly techtonik : > On Fri, Nov 15, 2013 at 5:43 PM, Benjamin Peterson wrote: >> 2013/11/15 anatoly techtonik : >>> On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson wrote: >>>> 2013/11/12 anatoly techtonik : >>>>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >>>>>> 2013/11/10 anatoly techtonik : >>>>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>>>>>> >>>>>>> In Assign(expr* targets, expr value), why the first argument is a list? >>>>>> >>>>>> x = y = 42 >>>>> >>>>> Thanks. >>>>> >>>>> Speaking of this ASDL. `expr* targets` means that multiple entities of >>>>> `expr` under the name 'targets' can be passed to Assign statement. >>>>> Assign uses them as left value. But `expr` definition contains things >>>>> that can not be used as left side assignment targets: >>>>> >>>>> expr = BoolOp(boolop op, expr* values) >>>>> | BinOp(expr left, operator op, expr right) >>>>> ... >>>>> | Str(string s) -- need to specify raw, unicode, etc? >>>>> | Bytes(bytes s) >>>>> | NameConstant(singleton value) >>>>> | Ellipsis >>>>> >>>>> -- the following expression can appear in assignment context >>>>> | Attribute(expr value, identifier attr, expr_context ctx) >>>>> | Subscript(expr value, slice slice, expr_context ctx) >>>>> | Starred(expr value, expr_context ctx) >>>>> | Name(identifier id, expr_context ctx) >>>>> | List(expr* elts, expr_context ctx) >>>>> | Tuple(expr* elts, expr_context ctx) >>>>> >>>>> If I understand correctly, this is compiled into C struct definitions >>>>> (Python-ast.c), and there is a code to traverse the structure, but >>>>> where is code that validates that the structure is correct? Is it done >>>>> on the first level - text file parsing, before ASDL is built? If so, >>>>> then what is the role of this ADSL exactly that the first step is >>>>> unable to solve? >>>> >>>> Only valid expression targets are allowed during AST construction. See >>>> set_expr_context in ast.c. >>> >>> Oh my. Now there is also CST in addition to AST. This stuff - >>> http://docs.python.org/devguide/ - badly needs diagrams about data >>> transformation toolchain from Python source code to machine >>> execution instructions. I'd like some pretty stuff, but raw blogdiag >>> hack will do the job http://blockdiag.com/en/blockdiag/index.html >>> >>> There is no set_expr_context in my copy of CPython code, which >>> seems to be some alpha of Python 3.4 >> >> It's actually called set_context. > > Ok. So what is the process? > > SOURCE --> TOKEN STREAM --> SENTENCE STREAM --> CST --> > --> AST --> BYTECODE > > Is that right? I don't know what sentence stream is, but otherwise looks right. -- Regards, Benjamin From martin at v.loewis.de Fri Nov 22 13:47:11 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Nov 2013 13:47:11 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: Message-ID: <528F524F.8080807@v.loewis.de> Am 22.11.13 10:00, schrieb Richard Tew: > That there are people who would consider using the trademark to force > us to change the name we've been using without harm for 14 years, > worries me. It's one thing to perhaps use it to stop someone scamming > Python users, and another to suggest using it as a weapon against us > for this? Really? Unfortunately, Christian's original question was unclear. If his plan had been to release "Python 2.8" (as his original question suggested), then he would have faced quite some opposition (and, in a sense, releasing a "python28.dll" is quite close to that). IMO, calling it "Stackless Python 2.8" is fine, and I also believe that this complies with the current trademark policy. The objective of this policy is to give Python companies equal choices in naming their products, i.e. nobody would be allowed to call something "Python" without further qualification (except for the traditional CPython release). Regards, Martin From brett at python.org Fri Nov 22 15:01:32 2013 From: brett at python.org (Brett Cannon) Date: Fri, 22 Nov 2013 09:01:32 -0500 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 7:29 AM, Benjamin Peterson wrote: > 2013/11/22 anatoly techtonik : > > On Fri, Nov 15, 2013 at 5:43 PM, Benjamin Peterson > wrote: > >> 2013/11/15 anatoly techtonik : > >>> On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson < > benjamin at python.org> wrote: > >>>> 2013/11/12 anatoly techtonik : > >>>>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson < > benjamin at python.org> wrote: > >>>>>> 2013/11/10 anatoly techtonik : > >>>>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl > >>>>>>> > >>>>>>> In Assign(expr* targets, expr value), why the first argument is a > list? > >>>>>> > >>>>>> x = y = 42 > >>>>> > >>>>> Thanks. > >>>>> > >>>>> Speaking of this ASDL. `expr* targets` means that multiple entities > of > >>>>> `expr` under the name 'targets' can be passed to Assign statement. > >>>>> Assign uses them as left value. But `expr` definition contains things > >>>>> that can not be used as left side assignment targets: > >>>>> > >>>>> expr = BoolOp(boolop op, expr* values) > >>>>> | BinOp(expr left, operator op, expr right) > >>>>> ... > >>>>> | Str(string s) -- need to specify raw, unicode, etc? > >>>>> | Bytes(bytes s) > >>>>> | NameConstant(singleton value) > >>>>> | Ellipsis > >>>>> > >>>>> -- the following expression can appear in assignment context > >>>>> | Attribute(expr value, identifier attr, expr_context ctx) > >>>>> | Subscript(expr value, slice slice, expr_context ctx) > >>>>> | Starred(expr value, expr_context ctx) > >>>>> | Name(identifier id, expr_context ctx) > >>>>> | List(expr* elts, expr_context ctx) > >>>>> | Tuple(expr* elts, expr_context ctx) > >>>>> > >>>>> If I understand correctly, this is compiled into C struct definitions > >>>>> (Python-ast.c), and there is a code to traverse the structure, but > >>>>> where is code that validates that the structure is correct? Is it > done > >>>>> on the first level - text file parsing, before ASDL is built? If so, > >>>>> then what is the role of this ADSL exactly that the first step is > >>>>> unable to solve? > >>>> > >>>> Only valid expression targets are allowed during AST construction. See > >>>> set_expr_context in ast.c. > >>> > >>> Oh my. Now there is also CST in addition to AST. This stuff - > >>> http://docs.python.org/devguide/ - badly needs diagrams about data > >>> transformation toolchain from Python source code to machine > >>> execution instructions. I'd like some pretty stuff, but raw blogdiag > >>> hack will do the job http://blockdiag.com/en/blockdiag/index.html > >>> > >>> There is no set_expr_context in my copy of CPython code, which > >>> seems to be some alpha of Python 3.4 > >> > >> It's actually called set_context. > > > > Ok. So what is the process? > > > > SOURCE --> TOKEN STREAM --> SENTENCE STREAM --> CST --> > > --> AST --> BYTECODE > > > > Is that right? > > I don't know what sentence stream is, but otherwise looks right. If you want more of an overview: http://pyvideo.org/video/2331/from-source-to-code-how-cpythons-compiler-works -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Nov 22 15:01:52 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 22 Nov 2013 17:01:52 +0300 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 3:29 PM, Benjamin Peterson wrote: > 2013/11/22 anatoly techtonik : >> On Fri, Nov 15, 2013 at 5:43 PM, Benjamin Peterson wrote: >>> 2013/11/15 anatoly techtonik : >>>> On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson wrote: >>>>> 2013/11/12 anatoly techtonik : >>>>>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson wrote: >>>>>>> 2013/11/10 anatoly techtonik : >>>>>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl >>>>>>>> >>>>>>>> In Assign(expr* targets, expr value), why the first argument is a list? >>>>>>> >>>>>>> x = y = 42 >>>>>> >>>>>> Thanks. >>>>>> >>>>>> Speaking of this ASDL. `expr* targets` means that multiple entities of >>>>>> `expr` under the name 'targets' can be passed to Assign statement. >>>>>> Assign uses them as left value. But `expr` definition contains things >>>>>> that can not be used as left side assignment targets: >>>>>> >>>>>> expr = BoolOp(boolop op, expr* values) >>>>>> | BinOp(expr left, operator op, expr right) >>>>>> ... >>>>>> | Str(string s) -- need to specify raw, unicode, etc? >>>>>> | Bytes(bytes s) >>>>>> | NameConstant(singleton value) >>>>>> | Ellipsis >>>>>> >>>>>> -- the following expression can appear in assignment context >>>>>> | Attribute(expr value, identifier attr, expr_context ctx) >>>>>> | Subscript(expr value, slice slice, expr_context ctx) >>>>>> | Starred(expr value, expr_context ctx) >>>>>> | Name(identifier id, expr_context ctx) >>>>>> | List(expr* elts, expr_context ctx) >>>>>> | Tuple(expr* elts, expr_context ctx) >>>>>> >>>>>> If I understand correctly, this is compiled into C struct definitions >>>>>> (Python-ast.c), and there is a code to traverse the structure, but >>>>>> where is code that validates that the structure is correct? Is it done >>>>>> on the first level - text file parsing, before ASDL is built? If so, >>>>>> then what is the role of this ADSL exactly that the first step is >>>>>> unable to solve? >>>>> >>>>> Only valid expression targets are allowed during AST construction. See >>>>> set_expr_context in ast.c. >>>> >>>> Oh my. Now there is also CST in addition to AST. This stuff - >>>> http://docs.python.org/devguide/ - badly needs diagrams about data >>>> transformation toolchain from Python source code to machine >>>> execution instructions. I'd like some pretty stuff, but raw blogdiag >>>> hack will do the job http://blockdiag.com/en/blockdiag/index.html >>>> >>>> There is no set_expr_context in my copy of CPython code, which >>>> seems to be some alpha of Python 3.4 >>> >>> It's actually called set_context. >> >> Ok. So what is the process? >> >> SOURCE --> TOKEN STREAM --> SENTENCE STREAM --> CST --> >> --> AST --> BYTECODE >> >> Is that right? > > I don't know what sentence stream is, but otherwise looks right. I mean that when you have TOKENS, you need to validate that their order is valid and throw errors to console if it is not. Like every sentence should have certain word order to make sense, you won't be able to construct CST from tokens that are placed in wrong order. Right? Or is CST itself a tree of tokens, which position is validated when the tree is transformed to AST? FWIW, I've modified my astdump project to make it easy to explore Python AST and experiment with it. This dataset shows how to create coverage for AST transformations (visitors) https://bitbucket.org/techtonik/astdump/src/tip/dataset/?at=default From ethan at stoneleaf.us Fri Nov 22 15:16:27 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 22 Nov 2013 06:16:27 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: Message-ID: <528F673B.3070505@stoneleaf.us> On 11/22/2013 01:00 AM, Richard Tew wrote: > > [snip] > > Yet, suddenly, the chance that we may release a "Stackless Python > 2.8", is a concern. Because there is no CPython 2.8 to mirror, and we've said there will not be one. It certainly didn't help that this was presented one week before the feature freeze deadline for 3.4 and Christian stated [1] his "2.8" release was going to happen in one week unless we talked him out of it. > We're not talking about releasing a Python 2.8 against anyone's wishes > here. Having "Python 2.8" in the name is going to be a PR nightmare. Other names have been suggested. > Or in _this specific case_, doing anything other than naming > something in a way we've been naming it for over a decade. When has Stackless created a version that didn't mirror the CPython version? If you haven't, then you are doing something different. > That there are people who would consider using the trademark to force > us to change the name we've been using without harm for 14 years, > worries me. This concern came about because of the misapprehension that the name would be "Python 2.8" with no "Stackless" in front. -- ~Ethan~ [1] https://mail.python.org/pipermail/python-dev/2013-November/130437.html From solipsis at pitrou.net Fri Nov 22 15:55:06 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 15:55:06 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 References: Message-ID: <20131122155506.2abd8a4c@fsol> On Fri, 22 Nov 2013 22:00:54 +1300 Richard Tew wrote: > > Was there any concern about the dozens of "Stackless Python 2.x" and > "Stackless Python 3.x" versions that I have released over the years > being a cause for confusion? Or more importantly, any actual problems > experienced? > > Yet, suddenly, the chance that we may release a "Stackless Python > 2.8", is a concern. It is, as long as there is "Python 2.8" in it: it will breed confusion amongst the 99% of non-experts our community is composed of. To be clear: I don't think anyone would go after you for doing so (perhaps Barry will play a bass solo? :-)), but it would still be a shame IMHO. Regards Antoine. From barry at python.org Fri Nov 22 16:12:13 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 22 Nov 2013 10:12:13 -0500 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <20131122155506.2abd8a4c@fsol> References: <20131122155506.2abd8a4c@fsol> Message-ID: <20131122101213.494922da@limelight.wooz.org> On Nov 22, 2013, at 03:55 PM, Antoine Pitrou wrote: >(perhaps Barry will play a bass solo? :-)) Now, now, we don't want to give Chris incentive, do we? :) -Barry From ericsnowcurrently at gmail.com Fri Nov 22 17:26:32 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 09:26:32 -0700 Subject: [Python-Dev] Buildbot categories Message-ID: Is there a listing of buildbot categories (3.x.stable, custom-stable, etc.)? The best I could find was http://python.org/dev/buildbot/. I couldn't find it at http://buildbot.python.org/ or in the devguide, both places where I expected to find such a list. Thanks! -eric From ericsnowcurrently at gmail.com Fri Nov 22 17:36:47 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 09:36:47 -0700 Subject: [Python-Dev] PEP 451 (ModuleSpec) has landed Message-ID: The bulk of PEP 451 (all but the "fluffy" parts as Brett called them ) is now committed to default. We are tracking the remaining items as dependencies of http://bugs.python.org/issue18864. http://hg.python.org/cpython/rev/07229c6104b1 changeset: 87347:07229c6104b1 user: Eric Snow date: Fri Nov 22 09:05:39 2013 -0700 summary: Implement PEP 451 (ModuleSpec). -eric From ericsnowcurrently at gmail.com Fri Nov 22 17:37:52 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 09:37:52 -0700 Subject: [Python-Dev] PEP 451 (ModuleSpec) has landed In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 9:36 AM, Eric Snow wrote: > The bulk of PEP 451 (all but the "fluffy" parts as Brett called them > ) is now committed to default. I'm looking into buildbot failures (on Windows of course). -eric From solipsis at pitrou.net Fri Nov 22 17:44:55 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 17:44:55 +0100 Subject: [Python-Dev] PEP 428 (pathlib) now committed Message-ID: <20131122174455.2004b3ce@fsol> Hello, I've pushed pathlib to the repository. I'm hopeful there won't be new buildbot failures because of it, but still, there may be some platform-specific issues I'm unaware of. I hope our Release Manager is doing ok :-) Regards Antoine. From tismer at stackless.com Fri Nov 22 17:54:46 2013 From: tismer at stackless.com (Christian Tismer) Date: Fri, 22 Nov 2013 17:54:46 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528F524F.8080807@v.loewis.de> References: <528F524F.8080807@v.loewis.de> Message-ID: <528F8C56.3060001@stackless.com> On 22/11/13 13:47, "Martin v. L?wis" wrote: > Am 22.11.13 10:00, schrieb Richard Tew: >> That there are people who would consider using the trademark to force >> us to change the name we've been using without harm for 14 years, >> worries me. It's one thing to perhaps use it to stop someone scamming >> Python users, and another to suggest using it as a weapon against us >> for this? Really? > Unfortunately, Christian's original question was unclear. If his plan > had been to release "Python 2.8" (as his original question suggested), > then he would have faced quite some opposition (and, in a sense, > releasing a "python28.dll" is quite close to that). The discussion is over, but I cannot let this comment go through without citing my original question, again: > My question > ----------- > > I have created a very clean Python 2.7.6+ based CPython with the > Stackless > additions, that compiles with VS 2010, using the adapted project > structure > of Python 3.3.X, and I want to publish that on the Stackless website > as the > official "Stackless Python 2.8". If you consider Stackless as official > ;-) . Can it be that these sentences have morphed into something different when going out to the mailing list? Or maybe there is a filter in the brains? If one removes the word "Stackless" everywhere, the above text reads still almost syntactic correctly, but changes it's meaning a lot. -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From victor.stinner at gmail.com Fri Nov 22 17:56:37 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 22 Nov 2013 17:56:37 +0100 Subject: [Python-Dev] PEP 428 (pathlib) now committed In-Reply-To: <20131122174455.2004b3ce@fsol> References: <20131122174455.2004b3ce@fsol> Message-ID: 2013/11/22 Antoine Pitrou : > I've pushed pathlib to the repository. I'm hopeful there won't be > new buildbot failures because of it, but still, there may be some > platform-specific issues I'm unaware of. A PEP wouldn't be successful if it doesn't break any buildbot. PEP 451 was successful, as yours :-D http://buildbot.python.org/all/builders/AMD64%20FreeBSD%2010.0%203.x/builds/1000 Victor From eliben at gmail.com Fri Nov 22 18:05:28 2013 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 22 Nov 2013 09:05:28 -0800 Subject: [Python-Dev] Assign(expr* targets, expr value) - why targetS? In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 3:21 AM, anatoly techtonik wrote: > On Fri, Nov 15, 2013 at 5:43 PM, Benjamin Peterson > wrote: > > 2013/11/15 anatoly techtonik : > >> On Tue, Nov 12, 2013 at 5:08 PM, Benjamin Peterson > wrote: > >>> 2013/11/12 anatoly techtonik : > >>>> On Sun, Nov 10, 2013 at 8:34 AM, Benjamin Peterson < > benjamin at python.org> wrote: > >>>>> 2013/11/10 anatoly techtonik : > >>>>>> http://hg.python.org/cpython/file/1ee45eb6aab9/Parser/Python.asdl > >>>>>> > >>>>>> In Assign(expr* targets, expr value), why the first argument is a > list? > >>>>> > >>>>> x = y = 42 > >>>> > >>>> Thanks. > >>>> > >>>> Speaking of this ASDL. `expr* targets` means that multiple entities of > >>>> `expr` under the name 'targets' can be passed to Assign statement. > >>>> Assign uses them as left value. But `expr` definition contains things > >>>> that can not be used as left side assignment targets: > >>>> > >>>> expr = BoolOp(boolop op, expr* values) > >>>> | BinOp(expr left, operator op, expr right) > >>>> ... > >>>> | Str(string s) -- need to specify raw, unicode, etc? > >>>> | Bytes(bytes s) > >>>> | NameConstant(singleton value) > >>>> | Ellipsis > >>>> > >>>> -- the following expression can appear in assignment context > >>>> | Attribute(expr value, identifier attr, expr_context ctx) > >>>> | Subscript(expr value, slice slice, expr_context ctx) > >>>> | Starred(expr value, expr_context ctx) > >>>> | Name(identifier id, expr_context ctx) > >>>> | List(expr* elts, expr_context ctx) > >>>> | Tuple(expr* elts, expr_context ctx) > >>>> > >>>> If I understand correctly, this is compiled into C struct definitions > >>>> (Python-ast.c), and there is a code to traverse the structure, but > >>>> where is code that validates that the structure is correct? Is it done > >>>> on the first level - text file parsing, before ASDL is built? If so, > >>>> then what is the role of this ADSL exactly that the first step is > >>>> unable to solve? > >>> > >>> Only valid expression targets are allowed during AST construction. See > >>> set_expr_context in ast.c. > >> > >> Oh my. Now there is also CST in addition to AST. This stuff - > >> http://docs.python.org/devguide/ - badly needs diagrams about data > >> transformation toolchain from Python source code to machine > >> execution instructions. I'd like some pretty stuff, but raw blogdiag > >> hack will do the job http://blockdiag.com/en/blockdiag/index.html > >> > >> There is no set_expr_context in my copy of CPython code, which > >> seems to be some alpha of Python 3.4 > > > > It's actually called set_context. > > Ok. So what is the process? > > SOURCE --> TOKEN STREAM --> SENTENCE STREAM --> CST --> > --> AST --> BYTECODE > > Is that right? > > >>>> Is it possible to fix ADSL to move `expr` that are allowed in Assign > >>>> into `expr` subset? What effect will it achieve? I mean - will ADSL > >>>> compiler complain about wrong stuff on the left side, or it will still > >>>> be a role of some other component. Which one? > >>> > >>> I'm not sure what you mean by an `expr` subset. > >> > >> Transform this: > >> > >> expr = BoolOp(boolop op, expr* values) > >> | BinOp(expr left, operator op, expr right) > >> ... > >> | Str(string s) -- need to specify raw, unicode, etc? > >> | Bytes(bytes s) > >> | NameConstant(singleton value) > >> | Ellipsis > >> > >> -- the following expression can appear in assignment context > >> | Attribute(expr value, identifier attr, expr_context ctx) > >> | Subscript(expr value, slice slice, expr_context ctx) > >> | Starred(expr value, expr_context ctx) > >> | Name(identifier id, expr_context ctx) > >> | List(expr* elts, expr_context ctx) > >> | Tuple(expr* elts, expr_context ctx) > >> > >> to this: > >> > >> expr = BoolOp(boolop op, expr* values) > >> | BinOp(expr left, operator op, expr right) > >> ... > >> | Str(string s) -- need to specify raw, unicode, etc? > >> | Bytes(bytes s) > >> | NameConstant(singleton value) > >> | Ellipsis > >> > >> -- the following expression can appear in assignment context > >> | expr_asgn > >> > >> expr_asgn = > >> Attribute(expr value, identifier attr, expr_context ctx) > >> | Subscript(expr value, slice slice, expr_context ctx) > >> | Starred(expr value, expr_context ctx) > >> | Name(identifier id, expr_context ctx) > >> | List(expr* elts, expr_context ctx) > >> | Tuple(expr* elts, expr_context ctx) > > > > I doubt ASDL will let you do that. > > asdl.py is plain broken - wrong number of arguments passed to > output function > asdl_c.py worked ok with fixed ASDL and generated - diff attached. > I don't know what to check further - on Windows without Visual Studio. > -- > asdl.py is about to be replaced - please see http://bugs.python.org/issue19655 - I'm waiting for after the 3.4 branch to move forward with that. Let's discuss any needed fixes in that issue. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Nov 22 18:07:42 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 22 Nov 2013 18:07:42 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20131122170742.C59D356A40@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-11-15 - 2013-11-22) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4280 (+15) closed 27206 (+87) total 31486 (+102) Open issues with patches: 1972 Issues opened (81) ================== #17134: Use Windows' certificate store for CA certs http://bugs.python.org/issue17134 reopened by haypo #19613: test_nntplib: sporadic failures, test_article_head_body() http://bugs.python.org/issue19613 opened by haypo #19614: support.temp_cwd should use support.rmtree http://bugs.python.org/issue19614 opened by r.david.murray #19615: "ImportError: dynamic module does not define init function" wh http://bugs.python.org/issue19615 opened by ecatmur #19616: Fix test.test_importlib.util.test_both() to set __module__ http://bugs.python.org/issue19616 opened by brett.cannon #19618: test_sysconfig_module fails on Ubuntu 12.04 http://bugs.python.org/issue19618 opened by berker.peksag #19619: Blacklist base64, hex, ... codecs from bytes.decode() and str. http://bugs.python.org/issue19619 opened by haypo #19620: tokenize documentation contains typos (argment instead of argu http://bugs.python.org/issue19620 opened by cjwelborn #19622: Default buffering for input and output pipes in subprocess mod http://bugs.python.org/issue19622 opened by vadmium #19623: Support for writing aifc to unseekable file http://bugs.python.org/issue19623 opened by serhiy.storchaka #19624: Switch constants in the errno module to IntEnum http://bugs.python.org/issue19624 opened by serhiy.storchaka #19626: test_email and Lib/email/_policybase.py failures with -OO http://bugs.python.org/issue19626 opened by ezio.melotti #19627: python open built-in function - "updating" is not defined http://bugs.python.org/issue19627 opened by Bulwersator #19628: maxlevels -1 on compileall for unlimited recursion http://bugs.python.org/issue19628 opened by Sworddragon #19629: support.rmtree fails on symlinks under Windows http://bugs.python.org/issue19629 opened by pitrou #19630: marshal.dump cannot write to temporary file http://bugs.python.org/issue19630 opened by lm1 #19631: "exec" BNF in simple statements http://bugs.python.org/issue19631 opened by grubert #19632: doc build warning http://bugs.python.org/issue19632 opened by pitrou #19635: asyncio should not depend on concurrent.futures, it is not alw http://bugs.python.org/issue19635 opened by haypo #19636: Fix usage of MAX_PATH in posixmodule.c http://bugs.python.org/issue19636 opened by haypo #19638: dtoa: conversion from '__int64' to 'int', possible loss of dat http://bugs.python.org/issue19638 opened by christian.heimes #19639: Improve re match objects display http://bugs.python.org/issue19639 opened by Claudiu.Popa #19640: Drop _source attribute of namedtuple http://bugs.python.org/issue19640 opened by haypo #19641: Add audioop.byteswap() http://bugs.python.org/issue19641 opened by serhiy.storchaka #19642: shutil to support equivalent of: rm -f /dir/* http://bugs.python.org/issue19642 opened by ivan.radic #19643: shutil rmtree fails on readonly files in Windows http://bugs.python.org/issue19643 opened by ivan.radic #19645: decouple unittest assertions from the TestCase class http://bugs.python.org/issue19645 opened by Gregory.Salvan #19648: Empty tests in pickletester need to be implemented or removed http://bugs.python.org/issue19648 opened by zach.ware #19650: test_multiprocessing_spawn.test_mymanager_context() crashed wi http://bugs.python.org/issue19650 opened by haypo #19652: test_asyncio: test_subprocess_send_signal() hangs on buildbot http://bugs.python.org/issue19652 opened by haypo #19654: test_tkinter sporadic failures on "x86 Tiger 3.x" buildbot http://bugs.python.org/issue19654 opened by haypo #19655: Replace the ASDL parser carried with CPython http://bugs.python.org/issue19655 opened by eli.bendersky #19656: Add Py3k warning for non-ascii bytes literals http://bugs.python.org/issue19656 opened by serhiy.storchaka #19658: inspect.getsource weird case http://bugs.python.org/issue19658 opened by Ronny.Pfannschmidt #19659: Document Argument Clinic? http://bugs.python.org/issue19659 opened by haypo #19660: decorator syntax: allow testlist instead of just dotted_name http://bugs.python.org/issue19660 opened by james #19661: AIX: Python: RuntimeError "invalid slot offset when importing http://bugs.python.org/issue19661 opened by dellair.jie #19662: smtpd.py should not decode utf-8 http://bugs.python.org/issue19662 opened by lpolzer #19663: Not so correct error message when initializing defaultdict http://bugs.python.org/issue19663 opened by vajrasky #19665: test_logging fails with SMTPHandler timeout http://bugs.python.org/issue19665 opened by canisdirus #19666: Format string for ASCII unicode or bytes-like object as readon http://bugs.python.org/issue19666 opened by christian.heimes #19667: Add the "htmlcharrefreplace" error handler http://bugs.python.org/issue19667 opened by serhiy.storchaka #19668: Add support of the cp1125 encoding http://bugs.python.org/issue19668 opened by serhiy.storchaka #19670: SimpleCookie Generates Non-RFC6265-Compliant Cookies http://bugs.python.org/issue19670 opened by pdbogen #19671: Option to select the optimization level on compileall http://bugs.python.org/issue19671 opened by Sworddragon #19674: Add introspection information for builtins http://bugs.python.org/issue19674 opened by larry #19675: Pool dies with excessive workers, but does not cleanup http://bugs.python.org/issue19675 opened by dsoprea #19676: Add the "namereplace" error handler http://bugs.python.org/issue19676 opened by serhiy.storchaka #19677: IDLE displaying a CoreAnimation warning http://bugs.python.org/issue19677 opened by rhettinger #19678: smtpd.py: channel should be passed to process_message http://bugs.python.org/issue19678 opened by lpolzer #19679: smtpd.py (SMTPChannel): get rid of "conn" attribute http://bugs.python.org/issue19679 opened by lpolzer #19680: Help missing for exec and print http://bugs.python.org/issue19680 opened by bkabrda #19681: test_repr (test.test_functools.TestPartialC) failures http://bugs.python.org/issue19681 opened by christian.heimes #19683: test_minidom has many empty tests http://bugs.python.org/issue19683 opened by zach.ware #19687: Fixes for elementtree integer overflow http://bugs.python.org/issue19687 opened by christian.heimes #19689: ssl.create_default_context() http://bugs.python.org/issue19689 opened by christian.heimes #19690: test_logging test_race failed with PermissionError http://bugs.python.org/issue19690 opened by ned.deily #19691: Weird wording in RuntimeError doc http://bugs.python.org/issue19691 opened by pitrou #19692: Rename Py_SAFE_DOWNCAST http://bugs.python.org/issue19692 opened by pitrou #19693: "make altinstall && make install" behaviour differs from "make http://bugs.python.org/issue19693 opened by ncoghlan #19694: test_venv failing on one of the Ubuntu buildbots http://bugs.python.org/issue19694 opened by ncoghlan #19695: Clarify how to use various import-related locks http://bugs.python.org/issue19695 opened by brett.cannon #19696: Merge all (non-syntactic) import-related tests into test_impor http://bugs.python.org/issue19696 opened by brett.cannon #19697: refactor pythonrun.c to make use of specs (__main__.__spec__) http://bugs.python.org/issue19697 opened by brett.cannon #19698: Implement _imp.exec_builtin and exec_dynamic http://bugs.python.org/issue19698 opened by brett.cannon #19699: Update zipimport for PEP 451 http://bugs.python.org/issue19699 opened by brett.cannon #19700: Update runpy for PEP 451 http://bugs.python.org/issue19700 opened by brett.cannon #19701: Update multiprocessing for PEP 451 http://bugs.python.org/issue19701 opened by brett.cannon #19702: Update pickle to PEP 451 http://bugs.python.org/issue19702 opened by brett.cannon #19703: Upate pydoc to PEP 451 http://bugs.python.org/issue19703 opened by brett.cannon #19704: Update test.test_threaded_import to PEP 451 http://bugs.python.org/issue19704 opened by brett.cannon #19705: Update test.test_namespace_pkgs to PEP 451 http://bugs.python.org/issue19705 opened by brett.cannon #19706: Check if inspect needs updating for PEP 451 http://bugs.python.org/issue19706 opened by brett.cannon #19707: Check if unittest.mock needs updating for PEP 451 http://bugs.python.org/issue19707 opened by brett.cannon #19708: Check pkgutil for anything missing for PEP 451 http://bugs.python.org/issue19708 opened by brett.cannon #19709: Check pythonrun.c is fully using PEP 451 http://bugs.python.org/issue19709 opened by brett.cannon #19710: Make sure documentation for PEP 451 is finished http://bugs.python.org/issue19710 opened by brett.cannon #19711: add test for changed portions after reloading a namespace pack http://bugs.python.org/issue19711 opened by brett.cannon #19712: Make sure there are exec_module tests for _LoaderBasics subcla http://bugs.python.org/issue19712 opened by brett.cannon #19713: Deprecate various things in importlib thanks to PEP 451 http://bugs.python.org/issue19713 opened by brett.cannon #19714: Add tests for importlib.machinery.WindowsRegistryFinder http://bugs.python.org/issue19714 opened by brett.cannon Most recent 15 issues with no replies (15) ========================================== #19714: Add tests for importlib.machinery.WindowsRegistryFinder http://bugs.python.org/issue19714 #19712: Make sure there are exec_module tests for _LoaderBasics subcla http://bugs.python.org/issue19712 #19711: add test for changed portions after reloading a namespace pack http://bugs.python.org/issue19711 #19710: Make sure documentation for PEP 451 is finished http://bugs.python.org/issue19710 #19709: Check pythonrun.c is fully using PEP 451 http://bugs.python.org/issue19709 #19708: Check pkgutil for anything missing for PEP 451 http://bugs.python.org/issue19708 #19707: Check if unittest.mock needs updating for PEP 451 http://bugs.python.org/issue19707 #19706: Check if inspect needs updating for PEP 451 http://bugs.python.org/issue19706 #19705: Update test.test_namespace_pkgs to PEP 451 http://bugs.python.org/issue19705 #19704: Update test.test_threaded_import to PEP 451 http://bugs.python.org/issue19704 #19703: Upate pydoc to PEP 451 http://bugs.python.org/issue19703 #19702: Update pickle to PEP 451 http://bugs.python.org/issue19702 #19701: Update multiprocessing for PEP 451 http://bugs.python.org/issue19701 #19700: Update runpy for PEP 451 http://bugs.python.org/issue19700 #19699: Update zipimport for PEP 451 http://bugs.python.org/issue19699 Most recent 15 issues waiting for review (15) ============================================= #19689: ssl.create_default_context() http://bugs.python.org/issue19689 #19687: Fixes for elementtree integer overflow http://bugs.python.org/issue19687 #19681: test_repr (test.test_functools.TestPartialC) failures http://bugs.python.org/issue19681 #19676: Add the "namereplace" error handler http://bugs.python.org/issue19676 #19675: Pool dies with excessive workers, but does not cleanup http://bugs.python.org/issue19675 #19674: Add introspection information for builtins http://bugs.python.org/issue19674 #19668: Add support of the cp1125 encoding http://bugs.python.org/issue19668 #19667: Add the "htmlcharrefreplace" error handler http://bugs.python.org/issue19667 #19665: test_logging fails with SMTPHandler timeout http://bugs.python.org/issue19665 #19663: Not so correct error message when initializing defaultdict http://bugs.python.org/issue19663 #19662: smtpd.py should not decode utf-8 http://bugs.python.org/issue19662 #19660: decorator syntax: allow testlist instead of just dotted_name http://bugs.python.org/issue19660 #19656: Add Py3k warning for non-ascii bytes literals http://bugs.python.org/issue19656 #19655: Replace the ASDL parser carried with CPython http://bugs.python.org/issue19655 #19654: test_tkinter sporadic failures on "x86 Tiger 3.x" buildbot http://bugs.python.org/issue19654 Top 10 most discussed issues (10) ================================= #19619: Blacklist base64, hex, ... codecs from bytes.decode() and str. http://bugs.python.org/issue19619 41 msgs #18874: Add a new tracemalloc module to trace memory allocations http://bugs.python.org/issue18874 16 msgs #8813: SSLContext doesn't support loading a CRL http://bugs.python.org/issue8813 13 msgs #19518: Add new PyRun_xxx() functions to not encode the filename http://bugs.python.org/issue19518 12 msgs #19681: test_repr (test.test_functools.TestPartialC) failures http://bugs.python.org/issue19681 11 msgs #19674: Add introspection information for builtins http://bugs.python.org/issue19674 10 msgs #14455: plistlib unable to read json and binary plist files http://bugs.python.org/issue14455 9 msgs #19575: subprocess: on Windows, unwanted file handles are inherited by http://bugs.python.org/issue19575 9 msgs #19660: decorator syntax: allow testlist instead of just dotted_name http://bugs.python.org/issue19660 9 msgs #7105: weak dict iterators are fragile because of unpredictable GC ru http://bugs.python.org/issue7105 8 msgs Issues closed (79) ================== #5202: wave.py cannot write wave files into a shell pipeline http://bugs.python.org/issue5202 closed by serhiy.storchaka #8311: wave module sets data subchunk size incorrectly when writing w http://bugs.python.org/issue8311 closed by serhiy.storchaka #10158: BadInternalCall running test_multiprocessing http://bugs.python.org/issue10158 closed by haypo #12853: global name 'r' is not defined in upload.py http://bugs.python.org/issue12853 closed by jason.coombs #12892: UTF-16 and UTF-32 codecs should reject (lone) surrogates http://bugs.python.org/issue12892 closed by serhiy.storchaka #16158: sporadic test_multiprocessing failure http://bugs.python.org/issue16158 closed by haypo #16596: Skip stack unwinding when "next", "until" and "return" pdb co http://bugs.python.org/issue16596 closed by gvanrossum #16632: Enable DEP and ASLR http://bugs.python.org/issue16632 closed by christian.heimes #16998: Lost updates with multiprocessing.Value http://bugs.python.org/issue16998 closed by sbt #17276: HMAC: deprecate default hash http://bugs.python.org/issue17276 closed by christian.heimes #17618: base85 encoding http://bugs.python.org/issue17618 closed by pitrou #17791: PC/pyconfig.h defines PREFIX macro http://bugs.python.org/issue17791 closed by christian.heimes #17806: Add keyword args support to str/bytes.expandtabs() http://bugs.python.org/issue17806 closed by ezio.melotti #17893: Refactor reduce protocol implementation http://bugs.python.org/issue17893 closed by alexandre.vassalotti #17916: Provide dis.Bytecode based equivalent of dis.distb http://bugs.python.org/issue17916 closed by python-dev #18294: zlib module is not completly 64-bit safe http://bugs.python.org/issue18294 closed by python-dev #18528: Possible fd leak in socketmodule http://bugs.python.org/issue18528 closed by christian.heimes #18775: name attribute for HMAC http://bugs.python.org/issue18775 closed by christian.heimes #19183: PEP 456 Secure and interchangeable hash algorithm http://bugs.python.org/issue19183 closed by christian.heimes #19282: dbm.open should be a context manager http://bugs.python.org/issue19282 closed by python-dev #19338: multiprocessing: sys.exit() from a child with a non-int exit c http://bugs.python.org/issue19338 closed by sbt #19371: timing test too tight http://bugs.python.org/issue19371 closed by ezio.melotti #19386: selectors test_interrupted_retry is flaky http://bugs.python.org/issue19386 closed by r.david.murray #19448: SSL: add OID / NID lookup http://bugs.python.org/issue19448 closed by christian.heimes #19449: csv.DictWriter can't handle extra non-string fields http://bugs.python.org/issue19449 closed by r.david.murray #19474: Argument Clinic causing compiler warnings about uninitialized http://bugs.python.org/issue19474 closed by larry #19507: ssl.wrap_socket() with server_hostname should imply match_host http://bugs.python.org/issue19507 closed by christian.heimes #19512: Avoid temporary Unicode strings, use identifiers to only creat http://bugs.python.org/issue19512 closed by haypo #19513: Use PyUnicodeWriter instead of PyAccu in repr(tuple) and repr( http://bugs.python.org/issue19513 closed by haypo #19520: Win32 compiler warning in _sha3 http://bugs.python.org/issue19520 closed by zach.ware #19523: logging.FileHandler - using of delay argument case handle leak http://bugs.python.org/issue19523 closed by python-dev #19544: Port distutils as found in Python 2.7 to Python 3.x. http://bugs.python.org/issue19544 closed by jason.coombs #19550: PEP 453: Windows installer integration http://bugs.python.org/issue19550 closed by loewis #19552: PEP 453: venv module and pyvenv integration http://bugs.python.org/issue19552 closed by python-dev #19553: PEP 453: "make install" and "make altinstall" integration http://bugs.python.org/issue19553 closed by ned.deily #19555: "SO" config var not getting set http://bugs.python.org/issue19555 closed by barry #19565: test_multiprocessing_spawn: RuntimeError and assertion error o http://bugs.python.org/issue19565 closed by haypo #19568: bytearray_setslice_linear() leaves the bytearray in an inconsi http://bugs.python.org/issue19568 closed by python-dev #19578: del list[a:b:c] doesn't handle correctly list_resize() failure http://bugs.python.org/issue19578 closed by python-dev #19581: PyUnicodeWriter: change the overallocation factor for Windows http://bugs.python.org/issue19581 closed by haypo #19586: distutils: Remove assertEquals and assert_ deprecation warning http://bugs.python.org/issue19586 closed by jason.coombs #19590: Use specific asserts in test_email http://bugs.python.org/issue19590 closed by serhiy.storchaka #19591: Use specific asserts in ctype tests http://bugs.python.org/issue19591 closed by serhiy.storchaka #19594: Use specific asserts in unittest tests http://bugs.python.org/issue19594 closed by ezio.melotti #19596: Silently skipped tests in test_importlib http://bugs.python.org/issue19596 closed by zach.ware #19597: Add decorator for tests not yet implemented http://bugs.python.org/issue19597 closed by zach.ware #19599: Failure of test_async_timeout() of test_multiprocessing_spawn: http://bugs.python.org/issue19599 closed by sbt #19600: Use specific asserts in distutils tests http://bugs.python.org/issue19600 closed by ezio.melotti #19601: Use specific asserts in sqlite3 tests http://bugs.python.org/issue19601 closed by serhiy.storchaka #19602: Use specific asserts in tkinter tests http://bugs.python.org/issue19602 closed by serhiy.storchaka #19603: Use specific asserts in test_decr http://bugs.python.org/issue19603 closed by serhiy.storchaka #19604: Use specific asserts in array tests http://bugs.python.org/issue19604 closed by serhiy.storchaka #19605: Use specific asserts in datetime tests http://bugs.python.org/issue19605 closed by serhiy.storchaka #19606: Use specific asserts in http.cookiejar tests http://bugs.python.org/issue19606 closed by serhiy.storchaka #19607: Use specific asserts in weakref tests http://bugs.python.org/issue19607 closed by serhiy.storchaka #19617: Fix usage of Py_ssize_t type in Python/compile.c http://bugs.python.org/issue19617 closed by haypo #19621: Reimporting this and str.translate() http://bugs.python.org/issue19621 closed by eric.snow #19625: IDLE should use UTF-8 as it's default encoding http://bugs.python.org/issue19625 closed by serhiy.storchaka #19633: test_wave: failures on PPC64 buildbot http://bugs.python.org/issue19633 closed by haypo #19634: test_strftime.test_y_before_1900_nonwin() fails on AIX http://bugs.python.org/issue19634 closed by haypo #19637: test_subprocess.test_undecodable_env() failure on AIX http://bugs.python.org/issue19637 closed by haypo #19644: Failing of interpreter when slicing a string in MacOS X Maveri http://bugs.python.org/issue19644 closed by ezio.melotti #19646: Use PyUnicodeWriter in repr(dict) http://bugs.python.org/issue19646 closed by haypo #19647: unittest.TestSuite consumes tests http://bugs.python.org/issue19647 closed by michael.foord #19649: Clean up OS X framework and universal bin directories http://bugs.python.org/issue19649 closed by ned.deily #19651: Mac OSX 10.9 segmentation fault 11 with Python 3.3.3 http://bugs.python.org/issue19651 closed by johndobson #19653: Generalize usage of _PyUnicodeWriter for repr(obj): add _PyObj http://bugs.python.org/issue19653 closed by haypo #19657: List comprehension conditional evaluation order seems backward http://bugs.python.org/issue19657 closed by r.david.murray #19664: UserDict test assumes ordered dict repr http://bugs.python.org/issue19664 closed by christian.heimes #19669: remove mention of the old LaTeX docs http://bugs.python.org/issue19669 closed by pitrou #19672: Listing of all exceptions for every function http://bugs.python.org/issue19672 closed by r.david.murray #19673: PEP 428 implementation http://bugs.python.org/issue19673 closed by pitrou #19682: _ssl won't compile with OSX 10.9 SDK http://bugs.python.org/issue19682 closed by ronaldoussoren #19684: IDLE on OS X 10.6.8 crashes when opening a file http://bugs.python.org/issue19684 closed by ned.deily #19685: pip: open() uses the locale encoding to parse Python script, i http://bugs.python.org/issue19685 closed by ncoghlan #19686: possible unnecessary memoryview copies? http://bugs.python.org/issue19686 closed by skrah #19688: pip install now fails with 'HTMLParser' object has no attribut http://bugs.python.org/issue19688 closed by ezio.melotti #1744456: Patch for feat. 1713877 Expose callbackAPI in readline module http://bugs.python.org/issue1744456 closed by christian.heimes #513840: entity unescape for sgml/htmllib http://bugs.python.org/issue513840 closed by ezio.melotti From techtonik at gmail.com Fri Nov 22 18:21:02 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 22 Nov 2013 20:21:02 +0300 Subject: [Python-Dev] PEP 428 (pathlib) now committed In-Reply-To: <20131122174455.2004b3ce@fsol> References: <20131122174455.2004b3ce@fsol> Message-ID: It was too fast. I didn't had a chance to send the comments. -- anatoly t. On Fri, Nov 22, 2013 at 7:44 PM, Antoine Pitrou wrote: > > Hello, > > I've pushed pathlib to the repository. I'm hopeful there won't be > new buildbot failures because of it, but still, there may be some > platform-specific issues I'm unaware of. > > I hope our Release Manager is doing ok :-) > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com From martin at v.loewis.de Fri Nov 22 18:25:21 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Nov 2013 18:25:21 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528F8C56.3060001@stackless.com> References: <528F524F.8080807@v.loewis.de> <528F8C56.3060001@stackless.com> Message-ID: <528F9381.70907@v.loewis.de> Am 22.11.13 17:54, schrieb Christian Tismer: > The discussion is over, but I cannot let this comment go through without > citing my original question, again: > >> My question >> ----------- >> >> I have created a very clean Python 2.7.6+ based CPython with the >> Stackless >> additions, that compiles with VS 2010, using the adapted project >> structure >> of Python 3.3.X, and I want to publish that on the Stackless website >> as the >> official "Stackless Python 2.8". If you consider Stackless as official >> ;-) . > > Can it be that these sentences have morphed into something different > when going out to the mailing list? Or maybe there is a filter in the > brains? > If one removes the word "Stackless" everywhere, the above text reads > still almost syntactic correctly, but changes it's meaning a lot. Hmm. This isn't the original *question*. Instead, the question read # Can I rely on PEP 404 that the "Python 2.8" namespace never will clash # with CPython? So while the word "Stackless" wasn't removed everywhere, it certainly wasn't in the question. Regards, Martin From brett at python.org Fri Nov 22 18:31:06 2013 From: brett at python.org (Brett Cannon) Date: Fri, 22 Nov 2013 12:31:06 -0500 Subject: [Python-Dev] PEP 451 (ModuleSpec) has landed In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 11:37 AM, Eric Snow wrote: > On Fri, Nov 22, 2013 at 9:36 AM, Eric Snow > wrote: > > The bulk of PEP 451 (all but the "fluffy" parts as Brett called them > > ) is now committed to default. > > I'm looking into buildbot failures (on Windows of course). .. which have now been fixed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Nov 22 19:09:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 19:09:07 +0100 Subject: [Python-Dev] peps: 3156 has also been committed. References: <3dR4Xm4QfLz7LmQ@mail.python.org> Message-ID: <20131122190907.26498847@fsol> On Fri, 22 Nov 2013 18:35:20 +0100 (CET) barry.warsaw wrote: > http://hg.python.org/peps/rev/706fe4b8b148 > changeset: 5315:706fe4b8b148 > user: Barry Warsaw > date: Fri Nov 22 12:35:10 2013 -0500 > summary: > 3156 has also been committed. The reason I haven't set "final" here is that AFAIK, Guido still has to check the PEP conforms to the implementation. I don't know if that warrants a different label. Regards Antoine. From Steve.Dower at microsoft.com Fri Nov 22 19:10:53 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Fri, 22 Nov 2013 18:10:53 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528F3E49.9030800@v.loewis.de> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> <528F3E49.9030800@v.loewis.de> Message-ID: <01b4d348507f4895a2b6b7e96c19b796@BLUPR03MB293.namprd03.prod.outlook.com> Martin v. L?wis wrote: > Am 22.11.13 01:58, schrieb Steve Dower: > >> I'm happy to work on a PEP and changes for what I described above, if >> there's enough interest? I can also update distutils to detect and >> build with any available compiler, though this may be more of a >> feature than we'd want for 2.7 at this point. > > I don't think a PEP is necessary - Guido essentially pre-approved changes needed > to make Python 2.7 work with newer tools and operating systems. I'd really want to update distutils.msvc9compiler to detect later versions as well, since that would make 'pip install' work properly for a large (majority?) of users for a large (majority?) of packages with extension modules. Some may consider this PEP-worthy (or at least worth arguing about), though I'm happy to just contribute a patch. (Not referring to my existing patch for this - I have a far more compatible approach in mind.) There's probably also value in making the same changes to Python 3.4. The stable ABI is solving a different problem, though it also made it safer to I'm also getting in touch with my colleague who currently owns MSVCRT to figure out the full scope of what may happen once we start allowing mismatched versions in the same process. Hopefully there isn't much, but it will certainly be worth writing up somewhere - PEP or developer docs, doesn't really bother me. We may also want to fail builds that use functionality known to conflict - setlocale() comes to mind. > A patch for this would be appreciated - perhaps you would want to put it into > your sandbox on hg.python.org. I don't have a sandbox - how can I get one? Cheers, Steve > Regards, > Martin From solipsis at pitrou.net Fri Nov 22 19:52:10 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 19:52:10 +0100 Subject: [Python-Dev] PEP 428 (pathlib) now committed References: <20131122174455.2004b3ce@fsol> Message-ID: <20131122195210.5b7fae8d@fsol> On Fri, 22 Nov 2013 17:44:55 +0100 Antoine Pitrou wrote: > > I've pushed pathlib to the repository. I'm hopeful there won't be > new buildbot failures because of it, but still, there may be some > platform-specific issues I'm unaware of. Actually, there turn out to be two platform-specific issues: - test_glob() failure under POSIX case-insensitive filesystems (in other words, OS X :-)): http://bugs.python.org/issue19718 - test_touch_common() failure under the Windows buildbot: http://bugs.python.org/issue19715 I'd be glad to have help or insight from people experts with those platforms on the issues above! Regards Antoine. From martin at v.loewis.de Fri Nov 22 20:12:38 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 22 Nov 2013 20:12:38 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <01b4d348507f4895a2b6b7e96c19b796@BLUPR03MB293.namprd03.prod.outlook.com> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> <528F3E49.9030800@v.loewis.de> <01b4d348507f4895a2b6b7e96c19b796@BLUPR03MB293.namprd03.prod.outlook.com> Message-ID: <528FACA6.1060504@v.loewis.de> Am 22.11.13 19:10, schrieb Steve Dower: > I'd really want to update distutils.msvc9compiler to detect later > versions as well, since that would make 'pip install' work properly > for a large (majority?) of users for a large (majority?) of packages > with extension modules. Some may consider this PEP-worthy (or at > least worth arguing about), though I'm happy to just contribute a > patch. (Not referring to my existing patch for this - I have a far > more compatible approach in mind.) A PEP on 2.7 seems questionable - if this would really need a PEP, it would be right out (IMO). A PEP would ask for community input, weighing possibly different design choices. Instead, I think this needs explicit RM approval, such as any other borderline bugfix patch. I'd personally support it, including any distutils change (assuming the changes "look" backwards-compatible) - but it still would be for Benjamin to rule about it. > There's probably also value in making the same changes to Python 3.4. Perhaps. However, Python 3.4 is likely being replaced before VS 2010 ends its life, and people will be more quick to forward-port to 3.5. > I'm also getting in touch with my colleague who currently owns MSVCRT > to figure out the full scope of what may happen once we start > allowing mismatched versions in the same process. There used to be an MSDN article about it, but I think it was incomplete. It mentioned (IIRC) a) locale, b) malloc, c) struct FILE. Not sure whether it mentioned CRT file handles, and I'm fairly sure that it didn't mention errno. I also don't think that timezone issues were mentioned (although there used to be a separate article about CRT timezones). So if you can get somebody to compile a complete list, publishing it as a KB article would certainly be appreciated beyond the Python project. >> A patch for this would be appreciated - perhaps you would want to >> put it into your sandbox on hg.python.org. > > I don't have a sandbox - how can I get one? You are not a Python committer yet, are you? If you are, go to hg.python.org/cpython, and invoke the server-side clone. If you are not - does your company agree if you would become one? In any case, patches or a clone on bitbucket would work just as well. Regards, Martin From ethan at stoneleaf.us Fri Nov 22 19:59:31 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 22 Nov 2013 10:59:31 -0800 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: References: <528F673B.3070505@stoneleaf.us> Message-ID: <528FA993.4090102@stoneleaf.us> [Redirecting to Python Dev] On 11/22/2013 10:25 AM, Richard Tew wrote: > On Sat, Nov 23, 2013 at 3:16 AM, Ethan Furman wrote: >> On 11/22/2013 01:00 AM, Richard Tew wrote: >>> Yet, suddenly, the chance that we may release a "Stackless Python >>> 2.8", is a concern. >> >> Because there is no CPython 2.8 to mirror, and we've said there will not be >> one. >> >> It certainly didn't help that this was presented one week before the feature >> freeze deadline for 3.4 and Christian stated [1] his "2.8" release was going >> to happen in one week unless we talked him out of it. >> >> >>> We're not talking about releasing a Python 2.8 against anyone's wishes >>> here. >> >> Having "Python 2.8" in the name is going to be a PR nightmare. Other names >> have been suggested. > > But how is it going to be a PR nightmare? You and others can say it, > but is it reasonable to do so? Apparently some of us think so. ;) There isn't supposed to be a Python 2.8. If you create one, people will find it. There are other distributions that package and present Python x.y, so it would be reasonable to think that a Stackless Python 2.8 meant a regular Python 2.8.0 > No-one has ever mistaken us for Python. We're not, and never will be > high enough in the search results. Our web site is not, and has never > been misrepresentative in a way where people could assume it is the > proper Python web site. In our history of releasing versions with > matching numbers, no-one has yet mistaken us. I suspect you would find many more hits if you were the only ones to have a Python 2.8. See above for why that would be confusing. > Yet, we're told we should adopt wacky version numbers like 10, or > rename our project from Stackless Python. Sometimes we have to do wacky stuff to avoid unnecessary confusion. -- ~Ethan~ From solipsis at pitrou.net Fri Nov 22 21:06:23 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 21:06:23 +0100 Subject: [Python-Dev] PEP 0404 and VS 2010 References: <528F673B.3070505@stoneleaf.us> <528FA993.4090102@stoneleaf.us> Message-ID: <20131122210623.557088a5@fsol> On Fri, 22 Nov 2013 10:59:31 -0800 Ethan Furman wrote: > > > Yet, we're told we should adopt wacky version numbers like 10, or > > rename our project from Stackless Python. > > Sometimes we have to do wacky stuff to avoid unnecessary confusion. Or it can be "Stackless Python 2.7 Extended Life" or something. Regards Antoine. From solipsis at pitrou.net Fri Nov 22 21:07:02 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 21:07:02 +0100 Subject: [Python-Dev] cpython: Add note to asyncore/asynchat recommending asyncio for new code. References: <3dR7pG39RKz7LkT@mail.python.org> Message-ID: <20131122210702.51832a6f@fsol> On Fri, 22 Nov 2013 21:02:14 +0100 (CET) guido.van.rossum wrote: > http://hg.python.org/cpython/rev/db6ae01a5f7f > changeset: 87364:db6ae01a5f7f > user: Guido van Rossum > date: Fri Nov 22 11:57:35 2013 -0800 > summary: > Add note to asyncore/asynchat recommending asyncio for new code. > > files: > Doc/library/asynchat.rst | 3 +++ > Doc/library/asyncore.rst | 3 +++ > 2 files changed, 6 insertions(+), 0 deletions(-) > > > diff --git a/Doc/library/asynchat.rst b/Doc/library/asynchat.rst > --- a/Doc/library/asynchat.rst > +++ b/Doc/library/asynchat.rst > @@ -10,6 +10,9 @@ > > -------------- > > +Note: This module exists for backwards compatibility only. For new code we > +recommend using :module:`asyncio`. > + Notes should use the ".. note::" syntax. Regards Antoine. From solipsis at pitrou.net Fri Nov 22 21:47:57 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 21:47:57 +0100 Subject: [Python-Dev] cpython: Wording changes to pathlib docs. References: <3dR8lx362Kz7LjM@mail.python.org> Message-ID: <20131122214757.6a8417c2@fsol> On Fri, 22 Nov 2013 21:45:17 +0100 (CET) andrew.kuchling wrote: > http://hg.python.org/cpython/rev/cce14bc9b675 > changeset: 87371:cce14bc9b675 > user: Andrew Kuchling > date: Fri Nov 22 15:45:02 2013 -0500 > summary: > Wording changes to pathlib docs. > > Only possibly-controversial change: joinpath() was described as: > > "Calling this method is equivalent to indexing the path with each of > the *other* arguments in turn." > > 'Indexing' is an odd word to use, because you can't subscript Path or > PurePath objects, so I changed it to "combining". You're right, "indexing" dates back to when pathlib used subscripting to combine paths together (e.g. you would write my_path['Lib']['test'] or my_path['Lib/test'] to access the 'Lib/test' subpath of my_path), but now pathlib uses the '/' operator. Regards Antoine. From Steve.Dower at microsoft.com Fri Nov 22 21:39:51 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Fri, 22 Nov 2013 20:39:51 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528FACA6.1060504@v.loewis.de> References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> <528F3E49.9030800@v.loewis.de> <01b4d348507f4895a2b6b7e96c19b796@BLUPR03MB293.namprd03.prod.outlook.com> <528FACA6.1060504@v.loewis.de> Message-ID: <89e3e3ebad0a4a25bb055649a96d78de@BLUPR03MB293.namprd03.prod.outlook.com> > Martin v. L?wis wrote: > Am 22.11.13 19:10, schrieb Steve Dower: >> I'd really want to update distutils.msvc9compiler to detect later >> versions as well, since that would make 'pip install' work properly >> for a large (majority?) of users for a large (majority?) of packages >> with extension modules. Some may consider this PEP-worthy (or at least >> worth arguing about), though I'm happy to just contribute a patch. >> (Not referring to my existing patch for this - I have a far more >> compatible approach in mind.) > > A PEP on 2.7 seems questionable - if this would really need a PEP, it would be > right out (IMO). A PEP would ask for community input, weighing possibly > different design choices. Good point. Let's keep this as a patch :) > Instead, I think this needs explicit RM approval, such as any other borderline > bugfix patch. I'd personally support it, including any distutils change > (assuming the changes "look" backwards-compatible) - but it still would be for > Benjamin to rule about it. > >> There's probably also value in making the same changes to Python 3.4. > > Perhaps. However, Python 3.4 is likely being replaced before VS 2010 ends its > life, and people will be more quick to forward-port to 3.5. True, though people will still need to match VC versions precisely. I guess we can look at the changes to 2.7 and see how easily they can be ported. >> I'm also getting in touch with my colleague who currently owns MSVCRT >> to figure out the full scope of what may happen once we start allowing >> mismatched versions in the same process. > > There used to be an MSDN article about it, but I think it was incomplete. It > mentioned (IIRC) a) locale, b) malloc, c) struct FILE. > Not sure whether it mentioned CRT file handles, and I'm fairly sure that it > didn't mention errno. I also don't think that timezone issues were mentioned > (although there used to be a separate article about CRT timezones). That's basically the complete list. All the other concerns relate to mixing C++ and C, which doesn't apply here. The advice I've been given on FILE* is that there's probably no way to make it work correctly due to its internal buffering. Unfortunately, there are more places where this leaks through than just the APIs using them - extensions that call os.dup(fd), PyNumber_AsSsize_t() and pass the result to _fdopen() (for example, numpy) are simply going to break with mismatched fd's and there's no way to detect it at compile time. It's hard to tell how wide-ranging this sort of issue is going to be, but it certainly has the potential to break badly... > So if you can get somebody to compile a complete list, publishing it as a KB > article would certainly be appreciated beyond the Python project. > >>> A patch for this would be appreciated - perhaps you would want to put >>> it into your sandbox on hg.python.org. >> >> I don't have a sandbox - how can I get one? > > You are not a Python committer yet, are you? If you are, go to > hg.python.org/cpython, and invoke the server-side clone. If you are not - does > your company agree if you would become one? Not yet, though I've signed a contributor agreement already. And I have explicit permission to make contributions along these lines. It seems I can make a sandbox clone without any authentication, which is very generous :), but I can't push yet. Cheers, Steve > In any case, patches or a clone on bitbucket would work just as well. > > Regards, > Martin From seotaewong40 at yahoo.co.kr Fri Nov 22 23:24:29 2013 From: seotaewong40 at yahoo.co.kr (TaeWong) Date: Sat, 23 Nov 2013 06:24:29 +0800 (SGT) Subject: [Python-Dev] Change Shtull-Trauring's name in Misc/ACKS Message-ID: <1385159069.79406.YahooMailNeo@web193101.mail.sg3.yahoo.com> Due to Shtull-Trauring's marriage, his name is now changed to Turner-Trauring. Please accept the suggestion to change his name in Misc/ACKS. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Nov 23 00:14:44 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 23 Nov 2013 12:14:44 +1300 Subject: [Python-Dev] PEP 0404 and VS 2010 In-Reply-To: <528FA993.4090102@stoneleaf.us> References: <528F673B.3070505@stoneleaf.us> <528FA993.4090102@stoneleaf.us> Message-ID: <528FE564.8070904@canterbury.ac.nz> Ethan Furman wrote: > [Redirecting to Python Dev] > > On 11/22/2013 10:25 AM, Richard Tew wrote: > >> Yet, we're told we should adopt wacky version numbers like 10, or >> rename our project from Stackless Python. > > Sometimes we have to do wacky stuff to avoid unnecessary confusion. Maybe imaginary version numbers could be used? Python 2.8j has a nice air of unreality about it. :-) -- Greg From larry at hastings.org Sat Nov 23 03:29:14 2013 From: larry at hastings.org (Larry Hastings) Date: Fri, 22 Nov 2013 18:29:14 -0800 Subject: [Python-Dev] Reminder: Less than 24 hours until 3.4 feature freeze Message-ID: <529012FA.20809@hastings.org> I hope to tag 3.4 in less than 24 hours from now. Last call, folks. Here's hoping the buildbots all get green soon, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Sat Nov 23 13:02:37 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sat, 23 Nov 2013 13:02:37 +0100 Subject: [Python-Dev] PEP 454 - tracemalloc - accepted In-Reply-To: References: Message-ID: Hi, 2013/11/21 Charles-Fran?ois Natali : > I'm happy to officially accept PEP 454 aka tracemalloc. > The API has substantially improved over the past weeks, and is now > both easy to use and suitable as a fundation for high-level tools for > memory-profiling. I pushed the implementation of he PEP 454 (tracemalloc): http://hg.python.org/cpython/rev/6e2089dbc5ad Just after that, I pushed a change in the API to fix a bug: http://hg.python.org/cpython/rev/66db0c66a6ee "Issue #18874: Remove tracemalloc.set_traceback_limit()" Changing the traceback limit while tracemalloc is tracing can leads to bugs or weird behaviour. So I added an option nframe parameter to start instead. I'm sorry, there are pending comments in the issue. I got many reviews the last days, I didn't have time to address them yet. I prefer to push the code before the beta1 because this version marks the feature freeze and I expect more feedback if the module is included in the release. I consider that the comments doesn't change the public API (except the removal of set_tracemalloc_limit which is already done) and so can be addressed later. http://bugs.python.org/issue18874 Anyway, thanks to everyone who helped me to build a better tracemalloc module! And don't hesitate to continue to review the code and test the module. I will try to write an article for the Python Insider blog to show some examples of usage of the tracemalloc module. It would be nice to have similar articles for the other PEPs implemented in Python 3.4, before the release of Python 3.4! http://blog.python.org/ Victor From storchaka at gmail.com Sat Nov 23 14:32:58 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 23 Nov 2013 15:32:58 +0200 Subject: [Python-Dev] PEP 428 (pathlib) now committed In-Reply-To: <20131122174455.2004b3ce@fsol> References: <20131122174455.2004b3ce@fsol> Message-ID: 22.11.13 18:44, Antoine Pitrou ???????(??): > I've pushed pathlib to the repository. I'm hopeful there won't be > new buildbot failures because of it, but still, there may be some > platform-specific issues I'm unaware of. Congratuate Antoine! Does it means that issues #11344 (Add os.path.splitpath(path) function) [1] and #13968 (Support recursive globs) [2] have no chance? Both are ready for commit and waits for reviews almost a year. Are the os.path and glob modules deprecated now? [1] http://bugs.python.org/issue11344 [2] http://bugs.python.org/issue13968 From techtonik at gmail.com Sat Nov 23 16:15:08 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 23 Nov 2013 18:15:08 +0300 Subject: [Python-Dev] PEP 428 (pathlib) now committed In-Reply-To: References: <20131122174455.2004b3ce@fsol> Message-ID: I'd vote for a different perspective on path handling. For me the pathlib is not the good way to go. Especially with copying ill behaviour of old os.path functions. We definitely need a "task force page" dedicated to "working with paths in Python" to collaborate. ML + PEP with privileged write access is ineffective. While I like the problem decomposition what I see in pathlib, I don't think it supports the greatest power of Python that it allows you not to think about cross-platform differences - http://clanmills.com/files/dist/doc/cross_platform.html#python-batteries-included I've got the impression that while using pathlib you should care about those. I am +1 for iterating few more times to research best compromise between the most simple possible and most user friendly interface (wart-free). But every iteration should be historically accessible like in blog report, not like in Mercurial PEP diff. There is still problem to dedicate time though. -- anatoly t. On Sat, Nov 23, 2013 at 4:32 PM, Serhiy Storchaka wrote: > 22.11.13 18:44, Antoine Pitrou ???????(??): > >> I've pushed pathlib to the repository. I'm hopeful there won't be >> new buildbot failures because of it, but still, there may be some >> platform-specific issues I'm unaware of. > > > Congratuate Antoine! > > Does it means that issues #11344 (Add os.path.splitpath(path) function) [1] > and #13968 (Support recursive globs) [2] have no chance? Both are ready for > commit and waits for reviews almost a year. Are the os.path and glob modules > deprecated now? > > [1] http://bugs.python.org/issue11344 > [2] http://bugs.python.org/issue13968 > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/techtonik%40gmail.com From solipsis at pitrou.net Sat Nov 23 16:19:17 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 23 Nov 2013 16:19:17 +0100 Subject: [Python-Dev] PEP 428 (pathlib) now committed References: <20131122174455.2004b3ce@fsol> Message-ID: <20131123161917.0c912e06@fsol> On Sat, 23 Nov 2013 15:32:58 +0200 Serhiy Storchaka wrote: > 22.11.13 18:44, Antoine Pitrou ???????(??): > > I've pushed pathlib to the repository. I'm hopeful there won't be > > new buildbot failures because of it, but still, there may be some > > platform-specific issues I'm unaware of. > > Congratuate Antoine! > > Does it means that issues #11344 (Add os.path.splitpath(path) function) > [1] and #13968 (Support recursive globs) [2] have no chance? Both are > ready for commit and waits for reviews almost a year. Are the os.path > and glob modules deprecated now? They are not deprecated, no. I am not terribly interested in reviewing those patches, personally, but other people may be :-) Regards Antoine. From solipsis at pitrou.net Sat Nov 23 19:08:57 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 23 Nov 2013 19:08:57 +0100 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? References: <20131116191526.16f9b79c@fsol> Message-ID: <20131123190857.2cb7b44b@fsol> Hello, I've now pushed Alexandre's implementation, and the PEP is marked final. Regards Antoine. On Sat, 16 Nov 2013 19:15:26 +0100 Antoine Pitrou wrote: > > Hello, > > Alexandre Vassalotti (thanks a lot!) has recently finalized his work on > the PEP 3154 implementation - pickle protocol 4. > > I think it would be good to get the PEP and the implementation accepted > for 3.4. As far as I can say, this has been a low-controvery proposal, > and it brings fairly obvious improvements to the table (which table?). > I still need some kind of BDFL or BDFL delegate to do that, though -- > unless I am allowed to mark my own PEP accepted :-) > > (I've asked Tim, specifically, for comments, since he contributed a lot > to previous versions of the pickle protocol.) > > The PEP is at http://www.python.org/dev/peps/pep-3154/ (should be > rebuilt soon by the server, I'd say) > > Alexandre's implementation is tracked at > http://bugs.python.org/issue17810 > > Regards > > Antoine. > > From greg at krypto.org Sat Nov 23 19:34:24 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Nov 2013 10:34:24 -0800 Subject: [Python-Dev] can someone create a buildbot slave name & password for me? Message-ID: I've got a nice ARM box to add to the pool (it should be a lot faster and more reliable than warsaw-ubuntu-arm; hopefully making its way to stable). proposed slave name: gps-ubuntu-exynos5-armv7l description: Ubuntu ARM - Dual Cortex A15 2GiB (ie: a repurposed samsung ARM chromebook) thanks, -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sat Nov 23 21:31:53 2013 From: barry at python.org (Barry Warsaw) Date: Sat, 23 Nov 2013 15:31:53 -0500 Subject: [Python-Dev] can someone create a buildbot slave name & password for me? In-Reply-To: References: Message-ID: <20131123153153.49df2da2@limelight.wooz.org> On Nov 23, 2013, at 10:34 AM, Gregory P. Smith wrote: >I've got a nice ARM box to add to the pool (it should be a lot faster and >more reliable than warsaw-ubuntu-arm; hopefully making its way to stable). Once that's up, let me know. I'd like to eventually retire this box. It's unsupported hardware and barely limping along. -Barry From tim.peters at gmail.com Sun Nov 24 00:18:37 2013 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 23 Nov 2013 17:18:37 -0600 Subject: [Python-Dev] [Python-checkins] cpython: asyncio: Change bounded semaphore into a subclass, like In-Reply-To: <3dRqvn17KPz7Lq7@mail.python.org> References: <3dRqvn17KPz7Lq7@mail.python.org> Message-ID: [guido] > http://hg.python.org/cpython/rev/6bee0fdcba39 > changeset: 87468:6bee0fdcba39 > user: Guido van Rossum > date: Sat Nov 23 15:09:16 2013 -0800 > summary: > asyncio: Change bounded semaphore into a subclass, like threading.[Bounded]Semaphore. > > files: > Lib/asyncio/locks.py | 36 ++++++++-------- > Lib/test/test_asyncio/test_locks.py | 2 +- > 2 files changed, 20 insertions(+), 18 deletions(-) > > > diff --git a/Lib/asyncio/locks.py b/Lib/asyncio/locks.py > --- a/Lib/asyncio/locks.py > +++ b/Lib/asyncio/locks.py > @@ -336,22 +336,15 @@ ... > +class BoundedSemaphore(Semaphore): ... > + def release(self): > + if self._value >= self._bound_value: > + raise ValueError('BoundedSemaphore released too many times') > + super().release() If there's a lock and parallelism involved, this release() implementation is vulnerable to races: any number of threads can see "self._value < self._bound_value" before one of them manages to call the superclass release(), and so self._value can become arbitrarily larger than self._bound_value. I fixed the same bug in threading's similar bounded semaphore class a few weeks ago. From greg at krypto.org Sun Nov 24 08:01:10 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Nov 2013 23:01:10 -0800 Subject: [Python-Dev] can someone create a buildbot slave name & password for me? In-Reply-To: <20131123153153.49df2da2@limelight.wooz.org> References: <20131123153153.49df2da2@limelight.wooz.org> Message-ID: On Sat, Nov 23, 2013 at 12:31 PM, Barry Warsaw wrote: > On Nov 23, 2013, at 10:34 AM, Gregory P. Smith wrote: > > >I've got a nice ARM box to add to the pool (it should be a lot faster and > >more reliable than warsaw-ubuntu-arm; hopefully making its way to stable). > > Once that's up, let me know. I'd like to eventually retire this box. It's > unsupported hardware and barely limping along. > It's up and running: http://buildbot.python.org/all/buildslaves/gps-ubuntu-exynos5-armv7l -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sun Nov 24 08:13:32 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Nov 2013 23:13:32 -0800 Subject: [Python-Dev] buildbot's are needlessly compiling -O0 Message-ID: our buildbots are setup to configure --with-pydebug which also unfortunately causes them to compile with -O0... this results in a python binary that is excruciatingly slow and makes the entire test suite run take a long time. given that nobody is ever going to run a gdb or another debugger on the buildbot generated transient binaries themselves how about we speed all of the buildbot's up by adding CFLAGS=-O2 to the configure command line? Sure, the compile step will take a bit longer but that is dwarfed by the test time as it is: http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/3224 http://buildbot.python.org/all/builders/ARMv7%203.x/builds/7 http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/639 It should dramatically decrease the turnaround latency for buildbot results. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 24 10:39:46 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Nov 2013 19:39:46 +1000 Subject: [Python-Dev] PEP 428 (pathlib) now committed In-Reply-To: <20131123161917.0c912e06@fsol> References: <20131122174455.2004b3ce@fsol> <20131123161917.0c912e06@fsol> Message-ID: On 24 Nov 2013 01:21, "Antoine Pitrou" wrote: > > On Sat, 23 Nov 2013 15:32:58 +0200 > Serhiy Storchaka wrote: > > 22.11.13 18:44, Antoine Pitrou ???????(??): > > > I've pushed pathlib to the repository. I'm hopeful there won't be > > > new buildbot failures because of it, but still, there may be some > > > platform-specific issues I'm unaware of. > > > > Congratuate Antoine! > > > > Does it means that issues #11344 (Add os.path.splitpath(path) function) > > [1] and #13968 (Support recursive globs) [2] have no chance? Both are > > ready for commit and waits for reviews almost a year. Are the os.path > > and glob modules deprecated now? > > They are not deprecated, no. I am not terribly interested in reviewing > those patches, personally, but other people may be :-) Right, pathlib is an abstraction layer on top of the lower level implementation APIs, rather than a replacement for them. Cheers, Nick. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 24 10:43:18 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Nov 2013 19:43:18 +1000 Subject: [Python-Dev] buildbot's are needlessly compiling -O0 In-Reply-To: References: Message-ID: On 24 Nov 2013 17:15, "Gregory P. Smith" wrote: > > our buildbots are setup to configure --with-pydebug which also unfortunately causes them to compile with -O0... this results in a python binary that is excruciatingly slow and makes the entire test suite run take a long time. > > given that nobody is ever going to run a gdb or another debugger on the buildbot generated transient binaries themselves how about we speed all of the buildbot's up by adding CFLAGS=-O2 to the configure command line? The main problem is that doing so would disable test_gdb. Humans don't run gdb on those binaries, but the test suite does. I agree it would be nice to figure out a way to run most of the tests on an optimised build, though. Cheers, Nick. > > Sure, the compile step will take a bit longer but that is dwarfed by the test time as it is: > > http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/3224 > http://buildbot.python.org/all/builders/ARMv7%203.x/builds/7 > http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/639 > > It should dramatically decrease the turnaround latency for buildbot results. > > -gps > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Nov 24 13:39:49 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 24 Nov 2013 13:39:49 +0100 Subject: [Python-Dev] cpython: Disable annoying tests which doesn't work optimized pickles. References: <3dRzfS13cMz7Ll8@mail.python.org> Message-ID: <20131124133949.0f9f66c9@fsol> On Sun, 24 Nov 2013 05:58:24 +0100 (CET) alexandre.vassalotti wrote: > http://hg.python.org/cpython/rev/a68c303eb8dc > changeset: 87486:a68c303eb8dc > user: Alexandre Vassalotti > date: Sat Nov 23 20:58:24 2013 -0800 > summary: > Disable annoying tests which doesn't work optimized pickles. We should probably disable them only on optimized pickles, then :-) Regards Antoine. From techtonik at gmail.com Sun Nov 24 15:12:48 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 24 Nov 2013 17:12:48 +0300 Subject: [Python-Dev] buildbot's are needlessly compiling -O0 In-Reply-To: References: Message-ID: On Sun, Nov 24, 2013 at 12:43 PM, Nick Coghlan wrote: > > On 24 Nov 2013 17:15, "Gregory P. Smith" wrote: >> >> our buildbots are setup to configure --with-pydebug which also >> unfortunately causes them to compile with -O0... this results in a python >> binary that is excruciatingly slow and makes the entire test suite run take >> a long time. >> >> given that nobody is ever going to run a gdb or another debugger on the >> buildbot generated transient binaries themselves how about we speed all of >> the buildbot's up by adding CFLAGS=-O2 to the configure command line? > > The main problem is that doing so would disable test_gdb. Humans don't run > gdb on those binaries, but the test suite does. Is there a danger that the code tested under GDB is not tested in "natural environment" for pythons? -- anatoly t. From eliben at gmail.com Sun Nov 24 15:19:14 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 24 Nov 2013 06:19:14 -0800 Subject: [Python-Dev] buildbot's are needlessly compiling -O0 In-Reply-To: References: Message-ID: On Sun, Nov 24, 2013 at 6:12 AM, anatoly techtonik wrote: > On Sun, Nov 24, 2013 at 12:43 PM, Nick Coghlan wrote: > > > > On 24 Nov 2013 17:15, "Gregory P. Smith" wrote: > >> > >> our buildbots are setup to configure --with-pydebug which also > >> unfortunately causes them to compile with -O0... this results in a > python > >> binary that is excruciatingly slow and makes the entire test suite run > take > >> a long time. > >> > >> given that nobody is ever going to run a gdb or another debugger on the > >> buildbot generated transient binaries themselves how about we speed all > of > >> the buildbot's up by adding CFLAGS=-O2 to the configure command line? > > > > The main problem is that doing so would disable test_gdb. Humans don't > run > > gdb on those binaries, but the test suite does. > > Is there a danger that the code tested under GDB is not tested in > "natural environment" for pythons? > -- > What are you talking about? Have you actually looked at test_gdb before writing this email? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun Nov 24 16:38:21 2013 From: larry at hastings.org (Larry Hastings) Date: Sun, 24 Nov 2013 07:38:21 -0800 Subject: [Python-Dev] Python 3.4.0b1 is now tagged, feature-freeze is now in effect Message-ID: <52921D6D.4050502@hastings.org> Please refrain from checking in any new features to Python trunk until after the 3.4 release branch is created (which will be a few months). Instead, let's concentrate our efforts on polishing Python 3.4 until it's the best and most-defect-free release yet! Thanks, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Sun Nov 24 17:45:14 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 24 Nov 2013 11:45:14 -0500 Subject: [Python-Dev] buildbot's are needlessly compiling -O0 In-Reply-To: References: Message-ID: <20131124114514.6f3a91da@limelight.wooz.org> On Nov 23, 2013, at 11:13 PM, Gregory P. Smith wrote: >our buildbots are setup to configure --with-pydebug which also >unfortunately causes them to compile with -O0... this results in a python >binary that is excruciatingly slow and makes the entire test suite run take >a long time. It would be fine(-ish) to add this for improved buildbot performance, but please do not change this for default --with-pydebug builds. When you're debugging Python, -O0 just makes so much more sense. -Barry From barry at python.org Sun Nov 24 17:47:42 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 24 Nov 2013 11:47:42 -0500 Subject: [Python-Dev] can someone create a buildbot slave name & password for me? In-Reply-To: References: <20131123153153.49df2da2@limelight.wooz.org> Message-ID: <20131124114742.41d9a663@limelight.wooz.org> On Nov 23, 2013, at 11:01 PM, Gregory P. Smith wrote: >http://buildbot.python.org/all/buildslaves/gps-ubuntu-exynos5-armv7l Cool thanks. Antoine, do you still want or need my buildbot, or can I take it off-line? (FWIW, because the hardware is no longer supported, it's pretty much stuck at Ubuntu 12.10.) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Sun Nov 24 18:02:49 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 24 Nov 2013 18:02:49 +0100 Subject: [Python-Dev] can someone create a buildbot slave name & password for me? References: <20131123153153.49df2da2@limelight.wooz.org> <20131124114742.41d9a663@limelight.wooz.org> Message-ID: <20131124180249.23fda67a@fsol> On Sun, 24 Nov 2013 11:47:42 -0500 Barry Warsaw wrote: > On Nov 23, 2013, at 11:01 PM, Gregory P. Smith wrote: > > >http://buildbot.python.org/all/buildslaves/gps-ubuntu-exynos5-armv7l > > Cool thanks. Antoine, do you still want or need my buildbot, or can I take it > off-line? (FWIW, because the hardware is no longer supported, it's pretty > much stuck at Ubuntu 12.10.) Well, your buildbot has already been off-line for something like a month :-) http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm If the hardware is not supported anymore, and since the machine was rather slow, I agree it's ok to let it go. Regards Antoine. From barry at python.org Sun Nov 24 22:13:06 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 24 Nov 2013 16:13:06 -0500 Subject: [Python-Dev] can someone create a buildbot slave name & password for me? In-Reply-To: <20131124180249.23fda67a@fsol> References: <20131123153153.49df2da2@limelight.wooz.org> <20131124114742.41d9a663@limelight.wooz.org> <20131124180249.23fda67a@fsol> Message-ID: <20131124161306.47558fff@limelight.wooz.org> On Nov 24, 2013, at 06:02 PM, Antoine Pitrou wrote: >If the hardware is not supported anymore, and since the machine was >rather slow, I agree it's ok to let it go. Done! (The machine's name was 'hope', so now we're hope-less :). -Barry From benhoyt at gmail.com Sun Nov 24 23:00:09 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 11:00:09 +1300 Subject: [Python-Dev] PEP 428 - pathlib API questions Message-ID: PEP 428 looks nice. Thanks, Antoine! I have a couple of questions about the module name and API. I think I've read through most of the previous discussion, but may have missed some, so please point me to the right place if there have already been discussions about these things. 1) Someone on reddit.com/r/Python asked "Is the import going to be 'pathlib'? I thought the renaming going on of std lib things with the transition to Python 3 sought to remove the spurious usage of appending 'lib' to libs?" I wondered about this too. Has this been discussed/answered? 2) I think the operation of "suffix" and "suffixes" is good, but not so much the name. I saw Ben Finney's original suggestion about multiple extensions etc (https://mail.python.org/pipermail/python-ideas/2012-October/016437.html). However, it seems there was no further discussion about why not "extension" and "extensions"? I have never heard a filename extension being called a "suffix". I know it is a suffix in the sense of the English word, but I've never heard it called that in this context, and I think context is important. Put another way, "extension" is obvious and guessable, "suffix" isn't. 3) Obviously pathlib isn't going in the stdlib in Python 2.x, but I'm wondering about writing portable code when you want the string version of the path. In Python 3.x you'll call str(path_obj), but in Python 2.x that will fail if the path has unicode chars in it, and you'll need to use unicode(path_obj), which of course doesn't work 3.x. Is this just a fact of life, or would .str() or .as_string() help for 2.x/3.x portability? 4) Is path_obj.glob() recursive? In the PEP it looks like it is if the pattern starts with '**', but in the pep428 branch of the code there are both glob() and rglob() functions. I've never seen the ** syntax before (though admittedly I'm a Windows dev), and much prefer the explicitness of having two functions, or maybe even better, path_obj.glob('*.py', recursive=True). Seems much more Pythonic to provide an actual argument (or different function) for this change in behaviour, rather than stuffing the "recursive flag" inside the pattern string. Has this ship already sailed with http://bugs.python.org/issue13968? Which I also think should also be rglob(pattern) or glob(pattern, recursive=True). Of course, if this ship has already sailed, it's definitely better for pathlib's glob to match glob.glob. Thanks, Ben From larry at hastings.org Sun Nov 24 23:00:16 2013 From: larry at hastings.org (Larry Hastings) Date: Sun, 24 Nov 2013 14:00:16 -0800 Subject: [Python-Dev] [RELEASED] Python 3.4.0b1 Message-ID: <529276F0.3040105@hastings.org> On behalf of the Python development team, it's my privilege to announce the first beta release of Python 3.4. This is a preview release, and its use is not recommended for production settings. Python 3.4 includes a range of improvements of the 3.x series, including hundreds of small improvements and bug fixes. Major new features and changes in the 3.4 release series include: * PEP 435, a standardized "enum" module * PEP 436, a build enhancement that will help generate introspection information for builtins * PEP 442, improved semantics for object finalization * PEP 443, adding single-dispatch generic functions to the standard library * PEP 445, a new C API for implementing custom memory allocators * PEP 446, changing file descriptors to not be inherited by default in subprocesses * PEP 450, a new "statistics" module * PEP 453, a bundled installer for the *pip* package manager * PEP 456, a new hash algorithm for Python strings and binary data * PEP 3154, a new and improved protocol for pickled objects * PEP 3156, a new "asyncio" module, a new framework for asynchronous I/O Python 3.4 is now in "feature freeze", meaning that no new features will be added. The final release is projected for late February 2014. To download Python 3.4.0b1 visit: http://www.python.org/download/releases/3.4.0/ Please consider trying Python 3.4.0b1 with your code and reporting any new issues you notice to: http://bugs.python.org/ Enjoy! -- Larry Hastings, Release Manager larry at hastings.org (on behalf of the entire python-dev team and 3.4's contributors) From benhoyt at gmail.com Sun Nov 24 23:20:08 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 11:20:08 +1300 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) Message-ID: Hi folks, I decided to start another thread for my thoughts on the interaction between pathlib (Antoine's new PEP 428), issue 11406 (proposal for a directory iterator returning stat-like info), and my own scandir library, which implements something along the lines of issue 11406. My scandir library (https://github.com/benhoyt/scandir) is something I've been working on for a while -- it provides a scandir() function which uses the OS's directory iterator functions to expose as much stat-like information as possible (readdir and FindFirstFile etc). This way functions like os.walk() can use the info (particularly "is_dir()") and not require tons of extra calls to os.stat(). This provides a huge speed boost for os.walk() in many cases: I've seen 3-4x on Linux, and up to 20x on Windows. (It depends on various things, not least of which is Windows' weird stat caching -- if I run my scandir benchmark "fresh", I get os.walk() running 8-9 times as fast as the built-in one. But if I run it after an un-hibernate, suddenly it runs 18-20 times as fast as the built-in one. Either way, huge gains, especially on Windows.) scandir.scandir() returns a DirEntry object, which has .isdir(), .isfile(), .islink(), and .lstat() attributes. Look familiar? When I was reading PEP 428 and saw .is_file(), .is_dir(), and .stat(), I thought -- surely I can merge this with pathlib and Path objects. The first thing I can do to scandir is rename my isdir() type attributes to match PEP 428's, so that DirEntry quacks like a Path object where it can. However, I'm wondering if I can change scandir to return actual Path objects. Or better, because Path already helpfully provides iterdir() which yields Path objects, and Path objects have .is_dir() etc, can scandir()-like behaviour simply work out-of-the-box? This mainly depends on how Path is going to cache stat information. If it caches it, then this will just work. Sounds like Guido's opinion was that both cached and uncached use cases are important, but that it should be very clear which one you're getting. I personally like the .stat() and .restat() idea. The other related thing is that DirEntry only provides .lstat(), because it's providing stat-like info without following links. Note in this context that it's not just "network filesystems" on which stat() is slow (https://mail.python.org/pipermail/python-dev/2013-May/125805.html). It's quite slow in Windows under various conditions too. See also Nick Coghlan's post about a DirEntry-style object on the issue 11406 thread: https://mail.python.org/pipermail/python-dev/2013-May/126148.html Thoughts and suggestions for how to merge scandir with pathlib's approach? It's important to me that pathlib's API doesn't cut itself off from a more efficient implement of the ideas from issue 11406 and scandir... Thanks, Ben. From larry at hastings.org Sun Nov 24 23:28:31 2013 From: larry at hastings.org (Larry Hastings) Date: Sun, 24 Nov 2013 14:28:31 -0800 Subject: [Python-Dev] [RELEASED] Python 3.4.0b1 In-Reply-To: <529276F0.3040105@hastings.org> References: <529276F0.3040105@hastings.org> Message-ID: <52927D8F.9000506@hastings.org> On 11/24/2013 02:00 PM, Larry Hastings wrote: > Python 3.4 includes a range of improvements of the 3.x series, including > hundreds of small improvements and bug fixes. Major new features and > changes in the 3.4 release series include: Whoops, sorry, I missed a couple of PEPs there: * PEP 428, a "pathlib" module providing object-oriented filesystem paths * PEP 451, standardizing module metadata for Python's module import system * PEP 454, a new "tracemalloc" module for tracing Python memory allocations They're on the web site already, and they'll be in the next announcement. Sorry for the oversight! //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Nov 24 23:29:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 24 Nov 2013 23:29:21 +0100 Subject: [Python-Dev] PEP 428 - pathlib API questions References: Message-ID: <20131124232921.44f7865a@fsol> Hello, On Mon, 25 Nov 2013 11:00:09 +1300 Ben Hoyt wrote: > > 1) Someone on reddit.com/r/Python asked "Is the import going to be > 'pathlib'? I thought the renaming going on of std lib things with the > transition to Python 3 sought to remove the spurious usage of > appending 'lib' to libs?" I wondered about this too. Has this been > discussed/answered? Well, "path" is much too common already, and it's an obvious variable name for a filesystem path, so "pathlib" is better to avoid name clashes. > 2) I think the operation of "suffix" and "suffixes" is good, but not > so much the name. I saw Ben Finney's original suggestion about > multiple extensions etc > (https://mail.python.org/pipermail/python-ideas/2012-October/016437.html). > > However, it seems there was no further discussion about why not > "extension" and "extensions"? I have never heard a filename extension > being called a "suffix". I know it is a suffix in the sense of the > English word, but I've never heard it called that in this context, and > I think context is important. Put another way, "extension" is obvious > and guessable, "suffix" isn't. Well, perhaps :-), but nobody opposed suffix and suffixes at the time. Note the API is provisional, so we can still make it change, but obviously the barrier for changes is higher now that the PEP is accepted and the beta has been cut. > 3) Obviously pathlib isn't going in the stdlib in Python 2.x, but I'm > wondering about writing portable code when you want the string version > of the path. In Python 3.x you'll call str(path_obj), but in Python > 2.x that will fail if the path has unicode chars in it, and you'll > need to use unicode(path_obj), which of course doesn't work 3.x. The behaviour of unicode paths in Python 2 is erratic (system-dependent). pathlib can't really fix it: Python 2 doesn't know about a well-defined filesystem encoding. > 4) Is path_obj.glob() recursive? This is documented: http://docs.python.org/dev/library/pathlib.html#pathlib.Path.glob http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob > Seems much more Pythonic to provide an actual argument (or different > function) for this change in behaviour, rather than stuffing the > "recursive flag" inside the pattern string. It's not a flag, it's a different wildcard. This allows e.g. a library function to call glob() and users to pass a recursive or non-recursive pattern as they wish. > Has this ship already sailed with http://bugs.python.org/issue13968? This issue is still open, so no :-) Regards Antoine. From solipsis at pitrou.net Sun Nov 24 23:44:34 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 24 Nov 2013 23:44:34 +0100 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) References: Message-ID: <20131124234434.37390cb2@fsol> On Mon, 25 Nov 2013 11:20:08 +1300 Ben Hoyt wrote: > > This mainly depends on how Path is going to cache stat information. If > it caches it, then this will just work. Sounds like Guido's opinion > was that both cached and uncached use cases are important, but that it > should be very clear which one you're getting. I personally like the > .stat() and .restat() idea. Right now, pathlib doesn't cache. Guido decided it was safer to start off like that, and perhaps later we can add some optional caching. One reason caching didn't go in is that it's not clear which API is best. Working on pluggin scandir() into pathlib would actually help choosing a stat-caching API. (or, rather, lstat-caching...) > The other related thing is that DirEntry only provides .lstat(), > because it's providing stat-like info without following links. Path.is_dir() and friends use stat(), i.e. they inform you about whether a symlink's target is a directory (not the symlink itself). Of course, if the DirEntry says the path is a symlink, Path.is_dir() could then run stat() to find out about the target. Do you plan to propose scandir() for inclusion in the stdlib? Regards Antoine. From benhoyt at gmail.com Mon Nov 25 00:03:38 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 12:03:38 +1300 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: <20131124232921.44f7865a@fsol> References: <20131124232921.44f7865a@fsol> Message-ID: > Well, "path" is much too common already, and it's an obvious variable > name for a filesystem path, so "pathlib" is better to avoid name > clashes. Yep, that makes total sense, thanks. >> However, it seems there was no further discussion about why not >> "extension" and "extensions"? I have never heard a filename extension >> being called a "suffix". I know it is a suffix in the sense of the >> English word, but I've never heard it called that in this context, and >> I think context is important. Put another way, "extension" is obvious >> and guessable, "suffix" isn't. > > Well, perhaps :-), but nobody opposed suffix and suffixes at the time. > Note the API is provisional, so we can still make it change, but > obviously the barrier for changes is higher now that the PEP is > accepted and the beta has been cut. Okay. I won't push hard :-) as "suffix" isn't terrible, but has anyone else never (or rarely) heard the term "suffix" applied to filename extensions? >> 3) Obviously pathlib isn't going in the stdlib in Python 2.x, but I'm >> wondering about writing portable code when you want the string version >> of the path. In Python 3.x you'll call str(path_obj), but in Python >> 2.x that will fail if the path has unicode chars in it, and you'll >> need to use unicode(path_obj), which of course doesn't work 3.x. > > The behaviour of unicode paths in Python 2 is erratic > (system-dependent). pathlib can't really fix it: Python 2 doesn't know > about a well-defined filesystem encoding. Fair enough. >> 4) Is path_obj.glob() recursive? > > This is documented: > http://docs.python.org/dev/library/pathlib.html#pathlib.Path.glob > http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob > >> Seems much more Pythonic to provide an actual argument (or different >> function) for this change in behaviour, rather than stuffing the >> "recursive flag" inside the pattern string. > > It's not a flag, it's a different wildcard. This allows e.g. a library > function to call glob() and users to pass a recursive or non-recursive > pattern as they wish. Okay, just saw those docs now -- thanks. Fair enough re "it's a different wildcard". At the least I don't think there should be two ways to do it -- in other words, either rglob() or glob('**'), both seems very un-PEP 20 to me. My preference is rglob(), but glob(recursive=True) would be fine too. >> Has this ship already sailed with http://bugs.python.org/issue13968? > > This issue is still open, so no :-) Same goes for this issue -- there should be OOWTDI, and my preference is rglob() or glob(recursive=True). But maybe issue 13968's behaviour can be determined by pathlib's now that pathlib is the one getting done first. Thanks, Ben. From benhoyt at gmail.com Mon Nov 25 00:04:28 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 12:04:28 +1300 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: <20131124234434.37390cb2@fsol> References: <20131124234434.37390cb2@fsol> Message-ID: > Right now, pathlib doesn't cache. Guido decided it was safer to start > off like that, and perhaps later we can add some optional caching. > > One reason caching didn't go in is that it's not clear which API is > best. Working on pluggin scandir() into pathlib would actually help > choosing a stat-caching API. > > (or, rather, lstat-caching...) > >> The other related thing is that DirEntry only provides .lstat(), >> because it's providing stat-like info without following links. > > Path.is_dir() and friends use stat(), i.e. they inform you about > whether a symlink's target is a directory (not the symlink itself). Of > course, if the DirEntry says the path is a symlink, Path.is_dir() could > then run stat() to find out about the target. > > Do you plan to propose scandir() for inclusion in the stdlib? Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry objects" for inclusion into the stdlib, and also speed up os.walk() as a result. However, pathlib's API with .is_dir() and .lstat() etc are so close to DirEntry, I'd be much keener to roll up the scandir functionality into pathlib's iterdir(), as that's already going in the standard library, and iterdir() already returns Path objects. I'm just not sure it's possible or useful without stat caching. We could do Path.lstat(cached=True), but we'd also really want is_dir(cached=True), so that API kinda sucks. Alternatively you could have iterdir(cached=True) return PathWithCachedStat style objects -- probably better, but kinda messy. For these reasons, I would much prefer stat caching on by default in Path -- in my experience, the cached behaviour is desired much much more often than the non-cached. I've written directory walkers more often than I can count, whereas I've maybe only once written a long-running process that needs to re-stat, and if it's clearly documented as cached, then it's super easy to call restat(), or create a new Path instance to get new stat info. This would allow iterdir() to take advantage of the huge performance improvements you can get when walking directories. Guido, are you at all open to reconsidering the uncached-by-default in light of this? -Ben From greg.ewing at canterbury.ac.nz Mon Nov 25 00:06:59 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 25 Nov 2013 12:06:59 +1300 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: References: Message-ID: <52928693.9020803@canterbury.ac.nz> Ben Hoyt wrote: > However, it seems there was no further discussion about why not > "extension" and "extensions"? I have never heard a filename extension > being called a "suffix". You can't have read many unix man pages, then! I just searched for "suffix" in the gcc man page, and found this: For any given input file, the file name suffix determines what kind of compilation is done: > I know it is a suffix in the sense of the > English word, but I've never heard it called that in this context, and > I think context is important. This probably depends on your background. In my experience, the term "extension" arose in OSes where it was a formal part of the filename syntax, often highly constrained. E.g. RT11, CP/M, early MS-DOS. Unix has never had a formal notion of extensions like that, only informal conventions, and has called them suffixes at least some of the time for as long as I can remember. > 4) Is path_obj.glob() recursive? In the PEP it looks like it is if the > pattern starts with '**', I don't think it has to *start* with **. Rather, the ** is a pattern that can span directory separators. It's not a flag that applies to the whole thing -- a pattern could have a * in one place and a ** in another. -- Greg From benhoyt at gmail.com Mon Nov 25 00:12:51 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 12:12:51 +1300 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: <52928693.9020803@canterbury.ac.nz> References: <52928693.9020803@canterbury.ac.nz> Message-ID: >> However, it seems there was no further discussion about why not >> "extension" and "extensions"? I have never heard a filename extension >> being called a "suffix". > > > You can't have read many unix man pages, then! Huh, no I haven't! Certainly not regularly, as I'm almost exclusively a Windows user. :-) > This probably depends on your background. In my experience, > the term "extension" arose in OSes where it was a formal > part of the filename syntax, often highly constrained. > E.g. RT11, CP/M, early MS-DOS. > > Unix has never had a formal notion of extensions like that, > only informal conventions, and has called them suffixes at > least some of the time for as long as I can remember. Yes, seems like it definitely is background-dependent. I'm Windows-centric. I stand corrected, and recant my position on "suffix". :-) >> 4) Is path_obj.glob() recursive? In the PEP it looks like it is if the >> pattern starts with '**', > > > I don't think it has to *start* with **. Rather, the ** is > a pattern that can span directory separators. It's not a > flag that applies to the whole thing -- a pattern could have > a * in one place and a ** in another. Oh okay, that makes more sense. It definitely needs more thorough documentation in that case. I would still prefer the simpler and more explicit rglob() / recursive=True rather than pattern new syntax, but I don't feel as strongly anymore. -Ben From solipsis at pitrou.net Mon Nov 25 00:13:05 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 25 Nov 2013 00:13:05 +0100 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: <20131125001305.4ac3ab77@fsol> On Mon, 25 Nov 2013 12:04:28 +1300 Ben Hoyt wrote: > > Right now, pathlib doesn't cache. Guido decided it was safer to start > > off like that, and perhaps later we can add some optional caching. > > > > One reason caching didn't go in is that it's not clear which API is > > best. Working on pluggin scandir() into pathlib would actually help > > choosing a stat-caching API. > > > > (or, rather, lstat-caching...) > > > >> The other related thing is that DirEntry only provides .lstat(), > >> because it's providing stat-like info without following links. > > > > Path.is_dir() and friends use stat(), i.e. they inform you about > > whether a symlink's target is a directory (not the symlink itself). Of > > course, if the DirEntry says the path is a symlink, Path.is_dir() could > > then run stat() to find out about the target. > > > > Do you plan to propose scandir() for inclusion in the stdlib? > > Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry > objects" for inclusion into the stdlib, and also speed up os.walk() as > a result. > > However, pathlib's API with .is_dir() and .lstat() etc are so close to > DirEntry, I'd be much keener to roll up the scandir functionality into > pathlib's iterdir(), as that's already going in the standard library, > and iterdir() already returns Path objects. We could still expose scandir() as a low-level API, *and* call it in pathlib for optimizations. > We could do Path.lstat(cached=True), but we'd also really want > is_dir(cached=True), so that API kinda sucks. Alternatively you could > have iterdir(cached=True) return PathWithCachedStat style objects -- > probably better, but kinda messy. Perhaps Path.enable_caching()? It would enable caching not only on this path objects, but all objects constructed from it. Regards Antoine. From nad at acm.org Mon Nov 25 00:14:28 2013 From: nad at acm.org (Ned Deily) Date: Sun, 24 Nov 2013 15:14:28 -0800 Subject: [Python-Dev] [RELEASED] Python 3.4.0b1 References: <529276F0.3040105@hastings.org> Message-ID: In article <529276F0.3040105 at hastings.org>, Larry Hastings wrote: > On behalf of the Python development team, it's my privilege to announce > the first beta release of Python 3.4. > > This is a preview release, and its use is not recommended for > production settings. Note to users of the python.org Mac OS X binary installers: if you have installed earlier preview releases of Python 3.4, be aware that the batteries-included built-in Tcl/Tk library support introduced in 3.4.0a2 has been reverted in 3.4.0b1 because it was found to break some third-party packages. As is the case with earlier releases of Python, if you use the python.org 64-bit installer for OS X, you will again need to have a compatible third-party copy of Tcl/Tk 8.5 installed, such as ActiveTcl 8.5.15.1, to avoid the problematic system versions shipped in OS X 10.6+. See http://www.python.org/download/mac/tcltk/ for more information. -- Ned Deily, nad at acm.org From ncoghlan at gmail.com Mon Nov 25 00:18:56 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 09:18:56 +1000 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: On 25 Nov 2013 09:07, "Ben Hoyt" wrote: > > > Right now, pathlib doesn't cache. Guido decided it was safer to start > > off like that, and perhaps later we can add some optional caching. > > > > One reason caching didn't go in is that it's not clear which API is > > best. Working on pluggin scandir() into pathlib would actually help > > choosing a stat-caching API. > > > > (or, rather, lstat-caching...) > > > >> The other related thing is that DirEntry only provides .lstat(), > >> because it's providing stat-like info without following links. > > > > Path.is_dir() and friends use stat(), i.e. they inform you about > > whether a symlink's target is a directory (not the symlink itself). Of > > course, if the DirEntry says the path is a symlink, Path.is_dir() could > > then run stat() to find out about the target. > > > > Do you plan to propose scandir() for inclusion in the stdlib? > > Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry > objects" for inclusion into the stdlib, and also speed up os.walk() as > a result. > > However, pathlib's API with .is_dir() and .lstat() etc are so close to > DirEntry, I'd be much keener to roll up the scandir functionality into > pathlib's iterdir(), as that's already going in the standard library, > and iterdir() already returns Path objects. > > I'm just not sure it's possible or useful without stat caching. > > We could do Path.lstat(cached=True), but we'd also really want > is_dir(cached=True), so that API kinda sucks. Alternatively you could > have iterdir(cached=True) return PathWithCachedStat style objects -- > probably better, but kinda messy. > > For these reasons, I would much prefer stat caching on by default in > Path -- in my experience, the cached behaviour is desired much much > more often than the non-cached. I've written directory walkers more > often than I can count, whereas I've maybe only once written a > long-running process that needs to re-stat, and if it's clearly > documented as cached, then it's super easy to call restat(), or create > a new Path instance to get new stat info. > > This would allow iterdir() to take advantage of the huge performance > improvements you can get when walking directories. > > Guido, are you at all open to reconsidering the uncached-by-default in > light of this? No, caching on the object is dangerously unintuitive - it means two Path objects can compare equal, but give different answers for stat-dependent queries. A global string (or Path) keyed cache (rather than a per-object cache) would actually be a safer option, since it would ensure distinct path objects always gave the same answer. That's the approach I will likely pursue at some point in walkdir. It's also quite likely the "rich stat object" API will be pursued for 3.5, which is a much safer approach to stat result caching than trying to embed it directly in pathlib.Path objects. That's why we decided to punt on the caching question until 3.5 - it's better to provide a predictable building block that doesn't provide caching, and then work out how to provide a sensible caching layer on top of that, rather than trying to rush a potentially flawed caching design that leads to inconsistent behaviour. Cheers, Nick. > > -Ben > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Nov 25 00:22:19 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Nov 2013 15:22:19 -0800 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: On Sun, Nov 24, 2013 at 3:04 PM, Ben Hoyt wrote: > > Right now, pathlib doesn't cache. Guido decided it was safer to start > > off like that, and perhaps later we can add some optional caching. > > > > One reason caching didn't go in is that it's not clear which API is > > best. Working on pluggin scandir() into pathlib would actually help > > choosing a stat-caching API. > > > > (or, rather, lstat-caching...) > > > >> The other related thing is that DirEntry only provides .lstat(), > >> because it's providing stat-like info without following links. > > > > Path.is_dir() and friends use stat(), i.e. they inform you about > > whether a symlink's target is a directory (not the symlink itself). Of > > course, if the DirEntry says the path is a symlink, Path.is_dir() could > > then run stat() to find out about the target. > > > > Do you plan to propose scandir() for inclusion in the stdlib? > > Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry > objects" for inclusion into the stdlib, and also speed up os.walk() as > a result. > > However, pathlib's API with .is_dir() and .lstat() etc are so close to > DirEntry, I'd be much keener to roll up the scandir functionality into > pathlib's iterdir(), as that's already going in the standard library, > and iterdir() already returns Path objects. > > I'm just not sure it's possible or useful without stat caching. > > We could do Path.lstat(cached=True), but we'd also really want > is_dir(cached=True), so that API kinda sucks. Alternatively you could > have iterdir(cached=True) return PathWithCachedStat style objects -- > probably better, but kinda messy. > > For these reasons, I would much prefer stat caching on by default in > Path -- in my experience, the cached behaviour is desired much much > more often than the non-cached. I've written directory walkers more > often than I can count, whereas I've maybe only once written a > long-running process that needs to re-stat, and if it's clearly > documented as cached, then it's super easy to call restat(), or create > a new Path instance to get new stat info. > > This would allow iterdir() to take advantage of the huge performance > improvements you can get when walking directories. > > Guido, are you at all open to reconsidering the uncached-by-default in > light of this? I think we should think hard and deep about all the consequences. I was initially in favor of stat caching, but during offline review of PEP 428 Nick pointed out that there are too many different ways to do stat caching, and convinced me that it would be wrong to rush it. Now that beta 1 is out I really don't want to reconsider this -- we really need to stick to the plan. The ship has likewise sailed for adding scandir() (whether to os or pathlib). By all means experiment and get it ready for consideration for 3.5, but I don't want to add it to 3.4. In general I think there are some tough choices regarding stat caching. You already brought up stat vs. lstat -- there's also the issue of what to do if [l]stat fails -- do we cache the exception? IMO, the current incarnation is for convenience, correctness and cross-platform semantics -- three C's. The next incarnation can add a fourth C, caching. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Mon Nov 25 00:31:36 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 12:31:36 +1300 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: Antoine's class-global flag seems like a bad idea. > A global string (or Path) keyed cache (rather than a per-object cache) would > actually be a safer option, since it would ensure distinct path objects > always gave the same answer. That's the approach I will likely pursue at > some point in walkdir. Interesting approach. This wouldn't really solve the problem for scandir / DirEntry / performance issues, but it's a fair idea in general. > It's also quite likely the "rich stat object" API will be pursued for 3.5, > which is a much safer approach to stat result caching than trying to embed > it directly in pathlib.Path objects. As a Windows dev, I'm not sure I love the "rich stat object idea", because stat_result objects are sooo Posixy. On Windows, (some of) the file attribute info is stuffed into a stat_result struct. Which kinda works, but I like how Path exposes the higher-level, cross-platform stuff like .is_dir() so that most of the time you don't need to worry about stat. (You still need to worry about caching, though.) > That's why we decided to punt on the caching question until 3.5 - it's > better to provide a predictable building block that doesn't provide caching, > and then work out how to provide a sensible caching layer on top of that, > rather than trying to rush a potentially flawed caching design that leads to > inconsistent behaviour. Yep, agreed about rushing in a potentially flawed caching design. But I also don't want to "rush in" a design that prohibits scandir()-style performance optimizations -- though I guess it can still go in there one way or the other. "Worst case", we can add os.scandir() separately, which return DirEntry, "path-like" objects. -Ben From benhoyt at gmail.com Mon Nov 25 00:34:01 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 12:34:01 +1300 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: > I think we should think hard and deep about all the consequences. I was > initially in favor of stat caching, but during offline review of PEP 428 > Nick pointed out that there are too many different ways to do stat caching, > and convinced me that it would be wrong to rush it. Now that beta 1 is out I > really don't want to reconsider this -- we really need to stick to the plan. Fair call, and thanks for the response. > The ship has likewise sailed for adding scandir() (whether to os or > pathlib). By all means experiment and get it ready for consideration for > 3.5, but I don't want to add it to 3.4. Yes, I was definitely thinking about 3.5 at this stage. :-) What would be the next step for getting something like os.scandir() added for 3.5 -- a PEP referencing the various issues? > In general I think there are some tough choices regarding stat caching. You > already brought up stat vs. lstat -- there's also the issue of what to do if > [l]stat fails -- do we cache the exception? > > IMO, the current incarnation is for convenience, correctness and > cross-platform semantics -- three C's. The next incarnation can add a fourth > C, caching. Three/four C's, I like it! -Ben From ncoghlan at gmail.com Mon Nov 25 00:35:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 09:35:29 +1000 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: References: <52928693.9020803@canterbury.ac.nz> Message-ID: On 25 Nov 2013 09:14, "Ben Hoyt" wrote: > > >> 4) Is path_obj.glob() recursive? In the PEP it looks like it is if the > >> pattern starts with '**', > > > > > > I don't think it has to *start* with **. Rather, the ** is > > a pattern that can span directory separators. It's not a > > flag that applies to the whole thing -- a pattern could have > > a * in one place and a ** in another. > > Oh okay, that makes more sense. It definitely needs more thorough > documentation in that case. I would still prefer the simpler and more > explicit rglob() / recursive=True rather than pattern new syntax, but > I don't feel as strongly anymore. Using "**" for directory spanning globs is also another case of us borrowing a reasonably common idiom from *nix systems that may not be familiar to Windows users. Cheers, Nick. > > -Ben > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Mon Nov 25 00:42:08 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 12:42:08 +1300 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: References: <52928693.9020803@canterbury.ac.nz> Message-ID: > Using "**" for directory spanning globs is also another case of us borrowing > a reasonably common idiom from *nix systems that may not be familiar to > Windows users. Okay, *nix wins then. :-) Python's stdlib is already fairly *nix-oriented (even when it's being cross-platform), so I guess it's not a big deal. My only remaining concern then is that there shouldn't be more than one way to do recursive globbing in a new API like this. Why does rglob() exist when the documentation simply says "like calling glob() but with '**' added in front of the pattern"? http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob -Ben From ncoghlan at gmail.com Mon Nov 25 00:53:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 09:53:44 +1000 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: On 25 Nov 2013 09:31, "Ben Hoyt" wrote: > > > It's also quite likely the "rich stat object" API will be pursued for 3.5, > > which is a much safer approach to stat result caching than trying to embed > > it directly in pathlib.Path objects. > > As a Windows dev, I'm not sure I love the "rich stat object idea", > because stat_result objects are sooo Posixy. On Windows, (some of) the > file attribute info is stuffed into a stat_result struct. Which kinda > works, but I like how Path exposes the higher-level, cross-platform > stuff like .is_dir() so that most of the time you don't need to worry > about stat. (You still need to worry about caching, though.) The idea of the rich stat result object is that has all that info prepopulated, based on an initial stat call. "Caching" it amounts to "keep a reference to it". It is suggested that it would be a subset of the pathlib.Path API: http://bugs.python.org/issue19725 If it's also a superset of the existing stat object API, then at least Path.stat and Path.lstat (and perhaps the lower level APIs) can be updated to return it in 3.5. > > That's why we decided to punt on the caching question until 3.5 - it's > > better to provide a predictable building block that doesn't provide caching, > > and then work out how to provide a sensible caching layer on top of that, > > rather than trying to rush a potentially flawed caching design that leads to > > inconsistent behaviour. > > Yep, agreed about rushing in a potentially flawed caching design. But > I also don't want to "rush in" a design that prohibits scandir()-style > performance optimizations -- though I guess it can still go in there > one way or the other. Yeah, the realisation that an initial non-caching approach didn't lock us out of external caching may not have been well communicated to the list. I was discussing the walkdir integration possibilities with Antoine and Guido and realised I would likely still need an external cache, even if pathlib had its own internal caching. At that point, it seemed highly desirable to duck the caching question entirely. > "Worst case", we can add os.scandir() separately, which return > DirEntry, "path-like" objects. Indeed, we may still want such an object API, since dirent doesn't provide full stat info. A PEP reviewing all this for 3.5 and proposing a specific os.scandir API would be a good thing. Cheers, Nick. > > -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Nov 25 04:15:57 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 13:15:57 +1000 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: References: <52928693.9020803@canterbury.ac.nz> Message-ID: On 25 Nov 2013 09:42, "Ben Hoyt" wrote: > > > Using "**" for directory spanning globs is also another case of us borrowing > > a reasonably common idiom from *nix systems that may not be familiar to > > Windows users. > > Okay, *nix wins then. :-) Python's stdlib is already fairly > *nix-oriented (even when it's being cross-platform), so I guess it's > not a big deal. > > My only remaining concern then is that there shouldn't be more than > one way to do recursive globbing in a new API like this. Why does > rglob() exist when the documentation simply says "like calling glob() > but with '**' added in front of the pattern"? > > http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob Because it's a layered API - embedding ** in the pattern is a strictly more powerful interface, but can be a little tricky to get your head around (especially if you don't use a shell that has the feature). rglob() is simpler, but not as flexible. We offer that kind of multi-level API fairly often. For example, subprocess.call() and friends are simpler interfaces for particular ways of using the powerful-but-complex subprocess.Popen API. The metaprogramming stack (functions, classes, decorators, descriptors, metaclasses) similarly offers the ability to trade increased complexity for increases in power and flexibility. In these cases, the "obvious way" is to use the simplest API that covers the use case, and only reach for the more complex API when you genuinely need it. Cheers, Nick. > > -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Mon Nov 25 04:18:47 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 25 Nov 2013 16:18:47 +1300 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: > The idea of the rich stat result object is that has all that info > prepopulated, based on an initial stat call. "Caching" it amounts to "keep a > reference to it". > > It is suggested that it would be a subset of the pathlib.Path API: > http://bugs.python.org/issue19725 > > If it's also a superset of the existing stat object API, then at least > Path.stat and Path.lstat (and perhaps the lower level APIs) can be updated > to return it in 3.5. Got it. >> "Worst case", we can add os.scandir() separately, which return >> DirEntry, "path-like" objects. > > Indeed, we may still want such an object API, since dirent doesn't provide > full stat info. I'm not quite sure what you're suggesting here. In any case, I'm going to modify my scandir() so its DirEntry objects are closer to pathlib.Path, particularly: * isdir() -> is_dir() * isfile() -> is_file() * islink() -> is_symlink() * add is_socket(), is_fifo(), is_block_device(), and is_char_device() I'm considering removing DirEntry's .dirent attribute entirely. The above is_* functions cover everything in .dirent.d_type in a much more Pythonic and cross-platform way, and the only other info in .dirent is d_ino -- can a non-Windows dev tell me how or when d_ino would be useful? If it's useful, is it useful in a higher-level, cross-platform API such as scandir()? Hmmm, I wonder about this "rich stat object" idea in light of the above. Do the methods on pathlib.Path basically supercede the need for this? Because otherwise folks will always be wondering whether to say "path.is_dir()" or "path.stat().is_dir" ... two ways to do it, right next to each other. So I'd prefer to add the "rich" stuff on the higher-level Path instead of the lower-level stat. > A PEP reviewing all this for 3.5 and proposing a specific os.scandir API > would be a good thing. Thanks, I'll definitely consider writing a PEP. -Ben From storchaka at gmail.com Mon Nov 25 08:51:02 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 25 Nov 2013 09:51:02 +0200 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: References: <52928693.9020803@canterbury.ac.nz> Message-ID: 25.11.13 01:35, Nick Coghlan ???????(??): > Using "**" for directory spanning globs is also another case of us > borrowing a reasonably common idiom from *nix systems that may not be > familiar to Windows users. Rather from Java world. From p.f.moore at gmail.com Mon Nov 25 08:58:02 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 25 Nov 2013 07:58:02 +0000 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: On 25 November 2013 03:18, Ben Hoyt wrote: > d_ino -- can a non-Windows dev tell me how or when d_ino would be > useful? If it's useful, is it useful in a higher-level, cross-platform > API such as scandir()? OK, so I'm a Windows dev, but my understanding is that d_ino is useful to tell if two files are identical - hard links to the same physical file have the same d_ino value. I don't believe it's possible to do this on Windows at all. I've seen it used in tools like diff, to short-circuit doing the actual diff if you know from a stat that the 2 files are the same. Paul From cf.natali at gmail.com Mon Nov 25 09:12:37 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Mon, 25 Nov 2013 09:12:37 +0100 Subject: [Python-Dev] PEP 428 - pathlib API questions In-Reply-To: <52928693.9020803@canterbury.ac.nz> References: <52928693.9020803@canterbury.ac.nz> Message-ID: 2013/11/25 Greg Ewing : > Ben Hoyt wrote: >> >> However, it seems there was no further discussion about why not >> "extension" and "extensions"? I have never heard a filename extension >> being called a "suffix". > > > You can't have read many unix man pages, then! I just > searched for "suffix" in the gcc man page, and found > this: > > For any given input file, the file name suffix determines what kind of > compilation is done: > > >> I know it is a suffix in the sense of the >> English word, but I've never heard it called that in this context, and >> I think context is important. > > > This probably depends on your background. In my experience, > the term "extension" arose in OSes where it was a formal > part of the filename syntax, often highly constrained. > E.g. RT11, CP/M, early MS-DOS. > > Unix has never had a formal notion of extensions like that, > only informal conventions, and has called them suffixes at > least some of the time for as long as I can remember. Indeed. Just for reference, here's an extract of POSIX basename(1) man page [1]: """ SYNOPSIS basename string [suffix] DESCRIPTION The string operand shall be treated as a pathname, as defined in XBD Pathname. The string string shall be converted to the filename corresponding to the last pathname component in string and then the suffix string suffix, if present, shall be removed. """ [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html cf From ncoghlan at gmail.com Mon Nov 25 09:33:01 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 18:33:01 +1000 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: On 25 Nov 2013 13:18, "Ben Hoyt" wrote: > > > The idea of the rich stat result object is that has all that info > > prepopulated, based on an initial stat call. "Caching" it amounts to "keep a > > reference to it". > > > > It is suggested that it would be a subset of the pathlib.Path API: > > http://bugs.python.org/issue19725 > > > > If it's also a superset of the existing stat object API, then at least > > Path.stat and Path.lstat (and perhaps the lower level APIs) can be updated > > to return it in 3.5. > > Got it. > > >> "Worst case", we can add os.scandir() separately, which return > >> DirEntry, "path-like" objects. > > > > Indeed, we may still want such an object API, since dirent doesn't provide > > full stat info. > > I'm not quite sure what you're suggesting here. > > In any case, I'm going to modify my scandir() so its DirEntry objects > are closer to pathlib.Path, particularly: > > * isdir() -> is_dir() > * isfile() -> is_file() > * islink() -> is_symlink() > * add is_socket(), is_fifo(), is_block_device(), and is_char_device() > > I'm considering removing DirEntry's .dirent attribute entirely. The > above is_* functions cover everything in .dirent.d_type in a much more > Pythonic and cross-platform way, and the only other info in .dirent is > d_ino -- can a non-Windows dev tell me how or when d_ino would be > useful? If it's useful, is it useful in a higher-level, cross-platform > API such as scandir()? > > Hmmm, I wonder about this "rich stat object" idea in light of the > above. Do the methods on pathlib.Path basically supercede the need for > this? Because otherwise folks will always be wondering whether to say > "path.is_dir()" or "path.stat().is_dir" ... two ways to do it, right > next to each other. So I'd prefer to add the "rich" stuff on the > higher-level Path instead of the lower-level stat. The rich stat API proposal exists precisely to provide a clean way to do stat result caching - path objects always give immediate data, stat objects give cached answers. The direct APIs on Path would just become a trivial shortcut once a rich stat APIs existed - you could use the long form if you wanted to, but it would be pointless to do so. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Nov 25 09:46:26 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 18:46:26 +1000 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fix test_fcntl to run properly on systems that do not support the flags In-Reply-To: <3dSbKV48lqz7Ljx@mail.python.org> References: <3dSbKV48lqz7Ljx@mail.python.org> Message-ID: On 25 Nov 2013 14:46, "gregory.p.smith" wrote: > > http://hg.python.org/cpython/rev/cac7319c5972 > changeset: 87537:cac7319c5972 > branch: 2.7 > parent: 87534:3981e57a7bdc > user: Gregory P. Smith > date: Mon Nov 25 04:45:27 2013 +0000 > summary: > Fix test_fcntl to run properly on systems that do not support the flags > used in the "does the value get passed in properly" test. > > files: > Lib/test/test_fcntl.py | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/Lib/test/test_fcntl.py b/Lib/test/test_fcntl.py > --- a/Lib/test/test_fcntl.py > +++ b/Lib/test/test_fcntl.py > @@ -113,7 +113,10 @@ > self.skipTest("F_NOTIFY or DN_MULTISHOT unavailable") > fd = os.open(os.path.dirname(os.path.abspath(TESTFN)), os.O_RDONLY) > try: > + # This will raise OverflowError if issue1309352 is present. > fcntl.fcntl(fd, cmd, flags) > + except IOError: > + pass # Running on a system that doesn't support these flags. > finally: > os.close(fd) Raising a skip is generally preferred to marking a test that can't be executed as passing. Cheers, Nick. > > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Mon Nov 25 16:04:21 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 25 Nov 2013 10:04:21 -0500 Subject: [Python-Dev] [Python-checkins] cpython: Close #19762: Fix name of _get_traces() and _get_object_traceback() function In-Reply-To: <3dShPg2yw6z7LmL@mail.python.org> References: <3dShPg2yw6z7LmL@mail.python.org> Message-ID: Why are these functions (get_traces and get_object_traceback) private? (1) Is the whole module provisional? At one point, I had thought so, but I don't see that in the PEP or implementation. (I'm not sure that it should be provisional, but I want to be sure that the decision is intentional.) (2) This implementation does lock in certain choices about the nature of traces. (What data to include for analysis vs excluding to save memory; which events are tracked separately and which combined into a single total; organizing the data that is saved in a hash by certain keys; etc) While I would prefer more flexibility, the existing code provides a reasonable default, and I can't forsee changing traces so much that these functions *can't* be reasonably supported unless the rest of the module API changes too. (3) get_object_traceback is the killer app that justifies the specific data-collection choices Victor made; if it isn't public, the implementation starts to look overbuilt. (4) get_traces is about the only way to get at even the all the data that *is* stored, prior to additional summarization. If it isn't public, those default summarization options become even more locked in.. -jJ On Mon, Nov 25, 2013 at 3:34 AM, victor.stinner wrote: > http://hg.python.org/cpython/rev/2e2ec595dc58 > changeset: 87551:2e2ec595dc58 > user: Victor Stinner > date: Mon Nov 25 09:33:18 2013 +0100 > summary: > Close #19762: Fix name of _get_traces() and _get_object_traceback() > function > name in their docstring. Patch written by Vajrasky Kok. > > files: > Modules/_tracemalloc.c | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > > diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c > --- a/Modules/_tracemalloc.c > +++ b/Modules/_tracemalloc.c > @@ -1018,7 +1018,7 @@ > } > > PyDoc_STRVAR(tracemalloc_get_traces_doc, > - "get_traces() -> list\n" > + "_get_traces() -> list\n" > "\n" > "Get traces of all memory blocks allocated by Python.\n" > "Return a list of (size: int, traceback: tuple) tuples.\n" > @@ -1083,7 +1083,7 @@ > } > > PyDoc_STRVAR(tracemalloc_get_object_traceback_doc, > - "get_object_traceback(obj)\n" > + "_get_object_traceback(obj)\n" > "\n" > "Get the traceback where the Python object obj was allocated.\n" > "Return a tuple of (filename: str, lineno: int) tuples.\n" > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > https://mail.python.org/mailman/listinfo/python-checkins > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Mon Nov 25 16:23:22 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 25 Nov 2013 16:23:22 +0100 Subject: [Python-Dev] [Python-checkins] cpython: Close #19762: Fix name of _get_traces() and _get_object_traceback() function In-Reply-To: References: <3dShPg2yw6z7LmL@mail.python.org> Message-ID: 2013/11/25 Jim Jewett : > Why are these functions (get_traces and get_object_traceback) private? _get_object_traceback() is wrapped to get a nice Python object: http://hg.python.org/cpython/file/6ec6facb69ca/Lib/tracemalloc.py#l208 _get_traces() is private, it is used internally by take_snapshot(): http://hg.python.org/cpython/file/6ec6facb69ca/Lib/tracemalloc.py#l455 So it's possible to modify the low-level (private) structure used in the C module without touching the high-level (public) Python API. > (1) Is the whole module provisional? At one point, I had thought so, but I > don't see that in the PEP or implementation. (I'm not sure that it should > be provisional, but I want to be sure that the decision is intentional.) I don't know. > (2) This implementation does lock in certain choices about the nature of > traces. (What data to include for analysis vs excluding to save memory; > which events are tracked separately and which combined into a single total; > organizing the data that is saved in a hash by certain keys; etc) > > While I would prefer more flexibility, the existing code provides a > reasonable default, and I can't forsee changing traces so much that these > functions *can't* be reasonably supported unless the rest of the module API > changes too. Sorry, I don't see which kind of information is "excluded" to save memory. Maybe you read an old version of the PEP? About "events": tracemalloc doesn't store functions calls as event, there is no timestamp. I didn't try to implement that, and I doesn't want to. If you develop it (maybe on top of tracemalloc, I mean by modify _tracemalloc.c), I would be interested to see your code and test it :-) > (3) get_object_traceback is the killer app that justifies the specific > data-collection choices Victor made; if it isn't public, the implementation > starts to look overbuilt. It is public, see the doc and the doc: http://docs.python.org/dev/library/tracemalloc.html#tracemalloc.get_object_traceback http://www.python.org/dev/peps/pep-0454/ > (4) get_traces is about the only way to get at even the all the data that > *is* stored, prior to additional summarization. If it isn't public, those > default summarization options become even more locked in.. Snapshot.traces contains exactly the same data than _get_traces(). In a previous version of the PEP/code, statistics were computed while traces were collected. It's not more the case. Now the API only collect raw data, and then you have call to Snapshot.statistics() or Snapshot.compare_to(). Victor From benhoyt at gmail.com Mon Nov 25 20:45:07 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 26 Nov 2013 08:45:07 +1300 Subject: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info) In-Reply-To: References: <20131124234434.37390cb2@fsol> Message-ID: > OK, so I'm a Windows dev, but my understanding is that d_ino is useful > to tell if two files are identical - hard links to the same physical > file have the same d_ino value. I don't believe it's possible to do > this on Windows at all. > > I've seen it used in tools like diff, to short-circuit doing the > actual diff if you know from a stat that the 2 files are the same. Okay, that helps -- thanks. So the inode number is probably not all that useful in this context at all. Because it doesn't come with the device, you don't know whether it's unique (from the posixpath.samestat source, it looks like a file's only unique if the inode and device numbers are equal). So I think I'm going to drop .dirent entirely, and just expose the d_type information via the is_* functions. I'm not sure about is_socket(), is_fifo(), is_block_device(), is_char_device(). I'm tempted to just leave them off, as I think they'll basically never be used ... their stat counterparts are exceedingly rare in the stdlib, so if you really want that, just use .lstat(). -Ben From Steve.Dower at microsoft.com Mon Nov 25 22:11:26 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Mon, 25 Nov 2013 21:11:26 +0000 Subject: [Python-Dev] PEP 0404 and VS 2010 References: <528D20FA.8040902@stackless.com> <20131120173044.77f6fcef@anarchist> <20131121121536.Horde.HqxpYYJwWGy4QGMBkK0Fvg1@webmail.df.eu> <-4665162615556836517@unknownmsgid> <8d63cc3435f94110aca925330245321e@BLUPR03MB293.namprd03.prod.outlook.com> <528F3E49.9030800@v.loewis.de> <01b4d348507f4895a2b6b7e96c19b796@BLUPR03MB293.namprd03.prod.outlook.com> <528FACA6.1060504@v.loewis.de> Message-ID: <2f49d1fa6b8343a99fcef2cff7fa2ee6@BLUPR03MB293.namprd03.prod.outlook.com> Steve Dower wrote: > The advice I've been given on FILE* is that there's probably no way to make it > work correctly due to its internal buffering. Unfortunately, there are more > places where this leaks through than just the APIs using them - extensions that > call os.dup(fd), PyNumber_AsSsize_t() and pass the result to _fdopen() (for > example, numpy) are simply going to break with mismatched fd's and there's no > way to detect it at compile time. It's hard to tell how wide-ranging this sort > of issue is going to be, but it certainly has the potential to break badly... After thinking about this and looking into it, I think the breakage caused by this sort of code is so bad that we should be discouraging it. The internal buffering, especially on stdin/stdout/stderr, will wreak havoc on any extensions that use them, and code that casts fds to ints within Python will simply crash. The loss of confidence here may be irrecoverable - I don't think we should be making it easy for people to get into this situation. We could make it opt-in for extension modules, but I think that situation is worse than the current one. The best solution is always going to be for users to install VS 2008, or at least VC9 (I'm working on making this easier, but it requires getting the attention of a lot of busy people...). Any thoughts? Cheers, Steve From greg at krypto.org Tue Nov 26 08:07:44 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 25 Nov 2013 23:07:44 -0800 Subject: [Python-Dev] [Python-checkins] cpython (2.7): Fix test_fcntl to run properly on systems that do not support the flags In-Reply-To: References: <3dSbKV48lqz7Ljx@mail.python.org> Message-ID: On Mon, Nov 25, 2013 at 12:46 AM, Nick Coghlan wrote: > > On 25 Nov 2013 14:46, "gregory.p.smith" > wrote: > > > > http://hg.python.org/cpython/rev/cac7319c5972 > > changeset: 87537:cac7319c5972 > > branch: 2.7 > > parent: 87534:3981e57a7bdc > > user: Gregory P. Smith > > date: Mon Nov 25 04:45:27 2013 +0000 > > summary: > > Fix test_fcntl to run properly on systems that do not support the flags > > used in the "does the value get passed in properly" test. > > > > files: > > Lib/test/test_fcntl.py | 3 +++ > > 1 files changed, 3 insertions(+), 0 deletions(-) > > > > > > diff --git a/Lib/test/test_fcntl.py b/Lib/test/test_fcntl.py > > --- a/Lib/test/test_fcntl.py > > +++ b/Lib/test/test_fcntl.py > > @@ -113,7 +113,10 @@ > > self.skipTest("F_NOTIFY or DN_MULTISHOT unavailable") > > fd = os.open(os.path.dirname(os.path.abspath(TESTFN)), > os.O_RDONLY) > > try: > > + # This will raise OverflowError if issue1309352 is present. > > fcntl.fcntl(fd, cmd, flags) > > + except IOError: > > + pass # Running on a system that doesn't support these > flags. > > finally: > > os.close(fd) > > Raising a skip is generally preferred to marking a test that can't be > executed as passing. > For this test this is still a pass because no OverflowError was raised. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Tue Nov 26 18:30:10 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 26 Nov 2013 09:30:10 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching Message-ID: Hello, argparse does prefix matching as long as there are no conflicts. For example: argparser = argparse.ArgumentParser() argparser.add_argument('--sync-foo', action='store_true') args = argparser.parse_args() If I pass "--sync" to this script, it recognizes it as "--sync-foo". This behavior is quite surprising although I can see the motivation for it. At the very least it should be much more explicitly documented (AFAICS it's barely mentioned in the docs). If there's another argument registered, say "--sync-bar" the above will fail due to a conflict. Now comes the nasty part. When using "parse_known_args" instead of "parse_args", the above happens too - --sync is recognized for --sync-foo and captured by the parser. But this is wrong! The whole idea of parse_known_args is to parse the known args, leaving unknowns alone. This prefix matching harms more than it helps here because maybe the program we're actually acting as a front-end for (and hence using parse_known_args) knows about --sync and wants to get it. Unless I'm missing something, this is a bug. But I'm also not sure whether we can do anything about it at this point, as existing code *may* be relying on it. The right thing to do would be to disable this prefix matching when parse_known_args is called. Again, at the very least this should be documented (for parse_known_args not less than a warning box, IMHO). Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Nov 26 18:38:40 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Nov 2013 09:38:40 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: Message-ID: I think matching on the shortest unique prefix is common for command line parsers in general, not just argparse. I believe optparse did this too, and even the venerable getopt does! I think all this originated in the original (non-Python) GNU standard for long option parsing. All that probably explains why the docs hardly touch upon it. As to why parse_known_args also does this, I can see the reasoning behind this behavior: to the end user, "--sync" is a valid option, so it would be surprising if it didn't get recognized under certain conditions. I suppose you were badly bitten by this recently? Can you tell us more about what happened? On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky wrote: > Hello, > > argparse does prefix matching as long as there are no conflicts. For > example: > > argparser = argparse.ArgumentParser() > argparser.add_argument('--sync-foo', action='store_true') > args = argparser.parse_args() > > If I pass "--sync" to this script, it recognizes it as "--sync-foo". This > behavior is quite surprising although I can see the motivation for it. At > the very least it should be much more explicitly documented (AFAICS it's > barely mentioned in the docs). > > If there's another argument registered, say "--sync-bar" the above will > fail due to a conflict. > > Now comes the nasty part. When using "parse_known_args" instead of > "parse_args", the above happens too - --sync is recognized for --sync-foo > and captured by the parser. But this is wrong! The whole idea of > parse_known_args is to parse the known args, leaving unknowns alone. This > prefix matching harms more than it helps here because maybe the program > we're actually acting as a front-end for (and hence using parse_known_args) > knows about --sync and wants to get it. > > Unless I'm missing something, this is a bug. But I'm also not sure whether > we can do anything about it at this point, as existing code *may* be > relying on it. The right thing to do would be to disable this prefix > matching when parse_known_args is called. > > Again, at the very least this should be documented (for parse_known_args > not less than a warning box, IMHO). > > Eli > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Tue Nov 26 18:46:09 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 26 Nov 2013 12:46:09 -0500 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: Message-ID: <20131126174610.6851A250848@webabinitio.net> On Tue, 26 Nov 2013 09:30:10 -0800, Eli Bendersky wrote: > Hello, > > argparse does prefix matching as long as there are no conflicts. For > example: > > argparser = argparse.ArgumentParser() > argparser.add_argument('--sync-foo', action='store_true') > args = argparser.parse_args() > > If I pass "--sync" to this script, it recognizes it as "--sync-foo". This > behavior is quite surprising although I can see the motivation for it. At > the very least it should be much more explicitly documented (AFAICS it's > barely mentioned in the docs). > > If there's another argument registered, say "--sync-bar" the above will > fail due to a conflict. > > Now comes the nasty part. When using "parse_known_args" instead of > "parse_args", the above happens too - --sync is recognized for --sync-foo > and captured by the parser. But this is wrong! The whole idea of > parse_known_args is to parse the known args, leaving unknowns alone. This > prefix matching harms more than it helps here because maybe the program > we're actually acting as a front-end for (and hence using parse_known_args) > knows about --sync and wants to get it. > > Unless I'm missing something, this is a bug. But I'm also not sure whether > we can do anything about it at this point, as existing code *may* be > relying on it. The right thing to do would be to disable this prefix > matching when parse_known_args is called. > > Again, at the very least this should be documented (for parse_known_args > not less than a warning box, IMHO). I'm not sure whether or not it is a bug, but there are a number of bugs involving argument prefix matching in the tracker. Including one to be able to disable it, that has a patch but didn't get into 3.4 (issue 14910). This topic seems more suited to the bug tracker than python-dev. Stephen hasn't been very active lately. There is at least one non-committer who has been, but no committers have made time to review the patches and take action (they are on my list, but there is a lot of stuff ahead of them). So if you'd like to help out in that regard, that would be great :) --David From eliben at gmail.com Tue Nov 26 18:46:21 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 26 Nov 2013 09:46:21 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: Message-ID: On Tue, Nov 26, 2013 at 9:38 AM, Guido van Rossum wrote: > I think matching on the shortest unique prefix is common for command line > parsers in general, not just argparse. I believe optparse did this too, and > even the venerable getopt does! I think all this originated in the original > (non-Python) GNU standard for long option parsing. All that probably > explains why the docs hardly touch upon it. > > As to why parse_known_args also does this, I can see the reasoning behind > this behavior: to the end user, "--sync" is a valid option, so it would be > surprising if it didn't get recognized under certain conditions. > > I suppose you were badly bitten by this recently? Can you tell us more > about what happened? > Sure. We have a Python script that serves as a gateway to another program. That other program has a "--sync" option. The gateway script has a "--sync-foo" option. When the gateway script is invoked with "--sync", we'd expect it to pass it to the program; instead, it matches it to its own "--sync-foo" and consumes the option. Practically, this means a big caveat on exactly the use case parse_known_args was designed for: whenever I have a Python script using argparse and passing unknown arguments to other programs, I have to manually make sure there are no common prefixes between any commands to avoid this problem. Frankly I don't see how the current behavior can be seen as the intended one. Eli > > > On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky wrote: > >> Hello, >> >> argparse does prefix matching as long as there are no conflicts. For >> example: >> >> argparser = argparse.ArgumentParser() >> argparser.add_argument('--sync-foo', action='store_true') >> args = argparser.parse_args() >> >> If I pass "--sync" to this script, it recognizes it as "--sync-foo". This >> behavior is quite surprising although I can see the motivation for it. At >> the very least it should be much more explicitly documented (AFAICS it's >> barely mentioned in the docs). >> >> If there's another argument registered, say "--sync-bar" the above will >> fail due to a conflict. >> >> Now comes the nasty part. When using "parse_known_args" instead of >> "parse_args", the above happens too - --sync is recognized for --sync-foo >> and captured by the parser. But this is wrong! The whole idea of >> parse_known_args is to parse the known args, leaving unknowns alone. This >> prefix matching harms more than it helps here because maybe the program >> we're actually acting as a front-end for (and hence using parse_known_args) >> knows about --sync and wants to get it. >> >> Unless I'm missing something, this is a bug. But I'm also not sure >> whether we can do anything about it at this point, as existing code *may* >> be relying on it. The right thing to do would be to disable this prefix >> matching when parse_known_args is called. >> >> Again, at the very least this should be documented (for parse_known_args >> not less than a warning box, IMHO). >> >> Eli >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org >> >> > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Tue Nov 26 18:55:07 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 26 Nov 2013 09:55:07 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: <20131126174610.6851A250848@webabinitio.net> References: <20131126174610.6851A250848@webabinitio.net> Message-ID: On Tue, Nov 26, 2013 at 9:46 AM, R. David Murray wrote: > On Tue, 26 Nov 2013 09:30:10 -0800, Eli Bendersky > wrote: > > Hello, > > > > argparse does prefix matching as long as there are no conflicts. For > > example: > > > > argparser = argparse.ArgumentParser() > > argparser.add_argument('--sync-foo', action='store_true') > > args = argparser.parse_args() > > > > If I pass "--sync" to this script, it recognizes it as "--sync-foo". This > > behavior is quite surprising although I can see the motivation for it. At > > the very least it should be much more explicitly documented (AFAICS it's > > barely mentioned in the docs). > > > > If there's another argument registered, say "--sync-bar" the above will > > fail due to a conflict. > > > > Now comes the nasty part. When using "parse_known_args" instead of > > "parse_args", the above happens too - --sync is recognized for --sync-foo > > and captured by the parser. But this is wrong! The whole idea of > > parse_known_args is to parse the known args, leaving unknowns alone. This > > prefix matching harms more than it helps here because maybe the program > > we're actually acting as a front-end for (and hence using > parse_known_args) > > knows about --sync and wants to get it. > > > > Unless I'm missing something, this is a bug. But I'm also not sure > whether > > we can do anything about it at this point, as existing code *may* be > > relying on it. The right thing to do would be to disable this prefix > > matching when parse_known_args is called. > > > > Again, at the very least this should be documented (for parse_known_args > > not less than a warning box, IMHO). > > I'm not sure whether or not it is a bug, but there are a number of > bugs involving argument prefix matching in the tracker. Including one > to be able to disable it, that has a patch but didn't get into 3.4 > (issue 14910). > > This topic seems more suited to the bug tracker than python-dev. > > Stephen hasn't been very active lately. There is at least one > non-committer who has been, but no committers have made time to review > the patches and take action (they are on my list, but there is a lot > of stuff ahead of them). So if you'd like to help out in that regard, > that would be great :) > Being able to disable prefix matching is certainly a good idea. I'll nosy myself on Issue 14910 and will try to find some time for it. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue Nov 26 18:53:07 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 26 Nov 2013 09:53:07 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: Message-ID: <5294E003.4070803@g.nevcal.com> Eli did give his use case... a front end for a program that has a parameter "--sync", and a front end preprocessor of some sort was trying to use "--sync-foo" as an argument, and wanted "--sync" to be left in the parameters to send on to the back end program. Design of the front-end might better be aware of back end parameters and not conflict, but the documentation could be improved, likely. It might also be possible to add a setting to disable the prefix matching feature, for code that prefers it not be done. Whether that is better done as a global setting, or a per-parameter setting I haven't thought through. But both the constructor and the parameter definitions already accept a variable number of named parameters, so I would think it would be possible to add another, and retain backward compatibility via an appropriate default. On 11/26/2013 9:38 AM, Guido van Rossum wrote: > I think matching on the shortest unique prefix is common for command > line parsers in general, not just argparse. I believe optparse did > this too, and even the venerable getopt does! I think all this > originated in the original (non-Python) GNU standard for long option > parsing. All that probably explains why the docs hardly touch upon it. > > As to why parse_known_args also does this, I can see the reasoning > behind this behavior: to the end user, "--sync" is a valid option, so > it would be surprising if it didn't get recognized under certain > conditions. > > I suppose you were badly bitten by this recently? Can you tell us more > about what happened? > > > On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky > wrote: > > Hello, > > argparse does prefix matching as long as there are no conflicts. > For example: > > argparser = argparse.ArgumentParser() > argparser.add_argument('--sync-foo', action='store_true') > args = argparser.parse_args() > > If I pass "--sync" to this script, it recognizes it as > "--sync-foo". This behavior is quite surprising although I can see > the motivation for it. At the very least it should be much more > explicitly documented (AFAICS it's barely mentioned in the docs). > > If there's another argument registered, say "--sync-bar" the above > will fail due to a conflict. > > Now comes the nasty part. When using "parse_known_args" instead of > "parse_args", the above happens too - --sync is recognized for > --sync-foo and captured by the parser. But this is wrong! The > whole idea of parse_known_args is to parse the known args, leaving > unknowns alone. This prefix matching harms more than it helps here > because maybe the program we're actually acting as a front-end for > (and hence using parse_known_args) knows about --sync and wants to > get it. > > Unless I'm missing something, this is a bug. But I'm also not sure > whether we can do anything about it at this point, as existing > code *may* be relying on it. The right thing to do would be to > disable this prefix matching when parse_known_args is called. > > Again, at the very least this should be documented (for > parse_known_args not less than a warning box, IMHO). > > Eli > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Tue Nov 26 19:16:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 26 Nov 2013 10:16:56 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: <5294E003.4070803@g.nevcal.com> References: <5294E003.4070803@g.nevcal.com> Message-ID: On Tue, Nov 26, 2013 at 9:53 AM, Glenn Linderman wrote: > Eli did give his use case... a front end for a program that has a > parameter "--sync", and a front end preprocessor of some sort was trying to > use "--sync-foo" as an argument, and wanted "--sync" to be left in the > parameters to send on to the back end program. > > Design of the front-end might better be aware of back end parameters and > not conflict, but the documentation could be improved, likely. > > It might also be possible to add a setting to disable the prefix matching > feature, for code that prefers it not be done. Whether that is better done > as a global setting, or a per-parameter setting I haven't thought through. > But both the constructor and the parameter definitions already accept a > variable number of named parameters, so I would think it would be possible > to add another, and retain backward compatibility via an appropriate > default. > > FWIW I'm not advocating a breaking behavior change here - I fully realize the ship has sailed. I'm interested in mitigation actions, though. Making the documentation explain this explicitly + adding an option to disable prefix matching (in 3.5 since we're past the 3.4 features point) will go a long way for alleviating this gotcha. Eli > > On 11/26/2013 9:38 AM, Guido van Rossum wrote: > > I think matching on the shortest unique prefix is common for command > line parsers in general, not just argparse. I believe optparse did this > too, and even the venerable getopt does! I think all this originated in the > original (non-Python) GNU standard for long option parsing. All that > probably explains why the docs hardly touch upon it. > > As to why parse_known_args also does this, I can see the reasoning behind > this behavior: to the end user, "--sync" is a valid option, so it would be > surprising if it didn't get recognized under certain conditions. > > I suppose you were badly bitten by this recently? Can you tell us more > about what happened? > > > On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky wrote: > >> Hello, >> >> argparse does prefix matching as long as there are no conflicts. For >> example: >> >> argparser = argparse.ArgumentParser() >> argparser.add_argument('--sync-foo', action='store_true') >> args = argparser.parse_args() >> >> If I pass "--sync" to this script, it recognizes it as "--sync-foo". >> This behavior is quite surprising although I can see the motivation for it. >> At the very least it should be much more explicitly documented (AFAICS it's >> barely mentioned in the docs). >> >> If there's another argument registered, say "--sync-bar" the above will >> fail due to a conflict. >> >> Now comes the nasty part. When using "parse_known_args" instead of >> "parse_args", the above happens too - --sync is recognized for --sync-foo >> and captured by the parser. But this is wrong! The whole idea of >> parse_known_args is to parse the known args, leaving unknowns alone. This >> prefix matching harms more than it helps here because maybe the program >> we're actually acting as a front-end for (and hence using parse_known_args) >> knows about --sync and wants to get it. >> >> Unless I'm missing something, this is a bug. But I'm also not sure >> whether we can do anything about it at this point, as existing code *may* >> be relying on it. The right thing to do would be to disable this prefix >> matching when parse_known_args is called. >> >> Again, at the very least this should be documented (for parse_known_args >> not less than a warning box, IMHO). >> >> Eli >> > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/eliben%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Tue Nov 26 19:42:32 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 26 Nov 2013 18:42:32 +0000 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: <5294E003.4070803@g.nevcal.com> Message-ID: On 26 November 2013 18:16, Eli Bendersky wrote: > FWIW I'm not advocating a breaking behavior change here - I fully realize > the ship has sailed. I'm interested in mitigation actions, though. Making > the documentation explain this explicitly + adding an option to disable > prefix matching (in 3.5 since we're past the 3.4 features point) will go a > long way for alleviating this gotcha. Prefix matching is pretty much the norm in my experience, so I don't think it should surprise people in general. That's not to say that being more explicit in the docs wouldn't be a good thing anyway. For the "passing unknown parameters through" case, I think having a means to disable prefix matching is good (and it's a reasonably important use case). It *does* seem to me to be obvious from the name that an alternative method named "parse_known_args" is the way to disable prefix matching. So I think it needs to be clearly documented that it *isn't* and precisely what purpose it is intended to serve. And it definitely doesn't do that at the moment. My feeling is that the following changes would be good: 1. Clarify in the documentation of parse_known_args that the function does not behave any differently from parse_args in terms of what arguments it accepts (and explicitly give an example showing that prefix arguments *are* consumed). This can be done for 3.4 as it doesn't change functionality. 2. Add a keyword argument to parse_known_args that allows the user to disable prefix matching (I'd suggest match_prefix with a default of True). This would be for 3.5. Paul From robert.kern at gmail.com Tue Nov 26 22:15:56 2013 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 26 Nov 2013 21:15:56 +0000 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: <5294E003.4070803@g.nevcal.com> Message-ID: On 2013-11-26 18:16, Eli Bendersky wrote: > FWIW I'm not advocating a breaking behavior change here - I fully realize the > ship has sailed. I'm interested in mitigation actions, though. Making the > documentation explain this explicitly + adding an option to disable prefix > matching (in 3.5 since we're past the 3.4 features point) will go a long way for > alleviating this gotcha. There is already the One Obvious Way to handle situations like yours: the user uses "--" to mark that the remaining arguments are pure arguments and not --options. parent-script --verbose -- ./child_script.py --no-verbose --etc This is a common idiom across many argument processors. parse_known_args() is not a good solution for this situation even if you mitigate the prefix issue. Exact matches of --options are still possible. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From larry at hastings.org Tue Nov 26 23:39:32 2013 From: larry at hastings.org (Larry Hastings) Date: Tue, 26 Nov 2013 14:39:32 -0800 Subject: [Python-Dev] Accepting PEP 3154 for 3.4? In-Reply-To: References: <20131116191526.16f9b79c@fsol> Message-ID: <52952324.7000608@hastings.org> On 11/18/2013 07:48 AM, Guido van Rossum wrote: > Clearly the framing is the weakest point of the PEP (== elicits the > most bikeshedding). Indeed--it is still ongoing: http://bugs.python.org/issue19780 //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Wed Nov 27 15:35:13 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 27 Nov 2013 06:35:13 -0800 Subject: [Python-Dev] potential argparse problem: bad mix of parse_known_args and prefix matching In-Reply-To: References: Message-ID: On Tue, Nov 26, 2013 at 9:30 AM, Eli Bendersky wrote: > Hello, > > argparse does prefix matching as long as there are no conflicts. For > example: > > argparser = argparse.ArgumentParser() > argparser.add_argument('--sync-foo', action='store_true') > args = argparser.parse_args() > > If I pass "--sync" to this script, it recognizes it as "--sync-foo". This > behavior is quite surprising although I can see the motivation for it. At > the very least it should be much more explicitly documented (AFAICS it's > barely mentioned in the docs). > > If there's another argument registered, say "--sync-bar" the above will > fail due to a conflict. > > Now comes the nasty part. When using "parse_known_args" instead of > "parse_args", the above happens too - --sync is recognized for --sync-foo > and captured by the parser. But this is wrong! The whole idea of > parse_known_args is to parse the known args, leaving unknowns alone. This > prefix matching harms more than it helps here because maybe the program > we're actually acting as a front-end for (and hence using parse_known_args) > knows about --sync and wants to get it. > > Unless I'm missing something, this is a bug. But I'm also not sure whether > we can do anything about it at this point, as existing code *may* be > relying on it. The right thing to do would be to disable this prefix > matching when parse_known_args is called. > > Again, at the very least this should be documented (for parse_known_args > not less than a warning box, IMHO). > I created http://bugs.python.org/issue19814 for the documentation patch. http://bugs.python.org/issue14910 deals with making prefix matching optional, but that will have to be deferred to 3.5 Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Wed Nov 27 23:57:25 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Wed, 27 Nov 2013 23:57:25 +0100 Subject: [Python-Dev] [Python-checkins] cpython: asyncio: Change write buffer use to avoid O(N**2). Make write()/sendto() accept In-Reply-To: <3dVGSl4Tl3zRNM@mail.python.org> References: <3dVGSl4Tl3zRNM@mail.python.org> Message-ID: 2013/11/27 guido.van.rossum : > http://hg.python.org/cpython/rev/80e0040d910c > changeset: 87617:80e0040d910c > user: Guido van Rossum > date: Wed Nov 27 14:12:48 2013 -0800 > summary: > asyncio: Change write buffer use to avoid O(N**2). Make write()/sendto() accept bytearray/memoryview too. Change some asserts with proper exceptions. Was this change discussed somewhere? Antoine told me on IRC that it was discussed on the tulip mailing list. I would be nice to mention it in the commit message (for the next change ;-). > diff --git a/Lib/asyncio/selector_events.py b/Lib/asyncio/selector_events.py > --- a/Lib/asyncio/selector_events.py > +++ b/Lib/asyncio/selector_events.py > @@ -340,6 +340,8 @@ > > max_size = 256 * 1024 # Buffer size passed to recv(). > > + _buffer_factory = bytearray # Constructs initial value for self._buffer. > + Because the buffer type is now configurable, it would be nice to explain in a short comment why bytearray fits better this use case. (I like the bytearray object, so I like your whole changeset!) > @@ -762,6 +776,8 @@ > > class _SelectorDatagramTransport(_SelectorTransport): > > + _buffer_factory = collections.deque > + > def __init__(self, loop, sock, protocol, address=None, extra=None): > super().__init__(loop, sock, protocol, extra) > self._address = address Why collections.deque is preferred here? Could you also please add a comment? Victor From nagendra.vs at hp.com Thu Nov 28 07:14:22 2013 From: nagendra.vs at hp.com (V S, Nagendra (Nonstop Filesystems Team)) Date: Thu, 28 Nov 2013 06:14:22 +0000 Subject: [Python-Dev] test_uuid.py on HP NonStop Python-2.7.5 fails (test case: testIssue8621) Message-ID: Hi, We are porting python to HP NonStop & to a large extent we have been successful. While running the unit tests we happen to hit upon the problem reported in issue8621(http://bugs.python.org/issue8621), i.e uuid4 sequences on both parent & child were same . On NonStop we lack support for both hardware randomness & uuid_generate family of calls, so on NonStop "uuid.py" falls throw to random.range() call to generate the random number which in turn is used by the UUID class. Below the code snip that can used to repo the problem... #!/usr/bin/python import os import sys import uuid import random #Code taken from Lib/uuid.py def myuuid(): bytes = [chr(random.randrange(256)) for i in range(16)] val = uuid.UUID(bytes=bytes, version=4) return val # Code taken from Lib/test/test_uuid.py # # On at least some versions of OSX myuuid generates # the same sequence of UUIDs in the parent and any # children started using fork. for i in range(10): fds = os.pipe() pid = os.fork() if pid == 0: os.close(fds[0]) value = myuuid() os.write(fds[1], value.hex) os._exit(0) else: os.close(fds[1]) parent_value = myuuid().hex os.waitpid(pid, 0) child_value = os.read(fds[0], 100) os.close(fds[0]) print parent_value, child_value, (parent_value == child_value) We also ran the above test on Linux & were seeing the same results as on NonStop, i.e uuid4 sequences on both parent & child were same. Output snippet..... [vsnag@ Python-2.7.5]$ uname -a Linux 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux [vsnag@ Python-2.7.5]$ ./python test.py 2b512ef47c9f471a829148ca4e779dff 2b512ef47c9f471a829148ca4e779dff True e39b605476e04ca2b992b5d055278c3b e39b605476e04ca2b992b5d055278c3b True 8249fb9f6a2045c084da549c9e393a51 8249fb9f6a2045c084da549c9e393a51 True e1d9e84b7f134930abf5fb2760f07585 e1d9e84b7f134930abf5fb2760f07585 True 068f9612f0744eeb92dbca8950028a3c 068f9612f0744eeb92dbca8950028a3c True e85df457614f47d5b6300186bdd1c798 e85df457614f47d5b6300186bdd1c798 True a8628a0817b843778e9565b835b8cfac a8628a0817b843778e9565b835b8cfac True 478d8e9e5090498e9fad5356c5ae9568 478d8e9e5090498e9fad5356c5ae9568 True 151d7f1c8f8b42099a89a0f442e98dc7 151d7f1c8f8b42099a89a0f442e98dc7 True 838571b03a5b46f8a06398e020647d89 838571b03a5b46f8a06398e020647d89 True [vsnag@ Python-2.7.5]$ Any thoughts what could be going wrong....? Thanks & Regards Nagendra.V.S From arigo at tunes.org Thu Nov 28 10:53:51 2013 From: arigo at tunes.org (Armin Rigo) Date: Thu, 28 Nov 2013 10:53:51 +0100 Subject: [Python-Dev] test_uuid.py on HP NonStop Python-2.7.5 fails (test case: testIssue8621) In-Reply-To: References: Message-ID: Hi, On Thu, Nov 28, 2013 at 7:14 AM, V S, Nagendra (Nonstop Filesystems Team) > on NonStop "uuid.py" falls throw to random.range() call to generate the random number And indeed, the random module gets identical results in both parent and child after a fork(). This seems to mean that the random module is not safe to use in uuid without taking special care of this fact (e.g. feeding back os.getpid() into the seed). A bient?t, Armin. From christian at python.org Thu Nov 28 12:30:49 2013 From: christian at python.org (Christian Heimes) Date: Thu, 28 Nov 2013 12:30:49 +0100 Subject: [Python-Dev] test_uuid.py on HP NonStop Python-2.7.5 fails (test case: testIssue8621) In-Reply-To: References: Message-ID: <52972969.9010702@python.org> Am 28.11.2013 10:53, schrieb Armin Rigo: > Hi, > > On Thu, Nov 28, 2013 at 7:14 AM, V S, Nagendra (Nonstop Filesystems Team) >> on NonStop "uuid.py" falls throw to random.range() call to generate the random number > > And indeed, the random module gets identical results in both parent > and child after a fork(). This seems to mean that the random module > is not safe to use in uuid without taking special care of this fact > (e.g. feeding back os.getpid() into the seed). Yes, there is an issue with the fallback to random, see my 17 months old bug report: http://bugs.python.org/issue15206 The uuid module needs a fork() aware random instance like the tempfile module. Christian From nagendra.vs at hp.com Thu Nov 28 15:25:29 2013 From: nagendra.vs at hp.com (V S, Nagendra (Nonstop Filesystems Team)) Date: Thu, 28 Nov 2013 14:25:29 +0000 Subject: [Python-Dev] test_uuid.py on HP NonStop Python-2.7.5 fails (test case: testIssue8621) In-Reply-To: <52972969.9010702@python.org> References: <52972969.9010702@python.org> Message-ID: > > Hi, > > > > On Thu, Nov 28, 2013 at 7:14 AM, V S, Nagendra (Nonstop Filesystems > > Team) > >> on NonStop "uuid.py" falls throw to random.range() call to generate > >> the random number > > > > And indeed, the random module gets identical results in both parent > > and child after a fork(). This seems to mean that the random module > > is not safe to use in uuid without taking special care of this fact > > (e.g. feeding back os.getpid() into the seed). > > Yes, there is an issue with the fallback to random, see my 17 months old bug > report: http://bugs.python.org/issue15206 The uuid module needs a > fork() aware random instance like the tempfile module. > > Christian Thanks all for your responses. I tried to something similar to what Armin was suggesting. Here is what I have done def myuuid(): random.seed(time()) # or just could be random.seed() <-------------New addition bytes = [chr(random.randrange(256)) for i in range(16)] val = uuid.UUID(bytes=bytes, version=4) return val With this I was able to get a unique sequences in both parent & child after fork().I have tested this function by calling it a million times & every time there has been a unique sequence both in parent & child. Do you this can be fair enough work around for this problem. Thanks & Regards Nagendra.V.S From doko at ubuntu.com Thu Nov 28 17:24:59 2013 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 28 Nov 2013 17:24:59 +0100 Subject: [Python-Dev] buildbot's are needlessly compiling -O0 In-Reply-To: References: Message-ID: <52976E5B.5030903@ubuntu.com> Am 24.11.2013 08:13, schrieb Gregory P. Smith: > our buildbots are setup to configure --with-pydebug which also > unfortunately causes them to compile with -O0... this results in a python > binary that is excruciatingly slow and makes the entire test suite run take > a long time. > > given that nobody is ever going to run a gdb or another debugger on the > buildbot generated transient binaries themselves how about we speed all of > the buildbot's up by adding CFLAGS=-O2 to the configure command line? > > Sure, the compile step will take a bit longer but that is dwarfed by the > test time as it is: > > http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/3224 > http://buildbot.python.org/all/builders/ARMv7%203.x/builds/7 > http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/639 > > It should dramatically decrease the turnaround latency for buildbot results. maybe check if the compiler understands -Og (GCC 4.8), and use that? Otoh, it would be good to check with -O2 or -O3 too, common values used when building python for production use. Matthias From techtonik at gmail.com Thu Nov 28 18:45:56 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 28 Nov 2013 20:45:56 +0300 Subject: [Python-Dev] PEP process entry point and ill fated initiatives Message-ID: I wanted to help people who are trying to find out more about PEP submission process by providing relevant info (or a pointer) in README.rst that is located at the root of PEPs repository. You can see it here. https://bitbucket.org/rirror/peps I filled this issue with b.p.o http://bugs.python.org/issue19822 However, as you may see, I met some resistance, which said me to discuss the matter here. So, I'd like to discuss two things: 1. is my README.rst enhancement request is wrong? 2. what makes people close tickets without providing arguments, but using stuff like "should"? It is not the first stuff that I personally find discouraging, which may be attributed to my reputation, but what's the point in doing this? The only logical explanation I find for resistance in this particular case is that people in core Python development don't take usability and user experience matters seriously. Most of core development is about systems, not about people, so everybody is scratching own itches. I wish that usability and UX was given at least some coverage in Python development regulations. -- anatoly t. From guido at python.org Thu Nov 28 19:39:57 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 28 Nov 2013 10:39:57 -0800 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: Message-ID: Anatoly, the Python community is a lot more diverse than you think. "Pull requests" (whatever that means) are not the way to start a PEP. You should start by focusing on the contents, and the mechanics of editing it and getting it formatted properly are secondary. The process is explained in PEP one. Your bug report would have gotten a much better response if you had simply asked "what is the process, I can't figure it out from the repo's README" rather that (again) complaining that the core developers don't care. Offending people is not the way to get them to help you. On Thu, Nov 28, 2013 at 9:45 AM, anatoly techtonik wrote: > I wanted to help people who are trying to find out more > about PEP submission process by providing relevant > info (or a pointer) in README.rst that is located at the > root of PEPs repository. You can see it here. > > https://bitbucket.org/rirror/peps > > I filled this issue with b.p.o > > http://bugs.python.org/issue19822 > > However, as you may see, I met some resistance, > which said me to discuss the matter here. So, I'd like > to discuss two things: > > 1. is my README.rst enhancement request is wrong? > 2. what makes people close tickets without providing > arguments, but using stuff like "should"? > > It is not the first stuff that I personally find discouraging, > which may be attributed to my reputation, but what's the > point in doing this? > > The only logical explanation I find for resistance in this > particular case is that people in core Python > development don't take usability and user experience > matters seriously. Most of core development is about > systems, not about people, so everybody is scratching > own itches. I wish that usability and UX was given at > least some coverage in Python development regulations. > -- > anatoly t. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Thu Nov 28 22:04:24 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 29 Nov 2013 08:04:24 +1100 Subject: [Python-Dev] py.ini documentation improvement Message-ID: I was just getting Jython working with py.exe, and I think the documentation can be made a bit more friendly. In particular think we can make it easier for people to determine the correct folder by changing the line in 3.4.4.1 Customization via INI files: Two .ini files will be searched by the launcher - py.ini in the current user?s "application data" directory (i.e. the directory returned by calling the Windows function SHGetFolderPath with CSIDL_LOCAL_APPDATA) and py.ini in the same directory as the launcher. to Two .ini files will be searched by the launcher - py.ini in the current user?s "application data" directory (i.e. the directory returned by executing `echo %LOCALAPPDATA%` in a command window) and py.ini in the same directory as the launcher. %LOCALAPPDATA% should always contain the same value as would be returned from SHGetFolderPath with CSIDL_LOCAL_APPDATA. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu Nov 28 22:34:07 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 28 Nov 2013 13:34:07 -0800 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: Message-ID: <5297B6CF.4090809@g.nevcal.com> On 11/28/2013 1:04 PM, Tim Delaney wrote: > > %LOCALAPPDATA% should always contain the same value as would be > returned from SHGetFolderPath with CSIDL_LOCAL_APPDATA. Except when it gets changed. Documentation should reflect the actual use in the code. If it uses the SHGetFolderPath, that is what should be documented; if it uses the environment variable, that is what should get documented. Users can easily change the environment variable (which might be a reason to use it, instead of SHGetFolderPath). -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Thu Nov 28 22:45:42 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 29 Nov 2013 08:45:42 +1100 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: <5297B6CF.4090809@g.nevcal.com> References: <5297B6CF.4090809@g.nevcal.com> Message-ID: On 29 November 2013 08:34, Glenn Linderman wrote: > On 11/28/2013 1:04 PM, Tim Delaney wrote: > > > %LOCALAPPDATA% should always contain the same value as would be returned > from SHGetFolderPath with CSIDL_LOCAL_APPDATA. > > Except when it gets changed. Documentation should reflect the actual use > in the code. If it uses the SHGetFolderPath, that is what should be > documented; if it uses the environment variable, that is what should get > documented. Users can easily change the environment variable (which might > be a reason to use it, instead of SHGetFolderPath). > Didn't think of that - good point. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Nov 28 23:25:17 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 28 Nov 2013 17:25:17 -0500 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: Message-ID: On 11/28/2013 4:04 PM, Tim Delaney wrote: > I was just getting Jython working with py.exe, and I think the > documentation can be made a bit more friendly. In particular think we > can make it easier for people to determine the correct folder by > changing the line in 3.4.4.1 Customization via INI files: > > Two .ini files will be searched by the launcher - py.ini in the current > user?s "application data" directory (i.e. the directory returned by > calling the Windows function SHGetFolderPath with CSIDL_LOCAL_APPDATA) > and py.ini in the same directory as the launcher. > > to > > Two .ini files will be searched by the launcher - py.ini in the current > user?s "application data" directory (i.e. the directory returned by > executing `echo %LOCALAPPDATA%` in a command window) and py.ini in the > same directory as the launcher. 'Two .ini files will be searched by the launcher' sort of implies to me that the files exist. On my Win7 machine, echo %LOCALAPPDATA% returns C:\Users\Terry\AppData\Local. If I go to Users/Terry with Explorer, there is no AppData. (Same with admin user.) C:\Windows contains py(w).exe, but no py.ini. Clearer would be The launcher will search for py.ini in two possible places -- first in the same directory as the launcher; second in the current user's "application data" directory . I am pretty sure that order is important and that my version, not the current version, is correct. -- Terry Jan Reedy From martin at v.loewis.de Thu Nov 28 23:35:14 2013 From: martin at v.loewis.de (martin at v.loewis.de) Date: Thu, 28 Nov 2013 23:35:14 +0100 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: Message-ID: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Quoting Terry Reedy : > 'Two .ini files will be searched by the launcher' sort of implies to > me that the files exist. On my Win7 machine, echo %LOCALAPPDATA% > returns C:\Users\Terry\AppData\Local. If I go to Users/Terry with > Explorer, there is no AppData. (Same with admin user.) Are you sure about this? What happens if you type "%LOCALAPPDATA%" (without quotes) into the explorer's address bar? Regards, Martin From tjreedy at udel.edu Fri Nov 29 00:59:14 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 28 Nov 2013 18:59:14 -0500 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> References: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Message-ID: On 11/28/2013 5:35 PM, martin at v.loewis.de wrote: > > Quoting Terry Reedy : > >> 'Two .ini files will be searched by the launcher' sort of implies to >> me that the files exist. On my Win7 machine, echo %LOCALAPPDATA% >> returns C:\Users\Terry\AppData\Local. If I go to Users/Terry with >> Explorer, there is no AppData. (Same with admin user.) I initially intended to write "there is no AppData *visible*". Then I switched to my win7 admin account and there was still no AppData visible, for any user. This is unlike with XP where things not visible to 'users' become visible as admin. > What happens if you type "%LOCALAPPDATA%" > (without quotes) into the explorer's address bar? It displays C:\Users\Terry\AppData\Local on the bar and the contents in the box below. The py.ini doc might mention this as a way to get there. Thanks. -- Terry Jan Reedy From timothy.c.delaney at gmail.com Fri Nov 29 01:06:24 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 29 Nov 2013 11:06:24 +1100 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Message-ID: On 29 November 2013 10:59, Terry Reedy wrote: > On 11/28/2013 5:35 PM, martin at v.loewis.de wrote: > >> >> Quoting Terry Reedy : >> >> 'Two .ini files will be searched by the launcher' sort of implies to >>> me that the files exist. On my Win7 machine, echo %LOCALAPPDATA% >>> returns C:\Users\Terry\AppData\Local. If I go to Users/Terry with >>> Explorer, there is no AppData. (Same with admin user.) >>> >> > I initially intended to write "there is no AppData *visible*". Then I > switched to my win7 admin account and there was still no AppData visible, > for any user. This is unlike with XP where things not visible to 'users' > become visible as admin. By default in Win7 AppData is a hidden folder - you need to go to Tools | Folder Options | View | Show hidden files, folders and drives to see it in Explorer (no matter what user you're logged in as). If the py.ini location does become defined in terms of %LOCALAPPDATA% then suggesting to use that value in the Explorer address bar would probably be the easiest way for people to get to the correct directory. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Nov 29 03:17:43 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 28 Nov 2013 21:17:43 -0500 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Message-ID: On 11/28/2013 7:06 PM, Tim Delaney wrote: > By default in Win7 AppData is a hidden folder - you need to go to Tools On my system, that is Control Panel, not Tools. > | Folder Options | View | Show hidden files, folders and drives to see > it in Explorer (no matter what user you're logged in as). Thanks, done (and another desired change made also). > If the py.ini location does become defined in terms of %LOCALAPPDATA% > then suggesting to use that value in the Explorer address bar would > probably be the easiest way for people to get to the correct directory. -- Terry Jan Reedy From timothy.c.delaney at gmail.com Fri Nov 29 03:27:57 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 29 Nov 2013 13:27:57 +1100 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Message-ID: On 29 November 2013 13:17, Terry Reedy wrote: > On 11/28/2013 7:06 PM, Tim Delaney wrote: > > By default in Win7 AppData is a hidden folder - you need to go to Tools >> > > On my system, that is Control Panel, not Tools. Sorry - was referring to the Explorer "Tools" menu, which is likewise hidden by default ... so many things that I change from the defaults when setting up a Windows machine - it's become so automatic that I forget what is hidden from most users. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 29 03:38:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 29 Nov 2013 12:38:23 +1000 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Message-ID: On 29 November 2013 12:27, Tim Delaney wrote: > On 29 November 2013 13:17, Terry Reedy wrote: >> >> On 11/28/2013 7:06 PM, Tim Delaney wrote: >> >>> By default in Win7 AppData is a hidden folder - you need to go to Tools >> >> >> On my system, that is Control Panel, not Tools. > > > Sorry - was referring to the Explorer "Tools" menu, which is likewise hidden > by default ... so many things that I change from the defaults when setting > up a Windows machine - it's become so automatic that I forget what is hidden > from most users. Turning my Windows gaming machine back to the default settings and trying to bootstrap pip was one of the key things that got me motivated to work on PEP 453. It still astounds me that one company can both create Visual Studio, *and* make it difficult to configure a machine to be suitable for software development, but I guess that's enterprise companies for you :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Fri Nov 29 04:55:54 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 29 Nov 2013 12:55:54 +0900 Subject: [Python-Dev] py.ini documentation improvement In-Reply-To: References: <20131128233514.Horde.kS_pe_iXCxQhzKV14U66WQ3@webmail.df.eu> Message-ID: <878uw7dfp1.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > Turning my Windows gaming machine back to the default settings and > trying to bootstrap pip was one of the key things that got me > motivated to work on PEP 453. It still astounds me that one company > can both create Visual Studio, *and* make it difficult to configure a > machine to be suitable for software development, but I guess that's > enterprise companies for you :) Those of us who study management for a living are definitely not surprised. I recommend Robert Townsend's "Call Yourself Up" from _Up the Organization_: he advises CEOs to ring up the company's toll-free number and find out what the customers deal with. Urk! This is the "same s---, different dept", that's all. And it's a lot harder to avoid than most folks think. Kudos to Nick for "calling himself up", to python-dev for creating a language and development ecology with so few unnecessary barriers, and to Guido for setting the tone! From kristjan at ccpgames.com Fri Nov 29 10:16:38 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Fri, 29 Nov 2013 09:16:38 +0000 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: Message-ID: Reading the defect, I find people being unnecessarily obstructive. Closing the ticket, twice, is a rude, and unnecessary action. How about acknowledging that these waters are dark and murky and help making things better? Surely, documenting processes can only be an improvement? A lot has changed in open source development in the last 10 years. The processes we have are starting to have the air of cathedral around them. K From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of Guido van Rossum Sent: 28. n?vember 2013 18:40 To: anatoly techtonik Cc: python-dev Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives Anatoly, the Python community is a lot more diverse than you think. "Pull requests" (whatever that means) are not the way to start a PEP. You should start by focusing on the contents, and the mechanics of editing it and getting it formatted properly are secondary. The process is explained in PEP one. Your bug report would have gotten a much better response if you had simply asked "what is the process, I can't figure it out from the repo's README" rather that (again) complaining that the core developers don't care. Offending people is not the way to get them to help you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Nov 29 12:08:31 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 29 Nov 2013 14:08:31 +0300 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: Message-ID: On Thu, Nov 28, 2013 at 9:39 PM, Guido van Rossum wrote: > Anatoly, the Python community is a lot more diverse than you think. "Pull > requests" (whatever that means) are not the way to start a PEP. You should > start by focusing on the contents, and the mechanics of editing it and > getting it formatted properly are secondary. The process is explained in PEP > one. Your bug report would have gotten a much better response if you had > simply asked "what is the process, I can't figure it out from the repo's > README" rather that (again) complaining that the core developers don't care. > Offending people is not the way to get them to help you. Sorry, I didn't mean to offend anyone. It never was my goal. Everybody knows that I personally can easily find information about how to start new PEP - like this one - http://www.python.org/dev/peps/pep-0001/ I am not sure it is _the entry point_ I am looking for, because it may happen that there is a more up-to-date chapter about this in devguide. That's why I didn't paste any links - PEP list subscribers know a lot better about the process than I am. But the issue is - if my point of entry is PEP repo, and the first thing I do is reading the README, I expect it to contain the link to the PEP process, and this link is missing. It is not that I can't figure out something, it is that people need more time than necessary to find required info, and the time is the most valuable resource. Now there is a huge part about "community process vision", complains, analysis, request to resume CLArification work, and generic stuff about experience. It is not obligatory to read. About ill fated initiatives. I don't like when people prematurely close tickets without waiting for the mutual agreement that the problem is solved. Perhaps trackers should have personal "agree/disagree/meh" flags to help other people state their opinions, but until then the lack of time will always lead to the same communication problems. I value contributions people made for Python, even if I am not expressive about it, I do realize that without all these people it won't be possible. I can only regret that people can't self-sustain the feeling that they are important and that they take offence instead. Perhaps this offence is that makes them closing tickets without waiting for reply. Reply is not be required to close ticket for valid reason, but the point of conflict is that I don't like when tickets are prematurely closed under the reason of "you. do it right" instead of "let's make it better". Both "right" and "better" are subjective, but there is a big difference. "better" can be discussed, and "right" is not. I don't think it's ok. I understand that people don't have time to waste in chit chatting, but so do I, and everybody else out there. I am disrespectful to policies, that's right. I believe that policies need to be revised every couple of years, but nobody do this. Rules that work for folks that were present at the time the consensus is reached need to be explained to those who came later, and still you can't expect them to comply. Just because they don't have choice, they may choose to ignore, and community loses another contributor. I again complain, because I see many things that don't move and it doesn't make me happy. I can not be polite if I am not happy. From one side it's my own problem and I know about. Another aspect is that I won't have motivation to write if I am unhappy and polite - it is depressive. From the other side there is also a point of conflict that can not be ignored. People who signed CLA do not understand my position that CLA is harmful. They think that if I didn't sign it then I am just trolling and exclude me. I won't send patches until CLA is clear for every contributor and not like "common, just sign it, I am not a lawyer, but they know it is good". This probably make people upset, and they try to "help" you, so you have to maintain a sane amount of "dignity" to resist the pressure. People afraid to accept patches even when I *explicitly* give my permission to use them. I tired of writing patches that can not be accepted even with my explicit permission - permission in which I understand what I am giving out. But apparently there is something wrong with my permission - it is not incomplete, because if something was missing - people would tell me about it. But people don't understand what is missing. They say "you, do it right", "CLA is the right" way. I ask why, and there is silence. No, I've been given a lot of complicated books about lawyership, stuff that was written before I was born. I don't get it. That's my ill fated initiative to clarify the CLA and make Python contribution process free from FUD. People "do it right" and force other people to "do it right" as I see it. That's a natural process, but it can be constructive if the meaning of what is "right" is discussed and changed if necessary. Otherwise Python development becomes an exclusive privilege for those who comply or doesn't care about anything else. I am not using Python because of Python. I am using it, because it solves a whole heck of usability problems and ux issues with programming languages in this world, and I want Python development process to reflect on that matter. It may be that we are all missing something much bigger than we thought. I won't be surprised to know that Python 3 with all its internal beauty will never be successful, because of "artificial engineering" approach instead of "ux research" https://caremad.io/blog/a-look-at-pypi-downloads/ People are unreliable beings. While many praise Kenneth Rietz for inventing requests, I value him for fallibilism - is the word I learnt from his web site. I can throw in a simple idea that the "print() function changed the perception of the language" and now Python 3 is not the same simple universal tool I used to hack in my free time, and it might be true. But only in combination with other issues and features like opening text files not in UTF-8, constant headache about putting data in and out of Python with unnatural encode/decode paradigm, more complicated argparse, bigger docs. Everything may matter. My friend, biologist, who started from zero, said he finds Python 3 more understandable than Python 2. I can't share this feeling, but I accept it. Many Python folks can not accept that Python 3 may not be better. It just "don't right". I started to avoid using the word "wart", but I really want to see all these "gotchas" compiled in one place to iterate over them in community process. The process that can show that we can not rely on that somebody of us is "right". PyPy and IPython are more "right" is you ask me, but I still want mainstream Python to be better, and I want people who can make it better, to be able to do so. I am not the one who can meddle with internals, so I don't even try, but I can polish things on the outside, and this is what I try to do. I don't need Python as the best language in the world, I need it as universally understood one. Simple and generic, and perhaps bad and harmful for certain things. Python 3 is not simple, but it may be an intermediate step in forging a common vision of what is wrong with all that. This can become possible if developer expectations could be openly matched against user expectations and user experience of both. "git is complicated, because you don't do it right" - this kind of argument is very common and when majority of people start to think this way - you get "The Emperor's New Clothes" effect - mice prickle and cry, but eat cactus without saying a word. git could be better, but that attitude doesn't allow community to concentrate. That's was a weak attempt to define a vision for community process. I am not sure it is appropriate for this kind of mail, but there is no other place for it as I can see. Some points may be useful anyway. Community process may sound to abstract, so let's get back to the conflict. I am not sending patches, because most of the time I can not create them (that's irrelevant to the CLA). So I am trying to solve different problem - why people who can fix things, don't do this? And this thing is directly connected to their (user) experience of interacting with community. Here comes the CLA with different reasons why it is required. People say CLA is required, because of historical heritage and license stack. It is said that CRNI, BeOpen and other guys refuse to give up their "rights" on Python. Which "rights"? 1. Do they really want to control Python? 2. Or they just want to earn some points in Public Relations? 3. Or do they want to be respected as a vital part of Python history? These three are conflicting. I doubt that either CRNI or BeOpen choose (1). When why do they insist on license? They want PR points (2). But if stubborn and unclear license makes people less likely to contribute, it only makes reputation worse, not better. If they want (3) then we need to give them that. They don't need to enforce it with license terms. Enforcing doesn't earn respect either. This can be solved. People say CLA is required to protect PSF and Python, and you don't need to understand the text to sign it. I disagree and we won't reach any agreement here. I am radical in that I believe that "security by obscurity" is a bad approach to earn trust and respect between people. I demand human like CLArification. If Google and other parties managed to reduce lawyerish clutter, PSF can try better too. Sorry for being disrespectful to the community of lawyers that granted us all this "fun". People say CLA is required to contribute to Python documentation, and that's something that was discussed with no result. This is a lowest point and the most simple problem that can be addressed for now if somebody feel we should take an action and make something constructive out of this "otherwise likely to become ill fated" thread. https://mail.python.org/pipermail/python-legal-sig/2013-August/thread.html -- anatoly t. From rosuav at gmail.com Fri Nov 29 12:37:50 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Nov 2013 22:37:50 +1100 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: Message-ID: On Fri, Nov 29, 2013 at 10:08 PM, anatoly techtonik wrote: > About ill fated initiatives. I don't like when people prematurely close tickets > without waiting for the mutual agreement that the problem is solved. Perhaps > trackers should have personal "agree/disagree/meh" flags to help other > people state their opinions... > > Perhaps this offence is that makes them closing tickets without > waiting for reply. > Reply is not be required to close ticket for valid reason, but the > point of conflict > is that I don't like when tickets are prematurely closed under the reason of > "you. do it right" instead of "let's make it better"... > > I am disrespectful to policies, that's right. I believe that policies need to be > revised every couple of years, but nobody do this. Rules that work for > folks that > were present at the time the consensus is reached need to be explained > to those who came later, and still you can't expect them to comply. Why do you feel that Python is (or ought to be) democratic? It isn't, and it should not be. Your opinion is, I am sure, given weight when decisions are made, but ultimately those decisions are not a matter of "consensus". Why should the policies be revised? Because some people are coming in who don't like them? Poor reason. Being impolite will tend to reduce the weight your words are given. Demanding change is usually unsuccessful unless you can demonstrate a good reason for that change to happen, and I don't just mean "I hate your policies and I think it's time you reviewed them". ChrisA From victor.stinner at gmail.com Fri Nov 29 13:12:26 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Fri, 29 Nov 2013 13:12:26 +0100 Subject: [Python-Dev] Track ResourceWarning warnings with tracemalloc Message-ID: Hi, I'm trying to write an example of usage of the new tracemalloc.get_object_traceback() function. Last month I proposed to use it to give the traceback where a file/socket was allocated on ResourceWarning: https://mail.python.org/pipermail/python-dev/2013-October/129923.html I found a working solution using monkey patching of the socket, _pyio and builtins modules: https://bitbucket.org/haypo/misc/src/tip/python/res_warn.py This hack uses tracemalloc to retrieve the traceback where the file/socket was allocated, but you can do the same using traceback.extract_stack() for older Python version. So it's just an example to show how tracemalloc can be used. Example with test_poplib (without res_warn.py): [1/1] test_poplib /home/haypo/prog/python/default/Lib/test/support/__init__.py:1331: ResourceWarning: unclosed gc.collect() Ok, you now that the socket was destroyed be the garbage collector... But which test created this socket??? Example of res_warn.py with test_poplib: [1/1] test_poplib /home/haypo/prog/python/default/Lib/test/support/__init__.py:1331: ResourceWarning: unclosed gc.collect() Allocation traceback (most recent first): File '/home/haypo/prog/python/default/Lib/ssl.py', line 340 _context=self) File '/home/haypo/prog/python/default/Lib/poplib.py', line 390 self.sock = context.wrap_socket(self.sock) File '/home/haypo/prog/python/default/Lib/test/test_poplib.py', line 407 self.client.stls() File '/home/haypo/prog/python/default/Lib/unittest/case.py', line 567 self.setUp() File '/home/haypo/prog/python/default/Lib/unittest/case.py', line 610 return self.run(*args, **kwds) (...) You can see that the socket was created by test_poplib at line 407 TestPOP3_TLSClass.setUp). It's more useful than the previous warning :-) I found two issues in _pyio and tracemalloc modules. These issues should be fixed to use res_warn.py on text or buffered files and use it during Python shutdown: http://bugs.python.org/issue19831 http://bugs.python.org/issue19829 Victor From solipsis at pitrou.net Fri Nov 29 15:58:19 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Nov 2013 15:58:19 +0100 Subject: [Python-Dev] PEP process entry point and ill fated initiatives References: Message-ID: <20131129155819.7bf44bda@fsol> On Fri, 29 Nov 2013 09:16:38 +0000 Kristj?n Valur J?nsson wrote: > Closing the ticket, twice, is a rude, and unnecessary action. Closing the ticket means "we don't believe there is an issue, or we don't think it would be reasonable to fix it". If that's our judgement on the issue, how is it rude to close it? > How about acknowledging that these waters are dark and murky and help > making things better? Well, how about? If Anatoly has a concrete proposal, surely he can propose a patch to make things better. If you want to contribute, you know how to do it! Regards Antoine. From storchaka at gmail.com Fri Nov 29 16:33:14 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 29 Nov 2013 17:33:14 +0200 Subject: [Python-Dev] Track ResourceWarning warnings with tracemalloc In-Reply-To: References: Message-ID: 29.11.13 14:12, Victor Stinner ???????(??): > You can see that the socket was created by test_poplib at line 407 > TestPOP3_TLSClass.setUp). It's more useful than the previous warning > :-) Great! From kristjan at ccpgames.com Fri Nov 29 16:25:14 2013 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Fri, 29 Nov 2013 15:25:14 +0000 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: <20131129155819.7bf44bda@fsol> References: <20131129155819.7bf44bda@fsol> Message-ID: > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou > Sent: 29. n?vember 2013 14:58 > To: python-dev at python.org > Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives > > On Fri, 29 Nov 2013 09:16:38 +0000 > Kristj?n Valur J?nsson wrote: > > Closing the ticket, twice, is a rude, and unnecessary action. > > Closing the ticket means "we don't believe there is an issue, or we don't > think it would be reasonable to fix it". If that's our judgement on the issue, > how is it rude to close it? Also, this attitude. Who are the "we" in this case? And why send messages to people by shutting doors? > > > How about acknowledging that these waters are dark and murky and help > > making things better? > > Well, how about? If Anatoly has a concrete proposal, surely he can propose a > patch to make things better. Which is what he did. And instead of helpful ideas on how to improve his patch, the issue is closed. The acolytes have spoken. I know that Anatoly himself is a subject of long history here, but I myself have felt lessening affinity to the dev community in recent years. It feels like it is increasingly shutting itself in. Cheers, K From solipsis at pitrou.net Fri Nov 29 17:40:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Nov 2013 17:40:45 +0100 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: <20131129174045.62a78a9a@fsol> On Fri, 29 Nov 2013 15:25:14 +0000 Kristj?n Valur J?nsson wrote: > > > -----Original Message----- > > From: Python-Dev [mailto:python-dev- > > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou > > Sent: 29. n?vember 2013 14:58 > > To: python-dev at python.org > > Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives > > > > On Fri, 29 Nov 2013 09:16:38 +0000 > > Kristj?n Valur J?nsson wrote: > > > Closing the ticket, twice, is a rude, and unnecessary action. > > > > Closing the ticket means "we don't believe there is an issue, or we don't > > think it would be reasonable to fix it". If that's our judgement on the issue, > > how is it rude to close it? > Also, this attitude. Who are the "we" in this case? "We" = core developers who are active on the tracker. Regards Antoine. From status at bugs.python.org Fri Nov 29 18:07:46 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 29 Nov 2013 18:07:46 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20131129170746.E76B3568E4@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-11-22 - 2013-11-29) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 4317 (+37) closed 27289 (+83) total 31606 (+120) Open issues with patches: 1981 Issues opened (78) ================== #19352: unittest loader barfs on symlinks http://bugs.python.org/issue19352 reopened by doko #19579: test_asyncio: test__run_once timings should be relaxed http://bugs.python.org/issue19579 reopened by haypo #19638: dtoa: conversion from '__int64' to 'int', possible loss of dat http://bugs.python.org/issue19638 reopened by mark.dickinson #19715: test_touch_common failure under Windows http://bugs.python.org/issue19715 opened by pitrou #19717: resolve() fails when the path doesn't exist http://bugs.python.org/issue19717 opened by pitrou #19719: add importlib.abc.SpecLoader and SpecFinder http://bugs.python.org/issue19719 opened by brett.cannon #19720: suppress context for some exceptions in importlib? http://bugs.python.org/issue19720 opened by eric.snow #19721: Move all test_importlib utility code into test_importlib.util http://bugs.python.org/issue19721 opened by brett.cannon #19723: Argument Clinic should add markers for humans http://bugs.python.org/issue19723 opened by pitrou #19725: Richer stat object http://bugs.python.org/issue19725 opened by pjenvey #19726: BaseProtocol is not an ABC http://bugs.python.org/issue19726 opened by pitrou #19728: PEP 453: enable pip by default in the Windows binary installer http://bugs.python.org/issue19728 opened by ncoghlan #19731: Fix copyright footer http://bugs.python.org/issue19731 opened by pitrou #19732: python fails to build when configured with --with-system-libmp http://bugs.python.org/issue19732 opened by doko #19733: Setting image parameter of a button crashes with Cocoa Tk http://bugs.python.org/issue19733 opened by serhiy.storchaka #19734: venv and ensurepip are affected by pip environment variable se http://bugs.python.org/issue19734 opened by haypo #19736: posixmodule.c: Add flags for statvfs.f_flag to constant list http://bugs.python.org/issue19736 opened by doko #19737: Documentation of globals() and locals() should be improved http://bugs.python.org/issue19737 opened by Zahari.Dim #19738: pytime.c loses precision under Windows http://bugs.python.org/issue19738 opened by pitrou #19741: tracemalloc: tracemalloc_log_alloc() doesn't check _Py_HASHTAB http://bugs.python.org/issue19741 opened by haypo #19743: test_gdb failures http://bugs.python.org/issue19743 opened by larry #19744: ensurepip should refuse to install pip if SSL/TLS is not avail http://bugs.python.org/issue19744 opened by tim.peters #19745: TEST_DATA_DIR for out-of-tree builds http://bugs.python.org/issue19745 opened by christian.heimes #19746: No introspective way to detect ModuleImportFailure in unittest http://bugs.python.org/issue19746 opened by rbcollins #19748: test_time failures on AIX http://bugs.python.org/issue19748 opened by haypo #19749: test_venv failure on AIX: 'module' object has no attribute 'O_ http://bugs.python.org/issue19749 opened by haypo #19750: test_asyncio.test_unix_events constructor failures on AIX http://bugs.python.org/issue19750 opened by haypo #19751: test_sys: sys.hash_info.algorithm failure on SPARC Solaris bui http://bugs.python.org/issue19751 opened by haypo #19753: test_gdb failure on SystemZ buildbot http://bugs.python.org/issue19753 opened by haypo #19754: pickletools.optimize doesn't reframe correctly http://bugs.python.org/issue19754 opened by pitrou #19756: test_nntplib: sporadic failures, network isses? server down? http://bugs.python.org/issue19756 opened by haypo #19757: _tracemalloc.c: compiler warning with gil_state http://bugs.python.org/issue19757 opened by haypo #19758: Warnings in tests http://bugs.python.org/issue19758 opened by serhiy.storchaka #19761: test_tk fails on OS X with multiple test case failures with bo http://bugs.python.org/issue19761 opened by ned.deily #19763: Make it easier to backport statistics to 2.7 http://bugs.python.org/issue19763 opened by christian.heimes #19764: subprocess: use PROC_THREAD_ATTRIBUTE_HANDLE_LIST with STARTUP http://bugs.python.org/issue19764 opened by haypo #19766: test_venv: test_with_pip() failed on "AMD64 Fedora without thr http://bugs.python.org/issue19766 opened by haypo #19767: pathlib: iterfiles() and iterdirs() http://bugs.python.org/issue19767 opened by jonash #19768: Not so correct error message when giving incorrect type to max http://bugs.python.org/issue19768 opened by vajrasky #19769: test_venv: test_with_pip() failure on "AMD64 Windows Server 20 http://bugs.python.org/issue19769 opened by haypo #19770: NNTP.post broken http://bugs.python.org/issue19770 opened by sobczyk #19771: runpy should check ImportError.name before wrapping it http://bugs.python.org/issue19771 opened by ncoghlan #19772: str serialization of Message object may mutate the payload and http://bugs.python.org/issue19772 opened by r.david.murray #19775: Provide samefile() on Path objects http://bugs.python.org/issue19775 opened by pitrou #19776: Provide expanduser() on Path objects http://bugs.python.org/issue19776 opened by pitrou #19777: Provide a home() classmethod on Path objects http://bugs.python.org/issue19777 opened by pitrou #19780: Pickle 4 frame headers optimization http://bugs.python.org/issue19780 opened by serhiy.storchaka #19781: No SSL match_hostname() in ftplib http://bugs.python.org/issue19781 opened by christian.heimes #19782: No SSL match_hostname() in imaplib http://bugs.python.org/issue19782 opened by christian.heimes #19783: No SSL match_hostname() in nntplib http://bugs.python.org/issue19783 opened by christian.heimes #19784: No SSL match_hostname() in poplib http://bugs.python.org/issue19784 opened by christian.heimes #19785: No SSL match_hostname() in smtplib http://bugs.python.org/issue19785 opened by christian.heimes #19787: tracemalloc: set_reentrant() should not have to call PyThread_ http://bugs.python.org/issue19787 opened by haypo #19789: Improve wording of how to "undo" a call to logging.disable(lvl http://bugs.python.org/issue19789 opened by simonmweber #19791: test_pathlib should use can_symlink or skip_unless_symlink fro http://bugs.python.org/issue19791 opened by vajrasky #19792: pathlib does not support symlink in Windows XP http://bugs.python.org/issue19792 opened by vajrasky #19795: Formatting of True/False in docs http://bugs.python.org/issue19795 opened by serhiy.storchaka #19796: urllib2.HTTPError.reason is not documented as "Added in 2.7" http://bugs.python.org/issue19796 opened by moreati #19800: Write more detailed framing tests http://bugs.python.org/issue19800 opened by pitrou #19806: smtpd crashes when a multi-byte UTF-8 sequence is split betwee http://bugs.python.org/issue19806 opened by petri.lehtinen #19808: IDLE applies syntax highlighting to user input in its shell http://bugs.python.org/issue19808 opened by peter.otten #19809: Doc: subprocess should warn uses on race conditions when multi http://bugs.python.org/issue19809 opened by owenlin #19814: Document prefix matching behavior of argparse, particularly fo http://bugs.python.org/issue19814 opened by eli.bendersky #19816: test.regrtest: use tracemalloc to detect memory leaks? http://bugs.python.org/issue19816 opened by haypo #19817: tracemalloc add a memory limit feature http://bugs.python.org/issue19817 opened by haypo #19818: tracemalloc: comments on the doc http://bugs.python.org/issue19818 opened by haypo #19820: docs are missing info about module attributes http://bugs.python.org/issue19820 opened by eric.snow #19821: pydoc.ispackage() could be more accurate http://bugs.python.org/issue19821 opened by eric.snow #19824: string.Template: Add PHP-style variable expansion example http://bugs.python.org/issue19824 opened by techtonik #19826: Document that bug reporting by email is possible http://bugs.python.org/issue19826 opened by techtonik #19827: Optimize socket.settimeout() and socket.setblocking(): avoid s http://bugs.python.org/issue19827 opened by haypo #19828: test_site fails with -S flag http://bugs.python.org/issue19828 opened by vajrasky #19829: _pyio.BufferedReader and _pyio.TextIOWrapper destructor don't http://bugs.python.org/issue19829 opened by haypo #19830: test_poplib emits resource warning http://bugs.python.org/issue19830 opened by vajrasky #19831: tracemalloc: stop the module later at Python shutdown http://bugs.python.org/issue19831 opened by haypo #19832: XML version is ignored http://bugs.python.org/issue19832 opened by tkuhn #19833: asyncio: patches to document EventLoop and add more examples http://bugs.python.org/issue19833 opened by haypo #19834: Unpickling exceptions pickled by Python 2 http://bugs.python.org/issue19834 opened by doerwalter Most recent 15 issues with no replies (15) ========================================== #19834: Unpickling exceptions pickled by Python 2 http://bugs.python.org/issue19834 #19830: test_poplib emits resource warning http://bugs.python.org/issue19830 #19828: test_site fails with -S flag http://bugs.python.org/issue19828 #19827: Optimize socket.settimeout() and socket.setblocking(): avoid s http://bugs.python.org/issue19827 #19820: docs are missing info about module attributes http://bugs.python.org/issue19820 #19817: tracemalloc add a memory limit feature http://bugs.python.org/issue19817 #19816: test.regrtest: use tracemalloc to detect memory leaks? http://bugs.python.org/issue19816 #19808: IDLE applies syntax highlighting to user input in its shell http://bugs.python.org/issue19808 #19800: Write more detailed framing tests http://bugs.python.org/issue19800 #19796: urllib2.HTTPError.reason is not documented as "Added in 2.7" http://bugs.python.org/issue19796 #19789: Improve wording of how to "undo" a call to logging.disable(lvl http://bugs.python.org/issue19789 #19785: No SSL match_hostname() in smtplib http://bugs.python.org/issue19785 #19784: No SSL match_hostname() in poplib http://bugs.python.org/issue19784 #19783: No SSL match_hostname() in nntplib http://bugs.python.org/issue19783 #19782: No SSL match_hostname() in imaplib http://bugs.python.org/issue19782 Most recent 15 issues waiting for review (15) ============================================= #19833: asyncio: patches to document EventLoop and add more examples http://bugs.python.org/issue19833 #19831: tracemalloc: stop the module later at Python shutdown http://bugs.python.org/issue19831 #19830: test_poplib emits resource warning http://bugs.python.org/issue19830 #19828: test_site fails with -S flag http://bugs.python.org/issue19828 #19827: Optimize socket.settimeout() and socket.setblocking(): avoid s http://bugs.python.org/issue19827 #19817: tracemalloc add a memory limit feature http://bugs.python.org/issue19817 #19814: Document prefix matching behavior of argparse, particularly fo http://bugs.python.org/issue19814 #19806: smtpd crashes when a multi-byte UTF-8 sequence is split betwee http://bugs.python.org/issue19806 #19796: urllib2.HTTPError.reason is not documented as "Added in 2.7" http://bugs.python.org/issue19796 #19795: Formatting of True/False in docs http://bugs.python.org/issue19795 #19792: pathlib does not support symlink in Windows XP http://bugs.python.org/issue19792 #19791: test_pathlib should use can_symlink or skip_unless_symlink fro http://bugs.python.org/issue19791 #19787: tracemalloc: set_reentrant() should not have to call PyThread_ http://bugs.python.org/issue19787 #19785: No SSL match_hostname() in smtplib http://bugs.python.org/issue19785 #19784: No SSL match_hostname() in poplib http://bugs.python.org/issue19784 Top 10 most discussed issues (10) ================================= #19715: test_touch_common failure under Windows http://bugs.python.org/issue19715 40 msgs #19638: dtoa: conversion from '__int64' to 'int', possible loss of dat http://bugs.python.org/issue19638 23 msgs #19734: venv and ensurepip are affected by pip environment variable se http://bugs.python.org/issue19734 18 msgs #19743: test_gdb failures http://bugs.python.org/issue19743 15 msgs #19787: tracemalloc: set_reentrant() should not have to call PyThread_ http://bugs.python.org/issue19787 15 msgs #19509: No SSL match_hostname() in ftp, imap, nntp, pop, smtp modules http://bugs.python.org/issue19509 14 msgs #19780: Pickle 4 frame headers optimization http://bugs.python.org/issue19780 14 msgs #19744: ensurepip should refuse to install pip if SSL/TLS is not avail http://bugs.python.org/issue19744 12 msgs #19624: Switch constants in the errno module to IntEnum http://bugs.python.org/issue19624 10 msgs #19655: Replace the ASDL parser carried with CPython http://bugs.python.org/issue19655 10 msgs Issues closed (81) ================== #3158: Doctest fails to find doctests in extension modules http://bugs.python.org/issue3158 closed by zach.ware #11357: Add support for PEP 381 Mirror Authenticity http://bugs.python.org/issue11357 closed by eric.araujo #11489: json.dumps not parsable by json.loads (on Linux only) http://bugs.python.org/issue11489 closed by serhiy.storchaka #11508: Virtual Interfaces cause uuid._find_mac to raise a ValueError http://bugs.python.org/issue11508 closed by serhiy.storchaka #13633: Automatically convert character references in HTMLParser http://bugs.python.org/issue13633 closed by ezio.melotti #15204: Deprecate the 'U' open mode http://bugs.python.org/issue15204 closed by serhiy.storchaka #15397: Unbinding of methods http://bugs.python.org/issue15397 closed by alexandre.vassalotti #17201: Use allowZip64=True by default http://bugs.python.org/issue17201 closed by serhiy.storchaka #17523: Additional tests for the os module. http://bugs.python.org/issue17523 closed by willweaver #17747: Deprecate pickle fast mode http://bugs.python.org/issue17747 closed by alexandre.vassalotti #17787: Optimize pickling function dispatch in hot loops. http://bugs.python.org/issue17787 closed by alexandre.vassalotti #17810: Implement PEP 3154 (pickle protocol 4) http://bugs.python.org/issue17810 closed by alexandre.vassalotti #18326: Mention 'keyword only' for list.sort, improve glossary. http://bugs.python.org/issue18326 closed by zach.ware #18874: Add a new tracemalloc module to trace memory allocations http://bugs.python.org/issue18874 closed by haypo #19146: Improvements to traceback module http://bugs.python.org/issue19146 closed by gvanrossum #19308: Tools/gdb/libpython.py does not support GDB linked against Pyt http://bugs.python.org/issue19308 closed by pitrou #19358: Integrate "Argument Clinic" into CPython build http://bugs.python.org/issue19358 closed by larry #19467: asyncore documentation: redirect users to the new asyncio modu http://bugs.python.org/issue19467 closed by gvanrossum #19486: Bump up version of Tcl/Tk in building Python in Windows platfo http://bugs.python.org/issue19486 closed by zach.ware #19545: time.strptime exception context http://bugs.python.org/issue19545 closed by serhiy.storchaka #19551: PEP 453: Mac OS X installer integration http://bugs.python.org/issue19551 closed by ned.deily #19588: Silently skipped test in test_random http://bugs.python.org/issue19588 closed by zach.ware #19616: Fix test.test_importlib.util.test_both() to set __module__ http://bugs.python.org/issue19616 closed by brett.cannon #19620: tokenize documentation contains typos (argment instead of argu http://bugs.python.org/issue19620 closed by ezio.melotti #19622: Default buffering for input and output pipes in subprocess mod http://bugs.python.org/issue19622 closed by python-dev #19631: "exec" BNF in simple statements http://bugs.python.org/issue19631 closed by georg.brandl #19632: doc build warning http://bugs.python.org/issue19632 closed by georg.brandl #19639: Improve re match objects display http://bugs.python.org/issue19639 closed by ezio.melotti #19641: Add audioop.byteswap() http://bugs.python.org/issue19641 closed by serhiy.storchaka #19668: Add support of the cp1125 encoding http://bugs.python.org/issue19668 closed by serhiy.storchaka #19671: Option to select the optimization level on compileall http://bugs.python.org/issue19671 closed by Sworddragon #19674: Add introspection information for builtins http://bugs.python.org/issue19674 closed by larry #19691: Weird wording in RuntimeError doc http://bugs.python.org/issue19691 closed by pitrou #19694: test_venv failing on one of the Ubuntu buildbots http://bugs.python.org/issue19694 closed by python-dev #19706: Check if inspect needs updating for PEP 451 http://bugs.python.org/issue19706 closed by brett.cannon #19709: Check pythonrun.c is fully using PEP 451 http://bugs.python.org/issue19709 closed by brett.cannon #19716: test that touch doesn't change file contents http://bugs.python.org/issue19716 closed by pitrou #19718: Path.glob() on case-insensitive Posix filesystems http://bugs.python.org/issue19718 closed by pitrou #19722: Expose stack effect calculator to Python http://bugs.python.org/issue19722 closed by larry #19724: test_pkgutil buildbot failure (related to PEP 451) http://bugs.python.org/issue19724 closed by eric.snow #19727: os.utime(..., None) has poor resolution on Windows http://bugs.python.org/issue19727 closed by pitrou #19729: [regression] str.format sublevel format parsing broken in Pyth http://bugs.python.org/issue19729 closed by python-dev #19730: Clinic fixes: add converters with length, version directive, " http://bugs.python.org/issue19730 closed by larry #19735: ssl._create_stdlib_context http://bugs.python.org/issue19735 closed by christian.heimes #19739: Legit compiler warnings in new pickle code on 32-bit Windows http://bugs.python.org/issue19739 closed by pitrou #19740: test_asyncio problems on 32-bit Windows http://bugs.python.org/issue19740 closed by gvanrossum #19742: pathlib group unittests can fail http://bugs.python.org/issue19742 closed by pitrou #19747: New failures in test_pickletools on 32-bit Windows Vista http://bugs.python.org/issue19747 closed by tim.peters #19752: os.openpty() failure on Solaris 10: PermissionError: [Errno 13 http://bugs.python.org/issue19752 closed by haypo #19755: PEP 397 listed as accepted instead of finished. http://bugs.python.org/issue19755 closed by python-dev #19759: Deprecation warnings in test_hmac http://bugs.python.org/issue19759 closed by serhiy.storchaka #19760: Deprecation warnings in ttest_sysconfig and test_distutils http://bugs.python.org/issue19760 closed by serhiy.storchaka #19762: Incorrect function documentation of _get_object_traceback and http://bugs.python.org/issue19762 closed by python-dev #19765: test_asyncio: test_create_server() failed on "x86 Windows Serv http://bugs.python.org/issue19765 closed by haypo #19773: Failed to build python 2.6.9 Solaris 10 http://bugs.python.org/issue19773 closed by christian.heimes #19774: strptime incorrect for weekday '0' when using week number form http://bugs.python.org/issue19774 closed by r.david.murray #19778: Change re.compile display http://bugs.python.org/issue19778 closed by ezio.melotti #19779: test_concurrent_futures crashes on 32-bit Windows Vista http://bugs.python.org/issue19779 closed by tim.peters #19786: tracemalloc: remove arbitrary limit of 100 frames http://bugs.python.org/issue19786 closed by python-dev #19788: Always run kill_python(_d).exe when (re-)building on Windows http://bugs.python.org/issue19788 closed by zach.ware #19790: ctypes can't load a c++ dll if the dll calls COM on Windows 8 http://bugs.python.org/issue19790 closed by yinkaisheng #19793: Formatting of True/False in pathlib docs http://bugs.python.org/issue19793 closed by serhiy.storchaka #19794: Formatting of True/False in decimal docs http://bugs.python.org/issue19794 closed by serhiy.storchaka #19797: Solaris 10. mod_python build failed. http://bugs.python.org/issue19797 closed by christian.heimes #19798: tracemalloc: rename "max_size" to "peak_size" in get_traced_me http://bugs.python.org/issue19798 closed by python-dev #19799: Clarifications for the documentation of pathlib http://bugs.python.org/issue19799 closed by eli.bendersky #19801: Concatenating bytes is much slower than concatenating strings http://bugs.python.org/issue19801 closed by benjamin.peterson #19802: socket.SO_PRIORITY is missing http://bugs.python.org/issue19802 closed by python-dev #19803: memoryview complain ctypes byte array are not native single ch http://bugs.python.org/issue19803 closed by skrah #19804: test_uuid failing on 32-bit Windows Vista http://bugs.python.org/issue19804 closed by serhiy.storchaka #19805: Revise FAQ Dictionary: consistent key order item http://bugs.python.org/issue19805 closed by python-dev #19807: calculation of standard math returns incorrect value http://bugs.python.org/issue19807 closed by mark.dickinson #19810: Display function signature in TypeErrors resulting from argume http://bugs.python.org/issue19810 closed by benjamin.peterson #19811: test_pathlib: "The directory is not empty" error on os.rmdir() http://bugs.python.org/issue19811 closed by pitrou #19812: reload() by symbol name http://bugs.python.org/issue19812 closed by brett.cannon #19813: Add a new optional timeout parameter to socket.socket() constr http://bugs.python.org/issue19813 closed by gvanrossum #19815: ElementTree segmentation fault in expat_start_ns_handler http://bugs.python.org/issue19815 closed by eli.bendersky #19819: reversing a Unicode ligature doesn't work http://bugs.python.org/issue19819 closed by haypo #19822: PEP process entrypoint http://bugs.python.org/issue19822 closed by christian.heimes #19823: for-each on list aborts earlier than expected http://bugs.python.org/issue19823 closed by cost6 #19825: test b.p.o email interface http://bugs.python.org/issue19825 closed by techtonik From breamoreboy at yahoo.co.uk Fri Nov 29 18:21:25 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 29 Nov 2013 17:21:25 +0000 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: On 29/11/2013 15:25, Kristj?n Valur J?nsson wrote: > > I know that Anatoly himself is a subject of long history here, but I myself > have felt lessening affinity to the dev community in recent years. It feels like > it is increasingly shutting itself in. > I entirely agree, the development community is getting to the stage where it stinks. Consider Issue 19755. It was a low priority admittedly, but it took them 17 minutes and 47 seconds to action it, what is the world coming to? :) -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From g.brandl at gmx.net Fri Nov 29 19:08:24 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 29 Nov 2013 19:08:24 +0100 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: Message-ID: Am 29.11.2013 10:16, schrieb Kristj?n Valur J?nsson: > Reading the defect, I find people being unnecessarily obstructive. > > Closing the ticket, twice, is a rude, and unnecessary action. How about > acknowledging that these waters are dark and murky and help making things better? > > Surely, documenting processes can only be an improvement? > > A lot has changed in open source development in the last 10 years. The > processes we have are starting to have the air of cathedral around them. First, please note that many of Anatoly's tickets are not closed out of hand, because they represent a valid issue. In this specific case, the fact is simply that the PEP process has hardly anything at all to do with the peps repository, and therefore looking for PEP process info in its README.txt is irrelevant. Yes, documenting processes is good, and that's what the devguide is for. The PEP process is all about discussion. New developers who propose a PEP typically do not submit a PEP directly but are encouraged to discuss their idea first on python-ideas or python-dev. There is no "entry point" problem at all since they are then pointed to mail their draft to peps at python.org. This is also why it does not make sense to open pull requests against the repository (this may be what you allude to with your changes in 10 years). The PEP repo is handled by the PEP editors (and other core committers). New PEPs are checked for format and readability by the editors. In short, this issue tries to address a problem where none exists, and that is why it was closed. I cannot see why you would call that unnecessary, because I hope you don't think that the more open issues are languishing in the tracker the better :) cheers, Georg From brett at python.org Fri Nov 29 20:39:19 2013 From: brett at python.org (Brett Cannon) Date: Fri, 29 Nov 2013 14:39:19 -0500 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: On Fri, Nov 29, 2013 at 10:25 AM, Kristj?n Valur J?nsson < kristjan at ccpgames.com> wrote: > > > > -----Original Message----- > > From: Python-Dev [mailto:python-dev- > > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou > > Sent: 29. n?vember 2013 14:58 > > To: python-dev at python.org > > Subject: Re: [Python-Dev] PEP process entry point and ill fated > initiatives > > > > On Fri, 29 Nov 2013 09:16:38 +0000 > > Kristj?n Valur J?nsson wrote: > > > Closing the ticket, twice, is a rude, and unnecessary action. > > > > Closing the ticket means "we don't believe there is an issue, or we don't > > think it would be reasonable to fix it". If that's our judgement on the > issue, > > how is it rude to close it? > Also, this attitude. Who are the "we" in this case? And why send > messages to people by shutting doors? > > > > > How about acknowledging that these waters are dark and murky and help > > > making things better? > > > > Well, how about? If Anatoly has a concrete proposal, surely he can > propose a > > patch to make things better. > Which is what he did. And instead of helpful ideas on how to improve his > patch, > the issue is closed. The acolytes have spoken. > But we can't accept his patches because he actively and vocally refuses to sign the CLA. It's a form of protest and he knows full well that we can't accept the patches without him signing the CLA or a legal change in how the license is handled, neither of which are about to happen. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Fri Nov 29 21:16:42 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 29 Nov 2013 21:16:42 +0100 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: <5298F62A.8020904@egenix.com> On 29.11.2013 16:25, Kristj?n Valur J?nsson wrote: >>> How about acknowledging that these waters are dark and murky and help >>> making things better? >> >> Well, how about? If Anatoly has a concrete proposal, surely he can propose a >> patch to make things better. > > Which is what he did. And instead of helpful ideas on how to improve his patch, > the issue is closed. The acolytes have spoken. I'm not sure we're talking about the same issue item. Anatoly did not present a patch which could have been improved: http://bugs.python.org/issue19822 Anatoly refuses to sign the Python CLA, so it's not surprising that he didn't provide a patch. His only contribution was a step-by-step process he put in the ticket description which doesn't represent the PEP process. Christian pointed him to the correct process, but he unfortunately just ignored this. I'm afraid there's nothing much we can do to make ends meet. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From steve at pearwood.info Sat Nov 30 00:00:59 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Nov 2013 10:00:59 +1100 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: <20131129230058.GG2085@ando> On Fri, Nov 29, 2013 at 03:25:14PM +0000, Kristj?n Valur J?nsson wrote about the PEP process: > > > How about acknowledging that these waters are dark and murky and help > > > making things better? > > > > Well, how about? If Anatoly has a concrete proposal, surely he can propose a > > patch to make things better. > > Which is what he did. And instead of helpful ideas on how to improve his patch, > the issue is closed. The acolytes have spoken. As the first-time author of a PEP (450), I did not find that the waters were dark and murky, and I certainly did not feel that the "acolytes" tried to shut me out, or shut themselves in as you suggest. Nobody held my hand through the process of writing the PEP, but people pointed me in the right direction and were tolerant of my newbie mistakes. The PEP process is quite well documented in the PEPs themselves, for somebody who is willing to do a bit of reading. Could the process be more clear? Could the existing docs be improved? Of course the answers to these questions are Yes, nothing is perfect. But overall, I found the PEP process to be less painful than I had feared. -- Steven From ncoghlan at gmail.com Sat Nov 30 03:54:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Nov 2013 12:54:30 +1000 Subject: [Python-Dev] Track ResourceWarning warnings with tracemalloc In-Reply-To: References: Message-ID: On 29 November 2013 22:12, Victor Stinner wrote: > You can see that the socket was created by test_poplib at line 407 > TestPOP3_TLSClass.setUp). It's more useful than the previous warning > :-) Excellent! I was hoping tracemalloc would make it practical to track some of those down :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat Nov 30 04:39:11 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Nov 2013 13:39:11 +1000 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: On 30 November 2013 01:25, Kristj?n Valur J?nsson wrote: > I know that Anatoly himself is a subject of long history here, but I myself > have felt lessening affinity to the dev community in recent years. It feels like > it is increasingly shutting itself in. Are you sure it isn't just that the focus of development has shifted to matters that aren't of interest or relevance to you? Many (perhaps even most) problems in Python don't require changes at the language or standard library level. We have cycle times measured in months, and impact times measured in years (especially since core development switched to Python 3 only mode for feature development). That's not typically something that is useful in day-to-day development tasks - it's only ever relevant in strategic terms. One thing that has changed for me personally, is that I've become far more blunt about refusing to respect those that explicitly (and vocally) refuse to respect us, yet seem to want to participate in core development anyway, and that's directly caused by Anatoly. He's still the only person who has been proposed for a permanent ban from all python.org controlled communication channels. That was averted after he voluntarily stopped annoying people for a while, but now he's back and I think the matter needs to be reconsidered. He refuses to sign the CLA that would allow him to contribute directly, yet still wants people to fix things for him. He refuses to read design documentation, yet still wants people to listen to his ideas. He refuses to care about other people's use cases, yet still wants people to care about his. As a case in point, Anatoly recently suggested that more diagrams in the documentation would be a good thing (http://bugs.python.org/issue19608). That's not an objectively bad idea, but producing and maintaining good diagrams is a high overhead activity, so we generally don't bother. When I suggested drawing some and sending a patch (I had forgotten about the CLA problem), Anatoly's response was that he's not a designer. So I countered with a suggestion that he explore what would be involved in adding the seqdiag and blockdiag sphinx extensions to our docs build process, since having those available would drastically lower the barrier to including and maintaining reasonable diagrams in the documentation, increasing the chance of such diagrams being included in the future. Silence. "Hey some diagrams would be helpful!" is not a useful contribution, it's stating the bleeding obvious. Even nominating some *specific* parts of the guide where a diagram would have helped Anatoly personally would have been useful. The technical change I suggested about figuring out what we'd need to change to enable those extensions would *definitely* have been useful. Another couple of incidents recently occurred on distutils-sig, where Anatoly started second guessing the decision to work on PyPI 2 as a test-driven-development-from-day-one incrementally developed and released system, rather than trying to update the existing fragile PyPI code base directly, as well as complaining about the not-accessible-to-end-users design docs for the proposed end-to-end security model for PyPI. It would be one thing if he was voicing those concerns on his own blog (it's a free internet, he can do what he likes anywhere else). It's a problem when he's doing it on distutils-sig and the project issue trackers. This isn't a matter of a naive newcomer that doesn't know any better. This is someone who has had PSF board members sit down with them at PyCon US to explain the CLA and why it is the way it is, who has had core developers offer them direct advice on how to propose suggestions in a way that is more likely to get people to listen, and when major issues have occurred in the past, we've even gone hunting for people to talk to him in his native language to make sure it wasn't a language barrier that was the root cause of the problem. *None* of it has resulted in any signficant improvement in his behaviour. Contributor time and emotional energy are the most precious resources an open source project has, and Anatoly is recklessly wasteful of both. We've spent years trying to coach him on being an effective collaborator and contributor, and it hasn't worked. This isn't a democracy, and neither is it a place for arbitrary people to get therapy on their inability to have any empathy for another person's point of view - in the end, passion for the language isn't enough, people have to demonstrate an ability to learn and be respectful of other people's time and energy, and Anatoly has well and truly proven he doesn't have either of those. Anatoly has the entire rest of the internet to play in, we shouldn't have to put up with his disruptions when we're actually trying to get stuff done. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From breamoreboy at yahoo.co.uk Sat Nov 30 09:00:36 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 30 Nov 2013 08:00:36 +0000 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: On 30/11/2013 03:39, Nick Coghlan wrote: > On 30 November 2013 01:25, Kristj?n Valur J?nsson wrote: >> I know that Anatoly himself is a subject of long history here, but I myself >> have felt lessening affinity to the dev community in recent years. It feels like >> it is increasingly shutting itself in. > > Are you sure it isn't just that the focus of development has shifted > to matters that aren't of interest or relevance to you? Many (perhaps > even most) problems in Python don't require changes at the language or > standard library level. We have cycle times measured in months, and > impact times measured in years (especially since core development > switched to Python 3 only mode for feature development). That's not > typically something that is useful in day-to-day development tasks - > it's only ever relevant in strategic terms. > > One thing that has changed for me personally, is that I've become far > more blunt about refusing to respect those that explicitly (and > vocally) refuse to respect us, yet seem to want to participate in core > development anyway, and that's directly caused by Anatoly. He's still > the only person who has been proposed for a permanent ban from all > python.org controlled communication channels. That was averted after > he voluntarily stopped annoying people for a while, but now he's back > and I think the matter needs to be reconsidered. > > He refuses to sign the CLA that would allow him to contribute > directly, yet still wants people to fix things for him. > He refuses to read design documentation, yet still wants people to > listen to his ideas. > He refuses to care about other people's use cases, yet still wants > people to care about his. > > As a case in point, Anatoly recently suggested that more diagrams in > the documentation would be a good thing > (http://bugs.python.org/issue19608). That's not an objectively bad > idea, but producing and maintaining good diagrams is a high overhead > activity, so we generally don't bother. When I suggested drawing some > and sending a patch (I had forgotten about the CLA problem), Anatoly's > response was that he's not a designer. So I countered with a > suggestion that he explore what would be involved in adding the > seqdiag and blockdiag sphinx extensions to our docs build process, > since having those available would drastically lower the barrier to > including and maintaining reasonable diagrams in the documentation, > increasing the chance of such diagrams being included in the future. > Silence. > > "Hey some diagrams would be helpful!" is not a useful contribution, > it's stating the bleeding obvious. Even nominating some *specific* > parts of the guide where a diagram would have helped Anatoly > personally would have been useful. The technical change I suggested > about figuring out what we'd need to change to enable those extensions > would *definitely* have been useful. > > Another couple of incidents recently occurred on distutils-sig, where > Anatoly started second guessing the decision to work on PyPI 2 as a > test-driven-development-from-day-one incrementally developed and > released system, rather than trying to update the existing fragile > PyPI code base directly, as well as complaining about the > not-accessible-to-end-users design docs for the proposed end-to-end > security model for PyPI. It would be one thing if he was voicing those > concerns on his own blog (it's a free internet, he can do what he > likes anywhere else). It's a problem when he's doing it on > distutils-sig and the project issue trackers. > > This isn't a matter of a naive newcomer that doesn't know any better. > This is someone who has had PSF board members sit down with them at > PyCon US to explain the CLA and why it is the way it is, who has had > core developers offer them direct advice on how to propose suggestions > in a way that is more likely to get people to listen, and when major > issues have occurred in the past, we've even gone hunting for people > to talk to him in his native language to make sure it wasn't a > language barrier that was the root cause of the problem. *None* of it > has resulted in any signficant improvement in his behaviour. > > Contributor time and emotional energy are the most precious resources > an open source project has, and Anatoly is recklessly wasteful of > both. We've spent years trying to coach him on being an effective > collaborator and contributor, and it hasn't worked. This isn't a > democracy, and neither is it a place for arbitrary people to get > therapy on their inability to have any empathy for another person's > point of view - in the end, passion for the language isn't enough, > people have to demonstrate an ability to learn and be respectful of > other people's time and energy, and Anatoly has well and truly proven > he doesn't have either of those. > > Anatoly has the entire rest of the internet to play in, we shouldn't > have to put up with his disruptions when we're actually trying to get > stuff done. > > Regards, > Nick. > FWIW I agree entirely with your sentiments. Several times recently I've seen him in action on the bug tracker. I've been sorely tempted to wear my XXXL size Asperger hat and state what I really think of his behaviour, but somehow managed to control myself and say nothing. Thankfully with your words above this seems to have paid off as nobody has been sidetracked any further. And yes I'm aware that at times I'm no angel, although I believe I can at least offer mitigating circumstances. -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From kristjan at ccpgames.com Sat Nov 30 15:40:14 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Sat, 30 Nov 2013 14:40:14 +0000 Subject: [Python-Dev] PEP process entry point and ill fated initiatives In-Reply-To: References: <20131129155819.7bf44bda@fsol> Message-ID: Thanks for this long explanation, Nick. For someone that is not a compulsive reader of python-dev it certainly helps by putting things in perspective. I think the problem you describe is a singular one that needs to be dealt with using singular methods. My own personal complaints, have other causes, I hope, and I see now that bringing the two up as being somehow related is both incorrect and unwise. I'm sorry for stirring things up, I'll try to show more restraint in the future :) Cheers, Kristj?n -----Original Message----- From: Nick Coghlan [mailto:ncoghlan at gmail.com] Sent: 30. n?vember 2013 03:39 To: Kristj?n Valur J?nsson Cc: Antoine Pitrou; python-dev at python.org Subject: Re: [Python-Dev] PEP process entry point and ill fated initiatives On 30 November 2013 01:25, Kristj?n Valur J?nsson wrote: > I know that Anatoly himself is a subject of long history here, but I > myself have felt lessening affinity to the dev community in recent > years. It feels like it is increasingly shutting itself in. Are you sure it isn't just that the focus of development has shifted to matters that aren't of interest or relevance to you? Many (perhaps even most) problems in Python don't require changes at the language or standard library level. We have cycle times measured in months, and impact times measured in years (especially since core development switched to Python 3 only mode for feature development). That's not typically something that is useful in day-to-day development tasks - it's only ever relevant in strategic terms. One thing that has changed for me personally, is that I've become far more blunt about refusing to respect those that explicitly (and vocally) refuse to respect us, yet seem to want to participate in core development anyway, and that's directly caused by Anatoly. He's still the only person who has been proposed for a permanent ban from all python.org controlled communication channels. That was averted after he voluntarily stopped annoying people for a while, but now he's back and I think the matter needs to be reconsidered. He refuses to sign the CLA that would allow him to contribute directly, yet still wants people to fix things for him. He refuses to read design documentation, yet still wants people to listen to his ideas. He refuses to care about other people's use cases, yet still wants people to care about his. As a case in point, Anatoly recently suggested that more diagrams in the documentation would be a good thing (http://bugs.python.org/issue19608). That's not an objectively bad idea, but producing and maintaining good diagrams is a high overhead activity, so we generally don't bother. When I suggested drawing some and sending a patch (I had forgotten about the CLA problem), Anatoly's response was that he's not a designer. So I countered with a suggestion that he explore what would be involved in adding the seqdiag and blockdiag sphinx extensions to our docs build process, since having those available would drastically lower the barrier to including and maintaining reasonable diagrams in the documentation, increasing the chance of such diagrams being included in the future. Silence. "Hey some diagrams would be helpful!" is not a useful contribution, it's stating the bleeding obvious. Even nominating some *specific* parts of the guide where a diagram would have helped Anatoly personally would have been useful. The technical change I suggested about figuring out what we'd need to change to enable those extensions would *definitely* have been useful. Another couple of incidents recently occurred on distutils-sig, where Anatoly started second guessing the decision to work on PyPI 2 as a test-driven-development-from-day-one incrementally developed and released system, rather than trying to update the existing fragile PyPI code base directly, as well as complaining about the not-accessible-to-end-users design docs for the proposed end-to-end security model for PyPI. It would be one thing if he was voicing those concerns on his own blog (it's a free internet, he can do what he likes anywhere else). It's a problem when he's doing it on distutils-sig and the project issue trackers. This isn't a matter of a naive newcomer that doesn't know any better. This is someone who has had PSF board members sit down with them at PyCon US to explain the CLA and why it is the way it is, who has had core developers offer them direct advice on how to propose suggestions in a way that is more likely to get people to listen, and when major issues have occurred in the past, we've even gone hunting for people to talk to him in his native language to make sure it wasn't a language barrier that was the root cause of the problem. *None* of it has resulted in any signficant improvement in his behaviour. Contributor time and emotional energy are the most precious resources an open source project has, and Anatoly is recklessly wasteful of both. We've spent years trying to coach him on being an effective collaborator and contributor, and it hasn't worked. This isn't a democracy, and neither is it a place for arbitrary people to get therapy on their inability to have any empathy for another person's point of view - in the end, passion for the language isn't enough, people have to demonstrate an ability to learn and be respectful of other people's time and energy, and Anatoly has well and truly proven he doesn't have either of those. Anatoly has the entire rest of the internet to play in, we shouldn't have to put up with his disruptions when we're actually trying to get stuff done. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From christian at python.org Sat Nov 30 19:29:37 2013 From: christian at python.org (Christian Heimes) Date: Sat, 30 Nov 2013 19:29:37 +0100 Subject: [Python-Dev] Verification of SSL cert and hostname made easy Message-ID: Hi, Larry has granted me a special pardon to add an outstanding fix for SSL, http://bugs.python.org/issue19509 . Right now most stdlib modules (ftplib, imaplib, nntplib, poplib, smtplib) neither support server name indication (SNI) nor check the subject name of the peer's certificate properly. The second issue is a major loop-hole because it allows man-in-the-middle attack despite CERT_REQUIRED. With CERT_REQUIRED OpenSSL verifies that the peer's certificate is directly or indirectly signed by a trusted root certification authority. With Python 3.4 the ssl module is able to use/load the system's trusted root certs on all major systems (Linux, Mac, BSD, Windows). On Linux and BSD it requires a properly configured system openssl to locate the root certs. This usually works out of the box. On Mac Apple's openssl build is able to use the keychain API of OSX. I have added code for Windows' system store. SSL socket code usually looks like this: context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) context.verify_mode = ssl.CERT_REQUIRED # new, by default it loads certs trusted for Purpose.SERVER_AUTH context.load_default_certs() sock = socket.create_connection(("example.net", 443)) sslsock = context.wrap_socket(sock) SSLContext.wrap_socket() wraps an ordinary socket into a SSLSocket. With verify_mode = CERT_REQUIRED OpenSSL ensures that the peer's SSL certificate is signed by a trusted root CA. In this example one very important step is missing. The peer may return *ANY* signed certificate for *ANY* hostname. These lines do NOT check that the certificate's information match "example.net". An attacker can use any arbitrary certificate (e.g. for "www.evil.net"), get it signed and abuse it for MitM attacks on "mail.python.org". http://docs.python.org/3/library/ssl.html#ssl.match_hostname must be used to verify the cert. It's easy to forget it... I have thought about multiple ways to fix the issue. At first I added a new argument "check_hostname" to all affected modules and implemented the check manually. For every module I had to modify several places for SSL and STARTTLS and add / change about 10 lines. The extra lines are required to properly shutdown and close the connection when the cert doesn't match the hostname. I don't like the solution because it's tedious. Every 3rd party author has to copy the same code, too. Then I came up with a better solution: context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) context.verify_mode = ssl.CERT_REQUIRED context.load_default_certs() context.check_hostname = True # <-- NEW sock = socket.create_connection(("example.net", 443)) # server_hostname is already used for SNI sslsock = context.wrap_socket(sock, server_hostname="example.net") This fix requires only a new SSLContext attribute and a small modification to SSLSocket.do_handshake(): if self.context.check_hostname: try: match_hostname(self.getpeercert(), self.server_hostname) except Exception: self.shutdown(_SHUT_RDWR) self.close() raise Pros: * match_hostname() is done in one central place * the cert is matched as early as possible * no extra arguments for APIs, a context object is enough * library developers just have to add server_hostname to get SNI and hostname checks at the same time * users of libraries can configure cert verification and checking on the same object * missing checks will not pass silently Cons: * Doesn't work with OpenSSL < 0.9.8f (released 2007) because older versions lack SNI support. The ssl module raises an exception for server_hostname if SNI is not supported. The default settings for all stdlib modules will still be verify_mode = CERT_NONE and check_hostname = False for maximum backward compatibility. Python 3.4 comes with a new function ssl.create_default_context() that returns a new context with best practice settings and loaded root CA certs. The settings are TLS 1.0, no weak and insecure ciphers (no MD5, no RC4), no compression (CRIME attack), CERT_REQUIRED and check_hostname = True (for client side only). http://bugs.python.org/issue19509 has a working patch for ftplib. Comments? Christian From ncoghlan at gmail.com Sat Nov 30 22:54:02 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 1 Dec 2013 07:54:02 +1000 Subject: [Python-Dev] Verification of SSL cert and hostname made easy In-Reply-To: References: Message-ID: On 1 Dec 2013 04:32, "Christian Heimes" wrote: > > Hi, > > Larry has granted me a special pardon to add an outstanding fix for SSL, > http://bugs.python.org/issue19509 . Right now most stdlib modules > (ftplib, imaplib, nntplib, poplib, smtplib) neither support server name > indication (SNI) nor check the subject name of the peer's certificate > properly. The second issue is a major loop-hole because it allows > man-in-the-middle attack despite CERT_REQUIRED. > > With CERT_REQUIRED OpenSSL verifies that the peer's certificate is > directly or indirectly signed by a trusted root certification authority. > With Python 3.4 the ssl module is able to use/load the system's trusted > root certs on all major systems (Linux, Mac, BSD, Windows). On Linux and > BSD it requires a properly configured system openssl to locate the root > certs. This usually works out of the box. On Mac Apple's openssl build > is able to use the keychain API of OSX. I have added code for Windows' > system store. > > SSL socket code usually looks like this: > > context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) > context.verify_mode = ssl.CERT_REQUIRED > # new, by default it loads certs trusted for Purpose.SERVER_AUTH > context.load_default_certs() > > sock = socket.create_connection(("example.net", 443)) > sslsock = context.wrap_socket(sock) > > SSLContext.wrap_socket() wraps an ordinary socket into a SSLSocket. With > verify_mode = CERT_REQUIRED OpenSSL ensures that the peer's SSL > certificate is signed by a trusted root CA. In this example one very > important step is missing. The peer may return *ANY* signed certificate > for *ANY* hostname. These lines do NOT check that the certificate's > information match "example.net". An attacker can use any arbitrary > certificate (e.g. for "www.evil.net"), get it signed and abuse it for > MitM attacks on "mail.python.org". > http://docs.python.org/3/library/ssl.html#ssl.match_hostname must be > used to verify the cert. It's easy to forget it... > > > I have thought about multiple ways to fix the issue. At first I added a > new argument "check_hostname" to all affected modules and implemented > the check manually. For every module I had to modify several places for > SSL and STARTTLS and add / change about 10 lines. The extra lines are > required to properly shutdown and close the connection when the cert > doesn't match the hostname. I don't like the solution because it's > tedious. Every 3rd party author has to copy the same code, too. > > Then I came up with a better solution: > > context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) > context.verify_mode = ssl.CERT_REQUIRED > context.load_default_certs() > context.check_hostname = True # <-- NEW > > sock = socket.create_connection(("example.net", 443)) > # server_hostname is already used for SNI > sslsock = context.wrap_socket(sock, server_hostname="example.net") > > > This fix requires only a new SSLContext attribute and a small > modification to SSLSocket.do_handshake(): > > if self.context.check_hostname: > try: > match_hostname(self.getpeercert(), self.server_hostname) > except Exception: > self.shutdown(_SHUT_RDWR) > self.close() > raise > > > Pros: > > * match_hostname() is done in one central place > * the cert is matched as early as possible > * no extra arguments for APIs, a context object is enough > * library developers just have to add server_hostname to get SNI and > hostname checks at the same time > * users of libraries can configure cert verification and checking on the > same object > * missing checks will not pass silently > > Cons: > > * Doesn't work with OpenSSL < 0.9.8f (released 2007) because older > versions lack SNI support. The ssl module raises an exception for > server_hostname if SNI is not supported. > > > The default settings for all stdlib modules will still be verify_mode = > CERT_NONE and check_hostname = False for maximum backward compatibility. > Python 3.4 comes with a new function ssl.create_default_context() that > returns a new context with best practice settings and loaded root CA > certs. The settings are TLS 1.0, no weak and insecure ciphers (no MD5, > no RC4), no compression (CRIME attack), CERT_REQUIRED and check_hostname > = True (for client side only). > > http://bugs.python.org/issue19509 has a working patch for ftplib. > > Comments? If Larry is OK with it as RM (and it sounds like he is), +1 from me as well. Cheers, Nick. > > Christian > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Nov 30 23:16:05 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 30 Nov 2013 14:16:05 -0800 Subject: [Python-Dev] Verification of SSL cert and hostname made easy In-Reply-To: References: Message-ID: Sounds good. Is another change for asyncio needed? On Sat, Nov 30, 2013 at 1:54 PM, Nick Coghlan wrote: > > On 1 Dec 2013 04:32, "Christian Heimes" wrote: > > > > Hi, > > > > Larry has granted me a special pardon to add an outstanding fix for SSL, > > http://bugs.python.org/issue19509 . Right now most stdlib modules > > (ftplib, imaplib, nntplib, poplib, smtplib) neither support server name > > indication (SNI) nor check the subject name of the peer's certificate > > properly. The second issue is a major loop-hole because it allows > > man-in-the-middle attack despite CERT_REQUIRED. > > > > With CERT_REQUIRED OpenSSL verifies that the peer's certificate is > > directly or indirectly signed by a trusted root certification authority. > > With Python 3.4 the ssl module is able to use/load the system's trusted > > root certs on all major systems (Linux, Mac, BSD, Windows). On Linux and > > BSD it requires a properly configured system openssl to locate the root > > certs. This usually works out of the box. On Mac Apple's openssl build > > is able to use the keychain API of OSX. I have added code for Windows' > > system store. > > > > SSL socket code usually looks like this: > > > > context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) > > context.verify_mode = ssl.CERT_REQUIRED > > # new, by default it loads certs trusted for Purpose.SERVER_AUTH > > context.load_default_certs() > > > > sock = socket.create_connection(("example.net", 443)) > > sslsock = context.wrap_socket(sock) > > > > SSLContext.wrap_socket() wraps an ordinary socket into a SSLSocket. With > > verify_mode = CERT_REQUIRED OpenSSL ensures that the peer's SSL > > certificate is signed by a trusted root CA. In this example one very > > important step is missing. The peer may return *ANY* signed certificate > > for *ANY* hostname. These lines do NOT check that the certificate's > > information match "example.net". An attacker can use any arbitrary > > certificate (e.g. for "www.evil.net"), get it signed and abuse it for > > MitM attacks on "mail.python.org". > > http://docs.python.org/3/library/ssl.html#ssl.match_hostname must be > > used to verify the cert. It's easy to forget it... > > > > > > I have thought about multiple ways to fix the issue. At first I added a > > new argument "check_hostname" to all affected modules and implemented > > the check manually. For every module I had to modify several places for > > SSL and STARTTLS and add / change about 10 lines. The extra lines are > > required to properly shutdown and close the connection when the cert > > doesn't match the hostname. I don't like the solution because it's > > tedious. Every 3rd party author has to copy the same code, too. > > > > Then I came up with a better solution: > > > > context = ssl.SSLContext(ssl.PROTOCOL_TLSv1) > > context.verify_mode = ssl.CERT_REQUIRED > > context.load_default_certs() > > context.check_hostname = True # <-- NEW > > > > sock = socket.create_connection(("example.net", 443)) > > # server_hostname is already used for SNI > > sslsock = context.wrap_socket(sock, server_hostname="example.net") > > > > > > This fix requires only a new SSLContext attribute and a small > > modification to SSLSocket.do_handshake(): > > > > if self.context.check_hostname: > > try: > > match_hostname(self.getpeercert(), self.server_hostname) > > except Exception: > > self.shutdown(_SHUT_RDWR) > > self.close() > > raise > > > > > > Pros: > > > > * match_hostname() is done in one central place > > * the cert is matched as early as possible > > * no extra arguments for APIs, a context object is enough > > * library developers just have to add server_hostname to get SNI and > > hostname checks at the same time > > * users of libraries can configure cert verification and checking on the > > same object > > * missing checks will not pass silently > > > > Cons: > > > > * Doesn't work with OpenSSL < 0.9.8f (released 2007) because older > > versions lack SNI support. The ssl module raises an exception for > > server_hostname if SNI is not supported. > > > > > > The default settings for all stdlib modules will still be verify_mode = > > CERT_NONE and check_hostname = False for maximum backward compatibility. > > Python 3.4 comes with a new function ssl.create_default_context() that > > returns a new context with best practice settings and loaded root CA > > certs. The settings are TLS 1.0, no weak and insecure ciphers (no MD5, > > no RC4), no compression (CRIME attack), CERT_REQUIRED and check_hostname > > = True (for client side only). > > > > http://bugs.python.org/issue19509 has a working patch for ftplib. > > > > Comments? > > If Larry is OK with it as RM (and it sounds like he is), +1 from me as > well. > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Nov 30 23:51:17 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 30 Nov 2013 23:51:17 +0100 Subject: [Python-Dev] Verification of SSL cert and hostname made easy References: Message-ID: <20131130235117.622994e5@fsol> On Sat, 30 Nov 2013 19:29:37 +0100 Christian Heimes wrote: > This fix requires only a new SSLContext attribute and a small > modification to SSLSocket.do_handshake(): > > if self.context.check_hostname: > try: > match_hostname(self.getpeercert(), self.server_hostname) > except Exception: > self.shutdown(_SHUT_RDWR) > self.close() > raise Small nit: what happens if the server_hostname is None (i.e. wasn't passed to context.wrap_socket())? > The default settings for all stdlib modules will still be verify_mode = > CERT_NONE and check_hostname = False for maximum backward compatibility. > Python 3.4 comes with a new function ssl.create_default_context() that > returns a new context with best practice settings and loaded root CA > certs. The settings are TLS 1.0, no weak and insecure ciphers (no MD5, > no RC4), no compression (CRIME attack), CERT_REQUIRED and check_hostname > = True (for client side only). Sounds fine to me, thanks. Regards Antoine.