From ezio.melotti at gmail.com Sat Dec 1 01:04:09 2012 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sat, 1 Dec 2012 02:04:09 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20121130170723.D44591C98E@psf.upfronthosting.co.za> <20121130210748.3679B250117@webabinitio.net> Message-ID: Hi, On Fri, Nov 30, 2012 at 11:52 PM, Brett Cannon wrote: > On Fri, Nov 30, 2012 at 4:07 PM, R. David Murray wrote: > >> On Fri, 30 Nov 2012 14:38:12 -0500, Brett Cannon >> wrote: >> > Do we have a graph of the historical trend of the number of bugs (or at >> > least the historical details stored somewhere)? I think we have had a >> net >> >> Not really. Ezio made one by hand once, but there is nothing automated. >> > The one I made can be found here: https://docs.google.com/spreadsheet/ccc?key=0AplyAWXqkvHUdFF0SkVrT3VKcnRBZXNrR1hleHowWnc I now updated it with the latest data. On the Sheet 2 you can find additional graphs that show the releases of Python together with the data. Only final releases are included, alphas, betas and rcs are not included. The spreadsheet is a bit messy because I was experimenting with different kind of graphs and trying to work around some limitations of Google Docs, but it should be good enough. > >> The historical details are stored only in the mailing list archives, as >> far as I know. In theory I think you could re-calculate them from the >> Roundup DB, but for various reasons the numbers would probably come out >> slightly different. Still, getting the data from the DB would be better >> than parsing the emails, since for one reason and another there are >> missing Friday reports, and reports that were issued on non-Friday >> dates. >> > One option I was considering is having the weekly report script append the result on a file and make it available on bugs.python.org, or even use it to generate graphs directly. This is something I considered and planned to implement for a long time, but haven't done it yet. >> > decrease in open bugs the last couple of weeks and it would be neat to >> see >> > an absolute and relative graph of the overall trend since Python 3.3.0 >> was >> > released. Also might make a nice motivator to try to close issues >> faster. =) >> > >> > Otherwise is the code public for this somewhere? I assume it's making an >> >> Yes. It is in the software repository for our roundup instances: >> >> >> http://hg.python.org/tracker/python-dev/file/default/scripts/roundup-summary >> >> (Be warned that that isn't the location from which the script is >> executed, so it is possible for what is actually running to get out of >> sync with what is checked in at that location.) >> >> > XML-RPC call or something every week to get the results, but if I >> decide to >> >> Nope, it talks directly to the DB. And as you will see, it is more >> than a bit gnarly. >> >> > I think I could also download the csv file and parse that to get whatever > data I wanted. > To figure out when an issue was closed you need to access its history, and that's not available through XML-RPC/csv IIRC. You should be able to figure out when the issue got created though. Anyway, it's probably easier to implement something like what I mentioned earlier. > >> > do a little App Engine app to store historical data and do a graph I >> would >> > rather not have to figure all of this out from scratch. =) Although I >> could >> > I guess also parse the email if I wanted to ignore all other emails. >> >> I'm not sure how one would go about integrating the above with an App >> Engine app. I suspect that not quite enough information is available >> through the XML-RPC interface to replicate that script, but maybe you >> could manage just the open-close counting part of it. I haven't >> looked at what it would take. >> > > It really depends on what statistics I cared about (e.g. there are less > than 4000 bugs while there are less than 25,000 closed bugs). If I just did > high-level statistics it wouldn't be bad, but if I try to track every issue > independently that might be annoying (and actually cost money for me, > although I already personally pay for py3ksupport.appspot.com so I can > probably piggyback off of that app's quota). We will see if this ever goes > anywhere. =) > > Another somehow related project/experiment I've been working on is collecting stats about the patches available on the tracker. I put together a temporary page that allows you to enter the name of a module (or any file/path) and get a list of issues with patches that affect the specified module(s): http://wolfprojects.altervista.org/issues.html FTR this is based on the word done by anatoly (see links on the page). I'm planning to eventually integrate this in the tracker too, but lately I don't have too much time, so there's no ETA. Best Regards, Ezio Melotti -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Dec 2 08:08:28 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Dec 2012 17:08:28 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation In-Reply-To: <3YDg9w52qKzNGR@mail.python.org> References: <3YDg9w52qKzNGR@mail.python.org> Message-ID: On Sun, Dec 2, 2012 at 4:56 PM, christian.heimes wrote: > http://hg.python.org/cpython/rev/9af5a2611202 > changeset: 80672:9af5a2611202 > user: Christian Heimes > date: Sun Dec 02 07:56:42 2012 +0100 > summary: > Issue #16592: stringlib_bytes_join doesn't raise MemoryError on > allocation failure > > files: > Misc/NEWS | 3 +++ > Objects/stringlib/join.h | 1 + > 2 files changed, 4 insertions(+), 0 deletions(-) > > > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -10,6 +10,9 @@ > Core and Builtins > ----------------- > > +- Issue #16592: stringlib_bytes_join doesn't raise MemoryError on > allocation > + failure. > Please don't write NEWS entries in past tense like this - they're annoyingly ambiguous, as it isn't clear whether the entry is describing the reported problem or the fix for the problem. Describing just the new behaviour or the original problem and the fix is much easier to follow. For example: - Issue #16592: stringlib_bytes_join now correctly raises MemoryError on allocation failure. - Issue #16592: stringlib_bytes_join was triggering SystemError on allocation failure. It now correctly raises MemoryError. Issue titles for actual bugs generally don't make good NEWS entries, as they're typically a summary of the problem rather than the solution (RFE's are different, as there the issue title is often a good summary of the proposed change) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Dec 2 12:58:16 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 2 Dec 2012 21:58:16 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation In-Reply-To: <20121202105921.22ddce60@pitrou.net> References: <3YDg9w52qKzNGR@mail.python.org> <20121202105921.22ddce60@pitrou.net> Message-ID: On Sun, Dec 2, 2012 at 7:59 PM, Antoine Pitrou wrote: > Do you mean present tense? > Ah, you're right - my main objection is to describing the old broken behaviour, without describing the *new* behaviour. Any use of present tense should relate to the new behaviour after the commit. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From "ja...py" at farowl.co.uk Sun Dec 2 09:19:09 2012 From: "ja...py" at farowl.co.uk (Jeff Allen) Date: Sun, 02 Dec 2012 08:19:09 +0000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation In-Reply-To: References: <3YDg9w52qKzNGR@mail.python.org> Message-ID: <50BB0EFD.9030905@farowl.co.uk> On 02/12/2012 07:08, Nick Coghlan wrote: > On Sun, Dec 2, 2012 at 4:56 PM, christian.heimes > > wrote: > > ... > diff --git a/Misc/NEWS b/Misc/NEWS > ... > +- Issue #16592: stringlib_bytes_join doesn't raise MemoryError on > allocation > + failure. > > > Please don't write NEWS entries in past tense like this - they're > annoyingly ambiguous, as it isn't clear whether the entry is > describing the reported problem or the fix for the problem. Describing > just the new behaviour or the original problem and the fix is much > easier to follow. For example: > > - Issue #16592: stringlib_bytes_join now correctly raises > MemoryError on allocation failure. > - Issue #16592: stringlib_bytes_join was triggering SystemError on > allocation failure. It now correctly raises MemoryError. > > Issue titles for actual bugs generally don't make good NEWS entries, > as they're typically a summary of the problem rather than the solution > (RFE's are different, as there the issue title is often a good summary > of the proposed change) > You mean please do (re-)write such statements in the past tense, when the news is that the statement is no longer true. I agree about the ambiguity that arises here, but there's a simple alternative to re-writing. Surely all that has been forgotten here is an enclosing "The following issues have been resolved:"? I think there's a lot to be said for cut and paste of actual titles on grounds of accuracy and speed (and perhaps scriptability). E.g. http://hg.python.org/jython/file/661a6baa10da/NEWS Jeff Allen -------------- next part -------------- An HTML attachment was scrubbed... URL: From petri at digip.org Mon Dec 3 09:13:58 2012 From: petri at digip.org (Petri Lehtinen) Date: Mon, 3 Dec 2012 10:13:58 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20121130170723.D44591C98E@psf.upfronthosting.co.za> Message-ID: <20121203081358.GS19614@p29> Brett Cannon wrote: > Do we have a graph of the historical trend of the number of bugs (or at least > the historical details stored somewhere)? I think we have had a net decrease in > open bugs the last couple of weeks and it would be neat to see an absolute and > relative graph of the overall trend since Python 3.3.0 was released. Also might > make a nice motivator to try to close issues faster. =) > > Otherwise is the code public for this somewhere? I assume it's making an > XML-RPC call or something every week to get the results, but if I decide to do > a little App Engine app to store historical data and do a graph I would rather > not have to figure all of this out from scratch. =) Although I could I guess > also parse the email if I wanted to ignore all other emails. A few months ago I made a script that downloads all python-dev mailman archives, scans them to find the summary messages, parses the messages and creates a graph using matplotlib. The script is available at https://gist.github.com/2723809. From ncoghlan at gmail.com Mon Dec 3 12:39:19 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 3 Dec 2012 21:39:19 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Issue #16592: stringlib_bytes_join doesn't raise MemoryError on allocation In-Reply-To: <50BB0EFD.9030905@farowl.co.uk> References: <3YDg9w52qKzNGR@mail.python.org> <50BB0EFD.9030905@farowl.co.uk> Message-ID: On Sun, Dec 2, 2012 at 6:19 PM, Jeff Allen <"ja...py"@farowl.co.uk> wrote: > On 02/12/2012 07:08, Nick Coghlan wrote: > > On Sun, Dec 2, 2012 at 4:56 PM, christian.heimes < > python-checkins at python.org> wrote: > >> ... >> diff --git a/Misc/NEWS b/Misc/NEWS >> ... >> >> +- Issue #16592: stringlib_bytes_join doesn't raise MemoryError on >> allocation >> + failure. >> > > Please don't write NEWS entries in past tense like this - they're > annoyingly ambiguous, as it isn't clear whether the entry is describing the > reported problem or the fix for the problem. Describing just the new > behaviour or the original problem and the fix is much easier to follow. For > example: > > - Issue #16592: stringlib_bytes_join now correctly raises MemoryError on > allocation failure. > - Issue #16592: stringlib_bytes_join was triggering SystemError on > allocation failure. It now correctly raises MemoryError. > > Issue titles for actual bugs generally don't make good NEWS entries, as > they're typically a summary of the problem rather than the solution (RFE's > are different, as there the issue title is often a good summary of the > proposed change) > > You mean please do (re-)write such statements in the past tense, when > the news is that the statement is no longer true. > > I agree about the ambiguity that arises here, but there's a simple > alternative to re-writing. Surely all that has been forgotten here is an > enclosing "The following issues have been resolved:"? I think there's a lot > to be said for cut and paste of actual titles on grounds of accuracy and > speed (and perhaps scriptability). > Readability matters - ambiguous release notes don't help anyone, and, like code, release notes are read by many more people than write them. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Mon Dec 3 20:43:48 2012 From: dholth at gmail.com (Daniel Holth) Date: Mon, 3 Dec 2012 14:43:48 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: How to use Obsoletes: The author of B decides A is obsolete. A releases an empty version of itself that Requires: B B Obsoletes: A The package manager says "These packages are obsolete: A". Would you like to remove them? User says "OK". On Wed, Nov 21, 2012 at 2:54 AM, Stephen J. Turnbull wrote: > PJ Eby writes: > > On Wed, Nov 21, 2012 at 12:00 AM, Stephen J. Turnbull < > stephen at xemacs.org>wrote: > > > > What I care about is when I'm using Gorgon, and there's something > > > "better" (or worse, "correct") to use in my application. > > > > Hence my suggestion for an Obsoleted-By field, in which Gorgon would be > > able to suggest alternatives. > > My bad, my precise intention was to follow up on your idea (which, > credit where credit is due, I had *not* hit upon independently). I > should have made that clear. > > (I really shouldn't be answering English email at a Japanese-speaking > conference, my brain thinks it knows what it's doing but shirazuni ? > ????????....) > > > > It might be a good idea to have a just-like-Amazon > > > > > > While-This-Package-Is-Great-You-Might-Also-Consider: > > > > > > field. > > > > Yeah, that's basically what Obsoleted-By is for. > > Well, Obsoleted-By is pretty strong language for suggesting possible > alternatives. But I suspect that few projects would really want to be > suggesting competitors' products *or* their own oldie-but-still-goodie > that they'd really like to obsolete ASAP (put an Obsoleted-By line in > every Python 2 distribution, anyone? :-) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/dholth%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Mon Dec 3 23:29:35 2012 From: larry at hastings.org (Larry Hastings) Date: Mon, 03 Dec 2012 14:29:35 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython Message-ID: <50BD27CF.1070303@hastings.org> Say there, the Python core development community! Have I got a question for you! *ahem* Which of the following four options do you dislike least? ;-) 1) CPython continues to provide no "function signature" objects (PEP 362) or inspect.getfullargspec() information for any function implemented in C. 2) We add new hand-coded data structures representing the metadata necessary for function signatures for builtins. Which means that, when defining arguments to functions in C, we'd need to repeat ourselves *even more* than we already do. 3) Builtin function arguments are defined using some seriously uncomfortable and impenetrable C preprocessor macros, which produce all the various types of output we need (argument processing code, function signature metadata, possibly the docstrings too). 4) Builtin function arguments are defined in a small DSL; these are expanded to code and data using a custom compile-time preprocessor step. All the core devs I've asked said "given all that, I'd prefer the hairy preprocessor macros". But by the end of the conversation they'd changed their minds to prefer the custom DSL. Maybe I'll make a believer out of you too--read on! I've named this DSL preprocessor "Argument Clinic", or Clinic for short**. Clinic works similarly to Ned Batchelder's brilliant "Cog" tool: http://nedbatchelder.com/code/cog/ You embed the input to Clinic in a comment in your C file, and the output is written out immediately after that comment. The output's overwritten every time the preprocessor is run. In short it looks something like this: /*[clinic] input to the DSL [clinic]*/ ... output from the DSL, overwritten every time ... /*[clinic end:]*/ The input to the DSL includes all the metadata about the function that we need for the function signature: * the name of the function, * the return annotation (if any), * each parameter to the function, including * its name, * its type (in C), * its default value, * and a per-parameter docstring; * and the docstring for the function as a whole. The resulting output contains: * the docstring for the function, * declarations for all your parameters, * C code handling all argument processing for you, * and a #define'd methoddef structure for adding the function to the module. I discussed this with Mark "HotPy" Shannon, and he suggested we break our existing C functions into two. We put the argument processing into its own function, generated entirely by Clinic, and have the implementation in a second function called from the first. I like this approach simply because it makes the code cleaner. (Note that this approach should not cause any overhead with a modern compiler, as both functions will be "static".) But it also provides an optimization opportunity for HotPy: it could read the metadata, and when generating the JIT'd code it could skip building the PyObjects and argument tuple (and possibly keyword argument dict), and the subsequent unpacking/decoding, and just call the implementation function directly, giving it a likely-measurable speed boost. And we can go further! If we add a new extension type API allowing you to register both functions, and external modules start using it, sophisticated Python implementations like PyPy might be able to skip building the tuple for extension type function calls--speeding those up too! Another plausible benefit: alternate implementations of Python could read the metadata--or parse the input to Clinic themselves--to ensure their reimplementations of the Python standard library conform to the same API! Clinic can also run general-purpose Python code ("/*[python]"). All output from "print" is redirected into the output section after the Python code. As you've no doubt already guessed, I've made a prototype of Argument Clinic. You can see it--and some sample conversions of builtins using it for argument processing--at this BitBucket repo: https://bitbucket.org/larry/python-clinic I don't claim that it's fabulous, production-ready code. But it's a definite start! To save you a little time, here's a preview of using Clinic for dbm.open(). The stuff at the same indent as a declaration are options; see the "clinic.txt" in the repo above for full documentation. /*[clinic] dbm.open -> mapping basename=dbmopen const char *filename; The filename to open. const char *flags="r"; How to open the file. "r" for reading, "w" for writing, etc. int mode=0666; default=0o666 If creating a new file, the mode bits for the new file (e.g. os.O_RDWR). Returns a database object. [clinic]*/ PyDoc_STRVAR(dbmopen__doc__, "dbm.open(filename[, flags=\'r\'[, mode=0o666]]) -> mapping\n" "\n" " filename\n" " The filename to open.\n" "\n" " flags\n" " How to open the file. \"r\" for reading, \"w\" for writing, etc.\n" "\n" " mode\n" " If creating a new file, the mode bits for the new file\n" " (e.g. os.O_RDWR).\n" "\n" "Returns a database object.\n" "\n"); #define DBMOPEN_METHODDEF \ {"open", (PyCFunction)dbmopen, METH_VARARGS | METH_KEYWORDS, dbmopen__doc__} static PyObject * dbmopen_impl(PyObject *self, const char *filename, const char *flags, int mode); static PyObject * dbmopen(PyObject *self, PyObject *args, PyObject *kwargs) { const char *filename; const char *flags = "r"; int mode = 0666; static char *_keywords[] = {"filename", "flags", "mode", NULL}; if (!PyArg_ParseTupleAndKeywords(args, kwargs, "s|si", _keywords, &filename, &flags, &mode)) return NULL; return dbmopen_impl(self, filename, flags, mode); } static PyObject * dbmopen_impl(PyObject *self, const char *filename, const char *flags, int mode) /*[clinic end:eddc886e542945d959b44b483258bf038acf8872]*/ As of this writing, I also have sample conversions in the following files available for your perusal: Modules/_cursesmodule.c Modules/_dbmmodule.c Modules/posixmodule.c Modules/zlibmodule.c Just search in C files for '[clinic]' and you'll find everything soon enough. As you can see, Clinic has already survived some contact with the enemy. I've already converted some tricky functions--for example, os.stat() and curses.window.addch(). The latter required adding a new positional-only processing mode for functions using a legacy argument processing approach. (See "clinic.txt" for more.) If you can suggest additional tricky functions to support, please do! Big unresolved questions: * How would we convert all the builtins to use Clinic? I fear any solution will involve some work by hand. Even if we can automate big chunks of it, fully automating it would require parsing arbitrary C. This seems like overkill for a one-shot conversion. (Mark Shannon says he has some ideas.) * How do we create the Signature objects? My current favorite idea: Clinic also generates a new, TBD C structure defining all the information necessary for the signature, which is also passed in to the new registration API (you remember, the one that takes both the argument-processing function and the implementation function). This is secreted away in some new part of the C function object. At runtime this is converted on-demand into a Signature object. Default values for arguments are represented in C as strings; the conversion process attempts eval() on the string, and if that works it uses the result, otherwise it simply passes through the string. * Right now Clinic paves over the PyArg_ParseTuple API for you. If we convert CPython to use Clinic everywhere, theoretically we could replace the parsing API with something cleaner and/or faster. Does anyone have good ideas (and time, and energy) here? * There's actually a fifth option, proposed by Brett Cannon. We constrain the format of docstrings for builtin functions to make them machine-readable, then generate the function signature objects from that. But consider: generating *everything* in the signature object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and this might gunk up the docstring. But the biggest unresolved question... is this all actually a terrible idea? //arry/ ** "Is this the right room for an argument?" "I've told you once...!" From greg at krypto.org Tue Dec 4 00:42:06 2012 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 3 Dec 2012 15:42:06 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: On Mon, Dec 3, 2012 at 2:29 PM, Larry Hastings wrote: > > Say there, the Python core development community! Have I got > a question for you! > > *ahem* > > Which of the following four options do you dislike least? ;-) > > 1) CPython continues to provide no "function signature" > objects (PEP 362) or inspect.getfullargspec() information > for any function implemented in C. > > yuck on #1, though this is what happens by default if we don't do anything nice. > 2) We add new hand-coded data structures representing the > metadata necessary for function signatures for builtins. > Which means that, when defining arguments to functions in C, > we'd need to repeat ourselves *even more* than we already do. > > yuck on #2. > 3) Builtin function arguments are defined using some seriously > uncomfortable and impenetrable C preprocessor macros, which > produce all the various types of output we need (argument > processing code, function signature metadata, possibly > the docstrings too). > Likely painful to maintain. C++ templates would likely be easier. > > 4) Builtin function arguments are defined in a small DSL; these > are expanded to code and data using a custom compile-time > preprocessor step. > > All the core devs I've asked said "given all that, I'd prefer the > hairy preprocessor macros". But by the end of the conversation > they'd changed their minds to prefer the custom DSL. Maybe I'll > make a believer out of you too--read on! > It always strikes me that C++ could be such a DSL that could likely be used for this purpose rather than defining and maintaining our own "yet another C preprocessor" step. But I don't have suggestions and we're not allowing C++ so... nevermind. :) > I've named this DSL preprocessor "Argument Clinic", or Clinic > for short**. Clinic works similarly to Ned Batchelder's brilliant > "Cog" tool: > http://nedbatchelder.com/code/**cog/ > > You embed the input to Clinic in a comment in your C file, > and the output is written out immediately after that comment. > The output's overwritten every time the preprocessor is run. > In short it looks something like this: > > /*[clinic] > input to the DSL > [clinic]*/ > > ... output from the DSL, overwritten every time ... > > /*[clinic end:]*/ > > The input to the DSL includes all the metadata about the > function that we need for the function signature: > > * the name of the function, > * the return annotation (if any), > * each parameter to the function, including > * its name, > * its type (in C), > * its default value, > * and a per-parameter docstring; > * and the docstring for the function as a whole. > > The resulting output contains: > > * the docstring for the function, > * declarations for all your parameters, > * C code handling all argument processing for you, > * and a #define'd methoddef structure for adding the > function to the module. > > > I discussed this with Mark "HotPy" Shannon, and he suggested we break > our existing C functions into two. We put the argument processing > into its own function, generated entirely by Clinic, and have the > implementation in a second function called from the first. I like > this approach simply because it makes the code cleaner. (Note that > this approach should not cause any overhead with a modern compiler, > as both functions will be "static".) > > But it also provides an optimization opportunity for HotPy: it could > read the metadata, and when generating the JIT'd code it could skip > building the PyObjects and argument tuple (and possibly keyword > argument dict), and the subsequent unpacking/decoding, and just call > the implementation function directly, giving it a likely-measurable > speed boost. > > And we can go further! If we add a new extension type API allowing > you to register both functions, and external modules start using it, > sophisticated Python implementations like PyPy might be able to skip > building the tuple for extension type function calls--speeding those > up too! > > Another plausible benefit: alternate implementations of Python could > read the metadata--or parse the input to Clinic themselves--to ensure > their reimplementations of the Python standard library conform to the > same API! > > > Clinic can also run general-purpose Python code ("/*[python]"). > All output from "print" is redirected into the output section > after the Python code. > > > As you've no doubt already guessed, I've made a prototype of > Argument Clinic. You can see it--and some sample conversions of > builtins using it for argument processing--at this BitBucket repo: > > https://bitbucket.org/larry/**python-clinic > > I don't claim that it's fabulous, production-ready code. But it's > a definite start! > > > To save you a little time, here's a preview of using Clinic for > dbm.open(). The stuff at the same indent as a declaration are > options; see the "clinic.txt" in the repo above for full documentation. > > /*[clinic] > dbm.open -> mapping > basename=dbmopen > > const char *filename; > The filename to open. > > const char *flags="r"; > How to open the file. "r" for reading, "w" for writing, etc. > > int mode=0666; > default=0o666 > If creating a new file, the mode bits for the new file > (e.g. os.O_RDWR). > > Returns a database object. > > [clinic]*/ > > PyDoc_STRVAR(dbmopen__doc__, > "dbm.open(filename[, flags=\'r\'[, mode=0o666]]) -> mapping\n" > "\n" > " filename\n" > " The filename to open.\n" > "\n" > " flags\n" > " How to open the file. \"r\" for reading, \"w\" for writing, > etc.\n" > "\n" > " mode\n" > " If creating a new file, the mode bits for the new file\n" > " (e.g. os.O_RDWR).\n" > "\n" > "Returns a database object.\n" > "\n"); > > #define DBMOPEN_METHODDEF \ > {"open", (PyCFunction)dbmopen, METH_VARARGS | METH_KEYWORDS, > dbmopen__doc__} > > static PyObject * > dbmopen_impl(PyObject *self, const char *filename, const char *flags, > int mode); > > static PyObject * > dbmopen(PyObject *self, PyObject *args, PyObject *kwargs) > { > const char *filename; > const char *flags = "r"; > int mode = 0666; > static char *_keywords[] = {"filename", "flags", "mode", NULL}; > > if (!PyArg_ParseTupleAndKeywords(**args, kwargs, > "s|si", _keywords, > &filename, &flags, &mode)) > return NULL; > > return dbmopen_impl(self, filename, flags, mode); > } > > static PyObject * > dbmopen_impl(PyObject *self, const char *filename, const char *flags, > int mode) > /*[clinic end:**eddc886e542945d959b44b483258bf**038acf8872]*/ > > > As of this writing, I also have sample conversions in the following files > available for your perusal: > Modules/_cursesmodule.c > Modules/_dbmmodule.c > Modules/posixmodule.c > Modules/zlibmodule.c > Just search in C files for '[clinic]' and you'll find everything soon > enough. > > As you can see, Clinic has already survived some contact with the > enemy. I've already converted some tricky functions--for example, > os.stat() and curses.window.addch(). The latter required adding a > new positional-only processing mode for functions using a legacy > argument processing approach. (See "clinic.txt" for more.) If you > can suggest additional tricky functions to support, please do! > > > Big unresolved questions: > > * How would we convert all the builtins to use Clinic? I fear any > solution will involve some work by hand. Even if we can automate > big chunks of it, fully automating it would require parsing arbitrary > C. This seems like overkill for a one-shot conversion. > (Mark Shannon says he has some ideas.) > A lot of hand work. Sprints at pycon. etc. Automating nice chunks of it could be partially done for some easy cases such as things that only use ParseTuple today. > > * How do we create the Signature objects? My current favorite idea: > Clinic also generates a new, TBD C structure defining all the > information necessary for the signature, which is also passed in to > the new registration API (you remember, the one that takes both the > argument-processing function and the implementation function). This > is secreted away in some new part of the C function object. At > runtime this is converted on-demand into a Signature object. Default > values for arguments are represented in C as strings; the conversion > process attempts eval() on the string, and if that works it uses the > result, otherwise it simply passes through the string. > I think passing on the string if that doesn't work is wrong. It could lead to a behavior change not realized until runtime due to some other possibly unrelated thing causing the eval to fail. A failure to eval() one of these strings should result in an ImportError from the extension module's init or a fatal failure if it is a builtin. (I'm assuming these would be done at extension module import time at or after the end of the module init function) > > * Right now Clinic paves over the PyArg_ParseTuple API for you. > If we convert CPython to use Clinic everywhere, theoretically we > could replace the parsing API with something cleaner and/or faster. > Does anyone have good ideas (and time, and energy) here? > By "paves over" do you mean that Clinic is currently using the ParseTuple API in its generated code? Yes, we should do better. But don't hold Clinic up on that. In fact allowing a version of Clinic to work stand alone as a PyPI project and generate Python 2.7 and 3.2/3.3 extension module boilerplate could would increase its adoption and improve the quality of some existing extension modules that choose to use it. My first take on this would be to do the obvious and expand the code within the case/switch statement in the loop that ParseTuple ends up in directly so that we're just generating raw parameter validation and acceptance code based on the clinic definition. I've never liked things in C that parse a string at runtime to determine behavior. (please don't misinterpret that to suggest I don't like Python ;) > * There's actually a fifth option, proposed by Brett Cannon. We > constrain the format of docstrings for builtin functions to make > them machine-readable, then generate the function signature objects > from that. But consider: generating *everything* in the signature > object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and > this might gunk up the docstring. > > > But the biggest unresolved question... is this all actually a terrible > idea? > No it is not. I like it. I don't _like_ adding another C preprocessor but I think if we keep it very limited it is a perfectly reasonable thing to do as part of our build process. > > //arry/ > > > ** "Is this the right room for an argument?" > "I've told you once...!" > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Dec 4 00:57:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Dec 2012 09:57:13 +1000 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> Message-ID: +1 to what Greg said. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue Dec 4 01:16:47 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 03 Dec 2012 16:16:47 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> Message-ID: <50BD40EF.1070302@g.nevcal.com> On 12/3/2012 3:42 PM, Gregory P. Smith wrote: > > All the core devs I've asked said "given all that, I'd prefer the > hairy preprocessor macros". But by the end of the conversation > they'd changed their minds to prefer the custom DSL. Maybe I'll > make a believer out of you too--read on! > > > It always strikes me that C++ could be such a DSL that could likely be > used for this purpose rather than defining and maintaining our own > "yet another C preprocessor" step. But I don't have suggestions and > we're not allowing C++ so... nevermind. :) C++ has enough power to delude many (including me) into thinking that it could be used this way.... but in my experience, it isn't quite there. There isn't quite enough distinction between various integral types to achieve the goals I once had, anyway... and that was some 15 years ago... but for compatibility reasons, I doubt it has improved in that area. Glenn -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Dec 4 02:21:40 2012 From: larry at hastings.org (Larry Hastings) Date: Mon, 03 Dec 2012 17:21:40 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> Message-ID: <50BD5024.6040505@hastings.org> On 12/03/2012 03:42 PM, Gregory P. Smith wrote: > On Mon, Dec 3, 2012 at 2:29 PM, Larry Hastings > wrote: > > Default > values for arguments are represented in C as strings; the conversion > process attempts eval() on the string, and if that works it uses the > result, otherwise it simply passes through the string. > > > I think passing on the string if that doesn't work is wrong. It could > lead to a behavior change not realized until runtime due to some other > possibly unrelated thing causing the eval to fail. Good point. I amend my proposal to say: we make this explicit rather than implicit. We declare an additional per-parameter flag that says "don't eval this, just pass through the string". In absence of this flag, the struct-to-Signature-izer runs eval on the string and complains noisily if it fails. > * Right now Clinic paves over the PyArg_ParseTuple API for you. > If we convert CPython to use Clinic everywhere, theoretically we > could replace the parsing API with something cleaner and/or faster. > Does anyone have good ideas (and time, and energy) here? > > > By "paves over" do you mean that Clinic is currently using the > ParseTuple API in its generated code? Yes. Specifically, it uses ParseTuple for "positional-only" argument processing, and ParseTupleAndKeywords for all others. You can see the latter in the sample output in my original email. > Yes, we should do better. But don't hold Clinic up on that. As I have not! > But the biggest unresolved question... is this all actually a terrible > idea? > > > No it is not. I like it. \o/ //arry/ From barry at python.org Mon Dec 3 23:37:13 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 3 Dec 2012 17:37:13 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: <20121203173713.40ec5af2@resist.wooz.org> On Dec 03, 2012, at 02:29 PM, Larry Hastings wrote: >4) Builtin function arguments are defined in a small DSL; these > are expanded to code and data using a custom compile-time > preprocessor step. > >All the core devs I've asked said "given all that, I'd prefer the >hairy preprocessor macros". But by the end of the conversation >they'd changed their minds to prefer the custom DSL. Maybe I'll >make a believer out of you too--read on! The biggest question with generated code is always the effect on debugging. How horrible will it be when I have to step through argument parsing to figure out what's going wrong? -Barry From dholth at gmail.com Tue Dec 4 02:44:52 2012 From: dholth at gmail.com (Daniel Holth) Date: Mon, 3 Dec 2012 20:44:52 -0500 Subject: [Python-Dev] Accept just PEP-0426 In-Reply-To: References: <98BC16BE-3507-4D54-9AFC-8B1A0983B6F0@gmail.com> <4AB62E69A53048589648EA9A94CAC02A@gmail.com> <50ABF128.4020605@g.nevcal.com> Message-ID: On Tue, Nov 20, 2012 at 11:01 PM, Nick Coghlan wrote: > On Wed, Nov 21, 2012 at 1:20 PM, Nick Coghlan wrote: > >> On Wed, Nov 21, 2012 at 1:10 PM, PJ Eby wrote: >> >>> Conversely, if you have already installed a package that says it >>> "Obsoletes" another package, this does *not* tell you that the obsolete >>> package shouldn't still be installed! A replacement project doesn't >>> necessarily share the same API, and may exist in a different package >>> namespace altogether. >>> >> >> Then that's a bug in the metadata of the project misusing "Obsoletes", >> and should be reported as such. If the new package is not a drop-in >> replacement, then it has no business claiming to obsolete the other package. >> >> I think one of the big reasons this kind of use is rare in the Python >> community is that project name changes are almost always accompanied by >> *package* name changes, and as soon as you change the package name, you're >> changing the public API, and thus it is no longer appropriate to use >> Provides or Obsoletes, as the renamed project is no longer a drop-in >> replacement for the original. >> > > I realised that my comments above are more about the appropriate use of > "Provides", rather than "Obsoletes". For a practically useful "Obsoletes", > I think I'm inclined to agree with you, as "Obsoleted-By" provides a way > for a maintainer to explicitly declare that a project is no longer > receiving updates, and users should migrate to the replacement project if > they want to continue to receive fixes and improvements. The current > version of "Obsoletes" is, as Daniel describes, really only useful as > documentation. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > A few more changes to try to address some of the confusion about Requires-Dist: without re-designing the entire requirements system. PEP-426 was written only to add extras support to the format. The other changes, re-writing much of the PEP, have been an unfortunate side-effect. The file format's keys are case-insensitive. The version number should be in PEP 386 form. There are too many non-PEP-386 versions now and in the future to make it a must. Distribution (requirement) names are noted as being distinct from ``import x`` module names. Parenthetical explanation has balanced parens. "bundled" has been struck from the PEP. diff -r 55c706023fa2 -r 026aebf2265d pep-0426.txt --- a/pep-0426.txt Sun Nov 18 19:55:10 2012 +0200 +++ b/pep-0426.txt Mon Dec 03 20:36:13 2012 -0500 @@ -34,9 +34,9 @@ The syntax defined in this PEP is for use with Python distribution metadata files. The file format is a simple UTF-8 encoded Key: value -format with no maximum line length, followed by a blank line and an -arbitrary payload. The keys are case-insensitive. It is parseable by -the ``email`` module with an appropriate ``email.policy.Policy()``. +format with case-insensitive keys and no maximum line length, followed by +a blank line and an arbitrary payload. It is parseable by the ``email`` +module with an appropriate ``email.policy.Policy()``. When ``metadata`` is a Unicode string, ```email.parser.Parser().parsestr(metadata)`` is a serviceable parser. @@ -94,7 +94,7 @@ ::::::: A string containing the distribution's version number. This -field must be in the format specified in PEP 386. +field should be in the format specified in PEP 386. Example:: @@ -283,12 +283,13 @@ Each entry contains a string naming some other distutils project required by this distribution. -The format of a requirement string is identical to that of a -distutils project name (e.g., as found in the ``Name:`` field. -optionally followed by a version declaration within parentheses. +The format of a requirement string is identical to that of a distribution +name (e.g., as found in the ``Name:`` field) optionally followed by a +version declaration within parentheses. -The distutils project names should correspond to names as found -on the `Python Package Index`_. +The distribution names should correspond to names as found on the `Python +Package Index`_; often the same as, but distinct from, the module names +as accessed with ``import x``. Version declarations must follow the rules described in `Version Specifiers`_ @@ -305,7 +306,8 @@ Like Requires-Dist, but names dependencies needed while the distributions's distutils / packaging `setup.py` / `setup.cfg` is run. -Commonly used to generate a manifest from version control. +Commonly used to bring in extra compiler support or a package needed +to generate a manifest from version control. Examples:: @@ -318,17 +320,19 @@ Provides-Dist (multiple use) :::::::::::::::::::::::::::: -Each entry contains a string naming a Distutils project which -is contained within this distribution. This field *must* include -the project identified in the ``Name`` field, followed by the -version : Name (Version). +Each entry contains a string naming a requirement that is satisfied by +installing this distribution. This field *must* include the project +identified in the ``Name`` field, optionally followed by the version: +Name (Version). A distribution may provide additional names, e.g. to indicate that -multiple projects have been bundled together. For instance, source -distributions of the ``ZODB`` project have historically included -the ``transaction`` project, which is now available as a separate -distribution. Installing such a source distribution satisfies -requirements for both ``ZODB`` and ``transaction``. +multiple projects have been merged into and replaced by a single +distribution or to indicate that this project is a substitute for another. +For instance distribute (a fork of setuptools) could ``Provides-Dist`` +setuptools to prevent the conflicting package from being downloaded and +installed when distribute is already installed. A distribution that has +been merged with another might ``Provides-Dist`` the obsolete name(s) +to satisfy any projects that require the obsolete distribution's name. A distribution may also provide a "virtual" project name, which does not correspond to any separately-distributed project: such a name @@ -359,10 +363,9 @@ Version declarations can be supplied. Version numbers must be in the format specified in `Version Specifiers`_. -The most common use of this field will be in case a project name -changes, e.g. Gorgon 2.3 gets subsumed into Torqued Python 1.0. -When you install Torqued Python, the Gorgon distribution should be -removed. +The most common use of this field will be in case a project name changes, +e.g. Gorgon 2.3 gets renamed to Torqued Python 1.0. When you install +Torqued Python, the Gorgon distribution should be removed. Examples:: -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Dec 4 02:54:37 2012 From: larry at hastings.org (Larry Hastings) Date: Mon, 03 Dec 2012 17:54:37 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121203173713.40ec5af2@resist.wooz.org> References: <50BD27CF.1070303@hastings.org> <20121203173713.40ec5af2@resist.wooz.org> Message-ID: <50BD57DD.3080209@hastings.org> On 12/03/2012 02:37 PM, Barry Warsaw wrote: > The biggest question with generated code is always the effect on debugging. > How horrible will it be when I have to step through argument parsing to figure > out what's going wrong? Right now, it's exactly like the existing solution. The generated function looks more or less like the top paragraph of the old code did; it declares variables, with defaults where appropriate, it calls PyArg_ParseMumbleMumble, if that fails it returns NULL, and otherwise it calls the impl function. There *was* an example of generated code in my original email; I encourage you to go back and take a look. For more you can look at the bitbucket repo; the output of the DSL is checked in there, as would be policy if we went with Clinic. TBH I think debuggability is one of the strengths of this approach. Unlike C macros, here all the code is laid out in front of you, formatted for easy reading. And it's not terribly complicated code. If we change the argument parsing code to use some new API, one hopes we will have the wisdom to make it /easier/ to read than PyArg_*. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Dec 4 02:57:33 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 4 Dec 2012 11:57:33 +1000 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121203173713.40ec5af2@resist.wooz.org> References: <50BD27CF.1070303@hastings.org> <20121203173713.40ec5af2@resist.wooz.org> Message-ID: On Tue, Dec 4, 2012 at 8:37 AM, Barry Warsaw wrote: > On Dec 03, 2012, at 02:29 PM, Larry Hastings wrote: > > >4) Builtin function arguments are defined in a small DSL; these > > are expanded to code and data using a custom compile-time > > preprocessor step. > > > >All the core devs I've asked said "given all that, I'd prefer the > >hairy preprocessor macros". But by the end of the conversation > >they'd changed their minds to prefer the custom DSL. Maybe I'll > >make a believer out of you too--read on! > > The biggest question with generated code is always the effect on debugging. > How horrible will it be when I have to step through argument parsing to > figure > out what's going wrong? > That's the advantage of the Cog-style approach that modifies the C source files in place and records checksums so the generator can easily tell when the code needs to be regenerated, either because it was changed via hand editing or because the definition changed. Yes, it violates the guideline of "don't check in generated code", but it makes debugging sane. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From hs at ox.cx Tue Dec 4 09:22:54 2012 From: hs at ox.cx (Hynek Schlawack) Date: Tue, 4 Dec 2012 09:22:54 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> Message-ID: <3C347236-EF4A-4C74-B422-A543797078D7@ox.cx> Am 04.12.2012 um 00:42 schrieb Gregory P. Smith : > * How would we convert all the builtins to use Clinic? I fear any > solution will involve some work by hand. Even if we can automate > big chunks of it, fully automating it would require parsing arbitrary > C. This seems like overkill for a one-shot conversion. > (Mark Shannon says he has some ideas.) > > A lot of hand work. Sprints at pycon. etc. Automating nice chunks of it could be partially done for some easy cases such as things that only use ParseTuple today. I don?t see this as a big problem. There?s always lots of people who want to get into Python hacking and don?t know where to start. These are easily digestible pieces that can be *reviewed in a timely manner*, thus ideal. We could even do some (virtual) sprint just on that. As for Larry: great approach, I?m impressed! -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue Dec 4 09:32:35 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 4 Dec 2012 09:32:35 +0100 Subject: [Python-Dev] cpython: Issue #16455: On FreeBSD and Solaris, if the locale is C, the In-Reply-To: <50BDA73D.5090206@python.org> References: <3YFn1558hdzQFW@mail.python.org> <50BDA73D.5090206@python.org> Message-ID: Hi, 2012/12/4 Christian Heimes : > Am 04.12.2012 03:23, schrieb victor.stinner: >> http://hg.python.org/cpython/rev/c25635b137cc >> changeset: 80718:c25635b137cc >> parent: 80716:b845901cf702 >> user: Victor Stinner >> date: Tue Dec 04 01:34:47 2012 +0100 >> summary: >> Issue #16455: On FreeBSD and Solaris, if the locale is C, the >> ASCII/surrogateescape codec is now used, instead of the locale encoding, to >> decode the command line arguments. This change fixes inconsistencies with >> os.fsencode() and os.fsdecode() because these operating systems announces an >> ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice. >> >> files: >> Include/unicodeobject.h | 2 +- >> Lib/test/test_cmd_line_script.py | 9 +- >> Misc/NEWS | 6 + >> Objects/unicodeobject.c | 24 +- >> Python/fileutils.c | 240 +++++++++++++++++- >> 5 files changed, 241 insertions(+), 40 deletions(-) > > ... > >> @@ -3110,7 +3110,8 @@ >> *surrogateescape = 0; >> return 0; >> } >> - if (strcmp(errors, "surrogateescape") == 0) { >> + if (errors == "surrogateescape" >> + || strcmp(errors, "surrogateescape") == 0) { >> *surrogateescape = 1; >> return 0; >> } > > Victor, That doesn't look right. :) GCC is complaining about the code: > > Objects/unicodeobject.c: In function 'locale_error_handler': > Objects/unicodeobject.c:3113:16: warning: comparison with string literal > results in unspecified behavior [-Waddress] Oh, I forgot to commit this change in a separated commit. It's a micro-optimization. PyUnicode_EncodeFSDefault() calls PyUnicode_EncodeLocale(unicode, "surrogateescape"), and PyUnicode_DecodeFSDefaultAndSize() calls PyUnicode_DecodeLocaleAndSize(s, size, "surrogateescape"). I chose to compare the address because I expect that GCC generates the same address for "surrogateescape" in PyUnicode_EncodeFSDefault() and in locale_error_handler(), comparing pointers is faster than comparing the string content. I remove this micro-optimization. The code path is only used during Python startup, and I don't expect any real speedup. > I'm also getting additional warnings in PyUnicode_Format(). > > Objects/unicodeobject.c: In function 'PyUnicode_Format': > Objects/unicodeobject.c:13782:8: warning: 'arg.sign' may be used > uninitialized in this function [-Wmaybe-uninitialized] > Objects/unicodeobject.c:13893:33: note: 'arg.sign' was declared here > Objects/unicodeobject.c:13779:12: warning: 'str' may be used > uninitialized in this function [-Wmaybe-uninitialized] > Objects/unicodeobject.c:13894:15: note: 'str' was declared here These members *are* initialized, but it's even hard to me (author of this code) to check them. I rewrote how these members are initialized to make the warnings quiet but also to simplify the code. Thanks for the review! Victor PS: I hope that I really fixed the FreeBSD/Solaris issue :-p From solipsis at pitrou.net Tue Dec 4 10:08:51 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Dec 2012 10:08:51 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython References: <50BD27CF.1070303@hastings.org> Message-ID: <20121204100851.193751c5@pitrou.net> Le Mon, 03 Dec 2012 14:29:35 -0800, Larry Hastings a ?crit : > > /*[clinic] > dbm.open -> mapping > basename=dbmopen > > const char *filename; > The filename to open. So how does it handle the fact that filename can either be a unicode string or a fsencoding-encoded bytestring? And how does it do the right encoding/decoding dance, possibly platform-specific? > static char *_keywords[] = {"filename", "flags", "mode", NULL}; > > if (!PyArg_ParseTupleAndKeywords(args, kwargs, > "s|si", _keywords, > &filename, &flags, &mode)) > return NULL; I see, it doesn't :-) > But the biggest unresolved question... is this all actually a terrible > idea? I like the idea, but it needs more polishing. I don't think the various "duck types" accepted by Python can be expressed fully in plain C types (e.g. you must distinguish between taking all kinds of numbers or only an __index__-providing number). Regards Antoine. From ulrich.eckhardt at dominolaser.com Tue Dec 4 13:10:01 2012 From: ulrich.eckhardt at dominolaser.com (Ulrich Eckhardt) Date: Tue, 04 Dec 2012 13:10:01 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: <50BDE819.8070404@dominolaser.com> Am 03.12.2012 23:29, schrieb Larry Hastings: [...autogen some code from special comment strings...] > /*[clinic] > dbm.open -> mapping > basename=dbmopen > > const char *filename; > The filename to open. > > const char *flags="r"; > How to open the file. "r" for reading, "w" for writing, etc. > > int mode=0666; > default=0o666 > If creating a new file, the mode bits for the new file > (e.g. os.O_RDWR). > > Returns a database object. > > [clinic]*/ Firstly, I like the idea. Even though this "autogenerate in-place" seems a bit strange at first, I don't think it really hurts in practice. Also, thanks for introducing me to the 'cog' tool, I think I'll use this now and then! This also brings me to a single question I have for your proposal: Why did you create another DSL instead of using Python, i.e. instead of using cog directly? Looking at the above, I could imagine this being written like this instead: /*[[[cog import pycognize with pycognize.function('dbmopen') as f: f.add_param('self') f.add_kwparam('filename', doc='The filename to open', c_type='char*') f.add_kwparam('flags', doc='How to open the file.' c_type='char*', default='r') f.set_result('mapping') ]]]*/ //[[[end]]] Cheers! Uli ************************************************************************************** Domino Laser GmbH, Fangdieckstra???e 75a, 22547 Hamburg, Deutschland Gesch???ftsf???hrer: Hans Robert Dapprich, Amtsgericht Hamburg HR B62 932 ************************************************************************************** Visit our website at http://www.dominolaser.com ************************************************************************************** Diese E-Mail einschlie???lich s???mtlicher Anh???nge ist nur f???r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf???nger sein sollten. Die E-Mail ist in diesem Fall zu l???schen und darf weder gelesen, weitergeleitet, ver???ffentlicht oder anderweitig benutzt werden. E-Mails k???nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte ???nderungen enthalten. Domino Laser GmbH ist f???r diese Folgen nicht verantwortlich. ************************************************************************************** From larry at hastings.org Tue Dec 4 16:05:32 2012 From: larry at hastings.org (Larry Hastings) Date: Tue, 04 Dec 2012 07:05:32 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BDE819.8070404@dominolaser.com> References: <50BD27CF.1070303@hastings.org> <50BDE819.8070404@dominolaser.com> Message-ID: <50BE113C.5050409@hastings.org> On 12/04/2012 04:10 AM, Ulrich Eckhardt wrote: > This also brings me to a single question I have for your proposal: Why > did you create another DSL instead of using Python, i.e. instead of > using cog directly? Looking at the above, I could imagine this being > written like this instead: Actually my original prototype was written using Cog. When I showed it to Guido at EuroPython, he suggested a DSL instead, as writing raw Python code for every single function would be far too wordy. I agree. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Tue Dec 4 16:36:06 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Tue, 04 Dec 2012 16:36:06 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: Larry Hastings, 03.12.2012 23:29: > Say there, the Python core development community! Have I got > a question for you! > > *ahem* > > Which of the following four options do you dislike least? ;-) > > 1) CPython continues to provide no "function signature" > objects (PEP 362) or inspect.getfullargspec() information > for any function implemented in C. I would love to see Cython generated functions look and behave completely like normal Python functions at some point, so this is the option I dislike most. > 2) We add new hand-coded data structures representing the > metadata necessary for function signatures for builtins. > Which means that, when defining arguments to functions in C, > we'd need to repeat ourselves *even more* than we already do. > > 3) Builtin function arguments are defined using some seriously > uncomfortable and impenetrable C preprocessor macros, which > produce all the various types of output we need (argument > processing code, function signature metadata, possibly > the docstrings too). > > 4) Builtin function arguments are defined in a small DSL; these > are expanded to code and data using a custom compile-time > preprocessor step. > [...] > * There's actually a fifth option, proposed by Brett Cannon. We > constrain the format of docstrings for builtin functions to make > them machine-readable, then generate the function signature objects > from that. But consider: generating *everything* in the signature > object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and > this might gunk up the docstring. Why not provide a constructor for signature objects that parses the signature from a string? For a signature like def func(int arg1, float arg2, ExtType arg3, *, object arg4=None) -> ExtType2: ... you'd just pass in this string: (arg1 : int, arg2 : float, arg3 : ExtType, *, arg4=None) -> ExtType2 or maybe prefixed by the function name, don't care. Might make it easier to pass it into the normal parser. For more than one alternative input type, use a tuple of types. For builtin types that are shadowed by C type names, pass "builtins.int" etc. Stefan From dmalcolm at redhat.com Tue Dec 4 17:47:23 2012 From: dmalcolm at redhat.com (David Malcolm) Date: Tue, 04 Dec 2012 11:47:23 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: <1354639643.1829.78.camel@surprise> On Mon, 2012-12-03 at 14:29 -0800, Larry Hastings wrote: [...snip compelling sales pitch...] I like the idea. As noted elsewhere, sane generated C code is much easier to step through in the debugger than preprocessor macros (though "sane" in that sentence is begging the question, I guess, but the examples you post look good to me). It's also seems cleaner to split the argument handling from the implementation of the function (iirc Cython already has an analogous split and can use this to bypass arg tuple creation). The proposal potentially also eliminates a source of bugs: mismatches between the format strings in PyArg_Parse* vs the underlying C types passed as varargs (which are a major pain for bigendian CPUs where int vs long screwups can really bite you). I got worried that this could introduce a bootstrapping issue (given that the clinic is implemented using python itself), but given that the generated code is checked in as part of the C source file, you always have the source you need to regenerate the interpreter. Presumably 3rd party extension modules could use this also, in which case the clinic tool could be something that could be installed/packaged as part of Python 3.4 ? [...snip...] > Big unresolved questions: > > * How would we convert all the builtins to use Clinic? I fear any > solution will involve some work by hand. Even if we can automate > big chunks of it, fully automating it would require parsing arbitrary > C. This seems like overkill for a one-shot conversion. > (Mark Shannon says he has some ideas.) Potentially my gcc python plugin could be used to autogenerate things. FWIW I already have Python code running inside gcc that can parse the PyArg_* APIs: http://git.fedorahosted.org/cgit/gcc-python-plugin.git/tree/libcpychecker/PyArg_ParseTuple.py Though my plugin runs after the C preprocessor has been run, so it may be fiddly to use this to autogenerate patches. Hope this is helpful Dave From larry at hastings.org Tue Dec 4 20:04:09 2012 From: larry at hastings.org (Larry Hastings) Date: Tue, 04 Dec 2012 11:04:09 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121204100851.193751c5@pitrou.net> References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> Message-ID: <50BE4929.60309@hastings.org> On 12/04/2012 01:08 AM, Antoine Pitrou wrote: > Le Mon, 03 Dec 2012 14:29:35 -0800, > Larry Hastings a ?crit : >> /*[clinic] >> dbm.open -> mapping >> basename=dbmopen >> >> const char *filename; >> The filename to open. > So how does it handle the fact that filename can either be a unicode > string or a fsencoding-encoded bytestring? And how does it do the right > encoding/decoding dance, possibly platform-specific? > > [...] > I see, it doesn't :-) If you compare the Clinic-generated code to the current implementation of dbm.open (and all the other functions I've touched) you'll find the "format units" specified to PyArg_Parse* are identical. Thus I assert the replacement argument parsing is no worse (and no better) than what's currently shipping in Python. Separately, I contributed code that handles unicode vs bytes for filenames in a reasonably cross-platform way; see "path_converter" in Modules/posixmodule.c. (This shipped in Python 3.3.) And indeed, I have examples of using "path_converter" with Clinic in my branch. Along these lines, I've been contemplating proposing that Clinic specifically understand "path" arguments, distinctly from other string arguments, as they are both common and rarely handled correctly. My main fear is that I probably don't understand all their complexities either ;-) Anyway, this is certainly something we can consider *improving* for Python 3.4. But for now I'm trying to make Clinic an indistinguishable drop-in replacement. > I like the idea, but it needs more polishing. I don't think the various > "duck types" accepted by Python can be expressed fully in plain C types > (e.g. you must distinguish between taking all kinds of numbers or only > an __index__-providing number). Naturally I agree Clinic needs more polishing. But the problem you fear is already solved. Clinic allows precisely expressing any existing PyArg_ "format unit"** through a combination of the type of the parameter and its "flags". The flags only become necessary for types used by multiple format units; for example, s, z, es, et, es#, et#, y, and y# all map to char *, so it's necessary to disambiguate by using the "flags". The specific case you cite ("__index__-providing number") is already unambiguous; that's n, mapped to Py_ssize_t. There aren't any other format units that map to a Py_ssize_t, so we're done. ** Well, any format unit except w*. I don't handle it just because I wasn't sure how best to do so. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Dec 4 20:21:39 2012 From: brett at python.org (Brett Cannon) Date: Tue, 4 Dec 2012 14:21:39 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: On Mon, Dec 3, 2012 at 5:29 PM, Larry Hastings wrote: > > Say there, the Python core development community! Have I got > a question for you! > > *ahem* > > Which of the following four options do you dislike least? ;-) > > 1) CPython continues to provide no "function signature" > objects (PEP 362) or inspect.getfullargspec() information > for any function implemented in C. > > 2) We add new hand-coded data structures representing the > metadata necessary for function signatures for builtins. > Which means that, when defining arguments to functions in C, > we'd need to repeat ourselves *even more* than we already do. > > 3) Builtin function arguments are defined using some seriously > uncomfortable and impenetrable C preprocessor macros, which > produce all the various types of output we need (argument > processing code, function signature metadata, possibly > the docstrings too). > > 4) Builtin function arguments are defined in a small DSL; these > are expanded to code and data using a custom compile-time > preprocessor step. > > > All the core devs I've asked said "given all that, I'd prefer the > hairy preprocessor macros". But by the end of the conversation > they'd changed their minds to prefer the custom DSL. Maybe I'll > make a believer out of you too--read on! > > [snip] > * There's actually a fifth option, proposed by Brett Cannon. We > constrain the format of docstrings for builtin functions to make > them machine-readable, then generate the function signature objects > from that. But consider: generating *everything* in the signature > object may get a bit tricky (e.g. Parameter.POSITIONAL_ONLY), and > this might gunk up the docstring. > I should mention that I was one of the people Larry pitched this to and this fifth option was before I fully understood the extent the DSL supported the various crazy options needed to support all current use-cases in CPython. Regardless I fully support what Larry is proposing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Tue Dec 4 20:34:45 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 4 Dec 2012 14:34:45 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython References: <50BD27CF.1070303@hastings.org> <1354639643.1829.78.camel@surprise> Message-ID: <20121204143445.1b6289ef@resist.wooz.org> On Dec 04, 2012, at 11:47 AM, David Malcolm wrote: >As noted elsewhere, sane generated C code is much easier to step through >in the debugger than preprocessor macros (though "sane" in that sentence >is begging the question, I guess, but the examples you post look good to >me). And to me too. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Tue Dec 4 20:35:28 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Dec 2012 20:35:28 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> Message-ID: <20121204203528.727d1c4e@pitrou.net> On Tue, 04 Dec 2012 11:04:09 -0800 Larry Hastings wrote: > > Along these lines, I've been contemplating proposing that Clinic > specifically understand "path" arguments, distinctly from other string > arguments, as they are both common and rarely handled correctly. My > main fear is that I probably don't understand all their complexities > either ;-) > > Anyway, this is certainly something we can consider *improving* for > Python 3.4. But for now I'm trying to make Clinic an indistinguishable > drop-in replacement. > [...] > > Naturally I agree Clinic needs more polishing. But the problem you fear > is already solved. Clinic allows precisely expressing any existing > PyArg_ "format unit"** through a combination of the type of the > parameter and its "flags". Very nice then! Your work is promising, and I hope we'll see a version of it some day in Python 3.4 (or 3.4+k). Regards Antoine. From guido at python.org Tue Dec 4 22:17:02 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Dec 2012 13:17:02 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121204203528.727d1c4e@pitrou.net> References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> Message-ID: On Tue, Dec 4, 2012 at 11:35 AM, Antoine Pitrou wrote: > On Tue, 04 Dec 2012 11:04:09 -0800 > Larry Hastings wrote: >> >> Along these lines, I've been contemplating proposing that Clinic >> specifically understand "path" arguments, distinctly from other string >> arguments, as they are both common and rarely handled correctly. My >> main fear is that I probably don't understand all their complexities >> either ;-) >> >> Anyway, this is certainly something we can consider *improving* for >> Python 3.4. But for now I'm trying to make Clinic an indistinguishable >> drop-in replacement. >> > [...] >> >> Naturally I agree Clinic needs more polishing. But the problem you fear >> is already solved. Clinic allows precisely expressing any existing >> PyArg_ "format unit"** through a combination of the type of the >> parameter and its "flags". > > Very nice then! Your work is promising, and I hope we'll see a version > of it some day in Python 3.4 (or 3.4+k). +1 for getting this into 3.4. Does it need a PEP, or just a bug tracker item + code review? I think the latter is fine -- it's probably better not to do too much bikeshedding but just to let Larry propose a patch, have it reviewed and submitted, and then iterate. It's also okay if it is initially used for only a subset of extension modules (and even if some functions/methods can't be expressed using it yet). -- --Guido van Rossum (python.org/~guido) From arigo at tunes.org Tue Dec 4 22:27:26 2012 From: arigo at tunes.org (Armin Rigo) Date: Tue, 4 Dec 2012 13:27:26 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> Message-ID: Hi, On Mon, Dec 3, 2012 at 3:42 PM, Gregory P. Smith wrote: > In fact allowing a version of Clinic to work stand alone as a > PyPI project and generate Python 2.7 and 3.2/3.3 extension module > boilerplate could would increase its adoption and improve the quality of > some existing extension modules that choose to use it. I agree: the same idea applies equally well to all existing 3rd-party extension modules, and does not depend on new CPython C API functions (so far), so Clinic should be released as a PyPI project too. A bient?t, Armin. From brett at python.org Tue Dec 4 22:45:54 2012 From: brett at python.org (Brett Cannon) Date: Tue, 4 Dec 2012 16:45:54 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> Message-ID: On Tue, Dec 4, 2012 at 4:17 PM, Guido van Rossum wrote: > On Tue, Dec 4, 2012 at 11:35 AM, Antoine Pitrou > wrote: > > On Tue, 04 Dec 2012 11:04:09 -0800 > > Larry Hastings wrote: > >> > >> Along these lines, I've been contemplating proposing that Clinic > >> specifically understand "path" arguments, distinctly from other string > >> arguments, as they are both common and rarely handled correctly. My > >> main fear is that I probably don't understand all their complexities > >> either ;-) > >> > >> Anyway, this is certainly something we can consider *improving* for > >> Python 3.4. But for now I'm trying to make Clinic an indistinguishable > >> drop-in replacement. > >> > > [...] > >> > >> Naturally I agree Clinic needs more polishing. But the problem you fear > >> is already solved. Clinic allows precisely expressing any existing > >> PyArg_ "format unit"** through a combination of the type of the > >> parameter and its "flags". > > > > Very nice then! Your work is promising, and I hope we'll see a version > > of it some day in Python 3.4 (or 3.4+k). > > +1 for getting this into 3.4. Does it need a PEP, or just a bug > tracker item + code review? I think the latter is fine -- it's > probably better not to do too much bikeshedding but just to let Larry > propose a patch, have it reviewed and submitted, and then iterate. > It's also okay if it is initially used for only a subset of extension > modules (and even if some functions/methods can't be expressed using > it yet). > I don't see a need for a PEP either; code review should be plenty since this doesn't change how the outside world views public APIs. And we can convert code iteratively so that shouldn't hold things up either. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Dec 4 22:49:07 2012 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Dec 2012 08:49:07 +1100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BD27CF.1070303@hastings.org> References: <50BD27CF.1070303@hastings.org> Message-ID: On Tue, Dec 4, 2012 at 9:29 AM, Larry Hastings wrote: > To save you a little time, here's a preview of using Clinic for > dbm.open(). The stuff at the same indent as a declaration are > options; see the "clinic.txt" in the repo above for full documentation. > > /*[clinic] >... hand-written content ... > [clinic]*/ > > ... generated content ... > /*[clinic end:eddc886e542945d959b44b483258bf038acf8872]*/ > One thing I'm not entirely clear on. Do you run Clinic on a source file and it edits that file, or is it a step in the build process? Your description of a preprocessor makes me think the latter, but the style of code (eg the checksum) suggests the former. ChrisA From solipsis at pitrou.net Tue Dec 4 22:48:17 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Dec 2012 22:48:17 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> Message-ID: <20121204224817.16c39931@pitrou.net> On Tue, 4 Dec 2012 16:45:54 -0500 Brett Cannon wrote: > > > > +1 for getting this into 3.4. Does it need a PEP, or just a bug > > tracker item + code review? I think the latter is fine -- it's > > probably better not to do too much bikeshedding but just to let Larry > > propose a patch, have it reviewed and submitted, and then iterate. > > It's also okay if it is initially used for only a subset of extension > > modules (and even if some functions/methods can't be expressed using > > it yet). > > > > I don't see a need for a PEP either; code review should be plenty since > this doesn't change how the outside world views public APIs. And we can > convert code iteratively so that shouldn't hold things up either. I think the DSL itself does warrant public exposure. It will be an element of the CPython coding style, if its use becomes widespread. Regards Antoine. From brett at python.org Tue Dec 4 22:54:27 2012 From: brett at python.org (Brett Cannon) Date: Tue, 4 Dec 2012 16:54:27 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121204224817.16c39931@pitrou.net> References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> <20121204224817.16c39931@pitrou.net> Message-ID: On Tue, Dec 4, 2012 at 4:48 PM, Antoine Pitrou wrote: > On Tue, 4 Dec 2012 16:45:54 -0500 > Brett Cannon wrote: > > > > > > +1 for getting this into 3.4. Does it need a PEP, or just a bug > > > tracker item + code review? I think the latter is fine -- it's > > > probably better not to do too much bikeshedding but just to let Larry > > > propose a patch, have it reviewed and submitted, and then iterate. > > > It's also okay if it is initially used for only a subset of extension > > > modules (and even if some functions/methods can't be expressed using > > > it yet). > > > > > > > I don't see a need for a PEP either; code review should be plenty since > > this doesn't change how the outside world views public APIs. And we can > > convert code iteratively so that shouldn't hold things up either. > > I think the DSL itself does warrant public exposure. It will be an > element of the CPython coding style, if its use becomes widespread. > That's what the issue will tease out, so this isn't going in without some public scrutiny. But going through python-ideas for this I think is a bit much. I mean we don't clear every change to PEP 7 or 8 with the public and that directly affects people as well in terms of coding style. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Dec 4 23:07:03 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 4 Dec 2012 23:07:03 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> <20121204224817.16c39931@pitrou.net> Message-ID: <20121204230703.7c6be047@pitrou.net> On Tue, 4 Dec 2012 16:54:27 -0500 Brett Cannon wrote: > On Tue, Dec 4, 2012 at 4:48 PM, Antoine Pitrou wrote: > > > On Tue, 4 Dec 2012 16:45:54 -0500 > > Brett Cannon wrote: > > > > > > > > +1 for getting this into 3.4. Does it need a PEP, or just a bug > > > > tracker item + code review? I think the latter is fine -- it's > > > > probably better not to do too much bikeshedding but just to let Larry > > > > propose a patch, have it reviewed and submitted, and then iterate. > > > > It's also okay if it is initially used for only a subset of extension > > > > modules (and even if some functions/methods can't be expressed using > > > > it yet). > > > > > > > > > > I don't see a need for a PEP either; code review should be plenty since > > > this doesn't change how the outside world views public APIs. And we can > > > convert code iteratively so that shouldn't hold things up either. > > > > I think the DSL itself does warrant public exposure. It will be an > > element of the CPython coding style, if its use becomes widespread. > > > > That's what the issue will tease out, so this isn't going in without some > public scrutiny. But going through python-ideas for this I think is a bit > much. I mean we don't clear every change to PEP 7 or 8 with the public and > that directly affects people as well in terms of coding style. Not necessarily python-ideas, but python-dev. (I hope we don't need a separate clinic-dev mailing-list, although it certainly sounds funny) Regards Antoine. From barry at python.org Tue Dec 4 23:09:47 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 4 Dec 2012 17:09:47 -0500 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121204224817.16c39931@pitrou.net> References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> <20121204224817.16c39931@pitrou.net> Message-ID: <20121204170947.1f8fa375@resist.wooz.org> On Dec 04, 2012, at 10:48 PM, Antoine Pitrou wrote: >I think the DSL itself does warrant public exposure. It will be an >element of the CPython coding style, if its use becomes widespread. We do have PEP 7 after all. No matter what, this stuff has to eventually be well documented outside of the tracker. -Barry From brian at python.org Tue Dec 4 23:10:54 2012 From: brian at python.org (Brian Curtin) Date: Tue, 4 Dec 2012 16:10:54 -0600 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> <20121204224817.16c39931@pitrou.net> Message-ID: On Tue, Dec 4, 2012 at 3:54 PM, Brett Cannon wrote: > But going through python-ideas for this I think is a bit much. It would never end. I think an issue on roundup could work just fine. From larry at hastings.org Tue Dec 4 23:17:09 2012 From: larry at hastings.org (Larry Hastings) Date: Tue, 04 Dec 2012 14:17:09 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> Message-ID: <50BE7665.6010404@hastings.org> On 12/04/2012 01:49 PM, Chris Angelico wrote: > One thing I'm not entirely clear on. Do you run Clinic on a source > file and it edits that file, or is it a step in the build process? > Your description of a preprocessor makes me think the latter, but the > style of code (eg the checksum) suggests the former. You run Clinic on a source file and it edits that file in-place (unless you use -o). It's not currently integrated into the build process. At what time Clinic gets run--manually or automatically--is TBD. Here's my blue-sky probably-overengineered proposal: we (and when I say "we" I mean "I") write a cross-platform C program that could be harmlessly but usefully integrated into the build process. First, we add a checksum for the *input* into the Clinic output. Next, when you run this program, you give it a C file as an argument. First it tries to find a working Python on your path. If it finds one, it uses that Python to run Clinic on the file, propagating any error code outward. If it doesn't find one, it understands enough of the Clinic format to scan the C file looking for Clinic blocks. If it finds one where the checksum doesn't match (for input or output!) it complains loudly and exits with an error code, hopefully bringing the build to a screeching halt. This would integrate Clinic into the build process without making the build reliant on having a Python interpreter available. I get the sneaking suspicion that I'm going to rewrite Clinic to run under either Python 2.7 or 3, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Tue Dec 4 23:19:23 2012 From: larry at hastings.org (Larry Hastings) Date: Tue, 04 Dec 2012 14:19:23 -0800 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> <20121204224817.16c39931@pitrou.net> Message-ID: <50BE76EB.9070103@hastings.org> On 12/04/2012 02:10 PM, Brian Curtin wrote: > I think an issue on roundup could work just fine. http://bugs.python.org/issue16612 Cheers, //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Wed Dec 5 00:01:35 2012 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 5 Dec 2012 10:01:35 +1100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BE7665.6010404@hastings.org> References: <50BD27CF.1070303@hastings.org> <50BE7665.6010404@hastings.org> Message-ID: On Wed, Dec 5, 2012 at 9:17 AM, Larry Hastings wrote: > Here's my blue-sky probably-overengineered proposal: we (and when I say "we" > I mean "I") write a cross-platform C program that could be harmlessly but > usefully integrated into the build process. First, we add a checksum for > the *input* into the Clinic output. Next, when you run this program, you > give it a C file as an argument. First it tries to find a working Python on > your path. If it finds one, it uses that Python to run Clinic on the file, > propagating any error code outward. If it doesn't find one, it understands > enough of the Clinic format to scan the C file looking for Clinic blocks. > If it finds one where the checksum doesn't match (for input or output!) it > complains loudly and exits with an error code, hopefully bringing the build > to a screeching halt. This would integrate Clinic into the build process > without making the build reliant on having a Python interpreter available. That would probably work, but it implies having two places that understand Clinic blocks (the main Python script, and the C binary), with the potential for one of them to have a bug. Is it possible, instead, to divide the build process in half, and actually use the newly-built Python to run all Clinic code? That would put some (maybe a lot of) restrictions on what functionality the Clinic parser is allowed to use, but if it can work, it'd be clean. (The main code of Clinic could still demand a fully-working Python if that's easier; I'm just suggesting making the "check the checksums" part of the same Python script as does the real work.) ChrisA From ncoghlan at gmail.com Wed Dec 5 07:25:46 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Dec 2012 16:25:46 +1000 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <50BE7665.6010404@hastings.org> References: <50BD27CF.1070303@hastings.org> <50BE7665.6010404@hastings.org> Message-ID: On Wed, Dec 5, 2012 at 8:17 AM, Larry Hastings wrote: > I get the sneaking suspicion that I'm going to rewrite Clinic to run under > either Python 2.7 or 3, > For bootstrapping purposes, isn't it enough to just ignore the checksums if there's no Python interpreter already built? We can have a commit hook that rejects a checkin if the checksums don't match so you can't push a change if you've modified the headers without regenerating them. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Wed Dec 5 08:13:36 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 5 Dec 2012 02:13:36 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Dec 3, 2012 at 2:43 PM, Daniel Holth wrote: > How to use Obsoletes: > > The author of B decides A is obsolete. > > A releases an empty version of itself that Requires: B > > B Obsoletes: A > > The package manager says "These packages are obsolete: A". Would you like to > remove them? > > User says "OK". > Um, no. Even if the the author of A and B are the same person, you can't remove A if there are other things on the user's system using it. The above scenario does not work *at all*, ever, except in the case where B is simply an updated version of A (i.e. identical API) -- in which case, why bother? To change the project name? (Then it should be "Formerly-named" or something like that, not "Obsoletes".) Please, *please* see the previous Catalog-SIG discussion I linked: this is only one of multiple metadata fields that were thoroughly debunked in that discussion as completely useless for automated dependency management. From donald.stufft at gmail.com Wed Dec 5 08:46:11 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Wed, 5 Dec 2012 02:46:11 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wednesday, December 5, 2012 at 2:13 AM, PJ Eby wrote: > On Mon, Dec 3, 2012 at 2:43 PM, Daniel Holth wrote: > > How to use Obsoletes: > > > > The author of B decides A is obsolete. > > > > A releases an empty version of itself that Requires: B > > > > B Obsoletes: A > > > > The package manager says "These packages are obsolete: A". Would you like to > > remove them? > > > > User says "OK". > > Um, no. Even if the the author of A and B are the same person, you > can't remove A if there are other things on the user's system using > it. The above scenario does not work *at all*, ever, except in the > case where B is simply an updated version of A (i.e. identical API) -- > in which case, why bother? To change the project name? (Then it > should be "Formerly-named" or something like that, not "Obsoletes".) > > You can automatically uninstall A from B in an automatic dependency management system. I *think* RPM does this, at the very least I believe it refuses to install B if A is already there (and the reverse as well).* There's nothing preventing an installer from, during it's attempt to install B, see it Obsoletes A, looking at what depends on A and warning the user what is going to happen and prompt it. I think Obsoletes as is an alright bit of information. I think the biggest flaw with Obsoletes isn't in Obsoletes itself, but is in the lack of a Conflicts tag that has the same functionality (minimally refusal to install both, possibly uninstall the previous one with a prompt to the user). Obsoletes has the semantics of a logical successor (typically renames) while Conflicts should have the semantics of a competitor. distribute would conflict with setuptools, foo2 would Obsoletes foo. * I could be wrong about RPM's treatment of Obsoletes > > Please, *please* see the previous Catalog-SIG discussion I linked: > this is only one of multiple metadata fields that were thoroughly > debunked in that discussion as completely useless for automated > dependency management. > > I don't see this in this thread, could you link it again? > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org (mailto:Python-Dev at python.org) > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.badger at gmail.com Wed Dec 5 16:42:48 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 5 Dec 2012 07:42:48 -0800 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20121205154247.GB2613@unaka.lan> On Wed, Dec 05, 2012 at 02:46:11AM -0500, Donald Stufft wrote: > On Wednesday, December 5, 2012 at 2:13 AM, PJ Eby wrote: > > On Mon, Dec 3, 2012 at 2:43 PM, Daniel Holth wrote: > > How to use Obsoletes: > > The author of B decides A is obsolete. > > A releases an empty version of itself that Requires: B > > B Obsoletes: A > > The package manager says "These packages are obsolete: A". Would you > like to > remove them? > > User says "OK". > > > Um, no. Even if the the author of A and B are the same person, you > can't remove A if there are other things on the user's system using > it. The above scenario does not work *at all*, ever, except in the > case where B is simply an updated version of A (i.e. identical API) -- > in which case, why bother? To change the project name? (Then it > should be "Formerly-named" or something like that, not "Obsoletes".) > > You can automatically uninstall A from B in an automatic dependency > management system. I *think* RPM does this, at the very least This is correct. > I believe it refuses to install B if A is already there (and the reverse > as well).* I'd have to test this but I believe you are correct about the first. Not sure about the reverse. > There's nothing preventing an installer from, during it's attempt to > install B, see it Obsoletes A, looking at what depends on A and > warning the user what is going to happen and prompt it. > In rpm-land, if something depended on A and nothing besides the actual A package provided A, rpm will refuse to install B. But rpm is meant to be used unattended so different package managers could certainly choose to prompt. For package renames, package B would have both an Obsoletes: A <= $OLD_VERSION and a Provides: A = NEW_VERSION -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From chris at simplistix.co.uk Wed Dec 5 17:08:46 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Wed, 05 Dec 2012 16:08:46 +0000 Subject: [Python-Dev] slightly misleading Popen.poll() docs Message-ID: <50BF718E.5080604@simplistix.co.uk> Hi All, Would anyone object to me making a change to the docs for 2.6, 2.7 and 3.x to clarify the following: http://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll A couple of my colleagues have ended up writing code like this: proc = Popen(['some', 'thing']) code = proc.poll() if code: raise Exception('An error happened: %s' % code) ...on the back of the fact that if your process terminates *really* quickly, *and* the docs say that the returncode is set by poll() (*sigh*). I'd like to change the docs for poll() to say: """ Check if child process has terminated. If it has, the returncode attribute will be set and that value will be returned. If it has not, None will be returned and the returncode attribute will remain None. """ Any objections? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From solipsis at pitrou.net Wed Dec 5 17:34:29 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Dec 2012 17:34:29 +0100 Subject: [Python-Dev] slightly misleading Popen.poll() docs References: <50BF718E.5080604@simplistix.co.uk> Message-ID: <20121205173429.2cbc3412@pitrou.net> Le Wed, 05 Dec 2012 16:08:46 +0000, Chris Withers a ?crit : > Hi All, > > Would anyone object to me making a change to the docs for 2.6, 2.7 > and 3.x to clarify the following: > > http://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll The doc looks clear to me. poll() returns the returncode attribute which is described thusly: "A None value indicates that the process hasn?t terminated yet." Therefore, I don't understand the confusion. poll() is explicitly non-blocking, and it is silly to expect it to return a process return code when the process hasn't returned yet (!). The correct answer is to use the wait() method (or communicate()), which is described two lines below poll(). May I suggest your colleagues didn't read the doc at all? Regards Antoine. From chris at python.org Wed Dec 5 17:40:40 2012 From: chris at python.org (Chris Withers) Date: Wed, 05 Dec 2012 16:40:40 +0000 Subject: [Python-Dev] slightly misleading Popen.poll() docs In-Reply-To: <20121205173429.2cbc3412@pitrou.net> References: <50BF718E.5080604@simplistix.co.uk> <20121205173429.2cbc3412@pitrou.net> Message-ID: <50BF7908.6070009@python.org> On 05/12/2012 16:34, Antoine Pitrou wrote: >> http://docs.python.org/3/library/subprocess.html#subprocess.Popen.poll > > The doc looks clear to me. poll() returns the returncode attribute which > is described thusly: > > "A None value indicates that the process hasn?t terminated yet." > > Therefore, I don't understand the confusion. Because lazy/busy people don't bother reading the links underneath docs... > poll() is explicitly > non-blocking, and it is silly to expect it to return a process return > code when the process hasn't returned yet (!). The correct answer is > to use the wait() method (or communicate()), which is described two > lines below poll(). I agree, however, I also: - don't see any harm in the change I propose - do see a slight improvement for the comprehending impaired ;-) > May I suggest your colleagues didn't read the doc at all? One of them quoted the docs at me at proof that his code must be correct ;-) Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From steve at pearwood.info Wed Dec 5 18:15:08 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 06 Dec 2012 04:15:08 +1100 Subject: [Python-Dev] slightly misleading Popen.poll() docs In-Reply-To: <50BF718E.5080604@simplistix.co.uk> References: <50BF718E.5080604@simplistix.co.uk> Message-ID: <50BF811C.4070708@pearwood.info> On 06/12/12 03:08, Chris Withers wrote: > I'd like to change the docs for poll() to say: > > """ > Check if child process has terminated. > If it has, the returncode attribute will be set and that value will be returned. > If it has not, None will be returned and the returncode attribute will remain None. > """ > > Any objections? Possibly because it is 4am here, I had to read this three times to understand it. How is this instead? """ Check if child process has terminated. Returns None while the child is still running, any non-None value means that the child has terminated. In either case, the return value is also available from the instance's returncode attribute. """ -- Steven From solipsis at pitrou.net Wed Dec 5 19:55:47 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 5 Dec 2012 19:55:47 +0100 Subject: [Python-Dev] slightly misleading Popen.poll() docs References: <50BF718E.5080604@simplistix.co.uk> <50BF811C.4070708@pearwood.info> Message-ID: <20121205195547.27740dfb@pitrou.net> On Thu, 06 Dec 2012 04:15:08 +1100 Steven D'Aprano wrote: > On 06/12/12 03:08, Chris Withers wrote: > > > I'd like to change the docs for poll() to say: > > > > """ > > Check if child process has terminated. > > If it has, the returncode attribute will be set and that value will be returned. > > If it has not, None will be returned and the returncode attribute will remain None. > > """ > > > > Any objections? > > Possibly because it is 4am here, I had to read this three times to understand it. > How is this instead? > > """ > Check if child process has terminated. Returns None while the child is still running, > any non-None value means that the child has terminated. In either case, the return > value is also available from the instance's returncode attribute. > """ I like this wording. Regards Antoine. From pje at telecommunity.com Wed Dec 5 22:10:14 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 5 Dec 2012 16:10:14 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Dec 5, 2012 at 2:46 AM, Donald Stufft wrote: > There's nothing preventing an installer from, during it's attempt to > install B, see it Obsoletes A, looking at what depends on A and > warning the user what is going to happen and prompt it. Unless the user wrote those things that depend on A, they aren't going to be in a position to do anything about it. (Contrast with a distro, where dependencies are indirect - the other package will depend on an abstraction provided by both A and B, rather than directly depending on A *or* B.) (Also note that all the user knows at this point is that the author of B *claims* to obsolete A, not that the authority managing the repository as a whole has decreed B to obsolete A.) > You can automatically uninstall A from B in an automatic dependency management system My point is that this can only work if the "obsoleting" is effectively just a rename, in which case the field should be "renames", or better still, "renamed-to" on the originating package. As I've mentioned repeatedly, Obsoleted-By handles more use cases than Obsoletes, and has at least one practical automated use case (notifying a developer that their project is depending on something that's obsolete). Also, the example given as a use case in the PEP (Gorgon to Torqued) is not just wrong, it's *actively misleading*. Gorgon and Torqued are transparent renames of Medusa and Twisted, which do not share a common API and thus cannot be used as the subject of any automated processing (in the case of Obsoletes) without doing some kind of PyPI metadata search for every package installed every time a package is installed. > I think Obsoletes as is an alright bit of information. 1. It cannot be used to prevent the installation of an obsolete package without a PyPI metadata search, since you must examine every *other* package on PyPI to find out whether some package obsoletes the one you're trying to install. 2. Unlike RPM, where metadata is provided by a trusted third party, Obsoletes can be specified by any random forker (no pun intended), which makes this information a mere advertisement... and an advertisement to the wrong audience at that, because they must have *already* found B in order to discover that it replaces A! 3. Nobody has yet supplied a use case where Obsoletes would not be strictly improved upon by Obsoleted-By. (Note that "the author of package X no longer maintains it" does not equal "package Y is entitled to name itself the successor and enforce this upon all users" -- this can work in RPM only because it is a third party Z who declares Y the successor to X, and there is no such party Z in the Python world.) > I don't see this in this thread, could you link it again? http://mail.python.org/pipermail/catalog-sig/2010-October/003368.html http://mail.python.org/pipermail/catalog-sig/2010-October/003364.html These posts also address why a "Conflicts" field is *also* unlikely to be particularly useful in practice, in part for reasons that relate to differences between RPM-land and Python-land. (For example, RPMs can conflict over things besides files, due to runtime and configuration issues that are out-of-scope for a Python installer tool.) While it's certainly desirable to not invent wheels, it's important to understand that the Python community does not work the same way as a Linux distribution. We are not a single organization shipping a fully-functional and configured machine, we are hundreds of individual authors shipping our own stuff. Conflict resolution and package replacement (and even deciding what it is that things "provide" or "require") are primarily *human* processes, not technical ones. Relationship and support "contracts", IOW, rather than software contracts. That's why, in the distro world, a package manager can use simple fields to carry out the will of the human organization that made those support and compatibility decisions. For Python, the situation is a bit more complicated, which is why clear thinking is needed. Simply copying fields blindly from other packaging systems just isn't going to cut it. Now, if the will of the community is to turn PyPI into a distro-style repository, that's fine... but even if you completely ignore the human issues, there are still technical ones. Generally, distro-style repositories work by downloading the full metadata set (or at least an index) to a user's machine. And that's the sort of architecture you'd need in order for these type of fields to be technically feasible (e.g., doing an index search for Obsoletes), without grinding the PyPI servers into dust. From dholth at gmail.com Wed Dec 5 23:30:37 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 5 Dec 2012 17:30:37 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Dec 5, 2012 at 4:10 PM, PJ Eby wrote: > On Wed, Dec 5, 2012 at 2:46 AM, Donald Stufft > wrote: > > There's nothing preventing an installer from, during it's attempt to > > install B, see it Obsoletes A, looking at what depends on A and > > warning the user what is going to happen and prompt it. > > Unless the user wrote those things that depend on A, they aren't going > to be in a position to do anything about it. (Contrast with a distro, > where dependencies are indirect - the other package will depend on an > abstraction provided by both A and B, rather than directly depending > on A *or* B.) > > (Also note that all the user knows at this point is that the author of > B *claims* to obsolete A, not that the authority managing the > repository as a whole has decreed B to obsolete A.) > > > > You can automatically uninstall A from B in an automatic dependency > management system > > My point is that this can only work if the "obsoleting" is effectively > just a rename, in which case the field should be "renames", or better > still, "renamed-to" on the originating package. > > As I've mentioned repeatedly, Obsoleted-By handles more use cases than > Obsoletes, and has at least one practical automated use case > (notifying a developer that their project is depending on something > that's obsolete). > > Also, the example given as a use case in the PEP (Gorgon to Torqued) > is not just wrong, it's *actively misleading*. Gorgon and Torqued are > transparent renames of Medusa and Twisted, which do not share a common > API and thus cannot be used as the subject of any automated processing > (in the case of Obsoletes) without doing some kind of PyPI metadata > search for every package installed every time a package is installed. > > > > I think Obsoletes as is an alright bit of information. > > 1. It cannot be used to prevent the installation of an obsolete > package without a PyPI metadata search, since you must examine every > *other* package on PyPI to find out whether some package obsoletes the > one you're trying to install. > > 2. Unlike RPM, where metadata is provided by a trusted third party, > Obsoletes can be specified by any random forker (no pun intended), > which makes this information a mere advertisement... and an > advertisement to the wrong audience at that, because they must have > *already* found B in order to discover that it replaces A! > > 3. Nobody has yet supplied a use case where Obsoletes would not be > strictly improved upon by Obsoleted-By. (Note that "the author of > package X no longer maintains it" does not equal "package Y is > entitled to name itself the successor and enforce this upon all users" > -- this can work in RPM only because it is a third party Z who > declares Y the successor to X, and there is no such party Z in the > Python world.) > > > > I don't see this in this thread, could you link it again? > > http://mail.python.org/pipermail/catalog-sig/2010-October/003368.html > http://mail.python.org/pipermail/catalog-sig/2010-October/003364.html > > These posts also address why a "Conflicts" field is *also* unlikely to > be particularly useful in practice, in part for reasons that relate to > differences between RPM-land and Python-land. (For example, RPMs can > conflict over things besides files, due to runtime and configuration > issues that are out-of-scope for a Python installer tool.) > > While it's certainly desirable to not invent wheels, it's important to > My desire is to invent the useful "wheel" binary package format in a reasonable and limited amount of time by making changes to Metadata 1.2 and implementing the new metadata format and wheel in distribute and pip. Help me out by allowing useless but un-changed fields to remain in this version of the PEP. I am done with the PEP and submit that it is not worse than its predecessor. I can participate in a discussion about any of the following: Summary of Differences From PEP 345 - Metadata-Version is now 1.3. - Values are now expected to be UTF-8. - A payload (containing the description) may appear after the headers. - Added extra to environment markers. - Most fields are now optional. - Changed fields: - Description - Project-URL - Requires-Dist - Added fields: - Extension - Provides-Extra - Setup-Requires-Dist -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald.stufft at gmail.com Thu Dec 6 00:07:41 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Wed, 5 Dec 2012 18:07:41 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> On Wednesday, December 5, 2012 at 4:10 PM, PJ Eby wrote: > My point is that this can only work if the "obsoleting" is effectively > just a rename, in which case the field should be "renames", or better > still, "renamed-to" on the originating package. Arguing over Obsoletes vs Renames is a massive bikeshedding argument. > As I've mentioned repeatedly, Obsoleted-By handles more use cases than > Obsoletes, and has at least one practical automated use case > (notifying a developer that their project is depending on something > that's obsolete). > > Also, the example given as a use case in the PEP (Gorgon to Torqued) > is not just wrong, it's *actively misleading*. Gorgon and Torqued are > transparent renames of Medusa and Twisted, which do not share a common > API and thus cannot be used as the subject of any automated processing > (in the case of Obsoletes) without doing some kind of PyPI metadata > search for every package installed every time a package is installed. > > So it's a bad example. Hardly an argument against it. > 1. It cannot be used to prevent the installation of an obsolete > package without a PyPI metadata search, since you must examine every > *other* package on PyPI to find out whether some package obsoletes the > one you're trying to install. > > Will require support from PyPI but this ultimately isn't a big deal. > > 2. Unlike RPM, where metadata is provided by a trusted third party, > Obsoletes can be specified by any random forker (no pun intended), > which makes this information a mere advertisement... and an > advertisement to the wrong audience at that, because they must have > *already* found B in order to discover that it replaces A! > > If you're installing B you've prescribed trust to that author. If you don't trust the author then why are you installing (and then executing) code they wrote. > > 3. Nobody has yet supplied a use case where Obsoletes would not be > strictly improved upon by Obsoleted-By. (Note that "the author of > package X no longer maintains it" does not equal "package Y is > entitled to name itself the successor and enforce this upon all users" > -- this can work in RPM only because it is a third party Z who > declares Y the successor to X, and there is no such party Z in the > Python world.) > > Very convenient to declare that one of the major use cases for Obsoletes over Obsoleted-By is not valid because of your own personal opinions. Like I said above, if you're installing a package that someone has uploaded you've implicitly granted them trust. There is far worse things that a bad Python citizen can do during, and after and install that what is allowed by Obsoletes. > > > > I don't see this in this thread, could you link it again? > > http://mail.python.org/pipermail/catalog-sig/2010-October/003368.html > http://mail.python.org/pipermail/catalog-sig/2010-October/003364.html > > These posts also address why a "Conflicts" field is *also* unlikely to > be particularly useful in practice, in part for reasons that relate to > differences between RPM-land and Python-land. (For example, RPMs can > conflict over things besides files, due to runtime and configuration > issues that are out-of-scope for a Python installer tool.) > > I don't think Conflicts is something that every single package is going to require. As you said the tools themselves are going to handle the obvious cases for the bulk of situations. Unless you think there are no cases where two packages can conflict in more than what files are going to be installed then there are cases where it would be helpful and merely having the ability to use it when it is the best tool for the job isn't going to cause any great issue. > > While it's certainly desirable to not invent wheels, it's important to > understand that the Python community does not work the same way as a > Linux distribution. We are not a single organization shipping a > fully-functional and configured machine, we are hundreds of individual > authors shipping our own stuff. Conflict resolution and package > replacement (and even deciding what it is that things "provide" or > "require") are primarily *human* processes, not technical ones. > Relationship and support "contracts", IOW, rather than software > contracts. > > End systems often times do not have a singular organization controlling every package in their system. The best example is Ubuntu and their PPA's. > > Now, if the will of the community is to turn PyPI into a distro-style > repository, that's fine... but even if you completely ignore the human > issues, there are still technical ones. Generally, distro-style > repositories work by downloading the full metadata set (or at least an > index) to a user's machine. And that's the sort of architecture you'd > need in order for these type of fields to be technically feasible > (e.g., doing an index search for Obsoletes), without grinding the PyPI > servers into dust. This is insane. A fairly simple database query is going to "grind the PyPI servers into dust"? You're going to need to back up this FUD or please refrain from spouting it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Dec 6 00:11:23 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Dec 2012 18:11:23 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20121205181123.73d95e68@limelight.wooz.org> On Dec 05, 2012, at 04:10 PM, PJ Eby wrote: >While it's certainly desirable to not invent wheels, it's important to >understand that the Python community does not work the same way as a >Linux distribution. We are not a single organization shipping a >fully-functional and configured machine, we are hundreds of individual >authors shipping our own stuff. Conflict resolution and package >replacement (and even deciding what it is that things "provide" or >"require") are primarily *human* processes, not technical ones. >Relationship and support "contracts", IOW, rather than software >contracts. > >That's why, in the distro world, a package manager can use simple >fields to carry out the will of the human organization that made those >support and compatibility decisions. For Python, the situation is a >bit more complicated, which is why clear thinking is needed. Simply >copying fields blindly from other packaging systems just isn't going >to cut it. +1 >Now, if the will of the community is to turn PyPI into a distro-style >repository, that's fine... Please no! -1 :) -Barry From barry at python.org Thu Dec 6 00:18:12 2012 From: barry at python.org (Barry Warsaw) Date: Wed, 5 Dec 2012 18:18:12 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> Message-ID: <20121205181812.33008ee0@limelight.wooz.org> On Dec 05, 2012, at 06:07 PM, Donald Stufft wrote: >If you're installing B you've prescribed trust to that author. If you don't >trust the author then why are you installing (and then executing) code >they wrote. What you installed Z, but B got installed because it was a dependency three levels down? >Very convenient to declare that one of the major use cases for >Obsoletes over Obsoleted-By is not valid because of your own >personal opinions. Like I said above, if you're installing a package >that someone has uploaded you've implicitly granted them trust. There >is far worse things that a bad Python citizen can do during, and after >and install that what is allowed by Obsoletes. Well, basically never installing anything from PyPI except into a virtualenv is probably a good recommendation (maybe even now). >End systems often times do not have a singular organization controlling >every package in their system. The best example is Ubuntu and their PPA's. Well, PPAs are awesome, but have known and well-publicized trust issues. I wouldn't enable a PPA into my running system without really knowing who the owner is and why I'm using their PPA. Or doing a lot of testing in a chroot first, and probably pinning the package set to just the one(s) from the PPA I care about. Cheers, -Barry From donald.stufft at gmail.com Thu Dec 6 00:30:41 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Wed, 5 Dec 2012 18:30:41 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <20121205181812.33008ee0@limelight.wooz.org> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> Message-ID: <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> On Wednesday, December 5, 2012 at 6:18 PM, Barry Warsaw wrote: > On Dec 05, 2012, at 06:07 PM, Donald Stufft wrote: > > > If you're installing B you've prescribed trust to that author. If you don't > > trust the author then why are you installing (and then executing) code > > they wrote. > > > > > What you installed Z, but B got installed because it was a dependency three > levels down? > > Sure, you granted trust to Z, Z granted trust to Y, and Y granted trust to B. Like in SSL certificates there was a chain of trust. If you don't trust Z then don't install their package. > > > Very convenient to declare that one of the major use cases for > > Obsoletes over Obsoleted-By is not valid because of your own > > personal opinions. Like I said above, if you're installing a package > > that someone has uploaded you've implicitly granted them trust. There > > is far worse things that a bad Python citizen can do during, and after > > and install that what is allowed by Obsoletes. > > > > > Well, basically never installing anything from PyPI except into a virtualenv > is probably a good recommendation (maybe even now). > > A virtualenv only protects you from well behaved packages. There is no way to prevent a package author from doing very nasty things to you if they wish. Providing more power in the metadata doesn't make this situation better or worse, it just makes more standard paths in the cases where you do need to do it. > > > End systems often times do not have a singular organization controlling > > every package in their system. The best example is Ubuntu and their PPA's. > > > > > Well, PPAs are awesome, but have known and well-publicized trust issues. I > wouldn't enable a PPA into my running system without really knowing who the > owner is and why I'm using their PPA. Or doing a lot of testing in a chroot > first, and probably pinning the package set to just the one(s) from the PPA I > care about. > > Basically the same thing can be said about packages on PyPI. All the same trust issues exist there. Simply installing a Python package is already granting far more trust than Obsoletes requires since installing a package is executed someone else's python code on your system. Even if you remove setup.py you're still going to be executing their code on your system. If you do not trust the author of the packages you are installing, you do not install their packages. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Thu Dec 6 01:18:15 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 5 Dec 2012 19:18:15 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Dec 5, 2012 at 5:30 PM, Daniel Holth wrote: > My desire is to invent the useful "wheel" binary package format in a > reasonable and limited amount of time by making changes to Metadata 1.2 and > implementing the new metadata format and wheel in distribute and pip. Help > me out by allowing useless but un-changed fields to remain in this version > of the PEP. I am done with the PEP and submit that it is not worse than its > predecessor. You could just mark those fields as deprecated and that they should not be used to delete packages or block packages from installation. Justification: nobody has managed to make them work in an automated tool yet, and their use in same is controversial, so they are downgraded to human-informational only. Please, let's not have yet *another* metadata spec that advertises these attractive nuisance[1] fields. I do not want us to be having this same conversation AGAIN the next time any metadata changes are being considered. We've already had it too many times already. PEPs are supposed to summarize these discussions for that very reason. --- [1] For non-native speakers, an attractive nuisance is a dangerous thing that entices unsuspecting persons to play with it; http://en.wikipedia.org/wiki/Attractive_nuisance_doctrine has more details. From pje at telecommunity.com Thu Dec 6 01:34:41 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 5 Dec 2012 19:34:41 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> Message-ID: On Wed, Dec 5, 2012 at 6:07 PM, Donald Stufft wrote: > Arguing over Obsoletes vs Renames is a massive bikeshedding argument. And is entirely beside the point. The substantive question is whether it's Obsoletes or Obsoleted-By - i.e., which side is it declared on. > So it's a bad example. Hardly an argument against it. Nobody has actually proposed a better one, outside of package renaming -- and that example featured an author who could just as easily have used an obsoleted-by field. > Will require support from PyPI but this ultimately isn't a big deal. ...and every PyPI clone. And of course the performance issues. > If you're installing B you've prescribed trust to that author. If you don't > trust the author then why are you installing (and then executing) code > they wrote. Trusting their code is one thing; trusting whether they understood a PEP (and its interactions with various installation tools) well enough to not accidentally delete *somebody else's code* out of my system is another thing altogether. OTOH, trusting an author to tell me (in an automated fashion), "hey, you should switch to this other thing as soon as you can" is a FAR smaller amount of required trust. Arguing that because I have to trust one thing, means I must trust another, is a "Fallacy of Gray" argument. > Very convenient to declare that one of the major use cases for > Obsoletes over Obsoleted-By is not valid because of your own > personal opinions. I didn't say it was invalid, I said: """Note that "the author of package X no longer maintains it" does not equal "package Y is entitled to name itself the successor and enforce this upon all users""" These things are not equal. AFAIK, well-managed Linux distros do not allow random forkers to declare themselves the official successor to a defunct package, so any analogy between this use case in the Python world and the distro world is strained at *best*. > Unless you think there are > no cases where two packages can conflict in more than what files > are going to be installed The rationale for that is laid out in the posts I linked. > then there are cases where it would be helpful Please, present a *real-life instance* where it would have been helpful to you. > and merely having the ability to use it when it is the best tool for the job > isn't going to cause any great issue. One of the posts I linked presents an instance where it would have actually *harmed* things to specify it, and it's quite easy to see how the same problem would arise if used for non-file-related conflicts... And the problem present is *directly* tied to the lack of a third-party Z who decides whether X and Y, as configured for release Q of distro P, "conflict". This is not a problem that is solvable even in *principle* for an automated tool in the absence of party Z, which means that any such field's actual function is limited to a heads-up to a human user. > This is insane. A fairly simple database query is going to "grind the PyPI > servers into dust"? You're going to need to back up this FUD or please > refrain from spouting it. I take it you're not familiar with PyPI's history of performance and scaling problems over the last several years, then. The statically cached "/simple" index was developed precisely to stop *today's* class of installation tools from killing the servers... and then mirroring PyPI was still required to scale. Any proposal that calls for encouraging tools to query a metadata field *every time* a package is installed (or even just downloaded) almost certainly needs to be vetted with the PyPI admin team. From stephen at xemacs.org Thu Dec 6 03:12:36 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 06 Dec 2012 11:12:36 +0900 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> Message-ID: <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> I understand the PEP author's frustration with continued discussion, but I think this subthread on Obsoletes vs. Obsoleted-By is not mere bikeshedding on names. It matters *which package* presents the information. Donald Stufft writes: > On Wednesday, December 5, 2012 at 6:18 PM, Barry Warsaw wrote: > > On Dec 05, 2012, at 06:07 PM, Donald Stufft wrote: > > > > > If you're installing B you've prescribed trust to that > > > author. If you don't trust the author then why are you > > > installing (and then executing) code they wrote. The author may be a genius when it comes to writing code, and an idiot when it comes to distributing it. Distribution is much harder than it looks, as you know. Trusting the author's *content* and trusting the author's *metadata* are not equivalent! As far as I can see, the semantics of putting "Obsoletes: A" into B without changing A are the same as the semantics of putting "Provides: A" into B (without changing A).[1] Only if A includes "Obsoleted-By: B" can a user be confident that B is a true successor to A. Furthermore, as has been pointed out, the presence of "Obsoleted-By" in A has the huge advantage of informing users and developers of dependent packages alike that A is obsolete when they try to update A. If A is not changed, then an attempted update will tell them exactly that, and they may never find out about B. But if A is modified in this trivial way, the package system can automatically inform them. This is also trivial, requiring no database queries. "Simple is better than complex." Footnotes: [1] A trustworthy author of B wouldn't use "Provides" unless he thought B was indeed a drop-in, and presumbly superior, replacement for A. And that's all that "Obsoletes" can tell you! From python at mrabarnett.plus.com Thu Dec 6 03:42:59 2012 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 06 Dec 2012 02:42:59 +0000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <50C00633.7010600@mrabarnett.plus.com> On 2012-12-06 02:12, Stephen J. Turnbull wrote: > I understand the PEP author's frustration with continued discussion, > but I think this subthread on Obsoletes vs. Obsoleted-By is not mere > bikeshedding on names. It matters *which package* presents the > information. > > > Donald Stufft writes: > > On Wednesday, December 5, 2012 at 6:18 PM, Barry Warsaw wrote: > > > On Dec 05, 2012, at 06:07 PM, Donald Stufft wrote: > > > > > > > If you're installing B you've prescribed trust to that > > > > author. If you don't trust the author then why are you > > > > installing (and then executing) code they wrote. > > The author may be a genius when it comes to writing code, and an idiot > when it comes to distributing it. Distribution is much harder than it > looks, as you know. Trusting the author's *content* and trusting the > author's *metadata* are not equivalent! > > As far as I can see, the semantics of putting "Obsoletes: A" into B > without changing A are the same as the semantics of putting "Provides: > A" into B (without changing A).[1] Only if A includes "Obsoleted-By: B" > can a user be confident that B is a true successor to A. > > Furthermore, as has been pointed out, the presence of "Obsoleted-By" > in A has the huge advantage of informing users and developers of > dependent packages alike that A is obsolete when they try to update A. > If A is not changed, then an attempted update will tell them exactly > that, and they may never find out about B. But if A is modified in > this trivial way, the package system can automatically inform them. > This is also trivial, requiring no database queries. > > "Simple is better than complex." > That makes sense. In summary, someone using B won't care that it has replaced A, but someone using A needs to be told that it has been replaced by B. From dholth at gmail.com Thu Dec 6 04:12:26 2012 From: dholth at gmail.com (Daniel Holth) Date: Wed, 5 Dec 2012 22:12:26 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <50C00633.7010600@mrabarnett.plus.com> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> <50C00633.7010600@mrabarnett.plus.com> Message-ID: Makes sense. How about calling it Replacement. 0 or 1? Replacement (optional) :::::::::::::::::::::: Indicates that this project is no longer being developed. The named project provides a drop-in replacement. A version declaration may be supplied and must follow the rules described in `Version Specifiers`_. The most common use of this field will be in case a project name changes. Examples:: Name: BadName Replacement: AcceptableName Replacement: AcceptableName (>=4.0.0) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Dec 6 05:26:36 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Dec 2012 14:26:36 +1000 Subject: [Python-Dev] slightly misleading Popen.poll() docs In-Reply-To: <20121205195547.27740dfb@pitrou.net> References: <50BF718E.5080604@simplistix.co.uk> <50BF811C.4070708@pearwood.info> <20121205195547.27740dfb@pitrou.net> Message-ID: On Thu, Dec 6, 2012 at 4:55 AM, Antoine Pitrou wrote: > On Thu, 06 Dec 2012 04:15:08 +1100 > Steven D'Aprano wrote: > > Possibly because it is 4am here, I had to read this three times to > understand it. > > How is this instead? > > > > """ > > Check if child process has terminated. Returns None while the child is > still running, > > any non-None value means that the child has terminated. In either case, > the return > > value is also available from the instance's returncode attribute. > > """ > > I like this wording. > Steven's proposed wording sounds good to me, too. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Dec 6 05:35:24 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Dec 2012 14:35:24 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Thu, Dec 6, 2012 at 8:30 AM, Daniel Holth wrote: > My desire is to invent the useful "wheel" binary package format in a > reasonable and limited amount of time by making changes to Metadata 1.2 and > implementing the new metadata format and wheel in distribute and pip. Help > me out by allowing useless but un-changed fields to remain in this version > of the PEP. I am done with the PEP and submit that it is not worse than its > predecessor. > Agreed. PJE's arguments sound reasonable (especially since Obsoletes doesn't get used much in RPM-land either - Provides & Conflicts are both far more common), but they're orthogonal to the current aims of the metadata 1.3 update. If another author wanted to create a subsequent 1.4 update that was focused on replacing Obsoletes with Obsoleted-By, that would be fine (alternatively, a patch to the current PEP draft may be acceptable, but accepting such a change would be up to Daniel as the PEP author). > > I can participate in a discussion about any of the following: > Summary of Differences From PEP 345 > > - Metadata-Version is now 1.3. > - Values are now expected to be UTF-8. > - A payload (containing the description) may appear after the headers. > - Added extra to environment markers. > - Most fields are now optional. > - Changed fields: > - Description > - Project-URL > - Requires-Dist > - Added fields: > - Extension > - Provides-Extra > - Setup-Requires-Dist > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > > -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Dec 6 05:44:01 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Dec 2012 23:44:01 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> <50C00633.7010600@mrabarnett.plus.com> Message-ID: On 12/5/2012 10:12 PM, Daniel Holth wrote: > Makes sense. How about calling it Replacement. 0 or 1? > > Replacement (optional) > :::::::::::::::::::::: > > Indicates that this project is no longer being developed. The named > project provides a drop-in replacement. > > A version declaration may be supplied and must follow the rules described > in `Version Specifiers`_. > > The most common use of this field will be in case a project name changes. > > Examples:: > > Name: BadName > Replacement: AcceptableName > > Replacement: AcceptableName (>=4.0.0) I like it. 'Replacement' is broader in meaning, more neutral, and less awkward than 'Obsoleted-by'. And I agree that A users have much more need to know about B the vice-versa. It is much the same situation with Py 2 and Py 3 (although the latter is *not* a drop-in replacement). -- Terry Jan Reedy From ncoghlan at gmail.com Thu Dec 6 05:54:45 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Dec 2012 14:54:45 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> <50C00633.7010600@mrabarnett.plus.com> Message-ID: On Thu, Dec 6, 2012 at 1:12 PM, Daniel Holth wrote: > Makes sense. How about calling it Replacement. 0 or 1? > Hah, you'd think I'd have learned by now to finish reading a thread before replying. It will be nice to get this addressed along with the other changes :) (FWIW, Conflicts and Obsoletes are messy in RPM as well, and especially troublesome as soon as you start enabling multiple upstream repos from different providers. The metadata problem is handled by prebuilding indices when the repo changes, but that's still more work for the server, and more work for clients) > Replacement (optional) > :::::::::::::::::::::: > I like verb forms like Obsoleted-By or Replaced-By, as the noun form is ambiguous about the direction of the change. Since the field being replaced is Obsoletes, Obsoleted-By makes sense. > > Indicates that this project is no longer being developed. The named > project provides a drop-in replacement. > Typically, the new version *won't* be a drop-in replacement (e.g. you'll likely at least have to import from a different top level package). Instead, the field would more often be used as an explicit indicator that the project is no longer receiving updates, as the *development team* has moved on, so users may want to consider either migrating, taking over development (if the former developers are amenable) or forking. If the replacing project *is* a drop-in replacement for the old project, then it should also advertise a Provides-Dist for the original project. Automated tools can then easily detect the two cases: A Obsoleted-By-Dist B and B Provides-Dist A = A is defunct, and B should be a drop-in replacement for A A Obsoleted-By-Dist B (without a Provides-Dist on B) = A is defunct, B is a replacement for A, but some porting will be needed Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Dec 6 05:56:38 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Dec 2012 14:56:38 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121205181812.33008ee0@limelight.wooz.org> <14AE5411296B4F55A7B5E55C026A4C40@gmail.com> <87fw3k2ajv.fsf@uwakimon.sk.tsukuba.ac.jp> <50C00633.7010600@mrabarnett.plus.com> Message-ID: On Thu, Dec 6, 2012 at 2:54 PM, Nick Coghlan wrote: > On Thu, Dec 6, 2012 at 1:12 PM, Daniel Holth wrote: > >> Makes sense. How about calling it Replacement. 0 or 1? >> > > Hah, you'd think I'd have learned by now to finish reading a thread before > replying. It will be nice to get this addressed along with the other > changes :) > > (FWIW, Conflicts and Obsoletes are messy in RPM as well, and especially > troublesome as soon as you start enabling multiple upstream repos from > different providers. The metadata problem is handled by prebuilding indices > when the repo changes, but that's still more work for the server, and more > work for clients) > > >> Replacement (optional) >> :::::::::::::::::::::: >> > > > I like verb forms like Obsoleted-By or Replaced-By, as the noun form is > ambiguous about the direction of the change. Since the field being replaced > is Obsoletes, Obsoleted-By makes sense. > Although Replaced-By would be fine as well - it's certainly much easier to say than the mouthful that is Obsoleted-By. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.badger at gmail.com Thu Dec 6 07:49:25 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Wed, 5 Dec 2012 22:49:25 -0800 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> Message-ID: <20121206064925.GC2613@unaka.lan> On Wed, Dec 05, 2012 at 07:34:41PM -0500, PJ Eby wrote: > On Wed, Dec 5, 2012 at 6:07 PM, Donald Stufft wrote: > > Nobody has actually proposed a better one, outside of package renaming > -- and that example featured an author who could just as easily have > used an obsoleted-by field. > How about pexpect and pextpect-u as a better example? > > > Very convenient to declare that one of the major use cases for > > Obsoletes over Obsoleted-By is not valid because of your own > > personal opinions. > > I didn't say it was invalid, I said: > > """Note that "the author of package X no longer maintains it" does not > equal "package Y is entitled to name itself the successor and enforce > this upon all users""" > > These things are not equal. AFAIK, well-managed Linux distros do not > allow random forkers to declare themselves the official successor to a > defunct package, so any analogy between this use case in the Python > world and the distro world is strained at *best*. > Note that although well-managed Linux distros attempt to control random forking internally, the distro package managers don't prevent people from installing from third parties. So Ubuntu PPAs, upstreams that provide their own rpms/debs, and major third party repos (for instance, rpmfusion as an add-on repo to Fedora) all have and sometimes (mis)use the ability to Obsolete packages in the base repository. So Donald isn't stretching the relationship quite as far as you make it out. The ecosystem of packages for a distro carries uncontrolled packages just as much as pypi. > > > and merely having the ability to use it when it is the best tool for the job > > isn't going to cause any great issue. > > One of the posts I linked presents an instance where it would have > actually *harmed* things to specify it, and it's quite easy to see how > the same problem would arise if used for non-file-related conflicts... > > And the problem present is *directly* tied to the lack of a > third-party Z who decides whether X and Y, as configured for release Q > of distro P, "conflict". > > This is not a problem that is solvable even in *principle* for an > automated tool in the absence of party Z, which means that any such > field's actual function is limited to a heads-up to a human user. > And the same for Provides. (ie: latest foo is 0.6c; bar Provides: foo-0.6d. an automated tool that finds both foo and bar in its dep tree can choose to install bar and not foo.) The ability for this class of fields to cause harm is not, to me, a compelling argument not to include them. It could be an argument to explicitly tell implementers of install tools that they all have caveats when used with pypi and similar unpoliced community package repositories. The install tools can then choose how they wish to deal with those caveats. Some example strategies: choose to prompt the user as to which to install, choose to always treat the fields as human-informational only, mark some repositories as being trusted to contain packages where these fields are active and other repositories where the fields are ignored. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From vinay_sajip at yahoo.co.uk Thu Dec 6 12:28:31 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 6 Dec 2012 11:28:31 +0000 (UTC) Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> Message-ID: Donald Stufft gmail.com> writes: > This is insane. A fairly simple database query is going to "grind the PyPI > servers?into dust"? ?You're going to need to back up this FUD or please > refrain from spouting it. Never mind the "Obsoletes" information - even the more useful "Requires-Dist" information is not exposed via PyPI, even though it appears to be stored in the database. (Or if it is, please point me to where - I must have missed it.) Even if this were to be made available, it's presumably obtained from PKG-INFO. As I understand, this data is not considered reliable - for example, pip runs egg_info on downloaded packages to get updated information when determining dependencies to be downloaded. If the Requires-Dist info in PKG-INFO can't be relied on, surely less critical information such as Obsoletes can't be relied on, either? Regards, Vinay Sajip From donald.stufft at gmail.com Thu Dec 6 12:33:42 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Thu, 6 Dec 2012 06:33:42 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> Message-ID: <2FEACDF3DA3048CE920F97832F24E584@gmail.com> On Thursday, December 6, 2012 at 6:28 AM, Vinay Sajip wrote: > Donald Stufft gmail.com (http://gmail.com)> writes: > > Never mind the "Obsoletes" information - even the more useful "Requires-Dist" > information is not exposed via PyPI, even though it appears to be stored in the > database. (Or if it is, please point me to where - I must have missed it.) > > Requires-Dist doesn't exist for more than a handful of packages. But PyPI exposes it via the XMLRPC API, possibly the JSON api as well. > > Even if this were to be made available, it's presumably obtained from PKG-INFO. > As I understand, this data is not considered reliable - for example, pip runs > egg_info on downloaded packages to get updated information when determining > dependencies to be downloaded. If the Requires-Dist info in PKG-INFO can't be > relied on, surely less critical information such as Obsoletes can't be relied on, > either? > > pip runs egg_info because setuptools does not write out to PKG-INFO what the dependencies are (it does write it out to a different text file though). But IIRC that text file is not guaranteed to exist in the distribution. There's also the history where pip was trying to preserve as much backwards compat with easy_install as it could, and if you used the file that egg_info writes out then you'll only get the requirements for the system that the distribution was packaged on. Any if statements that affect the dependencies won't be in effect. > > Regards, > > Vinay Sajip > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org (mailto:Python-Dev at python.org) > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Thu Dec 6 14:39:56 2012 From: dholth at gmail.com (Daniel Holth) Date: Thu, 6 Dec 2012 08:39:56 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <2FEACDF3DA3048CE920F97832F24E584@gmail.com> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Thu, Dec 6, 2012 at 6:33 AM, Donald Stufft wrote: > On Thursday, December 6, 2012 at 6:28 AM, Vinay Sajip wrote: > > Donald Stufft gmail.com> writes: > > Never mind the "Obsoletes" information - even the more useful > "Requires-Dist" > information is not exposed via PyPI, even though it appears to be stored > in the > database. (Or if it is, please point me to where - I must have missed it.) > > Requires-Dist doesn't exist for more than a handful of packages. But PyPI > exposes > it via the XMLRPC API, possibly the JSON api as well. > > > Even if this were to be made available, it's presumably obtained from > PKG-INFO. > As I understand, this data is not considered reliable - for example, pip > runs > egg_info on downloaded packages to get updated information when determining > dependencies to be downloaded. If the Requires-Dist info in PKG-INFO can't > be > relied on, surely less critical information such as Obsoletes can't be > relied on, > either? > > pip runs egg_info because setuptools does not write out to PKG-INFO what > the dependencies are (it does write it out to a different text file > though). But IIRC > that text file is not guaranteed to exist in the distribution. There's > also the > history where pip was trying to preserve as much backwards compat with > easy_install as it could, and if you used the file that egg_info writes out > then you'll only get the requirements for the system that the distribution > was > packaged on. Any if statements that affect the dependencies won't be > in effect. > It will be Obsoleted-By:. The "drop in replacement" requirement will be removed. The package manager will say "you are using these obsolete packages; check out these non-obsolete ones" but will not automatically pull the replacement without a Requires tag. I will probably add the unambiguous Conflicts: tag "uninstall this other package if I am installed". Many packages (IIRC more than half) have the pre-Metadata-1.2 equivalent of Requires-Dist: which is the very easy to parse requires.txt. This information is not reliable because it could depend on conditions in setup.py. Someone should write a setup.py compiler that determines whether a package's requirements are conditional or not. Environment markers (limited Python expressions at the end of Requires-Dist lines) attempt to make Requires-Dist reliable. You can execute them safely in your environment to determine whether a requirement is right for you: Requires-Dist: pywin32 (>1.0); sys.platform == 'win32' The wheel implementation makes sure all the metadata (the .dist-info directory) is at the end of the .zip archive. It's possible to read the metadata with a single HTTP partial request for the end of the archive without downloading the entire archive. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Thu Dec 6 15:58:39 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 6 Dec 2012 14:58:39 +0000 (UTC) Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: Daniel Holth gmail.com> writes: > The wheel implementation makes sure all the metadata (the .dist-info directory) > is at the end of the .zip archive. It's possible to read the metadata with a > single HTTP partial request for the end of the archive without downloading the > entire archive. Sounds good, but can you point to any example code which does this? As I understand it, for .zip files you have to read the last part of the file to get a pointer to the directory, then read that to find where each file in the archive is, then seek to a specific position to read the file contents. Regards, Vinay Sajip From dholth at gmail.com Thu Dec 6 16:23:20 2012 From: dholth at gmail.com (Daniel Holth) Date: Thu, 6 Dec 2012 10:23:20 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Thu, Dec 6, 2012 at 9:58 AM, Vinay Sajip wrote: > Daniel Holth gmail.com> writes: > > > The wheel implementation makes sure all the metadata (the .dist-info > directory) > > is at the end of the .zip archive. It's possible to read the metadata > with a > > single HTTP partial request for the end of the archive without > downloading the > > entire archive. > > Sounds good, but can you point to any example code which does this? As I > understand it, for .zip files you have to read the last part of the file > to get a > pointer to the directory, then read that to find where each file in the > archive > is, then seek to a specific position to read the file contents. You have to make a maximum of 3 requests: one for the directory pointer, one for the directory, and one for the file you want. It's not particularly difficult to make an HTTP-backed seekable file object to pass to ZipFile() for this purpose but I don't have an example. Normally the last few k of the file will contain all 3 pieces. 8k or 16k would be a good guess. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Thu Dec 6 16:21:30 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 06 Dec 2012 16:21:30 +0100 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On 6 Dec, 2012, at 15:58, Vinay Sajip wrote: > Daniel Holth gmail.com> writes: > >> The wheel implementation makes sure all the metadata (the .dist-info directory) >> is at the end of the .zip archive. It's possible to read the metadata with a >> single HTTP partial request for the end of the archive without downloading the >> entire archive. > > Sounds good, but can you point to any example code which does this? As I > understand it, for .zip files you have to read the last part of the file to get a > pointer to the directory, then read that to find where each file in the archive > is, then seek to a specific position to read the file contents. Because zipfiles can be appended to other files (for example when creating a self-extracting archive) the zipfile module maintains the file offset of the start of a zipfile. The code in the stdlib doesn't appear to test that the zipfile is at a positive offset in the file, therefore with some luck the following will work: * Download the last 10K of the archive (adjust the size to taste, it should be large enough to contain the zipfile directory and the file you are trying to read) * Create a zipfile.ZipFile * Read the zipfile member. If that doesn't work you'll have to create a temporary file of the right size and place the downloaded bit at the end of that file. BTW. Another (more hacky) alternative is to place the interesting bits of dist-info at the start of the zipfile, then you only need to download the first bit of the archive and can then extract the bits you need by parsing the local file headers (zipfiles contain both a directory at the end of the zipfile and a local header stored just before the file data). Ronald > > Regards, > > Vinay Sajip > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com From vinay_sajip at yahoo.co.uk Thu Dec 6 17:30:35 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Thu, 6 Dec 2012 16:30:35 +0000 (UTC) Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: Daniel Holth gmail.com> writes: > You have to make a maximum of 3 requests: one for the directory pointer, one > for the directory, and one for the file you want. It's not particularly > difficult to make an HTTP-backed seekable file object to pass to ZipFile() for > this purpose but I don't have an example. Normally the last few k of the file > will contain all 3 pieces. 8k or 16k would be a good guess. I don't need an example for doing it with multiple HTTP requests. I only asked for an example because you said one could read the metadata "with a single HTTP partial request", and I couldn't see how it could always be done with a single request. PEP 427 is mute on the subject of zip file comments in a .whl, but perhaps it shouldn't be. IIUC, the directory of the zip file *could* be further from the end of the file by more than 16K, due to the possible presence of a pathologically large comment in the end record. Regards, Vinay Sajip From dholth at gmail.com Thu Dec 6 18:19:24 2012 From: dholth at gmail.com (Daniel Holth) Date: Thu, 6 Dec 2012 12:19:24 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Thu, Dec 6, 2012 at 11:30 AM, Vinay Sajip wrote: > Daniel Holth gmail.com> writes: > > > You have to make a maximum of 3 requests: one for the directory pointer, > one > > for the directory, and one for the file you want. It's not particularly > > difficult to make an HTTP-backed seekable file object to pass to > ZipFile() for > > this purpose but I don't have an example. Normally the last few k of the > file > > will contain all 3 pieces. 8k or 16k would be a good guess. > > I don't need an example for doing it with multiple HTTP requests. I only > asked > for an example because you said one could read the metadata "with a single > HTTP partial request", and I couldn't see how it could always be done with > a > single request. > > PEP 427 is mute on the subject of zip file comments in a .whl, but perhaps > it > shouldn't be. IIUC, the directory of the zip file *could* be further from > the end > of the file by more than 16K, due to the possible presence of a > pathologically > large comment in the end record. It's just a "usually works" optimization that might be fun when bandwidth is more important than round trip times. The distance between the directory and the end of the file depends on the size of the directory. Django's is an extreme case at nearly half a meg; most are much smaller. On many filesystems it is cheap to create a sparse file the size of the entire archive and write the partial requests into it. The OS doesn't actually store all the 0's. The other reason wheel puts the metadata at the end is so the metadata can be re-written efficiently without re-writing the entire zipfile. The wheel project implements ZipFile.pop() which truncates the last file from a (normal) zip archive. This is especially useful when the last file is the attached digital signature. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Fri Dec 7 06:47:25 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 7 Dec 2012 00:47:25 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Thu, Dec 6, 2012 at 8:39 AM, Daniel Holth wrote: > It will be Obsoleted-By:. The "drop in replacement" requirement will be > removed. The package manager will say "you are using these obsolete > packages; check out these non-obsolete ones" but will not automatically pull > the replacement without a Requires tag. Sounds fine to me. > I will probably add the unambiguous Conflicts: tag "uninstall this other > package if I am installed". Please don't. See my lengthy posts from the previous PEP 345 retread discussion for why, or ask MRAB to succinctly summarize them as he did so brilliantly with the obsoletes/obsoleted-by issue. ;-) I'll take a stab at a short version, though: a conflict (other than filename conflict) is not an installation-time property of a single project, but rather a *runtime* property of an overall system to which the projects are being installed, including configuration that is out of scope for a Python-specific installation tool to manage. In addition, even declaring overall conflicts as a *mere shorthand* for an existing file conflict creates the possibility of stale conflict information! For example, RuleDispatch vs. PyDispatcher: at one time both provided a "dispatch" package, but if RuleDispatch declared PyDispatcher conflicting, the declaration would quickly have become outdated: PyDispatcher soon renamed its provided package to resolve the conflict. A file-based system can both detect and resolve this conflict (or lack thereof) automatically, whereas a manual "Conflicts" notation must be maintained by the author(s) of one or both packages and removed when out of date. In effect, a "conflicts" field actually *creates* conflicts and maintenance burdens where they did not previously exist, because even after the conflict no longer really existed, an automated tool would have prevented PyDispatch from being installed, or, per your suggestion above, unnecessarily *uninstalled* it after a user installed RuleDispatch. And unlike the Obsoletes->Obsoleted-By change, I do not know of any similar way to salvage the idea of a Conflicts field, without reference to some mediating authority that manages the information on behalf of an overall system into which the projects are being fitted. But in that case, neither of the projects really owns the declaration - it's more like Zope (say) would need a list of plugins that conflict with each other, or they could declare that they conflict when activated in the same instance. A generic Python installer, however, that doesn't know about Zope instances or Apache vhosts or Django apps or any other "environment of conflict", can't assume that *mere installation* constitutes a conflict! It doesn't know, for example, whether code from two simultaneously-installed packages will ever even be *imported* in the same process, let alone whether their specific conflicting features will be used in that process. This effectively ensures that in general, Python installation tools can *only* rely on file-based conflicts as being denotable by project metadata -- and even then, it's better to stick with *actual* file conflicts rather than predicted ones, to avoid the type of logjam described above. P.S. Sorry once again to drag you through all this at the last minute; I just sort of assumed you picked up where Alexis left off on the previous attempt at an update to PEP 345 and didn't pay close enough attention to earlier drafts. From pje at telecommunity.com Fri Dec 7 06:49:27 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 7 Dec 2012 00:49:27 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Thu, Dec 6, 2012 at 9:58 AM, Vinay Sajip wrote: > Daniel Holth gmail.com> writes: > >> The wheel implementation makes sure all the metadata (the .dist-info directory) >> is at the end of the .zip archive. It's possible to read the metadata with a >> single HTTP partial request for the end of the archive without downloading the >> entire archive. > > Sounds good, but can you point to any example code which does this? As I > understand it, for .zip files you have to read the last part of the file to get a > pointer to the directory, then read that to find where each file in the archive > is, then seek to a specific position to read the file contents. ISTR that this is especially true for zipimport: I think it depends on a zipfile signature being present at the *end* of the file. Certainly, the standard for .exe and shell wrappers for zipfiles is to place them at the beginning of the file, rather than the end. From pje at telecommunity.com Fri Dec 7 07:18:40 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 7 Dec 2012 01:18:40 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <20121206064925.GC2613@unaka.lan> References: <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121206064925.GC2613@unaka.lan> Message-ID: On Thu, Dec 6, 2012 at 1:49 AM, Toshio Kuratomi wrote: > On Wed, Dec 05, 2012 at 07:34:41PM -0500, PJ Eby wrote: >> On Wed, Dec 5, 2012 at 6:07 PM, Donald Stufft wrote: >> >> Nobody has actually proposed a better one, outside of package renaming >> -- and that example featured an author who could just as easily have >> used an obsoleted-by field. >> > How about pexpect and pextpect-u as a better example? Perhaps you could explain? I'm not familiar with those projects. > Note that although well-managed Linux distros attempt to control random > forking internally, the distro package managers don't prevent people from > installing from third parties. So Ubuntu PPAs, upstreams that provide their > own rpms/debs, and major third party repos (for instance, rpmfusion as > an add-on repo to Fedora) all have and sometimes (mis)use the ability to > Obsolete packages in the base repository. But in each of these cases, the packages are being defined *with reference to* some underlying vision of what the distro (or even "a distro") is. An Ubuntu PPA, if I understand correctly, is still *building an Ubuntu system*. Python packaging as a whole lacks such frames of reference. A forked distro is still a distro, and it's a fork *of something*. Rpmfusion is defining an enhanced Fedora, not slinging random unrelated packages about. If there's a distro analogy to PyPI, it seems to me that something like RpmFind would be closer: it's just a free-for-all of packages, with the user needing to decide for themselves whether installing something from a foreign distro will or won't blow up their system. (E.g., because their native distro and the foreign one use a different "provides" taxonomy.) RpmFind itself can't solve anybody's issues with conflicts or obsoletes; all it can do is search the data that's there. But unlike PyPI, RpmFind can at least tell you which vision of "a distro" a particular package was intended for. ;-) > The ability for this class of fields to cause harm is not, to me, > a compelling argument not to include them. But it is absolutely not a compelling argument *to* include them, and the actual arguments for them are pretty thin on the ground. The real knockdown is that in the PyPI environment, there aren't any automated use cases that don't produce collateral damage (outside of advisories about Obsoleted-By projects). > It could be an argument to > explicitly tell implementers of install tools that they all have caveats > when used with pypi and similar unpoliced community package repositories. AFAIK, there are only a handful of curated repositories: Scipy, Enthought, and ActiveState come to mind. These are essentially "python distros", and they might certainly have reason to build policy into their metadata. I expect, however, that they would not want the *package* authors declaring their own conflicts or obsolescence, so I'm not sure how the metadata spec will help them. Has anyone asked for their input or experience? It seems pointless to speculate on what they might or might not need for curated distribution. (I'm pretty sure Enthought has their own install tools, not sure about the other two.) > The install tools can then choose how they wish to deal with those caveats. > Some example strategies: choose to prompt the user as to which to install, > choose to always treat the fields as human-informational only, mark some > repositories as being trusted to contain packages where these fields are > active and other repositories where the fields are ignored. A peculiar phenomenon: every defense of these fields seems to refer almost exclusively to how the problems could be fixed or why the problems aren't that bad, rather than *how useful the fields would be* in real-world scenarios. In some cases, the argument for the fields' safety actually runs *counter* to their usefulness, e.g., the fields aren't that bad because we could make them have a limited function or no function at all. Isn't lack of usefulness generally considered an argument for *not* including a feature? ;-) From a.badger at gmail.com Fri Dec 7 18:01:34 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Fri, 7 Dec 2012 09:01:34 -0800 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121206064925.GC2613@unaka.lan> Message-ID: <20121207170134.GD2613@unaka.lan> On Fri, Dec 07, 2012 at 01:18:40AM -0500, PJ Eby wrote: > On Thu, Dec 6, 2012 at 1:49 AM, Toshio Kuratomi wrote: > > On Wed, Dec 05, 2012 at 07:34:41PM -0500, PJ Eby wrote: > >> On Wed, Dec 5, 2012 at 6:07 PM, Donald Stufft wrote: > >> > >> Nobody has actually proposed a better one, outside of package renaming > >> -- and that example featured an author who could just as easily have > >> used an obsoleted-by field. > >> > > How about pexpect and pextpect-u as a better example? > > Perhaps you could explain? I'm not familiar with those projects. > pexepect was last released in 2008. Upstream went silent with unanswered bugs in its tracker and no mailing list. A fork of pexpect was created that addressed the issue of unicode type in python2, a python3 port, and has slowly evolvd since then. I see that the original upstream has made some commits to their source repository since the fork was created although there has still been no new release. > > Note that although well-managed Linux distros attempt to control random > > forking internally, the distro package managers don't prevent people from > > installing from third parties. So Ubuntu PPAs, upstreams that provide their > > own rpms/debs, and major third party repos (for instance, rpmfusion as > > an add-on repo to Fedora) all have and sometimes (mis)use the ability to > > Obsolete packages in the base repository. > > But in each of these cases, the packages are being defined *with > reference to* some underlying vision of what the distro (or even "a > distro") is. An Ubuntu PPA, if I understand correctly, is still > *building an Ubuntu system*. Python packaging as a whole lacks such > frames of reference. A forked distro is still a distro, and it's a > fork *of something*. Rpmfusion is defining an enhanced Fedora, not > slinging random unrelated packages about. > Uhm.... that's both true and false as any complex system is. rpm and deb are just packaging formats. So: *) Not all packages built build on top of that system. There are rpm packages provided by upstreams that users attempt (to greater and lesser degrees of success) to install on SuSE, RHEL, Fedora, Mandriva, etc. There are debs built for Ubuntu that people attempt to install onto Debian. *) PPAs and rpmfusion may both build on top of an existing system but they can change the underlying structure, replacing components that other pieces of the base system depend on. You talk about the setuptools and distribute problem on pypi.... there's absolutley nothing that prevents someone from building a PPA or a package in a third-party rpm repository that packages a setuptools that Obsoletes: distribute or a distribute package that Obsoletes: setuptools. > > The install tools can then choose how they wish to deal with those caveats. > > Some example strategies: choose to prompt the user as to which to install, > > choose to always treat the fields as human-informational only, mark some > > repositories as being trusted to contain packages where these fields are > > active and other repositories where the fields are ignored. > > A peculiar phenomenon: every defense of these fields seems to refer > almost exclusively to how the problems could be fixed or why the > problems aren't that bad, rather than *how useful the fields would be* > in real-world scenarios. In some cases, the argument for the fields' > safety actually runs *counter* to their usefulness, e.g., the fields > aren't that bad because we could make them have a limited function or > no function at all. Isn't lack of usefulness generally considered an > argument for *not* including a feature? ;-) > If you constantly forget why the fields are useful, then I suppose you'll always believe that :-) -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From status at bugs.python.org Fri Dec 7 18:07:23 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 7 Dec 2012 18:07:23 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20121207170723.06DF11C9B9@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-11-30 - 2012-12-07) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3816 (+24) closed 24597 (+31) total 28413 (+55) Open issues with patches: 1671 Issues opened (36) ================== #16582: Tkinter calls SystemExit with string http://bugs.python.org/issue16582 opened by Abraham Karplus #16584: unhandled IOError filecmp.cmpfiles() if file not readable http://bugs.python.org/issue16584 opened by till #16587: Py_Initialize breaks wprintf on Windows http://bugs.python.org/issue16587 opened by makegho #16591: RUNSHARED wrong for OSX no framework http://bugs.python.org/issue16591 opened by grobian #16594: SocketServer should set SO_REUSEPORT along with SO_REUSEADDR w http://bugs.python.org/issue16594 opened by Andy.Zeldis #16595: Add resource.prlimit http://bugs.python.org/issue16595 opened by christian.heimes #16596: Skip stack unwinding when "next", "until" and "return" pdb co http://bugs.python.org/issue16596 opened by asvetlov #16597: file descriptor not being closed with context manager on IOErr http://bugs.python.org/issue16597 opened by udoprog #16598: Docs: double newlines printed in some file iteration examples http://bugs.python.org/issue16598 opened by lost-theory #16599: unittest: Access test result from tearDown http://bugs.python.org/issue16599 opened by techtonik #16601: Restarting iteration over tarfile continues from where it left http://bugs.python.org/issue16601 opened by mbirtwell #16602: weakref can return an object with 0 refcount http://bugs.python.org/issue16602 opened by eltoder #16603: Sporadic test_socket failures: testFDPassCMSG_SPACE on Mac OS http://bugs.python.org/issue16603 opened by haypo #16608: immutable subclass constructor call error does not show subcla http://bugs.python.org/issue16608 opened by LambertDW #16609: float loses precision when passed to str() http://bugs.python.org/issue16609 opened by sleepycal #16611: multiple problems with Cookie.py http://bugs.python.org/issue16611 opened by jdennis #16612: Integrate "Argument Clinic" specialized preprocessor into CPyt http://bugs.python.org/issue16612 opened by larry #16613: ChainMap.new_child could use improvement http://bugs.python.org/issue16613 opened by vinay.sajip #16614: argparse should have an option to require un-abbreviated optio http://bugs.python.org/issue16614 opened by Michael.Edwards #16615: gcc 4.7 unused-but-set warnings http://bugs.python.org/issue16615 opened by jcea #16616: test_poll.PollTests.poll_unit_tests() is dead code http://bugs.python.org/issue16616 opened by sbt #16618: Different glob() results for strings and bytes http://bugs.python.org/issue16618 opened by serhiy.storchaka #16620: Avoid using private function glob.glob1() in msi module and to http://bugs.python.org/issue16620 opened by serhiy.storchaka #16621: sched module enhancement request http://bugs.python.org/issue16621 opened by carlosmf.pt #16623: argparse help formatter does not honor non-breaking space http://bugs.python.org/issue16623 opened by roysmith #16624: subprocess.check_output should allow specifying stdin as a str http://bugs.python.org/issue16624 opened by zwol #16626: Infinite recursion in glob.glob('*:') on Windows http://bugs.python.org/issue16626 opened by serhiy.storchaka #16628: leak in ctypes.resize() http://bugs.python.org/issue16628 opened by pitrou #16629: IDLE: Calltips test fails due to int docstring change http://bugs.python.org/issue16629 opened by serwy #16630: IDLE: Calltip fails if __getattr__ raises exception http://bugs.python.org/issue16630 opened by serwy #16631: tarfile.extractall() doesn't extract everything if .next() was http://bugs.python.org/issue16631 opened by techtonik #16632: Enable DEP and ASLR http://bugs.python.org/issue16632 opened by christian.heimes #16633: os.environ updates only one copy of env vars under Windows (Ge http://bugs.python.org/issue16633 opened by eudoxos #16634: urllib.error.HTTPError.reason is not documented http://bugs.python.org/issue16634 opened by berker.peksag #16635: Interpreter not closing stdout/stderr on exit http://bugs.python.org/issue16635 opened by filip.zyzniewski #16636: codecs: readline() followed by readlines() returns trunkated r http://bugs.python.org/issue16636 opened by laurynas Most recent 15 issues with no replies (15) ========================================== #16636: codecs: readline() followed by readlines() returns trunkated r http://bugs.python.org/issue16636 #16634: urllib.error.HTTPError.reason is not documented http://bugs.python.org/issue16634 #16628: leak in ctypes.resize() http://bugs.python.org/issue16628 #16626: Infinite recursion in glob.glob('*:') on Windows http://bugs.python.org/issue16626 #16623: argparse help formatter does not honor non-breaking space http://bugs.python.org/issue16623 #16620: Avoid using private function glob.glob1() in msi module and to http://bugs.python.org/issue16620 #16616: test_poll.PollTests.poll_unit_tests() is dead code http://bugs.python.org/issue16616 #16613: ChainMap.new_child could use improvement http://bugs.python.org/issue16613 #16612: Integrate "Argument Clinic" specialized preprocessor into CPyt http://bugs.python.org/issue16612 #16611: multiple problems with Cookie.py http://bugs.python.org/issue16611 #16603: Sporadic test_socket failures: testFDPassCMSG_SPACE on Mac OS http://bugs.python.org/issue16603 #16598: Docs: double newlines printed in some file iteration examples http://bugs.python.org/issue16598 #16594: SocketServer should set SO_REUSEPORT along with SO_REUSEADDR w http://bugs.python.org/issue16594 #16591: RUNSHARED wrong for OSX no framework http://bugs.python.org/issue16591 #16584: unhandled IOError filecmp.cmpfiles() if file not readable http://bugs.python.org/issue16584 Most recent 15 issues waiting for review (15) ============================================= #16634: urllib.error.HTTPError.reason is not documented http://bugs.python.org/issue16634 #16632: Enable DEP and ASLR http://bugs.python.org/issue16632 #16631: tarfile.extractall() doesn't extract everything if .next() was http://bugs.python.org/issue16631 #16630: IDLE: Calltip fails if __getattr__ raises exception http://bugs.python.org/issue16630 #16629: IDLE: Calltips test fails due to int docstring change http://bugs.python.org/issue16629 #16628: leak in ctypes.resize() http://bugs.python.org/issue16628 #16626: Infinite recursion in glob.glob('*:') on Windows http://bugs.python.org/issue16626 #16624: subprocess.check_output should allow specifying stdin as a str http://bugs.python.org/issue16624 #16618: Different glob() results for strings and bytes http://bugs.python.org/issue16618 #16601: Restarting iteration over tarfile continues from where it left http://bugs.python.org/issue16601 #16598: Docs: double newlines printed in some file iteration examples http://bugs.python.org/issue16598 #16597: file descriptor not being closed with context manager on IOErr http://bugs.python.org/issue16597 #16596: Skip stack unwinding when "next", "until" and "return" pdb co http://bugs.python.org/issue16596 #16595: Add resource.prlimit http://bugs.python.org/issue16595 #16591: RUNSHARED wrong for OSX no framework http://bugs.python.org/issue16591 Top 10 most discussed issues (10) ================================= #14621: Hash function is not randomized properly http://bugs.python.org/issue14621 12 msgs #16581: define "PEP editor" in PEP 1 http://bugs.python.org/issue16581 10 msgs #16609: float loses precision when passed to str() http://bugs.python.org/issue16609 10 msgs #16569: Preventing errors of simultaneous access in zipfile http://bugs.python.org/issue16569 7 msgs #16599: unittest: Access test result from tearDown http://bugs.python.org/issue16599 7 msgs #14099: ZipFile.open() should not reopen the underlying file http://bugs.python.org/issue14099 6 msgs #15207: mimetypes.read_windows_registry() uses the wrong regkey, creat http://bugs.python.org/issue15207 6 msgs #16596: Skip stack unwinding when "next", "until" and "return" pdb co http://bugs.python.org/issue16596 6 msgs #16631: tarfile.extractall() doesn't extract everything if .next() was http://bugs.python.org/issue16631 6 msgs #16608: immutable subclass constructor call error does not show subcla http://bugs.python.org/issue16608 5 msgs Issues closed (31) ================== #5701: ZipFile returns compressed data stream when encountering unsup http://bugs.python.org/issue5701 closed by serhiy.storchaka #6036: Clean up test_posixpath.py http://bugs.python.org/issue6036 closed by python-dev #6744: calling kevent repr raises a TypeError http://bugs.python.org/issue6744 closed by pitrou #10052: Python/dtoa.c:158: #error "Failed to find an exact-width 32-bi http://bugs.python.org/issue10052 closed by mark.dickinson #10182: match_start truncates large values http://bugs.python.org/issue10182 closed by pitrou #10589: I/O ABC docs should specify which methods have implementations http://bugs.python.org/issue10589 closed by asvetlov #12457: type() returns incorrect type for nested classes http://bugs.python.org/issue12457 closed by pitrou #13120: Default nosigint option to pdb.Pdb() prevents use in non-main http://bugs.python.org/issue13120 closed by asvetlov #16416: Mac OS X: don't use the locale encoding but UTF-8 to encode an http://bugs.python.org/issue16416 closed by haypo #16444: Use support.TESTFN_UNDECODABLE on UNIX http://bugs.python.org/issue16444 closed by haypo #16562: Optimize dict equality test http://bugs.python.org/issue16562 closed by pitrou #16579: .pyw disturb multiprocessing behavior http://bugs.python.org/issue16579 closed by Alex.stein #16583: Tkinter nested SystemExit http://bugs.python.org/issue16583 closed by asvetlov #16585: surrogateescape broken w/ multibytecodecs' encode http://bugs.python.org/issue16585 closed by python-dev #16586: json library can't parse large (> 2^31) strings http://bugs.python.org/issue16586 closed by pitrou #16588: gcc 4.7 unused-but-set warnings on Python/thread_pthread.h http://bugs.python.org/issue16588 closed by christian.heimes #16589: PrettyPrinter docs is incomplete http://bugs.python.org/issue16589 closed by ezio.melotti #16590: Drop <2.6 support from _json.c http://bugs.python.org/issue16590 closed by pitrou #16592: stringlib_bytes_join doesn't raise MemoryError on allocation f http://bugs.python.org/issue16592 closed by christian.heimes #16593: Have BSD 'make -s' DTRT http://bugs.python.org/issue16593 closed by christian.heimes #16600: small py3k issue in rlcompleter http://bugs.python.org/issue16600 closed by rmcgibbo #16604: Sporadic .test_threaded_import failure: test_parallel_meta_pat http://bugs.python.org/issue16604 closed by pitrou #16605: test_posix.test_fs_holes() fails on FreeBSD 9.0 http://bugs.python.org/issue16605 closed by jcea #16606: hashlib memory leak http://bugs.python.org/issue16606 closed by eric.snow #16607: Bad examples in documentation http://bugs.python.org/issue16607 closed by georg.brandl #16610: Silent StopIteration exc when raised from generator inside of http://bugs.python.org/issue16610 closed by r.david.murray #16617: mimetypes.py UnicodeDecodeError: 'ascii' codec can't decode by http://bugs.python.org/issue16617 closed by amaury.forgeotdarc #16619: LOAD_GLOBAL used to load `None` under certain circumstances http://bugs.python.org/issue16619 closed by python-dev #16622: IDLE crashes on parentheses http://bugs.python.org/issue16622 closed by serwy #16625: Exception on mode 'br' http://bugs.python.org/issue16625 closed by Sworddragon #16627: comparison problem of two 'dict' objects http://bugs.python.org/issue16627 closed by ezio.melotti From pje at telecommunity.com Fri Dec 7 23:02:26 2012 From: pje at telecommunity.com (PJ Eby) Date: Fri, 7 Dec 2012 17:02:26 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <20121207170134.GD2613@unaka.lan> References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121206064925.GC2613@unaka.lan> <20121207170134.GD2613@unaka.lan> Message-ID: On Fri, Dec 7, 2012 at 12:01 PM, Toshio Kuratomi wrote: > On Fri, Dec 07, 2012 at 01:18:40AM -0500, PJ Eby wrote: >> On Thu, Dec 6, 2012 at 1:49 AM, Toshio Kuratomi wrote: >> > On Wed, Dec 05, 2012 at 07:34:41PM -0500, PJ Eby wrote: >> >> Nobody has actually proposed a better one, outside of package renaming >> >> -- and that example featured an author who could just as easily have >> >> used an obsoleted-by field. >> >> >> > How about pexpect and pextpect-u as a better example? >> >> Perhaps you could explain? I'm not familiar with those projects. >> > > pexepect was last released in 2008. Upstream went silent with unanswered > bugs in its tracker and no mailing list. A fork of pexpect was created that > addressed the issue of unicode type in python2, a python3 port, and has > slowly evolvd since then. > > I see that the original upstream has made some commits to their source > repository since the fork was created although there has still been no new > release. And what problem are you saying which fields would have solved (or which benefits they would have provided), for whom? If the packages have files in conflict, they won't be both installed. If they don't have files in conflict, there's nothing important to be informed of. If one is installing pexpect-u, then one does not need to discover that it is a successor of pexpect. If one is installing pexpect, it might be useful to know that pexpect-u exists, but one can't simply discover that from an Obsoletes field on pexpect-u. However, even if one did discover it, this would merely constitute an *advertisement* of pexpect-u's existence, not a *requirement* that it be used in place. A tool cannot know, without other affirmative user action, that it is actually a good assumption to use the advertised replacement. In the distro world, a user has *already* taken this affirmative action by choosing which repository to source packages from, on an implicit contract that this source is up to the job of managing his needs across multiple packages. Or, if they choose to source an off-brand or upstream package, they are taking affirmative action to risk it. In the Python world, there is no notion of a "repository", aside from a handful of managed Python distros, which have their own, distinct packaging methods and distribution tools. So there is no affirmative contract of trust regarding *inter-project* relationships. It is precisely this lack that is why the metadata spec has gone mostly unused since its inception about a decade ago. Nobody really knows what to "provide" or "require", or in what context they would actually be "obsoleting" anything that isn't their own package, or a package they've forked. But if you live mainly in the distro world, this concept seems absurd, and the fields *obviously* useful. But that's because you're swimming in an ocean of context that doesn't exist on dry land. You're saying that *of course* swimming fins are useful... if you live in the ocean. And I, living on dry land, am saying that *sure* they are... but only in a swimming pool or a pond, and we don't have very many of those here in dry Python-land. And the people who run the swimming pools have thoughtfully already provided their own. Do we need to standardize swim fin sizes for people who mostly live on dry land? The flip side of this, btw, is that there's an implicit contract in the Python world that there is generally only "the" package - not "the package as patched and re-packaged by vendors X, Y, and Z". If I install python project foo, version 1.2, I expect it to be the *same* foo-1.2, with the *same metadata*, *no matter where I got it from*. And so, this assumption is our "air" to your "water". We know that pools and ponds (curated Python distros) are different, as an exception to this rule, just as you know that reefs and islands (uncurated repositories, search engines, and upstream-built packages) are different, as an exception to your assumption that "the package I get is intended to play well with everything else in my system." (This of course is why many distro managers are suspicious of language-specific or other sorts of vertical package management tools - they seem as pointless as wheels in the water, solving problems you don't have, and creating new problems for you at the same time. Unfortunately, people on land will keep inventing them, because they have a different set of problems to solve -- some of which are actually created by the ocean-oriented tools. For example, virtualenv and its predecessors were developed to solve the "problem" of a single integrated environment, even though that integrated environment is the "solution" from a distro perspective.) > *) Not all packages built build on top of that system. There are rpm > packages provided by upstreams that users attempt (to greater and lesser > degrees of success) to install on SuSE, RHEL, Fedora, Mandriva, etc. There > are debs built for Ubuntu that people attempt to install onto Debian. Sure. But the reference points still exist, and there is a layer of indirection between "packager" and "developer", even in the case where the packager and developer are the same person or organization. In the Python case, there is usually no such indirection, outside of curated systems like SciPy et al. (Even there, most of what third-party packaging is about in the Python world is taking care of binary builds.) Again, it's islands in the ocean vs. pools on land. > *) PPAs and rpmfusion may both build on top of an existing system but they > can change the underlying structure, replacing components that other pieces > of the base system depend on. You talk about the setuptools and distribute > problem on pypi.... there's absolutley nothing that prevents someone from > building a PPA or a package in a third-party rpm repository that packages > a setuptools that Obsoletes: distribute or a distribute package that > Obsoletes: setuptools. At the *same time*? That is, are you saying that there are repositories that contain *self-contained* "Obsoletes"-cycles? (Presumably, there are no end-user sites containing such cycles, if the install tool responds by refusing to install one or by removing the other.) > If you constantly forget why the fields are useful, then I suppose you'll > always believe that :-) I've stated many times that they're useful... in the context of a larger system. Within the distro packaging ecosystem, a package "conflicts", "obsoletes", or "provides" things *relative* to some notion of an installation -- however vague -- that has been selected by an explicit user action (such as choice of basic distro, package manager, and repository). So, despite their framing as binary relationships -- e.g. Obsoletes(predecessor,succesor) -- the *actual* relationship is three-valued: Obsoletes(predecessor, successor, integration-context). The third player in the relationship is whoever *packaged* the project(s) in question... and in the Python world (outside of curated repositories), that packager is *always the original author*. Now, in the case where the packager and author are different, we can talk about such relationships in the same way: binary relationships with an implied third. For example, if SciPy decided at some point to replace NumPy with NumPyPy, it would be more than reasonable to state that Obsoletes(NumPy, NumPyPy, SciPy), even as at the same time, perhaps Enthought has already tried this and decided to go the other way, so that Obsoletes(NumPyPy, NumPy, EnthoughtPD). They use different tools and repositories and thus can imply the third position. In neither case, however is SciPy or Enthought (nor the authors of NumPy or NumPyPy), entitled to declare an Obsoletes relationship with a *true* wildcard for the third position. And so the key distinction between PyPI and the distro world is that *PyPI is not an integration context*. Packages provided by authors do not usually include this type of metadata, unless the author of the package has a specific integration context in mind. So the burden falls to either the repository manager or the user to define these higher-level relationships *within their intended integration context* (Or to put it another way, *somebody* has to be the "packager", not just the "developer".) Currently, Python distribution tools, culture, and methodology do not have any precedent for the metadata spec contents to be overrridden by a third-party packager, curator or repository manager, in the way that is normal and common in the distro world. (Try to imagine a Linux distro where this kind of information was *always* put in "upstream", because *there is no such thing* as "downstream". That's what it's like "on land".) This is why I keep saying that blind copying is an invitation to trouble, and that clear thinking about the actual requirements is needed. I would not object to explicitly three-way versions of these fields (requires, provides, conflicts, obsoletes) that define a specific integration context in which the statement applies. (Although defining how to name integration contexts would present a *new* challenge for discussion!) Likewise, I would not object to discussion of how to manage metadata for *repackaging* of Python projects by third-party curators (e.g. SciPy et al), and ways to keep that separate from the author's declarations. Or discussion of what should constitute a "repository" in the Python world, as opposed to what we have now (which apart from curated distributions, consists mainly of indexes, not true repositories in the distro sense). Today, however, there is no separation in the metadata spec (or tools) between "packaging" (in the sense understood by distros) and "distributing" (in the sense normally applied to Python packages distributed via PyPI and similar channels). And "packaging" in the distro sense is all about *integrating* packages, not merely making them *available* for others to integrate. That's the critical difference between the two, and in the resulting use cases for the metadata spec. From techtonik at gmail.com Sat Dec 8 02:10:21 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 8 Dec 2012 04:10:21 +0300 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Tue, Sep 15, 2009 at 9:24 PM, wrote: > On 04:25 pm, eric.pruitt at gmail.com wrote: > >> I'm bumping this PEP again in hopes of getting some feedback. >> > This is useful, indeed. ActiveState recipe for this has 10 votes, which is high for ActiveState (and such hardcore topic FWIW). > On Tue, Sep 8, 2009 at 23:52, Eric Pruitt wrote: >> >>> PEP: 3145 >>> Title: Asynchronous I/O For subprocess.Popen >>> Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson >>> Type: Standards Track >>> Content-Type: text/plain >>> Created: 04-Aug-2009 >>> Python-Version: 3.2 >>> >>> Abstract: >>> >>> In its present form, the subprocess.Popen implementation is prone to >>> dead-locking and blocking of the parent Python script while waiting >>> on data >>> from the child process. >>> >>> Motivation: >>> >>> A search for "python asynchronous subprocess" will turn up numerous >>> accounts of people wanting to execute a child process and communicate >>> with >>> it from time to time reading only the data that is available instead >>> of >>> blocking to wait for the program to produce data [1] [2] [3]. The >>> current >>> behavior of the subprocess module is that when a user sends or >>> receives >>> data via the stdin, stderr and stdout file objects, dead locks are >>> common >>> and documented [4] [5]. While communicate can be used to alleviate >>> some of >>> the buffering issues, it will still cause the parent process to block >>> while >>> attempting to read data when none is available to be read from the >>> child >>> process. >>> >>> Rationale: >>> >>> There is a documented need for asynchronous, non-blocking >>> functionality in >>> subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would >>> improve the >>> utility of the Python standard library that can be used on Unix based >>> and >>> Windows builds of Python. Practically every I/O object in Python has >>> a >>> file-like wrapper of some sort. Sockets already act as such and for >>> strings there is StringIO. Popen can be made to act like a file by >>> simply >>> using the methods attached the the subprocess.Popen.stderr, stdout and >>> stdin file-like objects. But when using the read and write methods of >>> those options, you do not have the benefit of asynchronous I/O. In >>> the >>> proposed solution the wrapper wraps the asynchronous methods to mimic >>> a >>> file object. >>> >>> Reference Implementation: >>> >>> I have been maintaining a Google Code repository that contains all of >>> my >>> changes including tests and documentation [9] as well as blog >>> detailing >>> the problems I have come across in the development process [10]. >>> >>> I have been working on implementing non-blocking asynchronous I/O in >>> the >>> subprocess.Popen module as well as a wrapper class for >>> subprocess.Popen >>> that makes it so that an executed process can take the place of a >>> file by >>> duplicating all of the methods and attributes that file objects have. >>> >> > "Non-blocking" and "asynchronous" are actually two different things. From > the rest of this PEP, I think only a non-blocking API is being introduced. > I haven't looked beyond the PEP, though, so I might be missing something. I suggest renaming http://www.python.org/dev/peps/pep-3145/ to 'Non-blocking I/O for subprocess' and continue. IMHO on this stage is where examples with deadlocks that occur with current subprocess implementation are badly needed. There are two base functions that have been added to the >>> subprocess.Popen >>> class: Popen.send and Popen._recv, each with two separate >>> implementations, >>> one for Windows and one for Unix based systems. The Windows >>> implementation uses ctypes to access the functions needed to control >>> pipes >>> in the kernel 32 DLL in an asynchronous manner. On Unix based >>> systems, >>> the Python interface for file control serves the same purpose. The >>> different implementations of Popen.send and Popen._recv have identical >>> arguments to make code that uses these functions work across multiple >>> platforms. >>> >> > Why does the method for non-blocking read from a pipe start with an "_"? > This is the convention (widely used) for a private API. The name also > doesn't suggest that this is the non-blocking version of reading. > Similarly, the name "send" doesn't suggest that this is the non-blocking > version of writing. The implementation is based on http://code.activestate.com/recipes/440554/which is more clearly illustrates integrated functionality. _recv() is a private base function, which is takes stdout or stderr as parameter. Corresponding user-level functions to read from stdout and stderr are .recv() and .recv_err() I thought about renaming API to .asyncread() and .asyncwrite(), but that may mean that you call method and then result asynchronously start to fill some buffer, which is not the case here. Then I thought about .check_read() and .check_write(), literally meaning 'check and read' or 'check and return' for non-blocking calls if there is nothing. But then again, poor naming convention of subprocess uses .check_output() for blocking read until command completes. Currently, subversion doesn't have .read and .write methods. It may be the best option: .write(what) to pipe more stuff into input buffer of child process. .read(from) where `from` is either subprocess.STDOUT or STDERR Both functions should be marked as non-blocking in docs and returning None if pipe is closed. When calling the Popen._recv function, it requires the pipe name be >>> passed as an argument so there exists the Popen.recv function that >>> passes >>> selects stdout as the pipe for Popen._recv by default. Popen.recv_err >>> selects stderr as the pipe by default. "Popen.recv" and >>> "Popen.recv_err" >>> are much easier to read and understand than "Popen._recv('stdout' >>> ..." and >>> "Popen._recv('stderr' ..." respectively. >>> >> > What about reading from other file descriptors? subprocess.Popen allows > arbitrary file descriptors to be used. Is there any provision here for > reading and writing non-blocking from or to those? On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is select. Of course a test is needed, but why it should not just work? > Since the Popen._recv function does not wait on data to be produced >>> before returning a value, it may return empty bytes. Popen.asyncread >>> handles this issue by returning all data read over a given time >>> interval. >>> >> > Oh. Popen.asyncread? What's that? This is the first time the PEP > mentions it. I guess that's for blocking read with timeout. Among the most popular questions about Python it is the question number ~500. http://stackoverflow.com/questions/1191374/subprocess-with-timeout > The ProcessIOWrapper class uses the asyncread and asyncwrite >>> functions to >>> allow a process to act like a file so that there are no blocking >>> issues >>> that can arise from using the stdout and stdin file objects produced >>> from >>> a subprocess.Popen call. >>> >> > What's the ProcessIOWrapper class? And what's the asyncwrite function? > Again, this is the first time it's mentioned. > Oh. That's a wrapper to access subprocess pipes with familiar file API. It is interesting: http://code.google.com/p/subprocdev/source/browse/subprocess.py?name=python3k > So, to sum up, I think my main comment is that the PEP seems to be missing > a significant portion of the details of what it's actually proposing. I > suspect that this information is present in the implementation, which I > have not looked at, but it probably belongs in the PEP. > > Jean-Paul > Writing PEPs is definitely a job, and a hard one for developers. Too bad a good idea *and* implementation (tests needed) is put on hold, because there is nobody, who can help with that part. IMHO PEP needs to expand on user stories even if there is significant amount of cited sources, a practical summary and problem illustration by examples are missing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Dec 8 02:33:45 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Dec 2012 11:33:45 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Fri, Dec 7, 2012 at 3:47 PM, PJ Eby wrote: > In effect, a "conflicts" field actually *creates* conflicts and > maintenance burdens where they did not previously exist, because even > after the conflict no longer really existed, an automated tool would > have prevented PyDispatch from being installed, or, per your > suggestion above, unnecessarily *uninstalled* it after a user > installed RuleDispatch. > That's not what a Conflicts field is for. It's to allow a project to say *they don't support* installing in parallel with another package. It doesn't matter why it's unsupported, it's making a conflict perceived by the project explicit in their metadata. Such a field is designed to convey information to users about *supported* configurations, regardless of whether or not they happen to work for a given use case. If a user believes a declared conflict is in error, and having the two installed in parallel is important to them, they can: 1. Use virtual environments to keep the two projects isolated from each other 2. Use an installer that ignores Conflicts information (which will be all of them, since that's the status quo) 3. Make their case to the upstream project that the conflict has been resolved, and installing the two in parallel no longer causes issues Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald.stufft at gmail.com Sat Dec 8 02:46:57 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Fri, 7 Dec 2012 20:46:57 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Friday, December 7, 2012 at 8:33 PM, Nick Coghlan wrote: > That's not what a Conflicts field is for. It's to allow a project to say *they don't support* installing in parallel with another package. It doesn't matter why it's unsupported, it's making a conflict perceived by the project explicit in their metadata. > > Such a field is designed to convey information to users about *supported* configurations, regardless of whether or not they happen to work for a given use case. If a user believes a declared conflict is in error, and having the two installed in parallel is important to them, they can: > 1. Use virtual environments to keep the two projects isolated from each other > 2. Use an installer that ignores Conflicts information (which will be all of them, since that's the status quo) > 3. Make their case to the upstream project that the conflict has been resolved, and installing the two in parallel no longer causes issues 4. Use the eventual --force flag that any installer that supported conflicts is likely to include ;) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Dec 8 02:49:29 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Dec 2012 11:49:29 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121206064925.GC2613@unaka.lan> <20121207170134.GD2613@unaka.lan> Message-ID: On Sat, Dec 8, 2012 at 8:02 AM, PJ Eby wrote: > > *) Not all packages built build on top of that system. There are rpm > > packages provided by upstreams that users attempt (to greater and lesser > > degrees of success) to install on SuSE, RHEL, Fedora, Mandriva, etc. > There > > are debs built for Ubuntu that people attempt to install onto Debian. > > Sure. But the reference points still exist, and there is a layer of > indirection between "packager" and "developer", even in the case where > the packager and developer are the same person or organization. In > the Python case, there is usually no such indirection, outside of > curated systems like SciPy et al. (Even there, most of what > third-party packaging is about in the Python world is taking care of > binary builds.) > > Again, it's islands in the ocean vs. pools on land. > To strain the analogy, the main value of these fields exists on the beach: at the point where you need to impedance match between the Python community and another packaging community. The ideal is to be able to get a point where you can point an automated tool at a project on PyPI and say "give me that, only packaged as RPM/deb/whatever, with appropriate distro specific metadata". Such a tool is obviously going to be distro specific, since it is going to have to do some remapping based on file names to pick up existing distro packages, but it *also* needs upstream metadata. Even in a distro, a "Conflicts:" field often *does* denote runtime conflicts (e.g. over a particular network port), because, as you say, filesystem level conflicts will usually be picked up automatically. The distro philosophy is to say "OK, we simply won't let you install conflicting projects at the same time, so you won't be surprised later by a conflict that only shows up if you happen to run them both at the same time". It's designed to turn a complex, hard to debug, problem into a simple, explicit error at installation time. People build complex systems (especially web apps) based on the PyPI ecosystem, and the upstream communities *can* assist in flagging potential issues in advance. If people start putting bad metadata in their projects, then that's just a bug to be dealt with like any other. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Sat Dec 8 07:46:51 2012 From: pje at telecommunity.com (PJ Eby) Date: Sat, 8 Dec 2012 01:46:51 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Fri, Dec 7, 2012 at 8:33 PM, Nick Coghlan wrote: > That's not what a Conflicts field is for. It's to allow a project to say > *they don't support* installing in parallel with another package. If that's the actual intended use case, the PEP needs some revision. In particular, if there's a behavioral recommendation for installer tools, it should be to avoid installing the project that *declares* the conflict, rather than the one that is the object of that declaration. ;-) In any case, as I said before, I don't have an issue with the fields all being declared as being for informational purposes only. My issue is only with recommendations for automated tool behavior that permit one project's author to exercise authority over another project's installation. If the fields are defined in such a way that an author can only shoot *themselves* in the foot with a bad declaration, that's fine by me. So if package A includes a "Conflicts: B" declaration, I recommend the following: * An attempt to install A with B already present refuses to install A without a warning and confirmation * An attempt to install B informs the user of the conflict, and optionally offers to uninstall A In this way, any collateral damage to B is avoided, while still making the intended "lack of support" declaration clear. How does that sound? From ncoghlan at gmail.com Sat Dec 8 11:06:21 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 8 Dec 2012 20:06:21 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Sat, Dec 8, 2012 at 4:46 PM, PJ Eby wrote: > So if package A includes a "Conflicts: B" declaration, I recommend the > following: > > * An attempt to install A with B already present refuses to install A > without a warning and confirmation > * An attempt to install B informs the user of the conflict, and > optionally offers to uninstall A > > In this way, any collateral damage to B is avoided, while still making > the intended "lack of support" declaration clear. > > How does that sound? > No, that's not the way it works. A conflict is always symmetric, no matter who declares it. The beneficiary of these notifications is the aggregator attempting to build a systematically coherent system, rather than one with latent incompatibilities waiting to bite them at run time. It doesn't *matter* if "A conflicts with B" or "B conflicts with A", you cannot have a system with both of them installed that will be supported by the developers of both A *and* B. Now, this beneficiary *may* be the packagers for a Linux distribution, but it may also be a larger Python distribution (ActiveState, EPD, etc), a web application developer, a desktop application developer, a system integrator for a large-scale distributed system, or anyone else that combines and deploys an integrated set of packages (even those a developer installs on their personal workstation). It's up to the user to decide who they want to believe. Now, it may be that, for a given use case, the end user doesn't actually care about the potential conflict (e.g. they've done their own research and determined that the conflicting behaviour doesn't affect their system) - that's then a design decision in the installation tools as to whether or not they want to make it easy for users to override the metadata. In the Linux distro case, the installer *and* most of the metadata are largely provided by the same people, so yum/rpm/etc generally *don't* make it easy to install conflicting packages. Python installers are in a different situation though, so forced installs are likely to be an expected feature (in fact, I expect the more likely outcome given the status quo is that the default behaviour will be a warning at installation time with an option to request enforcement of "no conflicts"). Building integrated systems *is hard*. Pretending projects can't conflict just because they're both written in Python isn't sensible, and neither is it sensible to avoid warning users about the the potential for latent defects when particular packages are used in combination. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Sat Dec 8 21:18:15 2012 From: pje at telecommunity.com (PJ Eby) Date: Sat, 8 Dec 2012 15:18:15 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Sat, Dec 8, 2012 at 5:06 AM, Nick Coghlan wrote: > On Sat, Dec 8, 2012 at 4:46 PM, PJ Eby wrote: >> >> So if package A includes a "Conflicts: B" declaration, I recommend the >> following: >> >> * An attempt to install A with B already present refuses to install A >> without a warning and confirmation >> * An attempt to install B informs the user of the conflict, and >> optionally offers to uninstall A >> >> In this way, any collateral damage to B is avoided, while still making >> the intended "lack of support" declaration clear. >> >> How does that sound? > > > No, that's not the way it works. A conflict is always symmetric, no matter > who declares it. But that *precisely contradicts* what you said in your previous email: > It's to allow a project to say > *they don't support* installing in parallel with another package. Just because A doesn't support being installed next to B, doesn't mean B doesn't support being installed next to A. B might work just fine with A installed, and even be explicitly supported by the author of B. Why should the author of A get to decide what happens to B? Just because I trust A about A, doesn't mean I should have to trust them about B. Look, I really don't care about the individual fields' definitions that much. I care about only one thing: A shouldn't get to (de facto) dictate what happens to B. If you *really* want the behavior to be symmetrical, then it should *only* be symmetrical if both A and B *agree* they are in conflict. (i.e., both refer to the other in their conflict fields). Otherwise, it should only be a warning. There are tons of other things that I could argue here about the positions you've laid out. But all I *really* care about is that we not define fields in such a way as to permit or encourage inter-package warfare -- intentional or not. Solutions acceptable to me include (in no particular order): * Make declarations affect only the declarer (as with Obsoleted-By) * Make declarations only warn users, not block installation or result in uninstallation * Have no automated action at all, and document them as intended for downstream repackagers only * Toss the field entirely * Make the field include a context (e.g. a distro name), so that only tools explicitly told you're operating in that context pay attention * Use the new metadata extension vocabularies to define hints for specific downstream packaging tools and systems * Replace "conflicts" with a specification of resources actually used by the project, so that such conflicts can be automatically detected without needing to target a specific project And there are probably others I haven't thought of yet. If you can be clearer about what it is you want from the Conflicts field *other* than just wanting it to stay as is (or perhaps *why* you would like to have the Python infrastructure side with project A over project B, irrespective of which project is A and which one is B), then perhaps I can come up with others. From thisismythrowawayaccount at gmail.com Sat Dec 8 22:07:43 2012 From: thisismythrowawayaccount at gmail.com (A G) Date: Sat, 8 Dec 2012 13:07:43 -0800 Subject: [Python-Dev] python 3.3 module test failures on FreeBSD 9 amd64 Message-ID: Hello All, I am successfully compiling python 3.3 on FreeBSD 9.0 amd64 architecture. When I run the tests, I get these two test failures ( I trimmed out all the output from test cases that returned ok): test_saltedcrypt (test.test_crypt.CryptTestCase) ... FAIL ====================================================================== FAIL: test_saltedcrypt (test.test_crypt.CryptTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/adam/Python-3.3.0/Lib/test/test_crypt.py", line 23, in test_saltedcrypt self.assertEqual(len(pw), method.total_size) AssertionError: 60 != 13 ---------------------------------------------------------------------- Ran 4 tests in 0.063s ====================================================================== FAIL: test_setegid (test.test_os.PosixUidGidTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/adam/Python-3.3.0/Lib/test/test_os.py", line 1211, in test_setegid self.assertRaises(os.error, os.setegid, 0) AssertionError: OSError not raised by setegid ====================================================================== FAIL: test_setgid (test.test_os.PosixUidGidTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/adam/Python-3.3.0/Lib/test/test_os.py", line 1199, in test_setgid self.assertRaises(os.error, os.setgid, 0) AssertionError: OSError not raised by setgid ====================================================================== FAIL: test_setregid (test.test_os.PosixUidGidTests) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/home/adam/Python-3.3.0/Lib/test/test_os.py", line 1231, in test_setregid self.assertRaises(os.error, os.setregid, 0, 0) AssertionError: OSError not raised by setregid ---------------------------------------------------------------------- Ran 124 tests in 2.035s FAILED (failures=3, skipped=23) Warning -- threading._dangling was modified by test_os test test_os failed 2 tests failed: test_crypt test_os My question is, are these known? Is it safe for me to use this compiled version of python, or will I run into problems if I use those two modules or any code that depends on those? Ideally I'd like to resolve these and submit a port for python3.3 since the most recent FreeBSD port is stuck on 3.2. Thanks! Adam -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat Dec 8 22:14:19 2012 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 08 Dec 2012 21:14:19 +0000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: <50C3ADAB.4070308@mrabarnett.plus.com> On 2012-12-08 20:18, PJ Eby wrote: > On Sat, Dec 8, 2012 at 5:06 AM, Nick Coghlan wrote: >> On Sat, Dec 8, 2012 at 4:46 PM, PJ Eby wrote: >>> >>> So if package A includes a "Conflicts: B" declaration, I recommend the >>> following: >>> >>> * An attempt to install A with B already present refuses to install A >>> without a warning and confirmation >>> * An attempt to install B informs the user of the conflict, and >>> optionally offers to uninstall A >>> >>> In this way, any collateral damage to B is avoided, while still making >>> the intended "lack of support" declaration clear. >>> >>> How does that sound? >> >> >> No, that's not the way it works. A conflict is always symmetric, no matter >> who declares it. > > But that *precisely contradicts* what you said in your previous email: > >> It's to allow a project to say >> *they don't support* installing in parallel with another package. > > Just because A doesn't support being installed next to B, doesn't mean > B doesn't support being installed next to A. B might work just fine > with A installed, and even be explicitly supported by the author of B. > Why should the author of A get to decide what happens to B? Just > because I trust A about A, doesn't mean I should have to trust them > about B. > [snip] If package A says that it conflicts with package B, it may or may not be symmetrical, because it's possible that package B has been updated since the author of package A discovered the conflict, so it's important that the user is told which package is complaining about the conflict, the one that is being installed or the one that is already installed. It may also be helpful if the package that includes the "Conflicts" declaration specifies which version of the other package it was last tested against in case there is a more recent version of the other package that does not cause the conflict, or, indeed, that there's a more recent version of the package that includes the "Conflicts" declaration that does not cause the conflict. From barry at python.org Sat Dec 8 22:51:06 2012 From: barry at python.org (Barry Warsaw) Date: Sat, 8 Dec 2012 16:51:06 -0500 Subject: [Python-Dev] Emacs users: hg-tools-grep Message-ID: <20121208165106.725ccabf@resist.wooz.org> Hark fellow Emacsers. All you unenlightened heathens can stop reading now. A few years ago, my colleague Jono Lange wrote probably the best little chunk of Emacs lisp ever. `M-x bzr-tools-grep` lets you easily search a Bazaar repository for a case-sensitive string, providing you with a nice *grep* buffer which you can scroll through. When you find a code sample you want to look at, C-c C-c visits the file and plops you right at the matching line. You *only* grep through files under version control, so you get to ignore generated files, and compilation artifacts, etc. Of course, this doesn't help you for working on the Python code base, because Mercurial. I finally whipped up this straight up rip of Jono's code to work with hg. I'm actually embarrassed to put a copyright on this thing, and would happily just donate it to Jono, drop it in Python's Misc directory, or slip it like a lump of coal into the xmas stocking of whoever wants to "maintain" it for the next 20 years. But anyway, it's already proven enormously helpful to me, so here it is. Cheers, -Barry P.S. Who wants to abuse Jono and Matthew's copyright again and provide a git version? ;; Copyright (c) 2012 Barry A. Warsaw ;; ;; Permission is hereby granted, free of charge, to any person obtaining ;; a copy of this software and associated documentation files (the ;; "Software"), to deal in the Software without restriction, including ;; without limitation the rights to use, copy, modify, merge, publish, ;; distribute, sublicense, and/or sell copies of the Software, and to ;; permit persons to whom the Software is furnished to do so, subject to ;; the following conditions: ;; ;; The above copyright notice and this permission notice shall be ;; included in all copies or substantial portions of the Software. ;; ;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ;; This code is based on bzr-tools.el ;; Copyright (c) 2008-2009 Jonathan Lange, Matthew Lefkowitz (provide 'hg-tools) (defconst hg-tools-grep-command "hg locate --print0 | xargs -0 grep -In %s" "The command used for grepping files using hg. See `hg-tools-grep'.") ;; Run 'code' at the root of the branch which dirname is in. (defmacro hg-tools-at-branch-root (dirname &rest code) `(let ((default-directory (locate-dominating-file (expand-file-name ,dirname) ".hg"))) , at code)) (defun hg-tools-grep (expression dirname) "Search a branch for `expression'. If there's a C-u prefix, prompt for `dirname'." (interactive (let* ((string (read-string "Search for: ")) (dir (if (null current-prefix-arg) default-directory (read-directory-name (format "Search for %s in: " string))))) (list string dir))) (hg-tools-at-branch-root dirname (grep-find (format hg-tools-grep-command (shell-quote-argument expression))))) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tseaver at palladion.com Sat Dec 8 23:41:46 2012 From: tseaver at palladion.com (Tres Seaver) Date: Sat, 08 Dec 2012 17:41:46 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/08/2012 05:06 AM, Nick Coghlan wrote: > Building integrated systems *is hard*. Pretending projects can't > conflict just because they're both written in Python isn't sensible, > and neither is it sensible to avoid warning users about the the > potential for latent defects when particular packages are used in > combination. Building such systems is *too hard* to deletgate to the maintainers of every Python distribution registered on the Cheeseshop: there is too much policy involved for the ha'penn'orth of mechanism we are discussing here (decentralized inter-project metadata) to support. Such metadata *cannot* be useful in the general sense, but only in the context of a "curated" collection of packages, where the *curator* (not the upstream package authors) makes the choices. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlDDwioACgkQ+gerLs4ltQ4rOACghpN5x+k0w0Umn20AG1WOvYkq KQsAnibXQtbTnmbrPaMaVEfLH7W496lk =WAh9 -----END PGP SIGNATURE----- From stefan at bytereef.org Sat Dec 8 23:40:13 2012 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 8 Dec 2012 23:40:13 +0100 Subject: [Python-Dev] python 3.3 module test failures on FreeBSD 9 amd64 In-Reply-To: References: Message-ID: <20121208224013.GA6357@sleipnir.bytereef.org> A G wrote: > ====================================================================== > FAIL: test_saltedcrypt (test.test_crypt.CryptTestCase) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ? File "/usr/home/adam/Python-3.3.0/Lib/test/test_crypt.py", line 23, in > test_saltedcrypt > ??? self.assertEqual(len(pw), method.total_size) > AssertionError: 60 != 13 This isn't known. Probably just a test assumption that is too strict. You can open an issue at http://bugs.python.org/ . > ====================================================================== > FAIL: test_setegid (test.test_os.PosixUidGidTests) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ? File "/usr/home/adam/Python-3.3.0/Lib/test/test_os.py", line 1211, in > test_setegid > ??? self.assertRaises(os.error, os.setegid, 0) > AssertionError: OSError not raised by setegid This is harmless and occurs if the test user is in the wheel group, see: http://bugs.python.org/issue14110 Stefan Krah From steve at pearwood.info Sun Dec 9 02:15:34 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 09 Dec 2012 12:15:34 +1100 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: <50C3ADAB.4070308@mrabarnett.plus.com> References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> Message-ID: <50C3E636.1020003@pearwood.info> On 09/12/12 08:14, MRAB wrote: > If package A says that it conflicts with package B, it may or may not > be symmetrical, because it's possible that package B has been updated > since the author of package A discovered the conflict, so it's > important that the user is told which package is complaining about the > conflict, the one that is being installed or the one that is already > installed. I must admit than in reading this thread, I'm having a bit of trouble understanding why merely *installing* packages should lead to conflicts. Assuming that two software packages Spam and Ham install into directories Spam and Ham, how can merely having them installed side-by-side lead to a conflict? I can see how running or importing Spam and Ham together might lead to problems. And I can see that if package Spam wants to install into directory Ham, that would be bad. But who does that? Have I just demonstrated my naivety when it comes to packaging? Under what circumstances would two well-behaved packages with different names conflict? -- Steven From rosuav at gmail.com Sun Dec 9 02:32:10 2012 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 9 Dec 2012 12:32:10 +1100 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: <50C3E636.1020003@pearwood.info> References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> Message-ID: On Sun, Dec 9, 2012 at 12:15 PM, Steven D'Aprano wrote: > Assuming that two software packages Spam and Ham install into directories > Spam and Ham, how can merely having them installed side-by-side lead to a > conflict? > > I can see how running or importing Spam and Ham together might lead to > problems. And I can see that if package Spam wants to install into > directory Ham, that would be bad. But who does that? If two packages Spam and Ham both define a module Jam, then the one that gets loaded will depend on the search path. That would be one form of conflict. ChrisA From steve at pearwood.info Sun Dec 9 03:11:34 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 09 Dec 2012 13:11:34 +1100 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: References: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> Message-ID: <50C3F356.8060106@pearwood.info> On 09/12/12 12:32, Chris Angelico wrote: > On Sun, Dec 9, 2012 at 12:15 PM, Steven D'Aprano wrote: >> Assuming that two software packages Spam and Ham install into directories >> Spam and Ham, how can merely having them installed side-by-side lead to a >> conflict? >> >> I can see how running or importing Spam and Ham together might lead to >> problems. And I can see that if package Spam wants to install into >> directory Ham, that would be bad. But who does that? > > If two packages Spam and Ham both define a module Jam, then the one > that gets loaded will depend on the search path. That would be one > form of conflict. import Spam.Jam import Ham.Jam What am I missing? Why would a software package called "Spam" install a top-level module called "Jam" rather than "Spam"? Isn't the whole point of Python packages to solve this namespace problem? -- Steven From donald.stufft at gmail.com Sun Dec 9 03:15:40 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Sat, 8 Dec 2012 21:15:40 -0500 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: <50C3F356.8060106@pearwood.info> References: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> <50C3F356.8060106@pearwood.info> Message-ID: <23078DF12CA048A0A9B6D649860235F9@gmail.com> On Saturday, December 8, 2012 at 9:11 PM, Steven D'Aprano wrote: > Why would a software package called "Spam" install a top-level module called > "Jam" rather than "Spam"? Isn't the whole point of Python packages to solve > this namespace problem? > Conflicts doesn't really solve file based conflicts as PJ Elby has pointed out tools need to detect that circumstance already. But to answer this question no, there is no required mapping between Project names (what your thing is called on PyPI) and python package names (what you import). Something named Spam on PyPI could provide multiple python packages, named whatever it was they wanted to be named. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Dec 9 03:51:09 2012 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 9 Dec 2012 13:51:09 +1100 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: <50C3F356.8060106@pearwood.info> References: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> <50C3F356.8060106@pearwood.info> Message-ID: On Sun, Dec 9, 2012 at 1:11 PM, Steven D'Aprano wrote: > On 09/12/12 12:32, Chris Angelico wrote: >> >> On Sun, Dec 9, 2012 at 12:15 PM, Steven D'Aprano >> wrote: >>> >>> Assuming that two software packages Spam and Ham install into directories >>> Spam and Ham, how can merely having them installed side-by-side lead to a >>> conflict? >>> >>> I can see how running or importing Spam and Ham together might lead to >>> problems. And I can see that if package Spam wants to install into >>> directory Ham, that would be bad. But who does that? >> >> >> If two packages Spam and Ham both define a module Jam, then the one >> that gets loaded will depend on the search path. That would be one >> form of conflict. > > > import Spam.Jam > > import Ham.Jam > > What am I missing? > > > Why would a software package called "Spam" install a top-level module called > "Jam" rather than "Spam"? Isn't the whole point of Python packages to solve > this namespace problem? That would require/demand that the software package MUST define a module with its own name, and MUST NOT define any other top-level modules, and also that package names MUST be unique. (RFC 2119 keywords.) That would work, as long as those restrictions are acceptable. ChrisA From python at mrabarnett.plus.com Sun Dec 9 04:22:46 2012 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 09 Dec 2012 03:22:46 +0000 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: <50C3E636.1020003@pearwood.info> References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> Message-ID: <50C40406.4020301@mrabarnett.plus.com> On 2012-12-09 01:15, Steven D'Aprano wrote: > On 09/12/12 08:14, MRAB wrote: > >> If package A says that it conflicts with package B, it may or may not >> be symmetrical, because it's possible that package B has been updated >> since the author of package A discovered the conflict, so it's >> important that the user is told which package is complaining about the >> conflict, the one that is being installed or the one that is already >> installed. > > I must admit than in reading this thread, I'm having a bit of trouble > understanding why merely *installing* packages should lead to conflicts. > [snip] Personally speaking, I was thinking more about possible problems at runtime due to functional conflicts, but it could apply to any (undefined) conflict. From trent at snakebite.org Sun Dec 9 05:12:29 2012 From: trent at snakebite.org (Trent Nelson) Date: Sat, 8 Dec 2012 23:12:29 -0500 Subject: [Python-Dev] python 3.3 module test failures on FreeBSD 9 amd64 In-Reply-To: References: Message-ID: <20121209041228.GA18257@snakebite.org> On Sat, Dec 08, 2012 at 01:07:43PM -0800, A G wrote: > Ideally I'd like to resolve these and submit a port for python3.3 since > the most recent FreeBSD port is stuck on 3.2. FWIW, the FreeBSD Python port maintainer, Kubilay Kocak, is active on #python-dev at freenode under the nick 'koobs'. He has been working on the 3.3 port. I'd recommend liaising with him in order to avoid duplicating any effort. Trent. From glyph at twistedmatrix.com Sun Dec 9 05:14:47 2012 From: glyph at twistedmatrix.com (Glyph) Date: Sat, 8 Dec 2012 20:14:47 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> On Dec 7, 2012, at 5:10 PM, anatoly techtonik wrote: > What about reading from other file descriptors? subprocess.Popen allows arbitrary file descriptors to be used. Is there any provision here for reading and writing non-blocking from or to those? > > On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is select. Of course a test is needed, but why it should not just work? This is exactly why the provision needs to be made explicitly. On Windows it is WriteFile and ReadFile and PeekNamedPipe - unless the handle is a socket in which case it needs to be WSARecv. Or maybe it's some other weird thing - like, maybe a mailslot - and you need to call a different API. On *nix it really shouldn't be select. select cannot wait upon a file descriptor whose value is greater than FD_SETSIZE, which means it sets a hard (and small) limit on the number of things that a process which wants to use this facility can be doing. On the other hand, if you hard-code another arbitrary limit like this into the stdlib subprocess module, it will just be another great reason why Twisted's spawnProcess is the best and everyone should use it instead, so be my guest ;-). -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sun Dec 9 05:17:52 2012 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 8 Dec 2012 20:17:52 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: I'm really not sure what this PEP is trying to get at given that it contains no examples and sounds from the descriptions to be adding a complicated api on top of something that already, IMNSHO, has too much it (subprocess.Popen). Regardless, any user can use the stdout/err/in file objects with their own code that handles them asynchronously (yes that can be painful but that is what is required for _any_ socket or pipe I/O you don't want to block on). It *sounds* to me like this entire PEP could be written and released as a third party module on PyPI that offers a subprocess.Popen subclass adding some more convenient non-blocking APIs. That's where I'd start if I were interested in this as a future feature. -gps On Fri, Dec 7, 2012 at 5:10 PM, anatoly techtonik wrote: > On Tue, Sep 15, 2009 at 9:24 PM, wrote: > >> On 04:25 pm, eric.pruitt at gmail.com wrote: >> >>> I'm bumping this PEP again in hopes of getting some feedback. >>> >> > This is useful, indeed. ActiveState recipe for this has 10 votes, which is > high for ActiveState (and such hardcore topic FWIW). > > >> On Tue, Sep 8, 2009 at 23:52, Eric Pruitt wrote: >>> >>>> PEP: 3145 >>>> Title: Asynchronous I/O For subprocess.Popen >>>> Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson >>>> Type: Standards Track >>>> Content-Type: text/plain >>>> Created: 04-Aug-2009 >>>> Python-Version: 3.2 >>>> >>>> Abstract: >>>> >>>> In its present form, the subprocess.Popen implementation is prone to >>>> dead-locking and blocking of the parent Python script while waiting >>>> on data >>>> from the child process. >>>> >>>> Motivation: >>>> >>>> A search for "python asynchronous subprocess" will turn up numerous >>>> accounts of people wanting to execute a child process and >>>> communicate with >>>> it from time to time reading only the data that is available instead >>>> of >>>> blocking to wait for the program to produce data [1] [2] [3]. The >>>> current >>>> behavior of the subprocess module is that when a user sends or >>>> receives >>>> data via the stdin, stderr and stdout file objects, dead locks are >>>> common >>>> and documented [4] [5]. While communicate can be used to alleviate >>>> some of >>>> the buffering issues, it will still cause the parent process to >>>> block while >>>> attempting to read data when none is available to be read from the >>>> child >>>> process. >>>> >>>> Rationale: >>>> >>>> There is a documented need for asynchronous, non-blocking >>>> functionality in >>>> subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would >>>> improve the >>>> utility of the Python standard library that can be used on Unix >>>> based and >>>> Windows builds of Python. Practically every I/O object in Python >>>> has a >>>> file-like wrapper of some sort. Sockets already act as such and for >>>> strings there is StringIO. Popen can be made to act like a file by >>>> simply >>>> using the methods attached the the subprocess.Popen.stderr, stdout >>>> and >>>> stdin file-like objects. But when using the read and write methods >>>> of >>>> those options, you do not have the benefit of asynchronous I/O. In >>>> the >>>> proposed solution the wrapper wraps the asynchronous methods to >>>> mimic a >>>> file object. >>>> >>>> Reference Implementation: >>>> >>>> I have been maintaining a Google Code repository that contains all >>>> of my >>>> changes including tests and documentation [9] as well as blog >>>> detailing >>>> the problems I have come across in the development process [10]. >>>> >>>> I have been working on implementing non-blocking asynchronous I/O in >>>> the >>>> subprocess.Popen module as well as a wrapper class for >>>> subprocess.Popen >>>> that makes it so that an executed process can take the place of a >>>> file by >>>> duplicating all of the methods and attributes that file objects have. >>>> >>> >> "Non-blocking" and "asynchronous" are actually two different things. From >> the rest of this PEP, I think only a non-blocking API is being introduced. >> I haven't looked beyond the PEP, though, so I might be missing something. > > > I suggest renaming http://www.python.org/dev/peps/pep-3145/ to > 'Non-blocking I/O for subprocess' and continue. IMHO on this stage is where > examples with deadlocks that occur with current subprocess > implementation are badly needed. > > There are two base functions that have been added to the >>>> subprocess.Popen >>>> class: Popen.send and Popen._recv, each with two separate >>>> implementations, >>>> one for Windows and one for Unix based systems. The Windows >>>> implementation uses ctypes to access the functions needed to control >>>> pipes >>>> in the kernel 32 DLL in an asynchronous manner. On Unix based >>>> systems, >>>> the Python interface for file control serves the same purpose. The >>>> different implementations of Popen.send and Popen._recv have >>>> identical >>>> arguments to make code that uses these functions work across multiple >>>> platforms. >>>> >>> >> Why does the method for non-blocking read from a pipe start with an "_"? >> This is the convention (widely used) for a private API. The name also >> doesn't suggest that this is the non-blocking version of reading. >> Similarly, the name "send" doesn't suggest that this is the non-blocking >> version of writing. > > > The implementation is based on http://code.activestate.com/recipes/440554/which is more clearly illustrates integrated functionality. > > _recv() is a private base function, which is takes stdout or stderr as > parameter. Corresponding user-level functions to read from stdout and > stderr are .recv() and .recv_err() > > I thought about renaming API to .asyncread() and .asyncwrite(), but that > may mean that you call method and then result asynchronously start to fill > some buffer, which is not the case here. > > Then I thought about .check_read() and .check_write(), literally meaning > 'check and read' or 'check and return' for non-blocking calls if there is > nothing. But then again, poor naming convention of subprocess uses > .check_output() for blocking read until command completes. > > Currently, subversion doesn't have .read and .write methods. It may be the > best option: > .write(what) to pipe more stuff into input buffer of child process. > .read(from) where `from` is either subprocess.STDOUT or STDERR > Both functions should be marked as non-blocking in docs and returning None > if pipe is closed. > > When calling the Popen._recv function, it requires the pipe name be >>>> passed as an argument so there exists the Popen.recv function that >>>> passes >>>> selects stdout as the pipe for Popen._recv by default. >>>> Popen.recv_err >>>> selects stderr as the pipe by default. "Popen.recv" and >>>> "Popen.recv_err" >>>> are much easier to read and understand than "Popen._recv('stdout' >>>> ..." and >>>> "Popen._recv('stderr' ..." respectively. >>>> >>> >> What about reading from other file descriptors? subprocess.Popen allows >> arbitrary file descriptors to be used. Is there any provision here for >> reading and writing non-blocking from or to those? > > > On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is > select. Of course a test is needed, but why it should not just work? > > >> Since the Popen._recv function does not wait on data to be produced >>>> before returning a value, it may return empty bytes. Popen.asyncread >>>> handles this issue by returning all data read over a given time >>>> interval. >>>> >>> >> Oh. Popen.asyncread? What's that? This is the first time the PEP >> mentions it. > > > I guess that's for blocking read with timeout. > Among the most popular questions about Python it is the question number > ~500. > http://stackoverflow.com/questions/1191374/subprocess-with-timeout > > >> The ProcessIOWrapper class uses the asyncread and asyncwrite >>>> functions to >>>> allow a process to act like a file so that there are no blocking >>>> issues >>>> that can arise from using the stdout and stdin file objects produced >>>> from >>>> a subprocess.Popen call. >>>> >>> >> What's the ProcessIOWrapper class? And what's the asyncwrite function? >> Again, this is the first time it's mentioned. >> > > Oh. That's a wrapper to access subprocess pipes with familiar file API. It > is interesting: > > http://code.google.com/p/subprocdev/source/browse/subprocess.py?name=python3k > > >> So, to sum up, I think my main comment is that the PEP seems to be >> missing a significant portion of the details of what it's actually >> proposing. I suspect that this information is present in the >> implementation, which I have not looked at, but it probably belongs in the >> PEP. >> >> Jean-Paul >> > > Writing PEPs is definitely a job, and a hard one for developers. Too bad a > good idea *and* implementation (tests needed) is put on hold, because there > is nobody, who can help with that part. > > IMHO PEP needs to expand on user stories even if there is significant > amount of cited sources, a practical summary and problem illustration by > examples are missing. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sun Dec 9 05:37:37 2012 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 8 Dec 2012 20:37:37 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> Message-ID: On Sat, Dec 8, 2012 at 8:14 PM, Glyph wrote: > > On Dec 7, 2012, at 5:10 PM, anatoly techtonik wrote: > > What about reading from other file descriptors? subprocess.Popen allows >> arbitrary file descriptors to be used. Is there any provision here for >> reading and writing non-blocking from or to those? > > > On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is > select. Of course a test is needed, but why it should not just work? > > > This is exactly why the provision needs to be made explicitly. > > On Windows it is WriteFile and ReadFile and PeekNamedPipe - unless the > handle is a socket in which case it needs to be WSARecv. Or maybe it's > some other weird thing - like, maybe a mailslot - and you need to call a > different API. > > On *nix it really shouldn't be select. select cannot wait upon a file > descriptor whose *value* is greater than FD_SETSIZE, which means it sets > a hard (and small) limit on the number of things that a process which wants > to use this facility can be doing. > Nobody should ever touch select() this decade. poll() exists everywhere that matters. > On the other hand, if you hard-code another arbitrary limit like this into > the stdlib subprocess module, it will just be another great reason why > Twisted's spawnProcess is the best and everyone should use it instead, so > be my guest ;-). > > Is twisted's spawnProcess thread safe and async signal safe by using restricted C code for everything between the fork() and exec()? I'm not familiar enough with the twisted codebase to find things easily in it but I'm not seeing such an extension module within twisted and the code in http://twistedmatrix.com/trac/browser/trunk/twisted/internet/process.pycertainly is not safe. Just sayin'. :) Python >= 3.2 along with the http://pypi.python.org/pypi/subprocess32/backport for use on 2.x get this right. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Sun Dec 9 05:44:44 2012 From: glyph at twistedmatrix.com (Glyph) Date: Sat, 8 Dec 2012 20:44:44 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> Message-ID: <4DDBF867-B8CA-4B79-88EA-A79AB59BE067@twistedmatrix.com> On Dec 8, 2012, at 8:37 PM, Gregory P. Smith wrote: > Is twisted's spawnProcess thread safe and async signal safe by using restricted C code for everything between the fork() and exec()? I'm not familiar enough with the twisted codebase to find things easily in it but I'm not seeing such an extension module within twisted and the code in http://twistedmatrix.com/trac/browser/trunk/twisted/internet/process.py certainly is not safe. Just sayin'. :) It's on the agenda: . -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Dec 9 06:35:06 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Dec 2012 15:35:06 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Sun, Dec 9, 2012 at 8:41 AM, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/08/2012 05:06 AM, Nick Coghlan wrote: > > > Building integrated systems *is hard*. Pretending projects can't > > conflict just because they're both written in Python isn't sensible, > > and neither is it sensible to avoid warning users about the the > > potential for latent defects when particular packages are used in > > combination. > > Building such systems is *too hard* to deletgate to the maintainers of > every Python distribution registered on the Cheeseshop: there is too > much policy involved for the ha'penn'orth of mechanism we are discussing > here (decentralized inter-project metadata) to support. > > Such metadata *cannot* be useful in the general sense, but only in the > context of a "curated" collection of packages, where the *curator* (not > the upstream package authors) makes the choices. > The authors of major projects are often in a good position to know when they conflict with other high profile projects and thus can't be used reliably in the same system. Now, *most* of the time, if there's a genuine conflict between two Python packages, it's going to be at install time - two projects attempting to install the same file obviously can't coexist on a single system (distribute and setuptools, for example, conflict at this level - they both want to own the "setuptools" and "easy_install" names). However, Python has plenty of other global state too (the codec registry, the import system, monkeypatching), and there is potential for conflict over underlying OS level resources. So let's look at the case of collections of Python packages that *are* curated. Maybe I'm a Linux distro packager, looking to automate the conversion to distro packages. Maybe I'm a toolsmith for a large corporation trying to build a curated set of packages for internal use (clearly indicating to my internal users which ones don't play nicely with each other and thus shouldn't be used together in the same project). Regardless of the reason, I'm the curator for a collection of Python packages. How shall I express the conflicts I have identified? Shall I go invent my own metadata system? Shall I be forced to choose a particular platform-specific dependency management system? How shall upstream authors communicate to *me* the conflicts that they're already aware of? Or, hey, there's this nice shiny cross-platform dependency management system *right here*. Maybe they'll be nice enough to consider handling *my* use case as well, even if it's a use case *they* don't care about. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Sun Dec 9 06:48:45 2012 From: pje at telecommunity.com (PJ Eby) Date: Sun, 9 Dec 2012 00:48:45 -0500 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: <50C40406.4020301@mrabarnett.plus.com> References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> <50C40406.4020301@mrabarnett.plus.com> Message-ID: On Sat, Dec 8, 2012 at 10:22 PM, MRAB wrote: > On 2012-12-09 01:15, Steven D'Aprano wrote: >> >> On 09/12/12 08:14, MRAB wrote: >> >>> If package A says that it conflicts with package B, it may or may not >>> be symmetrical, because it's possible that package B has been updated >>> since the author of package A discovered the conflict, so it's >>> important that the user is told which package is complaining about the >>> conflict, the one that is being installed or the one that is already >>> installed. >> >> >> I must admit than in reading this thread, I'm having a bit of trouble >> understanding why merely *installing* packages should lead to conflicts. >> > [snip] > Personally speaking, I was thinking more about possible problems at > runtime due to functional conflicts, but it could apply to any > (undefined) conflict. If it's for a runtime functional conflict, there's no need for installation tools to worry about it, except perhaps in the case where a single project C depends on *both* A and B, where A and B conflict with each other. Apart from that piece of information, there is no way to know that the code will ever even be imported at the same time. (And even then, it's just a hint of the possibility, not a guarantee.) Nick, OTOH, says that the purpose of the field is to declare that mere side-by-side installation invalidates developer support for the configuration. However, the widespread confusion (conflicts?) over what exactly the field is supposed to mean and when it should be used suggests that its charter is not nearly as clear as it should be. It seems perhaps it is suffering from the so-called "Illusion of Transparency", wherein everybody looks at it and thinks that it *obviously* means X, and only a fool could think otherwise... except that everyone has a *different* value of X in mind. That's why I keep asking for specific, concrete use cases. At this point, for the field to make any sense, there needs to be some better idea of what a "runtime" or "undefined" conflict is. Apart from file conflicts, has anybody identified a single PyPI package that would make use of this field? If so, what *is* that example, and what is the nature of the conflict? Do any of the distro folks know of a Python project tagged as conflicting with another for their distro, where the conflict does *not* involve any files in conflict? (And the conflict is not specific to the distro's packaging of that project and the project in conflict? i.e., that it would have actually been possible and/or meaningful for the upstream developer to have flagged the conflict in the project's metadata, given the proposed metadata standard?) From ncoghlan at gmail.com Sun Dec 9 06:54:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Dec 2012 15:54:08 +1000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Sun, Dec 9, 2012 at 6:18 AM, PJ Eby wrote: > On Sat, Dec 8, 2012 at 5:06 AM, Nick Coghlan wrote: > > On Sat, Dec 8, 2012 at 4:46 PM, PJ Eby wrote: > >> > >> So if package A includes a "Conflicts: B" declaration, I recommend the > >> following: > >> > >> * An attempt to install A with B already present refuses to install A > >> without a warning and confirmation > >> * An attempt to install B informs the user of the conflict, and > >> optionally offers to uninstall A > >> > >> In this way, any collateral damage to B is avoided, while still making > >> the intended "lack of support" declaration clear. > >> > >> How does that sound? > > > > > > No, that's not the way it works. A conflict is always symmetric, no > matter > > who declares it. > > But that *precisely contradicts* what you said in your previous email: > > > It's to allow a project to say > > *they don't support* installing in parallel with another package. > > Just because A doesn't support being installed next to B, doesn't mean > B doesn't support being installed next to A. B might work just fine > with A installed, and even be explicitly supported by the author of B. > Why should the author of A get to decide what happens to B? Just > because I trust A about A, doesn't mean I should have to trust them > about B. > If I'm installing both A *and* B, I want to know if *either* project doesn't support that configuration. The order in which they get installed should *not* have any impact on my finding out that I am using one of my dependencies in an unsupported way that may cause me unanticipated problems further down the line. The author of A *doesn't* get to decide what happens to B, *I* do. They're merely providing a heads up that they believe there are problems when using their project in conjunction with B. My options will be: - use them both anyway (e.g. perhaps after doing some research, I may find out the conflict relates solely to a feature of B that I'm not using, so I simply update my project documentation to say "do not use feature X from project B, as it conflicts with dependency A") - choose to continue using A, find another solution for B - choose to continue using B, find another solution for A As a concrete example, there are projects out there that are known not to work with gevent's socket monkeypatching, but people don't know that until they try it and it blows up in their face. I now agree that *enforcing* a conflicts field at install time in a Python installer doesn't make any sense, since the nature of Python means it will often be easy to sidestep any such issues once you're aware of their existence (e.g. by avoiding gevent's monkeypatching features and using threads to interact with the uncooperative synchronous library, or by splitting your application into multiple processes, some using gevent and others synchronous sockets). I also believe that *any* Conflicts declaration *should* be backed up with an explicit explanation and rationale for that conflict declaration in the project documentation. Making it impossible to document runtime conflicts in metadata doesn't make those conflicts go away - it just means they will continue to be documented in an ad hoc manner on project web sites (if they get documented at all), making the job of package curation unnecessarily more difficult (since there is no standard way to document runtime conflicts). Adding a metadata field doesn't make sure such known conflicts *will* be documented, but it least makes it possible. So, I still like the idea of including a Conflicts field, but think a few points should be made clear: - the Conflicts field would be for documenting other distributions which have known issues working together in the same process and thus constitute an unsupported configuration - this field would be aimed at package *users*, rather than at installation tools (although it would still be good if they installation tools supported scanning a set of packages for known conflicts) - any use of this field should be backed up with a more detailed explanation in the project documentation Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Dec 9 08:06:25 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Dec 2012 17:06:25 +1000 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: References: <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> <50C40406.4020301@mrabarnett.plus.com> Message-ID: On Sun, Dec 9, 2012 at 3:48 PM, PJ Eby wrote: > That's why I keep asking for specific, concrete use cases. At this > point, for the field to make any sense, there needs to be some better > idea of what a "runtime" or "undefined" conflict is. Apart from file > conflicts, has anybody identified a single PyPI package that would > make use of this field? If so, what *is* that example, and what is > the nature of the conflict? > The best current example I know of is whether or not a given package is gevent compatible. At the moment, you have to try it and see, or hope the project developers have a note somewhere saying whether or not it works. "Incompatible" might be a better field name than "Conflicts" for that use case, though. You've persuaded me that any installer based notification of runtime conflicts should at most be a warning (or even a separate query), since the user has so many options for dealing with it (including the typical case where the two components are simply never used in the same process). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sun Dec 9 09:07:12 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 09 Dec 2012 09:07:12 +0100 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: <20121208165106.725ccabf@resist.wooz.org> References: <20121208165106.725ccabf@resist.wooz.org> Message-ID: Am 08.12.2012 22:51, schrieb Barry Warsaw: > Hark fellow Emacsers. All you unenlightened heathens can stop reading now. > > A few years ago, my colleague Jono Lange wrote probably the best little chunk > of Emacs lisp ever. `M-x bzr-tools-grep` lets you easily search a Bazaar > repository for a case-sensitive string, providing you with a nice *grep* > buffer which you can scroll through. When you find a code sample you want to > look at, C-c C-c visits the file and plops you right at the matching line. > You *only* grep through files under version control, so you get to ignore > generated files, and compilation artifacts, etc. > > Of course, this doesn't help you for working on the Python code base, because > Mercurial. I finally whipped up this straight up rip of Jono's code to work > with hg. I'm actually embarrassed to put a copyright on this thing, and would > happily just donate it to Jono, drop it in Python's Misc directory, or slip it > like a lump of coal into the xmas stocking of whoever wants to "maintain" it > for the next 20 years. > > But anyway, it's already proven enormously helpful to me, so here it is. Thanks, I'll definitely use this! Georg From g.brandl at gmx.net Sun Dec 9 09:09:58 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 09 Dec 2012 09:09:58 +0100 Subject: [Python-Dev] Proposing "Argument Clinic", a new way of specifying arguments to builtins for CPython In-Reply-To: <20121204203528.727d1c4e@pitrou.net> References: <50BD27CF.1070303@hastings.org> <20121204100851.193751c5@pitrou.net> <50BE4929.60309@hastings.org> <20121204203528.727d1c4e@pitrou.net> Message-ID: Am 04.12.2012 20:35, schrieb Antoine Pitrou: > On Tue, 04 Dec 2012 11:04:09 -0800 > Larry Hastings wrote: >> >> Along these lines, I've been contemplating proposing that Clinic >> specifically understand "path" arguments, distinctly from other string >> arguments, as they are both common and rarely handled correctly. My >> main fear is that I probably don't understand all their complexities >> either ;-) >> >> Anyway, this is certainly something we can consider *improving* for >> Python 3.4. But for now I'm trying to make Clinic an indistinguishable >> drop-in replacement. >> > [...] >> >> Naturally I agree Clinic needs more polishing. But the problem you fear >> is already solved. Clinic allows precisely expressing any existing >> PyArg_ "format unit"** through a combination of the type of the >> parameter and its "flags". > > Very nice then! Your work is promising, and I hope we'll see a version > of it some day in Python 3.4 (or 3.4+k). Looks good to me to, and as someone who once tried to go the "preprocessor macro" route, much saner. One small thing: May I propose to make the "special comments" a little more self-descriptive? Yes, "argument clinic" is a nice name for the whole thing, but if you encounter it in a C file there's nothing it tells you about what happens there. cheers, Georg From pje at telecommunity.com Sun Dec 9 19:55:04 2012 From: pje at telecommunity.com (PJ Eby) Date: Sun, 9 Dec 2012 13:55:04 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Sun, Dec 9, 2012 at 12:54 AM, Nick Coghlan wrote: > On Sun, Dec 9, 2012 at 6:18 AM, PJ Eby wrote: >> >> On Sat, Dec 8, 2012 at 5:06 AM, Nick Coghlan wrote: >> > On Sat, Dec 8, 2012 at 4:46 PM, PJ Eby wrote: >> >> >> >> So if package A includes a "Conflicts: B" declaration, I recommend the >> >> following: >> >> >> >> * An attempt to install A with B already present refuses to install A >> >> without a warning and confirmation >> >> * An attempt to install B informs the user of the conflict, and >> >> optionally offers to uninstall A >> >> >> >> In this way, any collateral damage to B is avoided, while still making >> >> the intended "lack of support" declaration clear. >> >> >> >> How does that sound? >> > >> > >> > No, that's not the way it works. A conflict is always symmetric, no >> > matter >> > who declares it. >> >> But that *precisely contradicts* what you said in your previous email: >> >> > It's to allow a project to say >> > *they don't support* installing in parallel with another package. >> >> Just because A doesn't support being installed next to B, doesn't mean >> B doesn't support being installed next to A. B might work just fine >> with A installed, and even be explicitly supported by the author of B. >> Why should the author of A get to decide what happens to B? Just >> because I trust A about A, doesn't mean I should have to trust them >> about B. > > > If I'm installing both A *and* B, I want to know if *either* project doesn't > support that configuration. The order in which they get installed should > *not* have any impact on my finding out that I am using one of my > dependencies in an unsupported way that may cause me unanticipated problems > further down the line. This is probably moot now, but I didn't propose that installation order matter -- in both scenarios I described, you end up with a warning and A not installed, regardless of whether A or B were installed first. > The author of A *doesn't* get to decide what happens to B, *I* do. The reason I said, "(de facto)", is because the default behavior of whatever the next big installation tool is, would be what most users would've gotten by default. > They're > merely providing a heads up that they believe there are problems when using > their project in conjunction with B. My options will be: > - use them both anyway (e.g. perhaps after doing some research, I may find > out the conflict relates solely to a feature of B that I'm not using, so I > simply update my project documentation to say "do not use feature X from > project B, as it conflicts with dependency A") > - choose to continue using A, find another solution for B > - choose to continue using B, find another solution for A > > As a concrete example, there are projects out there that are known not to > work with gevent's socket monkeypatching, but people don't know that until > they try it and it blows up in their face. Here's the question, though: who's going to maintain that list? I can see gevent wanting to have a compatibility chart page in their docs, but it seems unlikely they'd want to block installation of non-gevent-compatible projects or vice versa. Similarly, I can't see why any of those other projects would want to block installation of gevent, or vice versa. That being said, I don't object to having the ability for either of them to do so: the utility of the field is *much* enhanced once its connection to installation tools is gone, since a wider variety of issues can be described without inconveniencing users. > I now agree that *enforcing* a conflicts field at install time in a Python > installer doesn't make any sense, since the nature of Python means it will > often be easy to sidestep any such issues once you're aware of their > existence (e.g. by avoiding gevent's monkeypatching features and using > threads to interact with the uncooperative synchronous library, or by > splitting your application into multiple processes, some using gevent and > others synchronous sockets). I also believe that *any* Conflicts declaration > *should* be backed up with an explicit explanation and rationale for that > conflict declaration in the project documentation. Beyond that, I think a reference URL should be included *in the field itself*, e.g. to a bug report, support ticket, or other page that documents the incompatibility and will be updated as the situation changes. The actual usefulness of the field to anyone "downstream" seems greatly reduced if they have to go hunting for the information explaining the compatibility issue(s). This is a good example of what I meant about clear thinking on concrete use cases, vs. simply copying fields from distro tools. In the distro world, these kinds of fields reflect the *results* of research and decision-making about compatibility. Whereas, in our "upstream" world, the purpose of the fields is to provide downstream repackagers and integrators with the source materials for such research. > So, I still like the idea of including a Conflicts field, but think a few > points should be made clear: > - the Conflicts field would be for documenting other distributions which > have known issues working together in the same process and thus constitute > an unsupported configuration > - this field would be aimed at package *users*, rather than at installation > tools (although it would still be good if they installation tools supported > scanning a set of packages for known conflicts) > - any use of this field should be backed up with a more detailed explanation > in the project documentation My concrete recommendation based on your comments, then, is: * The field should be called Known-Incompatibilities (to better clarify its purpose and avoid confusion with similarly-named installation-oriented metadata in other tools) * The field should be of the form (though not necessarily syntax): ProjectName==incompatible_version; info=url That is, each entry lists a project name and a specific version that is known to be incompatible, along with a (required) information URL. The URL should be for: * a page that is updated with any change in the situation * that will remain available indefinitely, and * describes the specific reason that particular project is considered incompatible, along with any available workarounds For minor issues, a bug report or support ticket is acceptable; otherwise, a long-lived documentation link should be used. In-page anchor links are acceptable. A simple link to either project's home page or main documentation page is *not* acceptable: the link must to be a part of the documentation that directly addresses the nature of the incompatibility. I'm not too picky about the version specification approach, though; the simplest thing is to only allow a single version to be named, but it also seems it could be reasonable to list one or more version ranges that appy, as long as they are not open-ended going forward. That is, saying versions 1.1-2.3 are incompatible is ok, but not "1.1 on". (Because the author of A is not in a position to declare on B's behalf that the incompatibility will *never* be fixable.) (I might be overthinking the versions, bit, though, since this is really just about warnings.) I would recommend that tools automatically provide the warning in cases where a project C depends on versions of A and B that are declared incompatible. In this case, while one cannot *prove* the incompatibility to be an issue, it is still a potential issue. (This is more of a package build-time issue, though, as with Replaced-By.) Speaking of Replaced-By, it probably makes sense to require a URL in the field there as well, but that URL can be an unchanging page such as an archived post to a mailing list or blog, announcing the project's renaming or obsolescence, and providing migration help or links thereto. I think it also should be a multi-valued field, just like Known-Incompatibilities. Recently, I came across a Python project "lepl" (a parser combinator library) that just declared its end-of-life, and actually recommended multiple alternatives, each of which would be more appropriate for some uses of a parsing library. (That is, there was no single "does everything" replacement for lepl's full feature set.) Finally, the PEP should document that the audience for both the Replaced-By and Known-Incompatibilities fields is developers and system integrators (such as distro teams). So they are designed to be processed by tools that *build* packages, rather than tools that *install* them. So, if you build a project that depends on something that's replaced, or a pair of things known to be incompatible, that's when you get warnings and such. Tools to check such things on installed projects are also ok, though to avoid unnecessary warnings, it's probably best to only list incompatibilities for co-dependents (and orphaned replaced projects) by default. That is, a checker should probably ignore replacements when there's an installed project depending on the replaced version, and ignore incompatibilities that aren't part of the same requirements subtree (and thus unlikely to be used together). Of course, having options to be more verbose is not an issue, and this isn't really something to legislate anyway -- it's just that listing *every* replaced project or potentially-incompatible pairing in even a moderately-sized installation is likely to be far more noise than signal. From mark at hotpy.org Sun Dec 9 23:22:19 2012 From: mark at hotpy.org (Mark Shannon) Date: Sun, 09 Dec 2012 22:22:19 +0000 Subject: [Python-Dev] Do more at compile time; less at runtime Message-ID: <50C50F1B.2080005@hotpy.org> Hi all, The current CPython bytecode interpreter is rather more complex than it needs to be. A number of bytecodes could be eliminated and a few more simplified by moving the work involved in handling compound statements (loops, try-blocks, etc) from the interpreter to the compiler. This simplest example of this is the while loop... while cond: body This currently compiled as start: if not cond goto end body goto start end: but it could be compiled as goto test: start: body if cond goto start which eliminates one instruction per iteration. A more complex example is a return in a try-finally block. try: part1 if cond: return X part2 finally: part3 Currently, handling the return is complex and involves "pseudo exceptions", but if part3 were duplicated by the compiler, then the RETURN bytecode could just perform a simple return. The code above would be compiled thus... PUSH_BLOCK try part1 if not X goto endif push X POP_BLOCK part3 <<< duplicated RETURN_VALUE endif: part2 POP_BLOCK part3 <<< duplicated The changes I am proposing are: Allow negative line deltas in the lnotab array (bytecode deltas would remain non-negative) Remove the SETUP_LOOP, BREAK and CONTINUE bytecodes Simplify the RETURN bytecode Eliminate "pseudo exceptions" from the interpreter Simplify (or perhaps eliminate) SETUP_TRY, END_FINALLY, END_WITH. Reverse the sense of the FOR_ITER bytecode (ie. jump on not-exhausted) The net effect of these changes would be: Reduced code size and reduced code complexity. A small (1-5%)? increase in speed, due the simplification of the bytecodes and a very small change in the number of bytecodes executed. A small change in the static size of the bytecodes (-2% to +2%)? Although this is a quite intrusive change, I think it is worthwhile as it simplifies ceval.c considerably. The interpreter has become rather convoluted and any simplification has to be a good thing. I've already implemented negative line deltas and the transformed while loop: https://bitbucket.org/markshannon/cpython-lnotab-signed I'm currently working on the block unwinding. So, Good idea? Bad idea? Should I write a PEP or is the bug tracker sufficient? Cheers, Mark. From guido at python.org Sun Dec 9 23:43:15 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 9 Dec 2012 14:43:15 -0800 Subject: [Python-Dev] Do more at compile time; less at runtime In-Reply-To: <50C50F1B.2080005@hotpy.org> References: <50C50F1B.2080005@hotpy.org> Message-ID: Sounds good to me. No PEP needed, just a tracker item, tests, review etc... --Guido van Rossum (sent from Android phone) On Dec 9, 2012 2:24 PM, "Mark Shannon" wrote: > Hi all, > > The current CPython bytecode interpreter is rather more complex than it > needs to be. A number of bytecodes could be eliminated and a few more > simplified by moving the work involved in handling compound statements > (loops, try-blocks, etc) from the interpreter to the compiler. > > This simplest example of this is the while loop... > while cond: > body > > This currently compiled as > > start: > if not cond goto end > body > goto start > end: > > but it could be compiled as > > goto test: > start: > body > if cond goto start > > which eliminates one instruction per iteration. > > A more complex example is a return in a try-finally block. > > try: > part1 > if cond: > return X > part2 > finally: > part3 > > Currently, handling the return is complex and involves "pseudo > exceptions", but if part3 were duplicated by the compiler, then the RETURN > bytecode could just perform a simple return. > The code above would be compiled thus... > > PUSH_BLOCK try > part1 > if not X goto endif > push X > POP_BLOCK > part3 <<< duplicated > RETURN_VALUE > endif: > part2 > POP_BLOCK > part3 <<< duplicated > > The changes I am proposing are: > > Allow negative line deltas in the lnotab array (bytecode deltas would > remain non-negative) > Remove the SETUP_LOOP, BREAK and CONTINUE bytecodes > Simplify the RETURN bytecode > Eliminate "pseudo exceptions" from the interpreter > Simplify (or perhaps eliminate) SETUP_TRY, END_FINALLY, END_WITH. > Reverse the sense of the FOR_ITER bytecode (ie. jump on not-exhausted) > > > The net effect of these changes would be: > Reduced code size and reduced code complexity. > A small (1-5%)? increase in speed, due the simplification of the > bytecodes and a very small change in the number of bytecodes executed. > A small change in the static size of the bytecodes (-2% to +2%)? > > Although this is a quite intrusive change, I think it is worthwhile as it > simplifies ceval.c considerably. > The interpreter has become rather convoluted and any simplification has to > be a good thing. > > I've already implemented negative line deltas and the transformed while > loop: https://bitbucket.org/**markshannon/cpython-lnotab-**signed > I'm currently working on the block unwinding. > > So, > Good idea? Bad idea? > Should I write a PEP or is the bug tracker sufficient? > > Cheers, > Mark. > > > > > > > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Dec 9 23:55:30 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Dec 2012 08:55:30 +1000 Subject: [Python-Dev] Do more at compile time; less at runtime In-Reply-To: <50C50F1B.2080005@hotpy.org> References: <50C50F1B.2080005@hotpy.org> Message-ID: Interesting idea, main challenge is to keep stepping through the code with pdb sane, and being clear on what differences in behaviour will be visible to the runtime execution hooks. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Mon Dec 10 00:29:16 2012 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 09 Dec 2012 23:29:16 +0000 Subject: [Python-Dev] Do more at compile time; less at runtime In-Reply-To: <50C50F1B.2080005@hotpy.org> References: <50C50F1B.2080005@hotpy.org> Message-ID: <50C51ECC.7000301@mrabarnett.plus.com> On 2012-12-09 22:22, Mark Shannon wrote: > Hi all, > > The current CPython bytecode interpreter is rather more complex than it > needs to be. A number of bytecodes could be eliminated and a few more > simplified by moving the work involved in handling compound statements > (loops, try-blocks, etc) from the interpreter to the compiler. > > This simplest example of this is the while loop... > while cond: > body > > This currently compiled as > > start: > if not cond goto end > body > goto start > end: > > but it could be compiled as > > goto test: > start: > body > if cond goto start > > which eliminates one instruction per iteration. > > A more complex example is a return in a try-finally block. > > try: > part1 > if cond: > return X > part2 > finally: > part3 > > Currently, handling the return is complex and involves "pseudo > exceptions", but if part3 were duplicated by the compiler, then the > RETURN bytecode could just perform a simple return. > The code above would be compiled thus... > > PUSH_BLOCK try > part1 > if not X goto endif > push X > POP_BLOCK > part3 <<< duplicated > RETURN_VALUE > endif: > part2 > POP_BLOCK > part3 <<< duplicated > > The changes I am proposing are: > [snip] Is it necessary to duplicate part3? Is it possible to put it into a subroutine if it's long? (And I do mean a simple cheap subroutine.) From techtonik at gmail.com Mon Dec 10 00:30:23 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 10 Dec 2012 02:30:23 +0300 Subject: [Python-Dev] hg annotate is broken on hg.python.org Message-ID: Just to let you know that annotate in hgweb is broken for Python sources. http://hg.python.org/cpython/annotate/692be1f9fa1d/Lib/distutils/tests/test_register.py -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Mon Dec 10 02:39:47 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 9 Dec 2012 17:39:47 -0800 Subject: [Python-Dev] hg annotate is broken on hg.python.org In-Reply-To: References: Message-ID: On Sun, Dec 9, 2012 at 3:30 PM, anatoly techtonik wrote: > Just to let you know that annotate in hgweb is broken for Python sources. > > http://hg.python.org/cpython/annotate/692be1f9fa1d/Lib/distutils/tests/test_register.py Maybe I'm missing something, but what's broken about it? Also, in my experience it's okay to file issues about hg.python.org on the main tracker if you suspect something isn't right or you think should be improved. --Chris From raymond.hettinger at gmail.com Mon Dec 10 02:44:30 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 9 Dec 2012 20:44:30 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration Message-ID: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> The current memory layout for dictionaries is unnecessarily inefficient. It has a sparse table of 24-byte entries containing the hash value, key pointer, and value pointer. Instead, the 24-byte entries should be stored in a dense table referenced by a sparse table of indices. For example, the dictionary: d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'} is currently stored as: entries = [['--', '--', '--'], [-8522787127447073495, 'barry', 'green'], ['--', '--', '--'], ['--', '--', '--'], ['--', '--', '--'], [-9092791511155847987, 'timmy', 'red'], ['--', '--', '--'], [-6480567542315338377, 'guido', 'blue']] Instead, the data should be organized as follows: indices = [None, 1, None, None, None, 0, None, 2] entries = [[-9092791511155847987, 'timmy', 'red'], [-8522787127447073495, 'barry', 'green'], [-6480567542315338377, 'guido', 'blue']] Only the data layout needs to change. The hash table algorithms would stay the same. All of the current optimizations would be kept, including key-sharing dicts and custom lookup functions for string-only dicts. There is no change to the hash functions, the table search order, or collision statistics. The memory savings are significant (from 30% to 95% compression depending on the how full the table is). Small dicts (size 0, 1, or 2) get the most benefit. For a sparse table of size t with n entries, the sizes are: curr_size = 24 * t new_size = 24 * n + sizeof(index) * t In the above timmy/barry/guido example, the current size is 192 bytes (eight 24-byte entries) and the new size is 80 bytes (three 24-byte entries plus eight 1-byte indices). That gives 58% compression. Note, the sizeof(index) can be as small as a single byte for small dicts, two bytes for bigger dicts and up to sizeof(Py_ssize_t) for huge dict. In addition to space savings, the new memory layout makes iteration faster. Currently, keys(), values, and items() loop over the sparse table, skipping-over free slots in the hash table. Now, keys/values/items can loop directly over the dense table, using fewer memory accesses. Another benefit is that resizing is faster and touches fewer pieces of memory. Currently, every hash/key/value entry is moved or copied during a resize. In the new layout, only the indices are updated. For the most part, the hash/key/value entries never move (except for an occasional swap to fill a hole left by a deletion). With the reduced memory footprint, we can also expect better cache utilization. For those wanting to experiment with the design, there is a pure Python proof-of-concept here: http://code.activestate.com/recipes/578375 YMMV: Keep in mind that the above size statics assume a build with 64-bit Py_ssize_t and 64-bit pointers. The space savings percentages are a bit different on other builds. Also, note that in many applications, the size of the data dominates the size of the container (i.e. the weight of a bucket of water is mostly the water, not the bucket). Raymond From stephen at xemacs.org Mon Dec 10 02:48:02 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 10 Dec 2012 10:48:02 +0900 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: <87vcca1xv1.fsf@uwakimon.sk.tsukuba.ac.jp> PJ Eby writes: > That being said, I don't object to having the ability for either of > them to do so: the utility of the field is *much* enhanced once its > connection to installation tools is gone, since a wider variety of > issues can be described without inconveniencing users. +1 to "describing". A metadata format should not specify tool behavior, and should use behavior-neutral nomenclature. Rather, use cases that seem probable or perhaps wrong-headed should inform the design. Nevertheless, actual decisions about behavior should be left to the tool authors. > This is a good example of what I meant about clear thinking on > concrete use cases, vs. simply copying fields from distro tools. In > the distro world, these kinds of fields reflect the *results* of > research and decision-making about compatibility. Whereas, in our > "upstream" world, the purpose of the fields is to provide downstream > repackagers and integrators with the source materials for such > research. I agree with the meaning of the above paragraph, but would like to dissociate myself from the comparison implied by the expression "clear thinking". AFAICS, it's different assumptions about use cases that drives the difference in prescriptions here. From pje at telecommunity.com Mon Dec 10 03:24:12 2012 From: pje at telecommunity.com (PJ Eby) Date: Sun, 9 Dec 2012 21:24:12 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <87vcca1xv1.fsf@uwakimon.sk.tsukuba.ac.jp> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <87vcca1xv1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Dec 9, 2012 at 8:48 PM, Stephen J. Turnbull wrote: > PJ Eby writes: > > This is a good example of what I meant about clear thinking on > > concrete use cases, vs. simply copying fields from distro tools. In > > the distro world, these kinds of fields reflect the *results* of > > research and decision-making about compatibility. Whereas, in our > > "upstream" world, the purpose of the fields is to provide downstream > > repackagers and integrators with the source materials for such > > research. > > I agree with the meaning of the above paragraph, but would like to > dissociate myself from the comparison implied by the expression "clear > thinking". What comparison is that? By "clear", I mean "free of prior assumptions". The assumptions that made the discussion difficult weren't just about the use cases themselves, but about the environments, tools, organizations, concepts, etc. surrounding those use cases. Indeed, even the assumption of what should *qualify* as a "use case" was a stumbling block on occasion. ;-) And by "thinking", I mean, "considering alternatives and consequences", as distinct from debating the merits of a specific position. Put together, the phrase "clear thinking on concrete use cases" means (at least to me), "dropping all preconceptions of the existing design and starting over from square one, to ask how best the problem may be solved, using specific examples as a guide rather than using generalities." Generalities not rooted in concrete examples have a way of leading to non-terminating discussions. ;-) Starting over a discussion in this fashion isn't easy, but the results are usually worth it. I appreciate Nick and Daniel's patience in particular. From python at mrabarnett.plus.com Mon Dec 10 04:03:37 2012 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 10 Dec 2012 03:03:37 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: <50C55109.7040107@mrabarnett.plus.com> On 2012-12-10 01:44, Raymond Hettinger wrote: > The current memory layout for dictionaries is > unnecessarily inefficient. It has a sparse table of > 24-byte entries containing the hash value, key pointer, > and value pointer. > > Instead, the 24-byte entries should be stored in a > dense table referenced by a sparse table of indices. > > For example, the dictionary: > > d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'} > > is currently stored as: > > entries = [['--', '--', '--'], > [-8522787127447073495, 'barry', 'green'], > ['--', '--', '--'], > ['--', '--', '--'], > ['--', '--', '--'], > [-9092791511155847987, 'timmy', 'red'], > ['--', '--', '--'], > [-6480567542315338377, 'guido', 'blue']] > > Instead, the data should be organized as follows: > > indices = [None, 1, None, None, None, 0, None, 2] > entries = [[-9092791511155847987, 'timmy', 'red'], > [-8522787127447073495, 'barry', 'green'], > [-6480567542315338377, 'guido', 'blue']] > > Only the data layout needs to change. The hash table > algorithms would stay the same. All of the current > optimizations would be kept, including key-sharing > dicts and custom lookup functions for string-only > dicts. There is no change to the hash functions, the > table search order, or collision statistics. > > The memory savings are significant (from 30% to 95% > compression depending on the how full the table is). > Small dicts (size 0, 1, or 2) get the most benefit. > > For a sparse table of size t with n entries, the sizes are: > > curr_size = 24 * t > new_size = 24 * n + sizeof(index) * t > > In the above timmy/barry/guido example, the current > size is 192 bytes (eight 24-byte entries) and the new > size is 80 bytes (three 24-byte entries plus eight > 1-byte indices). That gives 58% compression. > > Note, the sizeof(index) can be as small as a single > byte for small dicts, two bytes for bigger dicts and > up to sizeof(Py_ssize_t) for huge dict. > > In addition to space savings, the new memory layout > makes iteration faster. Currently, keys(), values, and > items() loop over the sparse table, skipping-over free > slots in the hash table. Now, keys/values/items can > loop directly over the dense table, using fewer memory > accesses. > > Another benefit is that resizing is faster and > touches fewer pieces of memory. Currently, every > hash/key/value entry is moved or copied during a > resize. In the new layout, only the indices are > updated. For the most part, the hash/key/value entries > never move (except for an occasional swap to fill a > hole left by a deletion). > > With the reduced memory footprint, we can also expect > better cache utilization. > > For those wanting to experiment with the design, > there is a pure Python proof-of-concept here: > > http://code.activestate.com/recipes/578375 > > YMMV: Keep in mind that the above size statics assume a > build with 64-bit Py_ssize_t and 64-bit pointers. The > space savings percentages are a bit different on other > builds. Also, note that in many applications, the size > of the data dominates the size of the container (i.e. > the weight of a bucket of water is mostly the water, > not the bucket). > What happens when a key is deleted? Will that leave an empty slot in the entry array? If so, will the array be compacted if the proportion of entries grows beyond a certain limit? Adding a key would involve simply appending to the entry array (filling the last empty slot), but if there could be empty slots in other places, would it look for the first? Actually, as I think about it, you could form a linked list (using the 'hash' field) of those empty slots that aren't part of the final contiguous block and fill those preferentially. From raymond.hettinger at gmail.com Mon Dec 10 04:30:22 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 9 Dec 2012 22:30:22 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C55109.7040107@mrabarnett.plus.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C55109.7040107@mrabarnett.plus.com> Message-ID: <9B40E85E-C809-475B-BF08-4E16274158C5@gmail.com> On Dec 9, 2012, at 10:03 PM, MRAB wrote: > What happens when a key is deleted? Will that leave an empty slot in > the entry array? Yes. See the __delitem__() method in the pure python implemention at http://code.activestate.com/recipes/578375 > If so, will the array be compacted if the proportion > of entries grows beyond a certain limit? Yes. Compaction happens during resizing() which triggers when the dict reaches two-thirds full (including dummy entries). See the __setitem__() method in the pure python implementation. > Adding a key would involve simply appending to the entry array (filling > the last empty slot), but if there could be empty slots in other > places, would it look for the first? Yes. See the _next_open_index() helper method in the pure python implemention. > Actually, as I think about it, you could form a linked list (using the > 'hash' field) of those empty slots that aren't part of the final > contiguous block and fill those preferentially. That's the plan. See the comment above the keylist.index(UNUSED) line in the _next_open_index() method in the pure python implementation. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From amcnabb at mcnabbs.org Mon Dec 10 04:38:08 2012 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Sun, 9 Dec 2012 20:38:08 -0700 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121206064925.GC2613@unaka.lan> <20121207170134.GD2613@unaka.lan> Message-ID: <20121210033808.GG28908@mcnabbs.org> On Fri, Dec 07, 2012 at 05:02:26PM -0500, PJ Eby wrote: > If the packages have files in conflict, they won't be both installed. > If they don't have files in conflict, there's nothing important to be > informed of. If one is installing pexpect-u, then one does not need > to discover that it is a successor of pexpect. In the specific case of pexpect and pexpect-u, the files don't actually conflict. The pexpect package includes a "pexpect.py" file, while pexpect-u includes a "pexpect/" directory. These conflict, but not in the easily detectable sense. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 From greg at krypto.org Mon Dec 10 07:38:19 2012 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 9 Dec 2012 22:38:19 -0800 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: On Sun, Dec 9, 2012 at 5:44 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > The current memory layout for dictionaries is > unnecessarily inefficient. It has a sparse table of > 24-byte entries containing the hash value, key pointer, > and value pointer. > > Instead, the 24-byte entries should be stored in a > dense table referenced by a sparse table of indices. > > For example, the dictionary: > > d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'} > > is currently stored as: > > entries = [['--', '--', '--'], > [-8522787127447073495, 'barry', 'green'], > ['--', '--', '--'], > ['--', '--', '--'], > ['--', '--', '--'], > [-9092791511155847987, 'timmy', 'red'], > ['--', '--', '--'], > [-6480567542315338377, 'guido', 'blue']] > > Instead, the data should be organized as follows: > > indices = [None, 1, None, None, None, 0, None, 2] > entries = [[-9092791511155847987, 'timmy', 'red'], > [-8522787127447073495, 'barry', 'green'], > [-6480567542315338377, 'guido', 'blue']] > > Only the data layout needs to change. The hash table > algorithms would stay the same. All of the current > optimizations would be kept, including key-sharing > dicts and custom lookup functions for string-only > dicts. There is no change to the hash functions, the > table search order, or collision statistics. > > The memory savings are significant (from 30% to 95% > compression depending on the how full the table is). > Small dicts (size 0, 1, or 2) get the most benefit. > > For a sparse table of size t with n entries, the sizes are: > > curr_size = 24 * t > new_size = 24 * n + sizeof(index) * t > > In the above timmy/barry/guido example, the current > size is 192 bytes (eight 24-byte entries) and the new > size is 80 bytes (three 24-byte entries plus eight > 1-byte indices). That gives 58% compression. > > Note, the sizeof(index) can be as small as a single > byte for small dicts, two bytes for bigger dicts and > up to sizeof(Py_ssize_t) for huge dict. > > In addition to space savings, the new memory layout > makes iteration faster. Currently, keys(), values, and > items() loop over the sparse table, skipping-over free > slots in the hash table. Now, keys/values/items can > loop directly over the dense table, using fewer memory > accesses. > > Another benefit is that resizing is faster and > touches fewer pieces of memory. Currently, every > hash/key/value entry is moved or copied during a > resize. In the new layout, only the indices are > updated. For the most part, the hash/key/value entries > never move (except for an occasional swap to fill a > hole left by a deletion). > > With the reduced memory footprint, we can also expect > better cache utilization. > > For those wanting to experiment with the design, > there is a pure Python proof-of-concept here: > > http://code.activestate.com/recipes/578375 > > YMMV: Keep in mind that the above size statics assume a > build with 64-bit Py_ssize_t and 64-bit pointers. The > space savings percentages are a bit different on other > builds. Also, note that in many applications, the size > of the data dominates the size of the container (i.e. > the weight of a bucket of water is mostly the water, > not the bucket). > +1 I like it. The plethora of 64-bit NULL values we cart around in our internals bothers me, this gets rid of some. The worst case of ~30% less memory consumed for the bucket (ignoring the water) is still decent. For a program with a lot of instances of small to medium sized objects I'd expect a measurable win. I'm interested in seeing performance and memory footprint numbers from a C implementation of this. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Dec 10 08:48:02 2012 From: christian at python.org (Christian Heimes) Date: Mon, 10 Dec 2012 08:48:02 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: <50C593B2.9060203@python.org> Hello Raymond Am 10.12.2012 02:44, schrieb Raymond Hettinger: > Another benefit is that resizing is faster and > touches fewer pieces of memory. Currently, every > hash/key/value entry is moved or copied during a > resize. In the new layout, only the indices are > updated. For the most part, the hash/key/value entries > never move (except for an occasional swap to fill a > hole left by a deletion). > > With the reduced memory footprint, we can also expect > better cache utilization. On the other hand every lookup and collision checks needs an additional multiplication, addition and pointer dereferencing: entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx It's hard to predict how the extra CPU instructions are going to affect the performance on different CPUs and scenarios. My guts tell me that your proposal is going to perform better on modern CPUs with lots of L1 and L2 cache and in an average case scenario. A worst case scenario with lots of collisions might be measurable slower for large dicts and small CPU cache. But this is pure speculation and my guts could be terrible wrong. :) I'm +1 to give it a shot. Good luck! Christian From stephen at xemacs.org Mon Dec 10 09:27:44 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 10 Dec 2012 17:27:44 +0900 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <87vcca1xv1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87r4my1fcv.fsf@uwakimon.sk.tsukuba.ac.jp> PJ Eby writes: > By "clear", I mean "free of prior assumptions". Ah, well, I guess I've just run into a personal limitation. I can't imagine thinking that is "free of prior assumptions". Not my own, and not by others, either. So, unfortunately, I was left with the conventional opposition in thinking: "clear" vs. "muddy". That impression was only strengthened by the phrase "vs. simply copying fields from distro tools." > Put together, the phrase "clear thinking on concrete use cases" means > (at least to me), "dropping all preconceptions of the existing design > and starting over from square one, to ask how best the problem may be > solved, using specific examples as a guide rather than using > generalities." Sure, but ISTM that's the opposite of what you've actually been doing, at least in terms of contributing to my understanding. One obstacle to discussion you have contributed to overcoming in my thinking is the big generality that the packager (ie, the person writing the metadata) is in a position to recommend "good behavior" to the installation tool, vs. being in a position to point out "relevant considerations" for users and tools installing the packager's product. Until that generality is formulated and expressed, it's very difficult to see why the examples and particular solutions to use cases that various proponents have described fail to address some real problems. It was difficult for me to see, at first, what distinction was actually being made. Specifically, I thought that the question about "Obsoletes" vs. "Obsoleted-By" was about which package should be considered authoritative about obsolescence. That is a reasonable distinction for that particular discussion, but there is a deeper, and general, principle behind that. Namely, "metadata is descriptive, not prescriptive." Of course once one understands that principle, the names of the fields don't matter so much, but it is helpful for "naive" users of the metadata if the field names strongly connote description of the package rather than behavior of the tool. From storchaka at gmail.com Mon Dec 10 10:02:26 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 10 Dec 2012 11:02:26 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9B40E85E-C809-475B-BF08-4E16274158C5@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C55109.7040107@mrabarnett.plus.com> <9B40E85E-C809-475B-BF08-4E16274158C5@gmail.com> Message-ID: On 10.12.12 05:30, Raymond Hettinger wrote: > On Dec 9, 2012, at 10:03 PM, MRAB > wrote: >> What happens when a key is deleted? Will that leave an empty slot in >> the entry array? > Yes. See the __delitem__() method in the pure python implemention > at http://code.activestate.com/recipes/578375 You can move the last entry on freed place. This will preserve compactness of entries array and simplify and speedup iterations and some other operations. def __delitem__(self, key, hashvalue=None): if hashvalue is None: hashvalue = hash(key) found, i = self._lookup(key, hashvalue) if found: index = self.indices[i] self.indices[i] = DUMMY self.size -= 1 if index != size: lasthash = self.hashlist[index] lastkey = self.keylist[index] found, j = self._lookup(lastkey, lasthash) assert found assert i != j self.indices[j] = index self.hashlist[index] = lasthash self.keylist[index] = lastkey self.valuelist[index] = self.valuelist[size] index = size self.hashlist[index] = UNUSED self.keylist[index] = UNUSED self.valuelist[index] = UNUSED else: raise KeyError(key) From storchaka at gmail.com Mon Dec 10 10:12:27 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 10 Dec 2012 11:12:27 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C593B2.9060203@python.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> Message-ID: On 10.12.12 09:48, Christian Heimes wrote: > On the other hand every lookup and collision checks needs an additional > multiplication, addition and pointer dereferencing: > > entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx I think that main slowdown will be in getting index from array with variable size of elements. It requires conditional jump which is not such cheap as additional or shifting. switch (self->index_size) { case 1: idx = ((uint8_t *)self->indices)[i]; break; case 2: idx = ((uint16_t *)self->indices)[i]; break; ... } I think that variable-size indices don't worth effort. From mark at hotpy.org Mon Dec 10 10:06:51 2012 From: mark at hotpy.org (Mark Shannon) Date: Mon, 10 Dec 2012 09:06:51 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: <50C5A62B.3030001@hotpy.org> On 10/12/12 01:44, Raymond Hettinger wrote: > The current memory layout for dictionaries is > unnecessarily inefficient. It has a sparse table of > 24-byte entries containing the hash value, key pointer, > and value pointer. > > Instead, the 24-byte entries should be stored in a > dense table referenced by a sparse table of indices. What minimum size and resizing factor do you propose for the entries array? Cheers, Mark. From victor.stinner at gmail.com Mon Dec 10 10:34:36 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 10 Dec 2012 10:34:36 +0100 Subject: [Python-Dev] Do more at compile time; less at runtime In-Reply-To: <50C50F1B.2080005@hotpy.org> References: <50C50F1B.2080005@hotpy.org> Message-ID: Do know my registervm project? I try to emit better bytecode. See the implementation on http://hg.python.org/sandbox/registervm See for example the documentation: hg.python.org/sandbox/registervm/file/tip/REGISTERVM.txt I implemented other optimisations. See also WPython project which uses also more efficient bytecode. http://code.google.com/p/wpython/ Victor Le 9 d?c. 2012 23:23, "Mark Shannon" a ?crit : > Hi all, > > The current CPython bytecode interpreter is rather more complex than it > needs to be. A number of bytecodes could be eliminated and a few more > simplified by moving the work involved in handling compound statements > (loops, try-blocks, etc) from the interpreter to the compiler. > > This simplest example of this is the while loop... > while cond: > body > > This currently compiled as > > start: > if not cond goto end > body > goto start > end: > > but it could be compiled as > > goto test: > start: > body > if cond goto start > > which eliminates one instruction per iteration. > > A more complex example is a return in a try-finally block. > > try: > part1 > if cond: > return X > part2 > finally: > part3 > > Currently, handling the return is complex and involves "pseudo > exceptions", but if part3 were duplicated by the compiler, then the RETURN > bytecode could just perform a simple return. > The code above would be compiled thus... > > PUSH_BLOCK try > part1 > if not X goto endif > push X > POP_BLOCK > part3 <<< duplicated > RETURN_VALUE > endif: > part2 > POP_BLOCK > part3 <<< duplicated > > The changes I am proposing are: > > Allow negative line deltas in the lnotab array (bytecode deltas would > remain non-negative) > Remove the SETUP_LOOP, BREAK and CONTINUE bytecodes > Simplify the RETURN bytecode > Eliminate "pseudo exceptions" from the interpreter > Simplify (or perhaps eliminate) SETUP_TRY, END_FINALLY, END_WITH. > Reverse the sense of the FOR_ITER bytecode (ie. jump on not-exhausted) > > > The net effect of these changes would be: > Reduced code size and reduced code complexity. > A small (1-5%)? increase in speed, due the simplification of the > bytecodes and a very small change in the number of bytecodes executed. > A small change in the static size of the bytecodes (-2% to +2%)? > > Although this is a quite intrusive change, I think it is worthwhile as it > simplifies ceval.c considerably. > The interpreter has become rather convoluted and any simplification has to > be a good thing. > > I've already implemented negative line deltas and the transformed while > loop: https://bitbucket.org/**markshannon/cpython-lnotab-**signed > I'm currently working on the block unwinding. > > So, > Good idea? Bad idea? > Should I write a PEP or is the bug tracker sufficient? > > Cheers, > Mark. > > > > > > > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Mon Dec 10 10:40:30 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 10 Dec 2012 10:40:30 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: Hi Raymond, On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger wrote: > Instead, the data should be organized as follows: > > indices = [None, 1, None, None, None, 0, None, 2] > entries = [[-9092791511155847987, 'timmy', 'red'], > [-8522787127447073495, 'barry', 'green'], > [-6480567542315338377, 'guido', 'blue']] As a side note, your suggestion also enables order-preserving dictionaries: iter() would automatically yield items in the order they were inserted, as long as there was no deletion. People will immediately start relying on this "feature"... and be confused by the behavior of deletion. :-/ A bient?t, Armin. From kachayev at gmail.com Mon Dec 10 11:01:53 2012 From: kachayev at gmail.com (Alexey Kachayev) Date: Mon, 10 Dec 2012 12:01:53 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: Hi! 2012/12/10 Armin Rigo > Hi Raymond, > > On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger > wrote: > > Instead, the data should be organized as follows: > > > > indices = [None, 1, None, None, None, 0, None, 2] > > entries = [[-9092791511155847987, 'timmy', 'red'], > > [-8522787127447073495, 'barry', 'green'], > > [-6480567542315338377, 'guido', 'blue']] > > As a side note, your suggestion also enables order-preserving > dictionaries: iter() would automatically yield items in the order they > were inserted, as long as there was no deletion. People will > immediately start relying on this "feature"... and be confused by the > behavior of deletion. :-/ > I'm not sure about "relying" cause currently Python supports this feature only with OrderedDict object. So it's common for python developers do not rely on inserting ordering when using generic dict. > > A bient?t, > > Armin. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/kachayev%40gmail.com > -- Kind regards, Alexey S. Kachayev, CTO at Kitapps Inc. ---------- http://codemehanika.org Skype: kachayev Tel: +380-996692092 -------------- next part -------------- An HTML attachment was scrubbed... URL: From dholth at gmail.com Mon Dec 10 14:29:02 2012 From: dholth at gmail.com (Daniel Holth) Date: Mon, 10 Dec 2012 08:29:02 -0500 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: <50C58DA5.3000307@cavallinux.eu> References: <50C58DA5.3000307@cavallinux.eu> Message-ID: On Mon, Dec 10, 2012 at 2:22 AM, Antonio Cavallo wrote: > Hi, > I wonder if is it worth/if there is any interest in trying to "clean" up > distutils: nothing in terms to add new features, just a *major* cleanup > retaining the exact same interface. > > > I'm not planning anything like *adding features* or rewriting rpm/rpmbuild > here, simply cleaning up that un-holy code mess. Yes it served well, don't > get me wrong, and I think it did work much better than anything it was > meant to replace it. > > I'm not into the py3 at all so I wonder how possibly it could fit/collide > into the big plan. > > Or I'll be wasting my time? > It has been tried before. IIUC the nature of distutils and extending distutils is that client code depends on the entire tangle. If you try to clean it up you will break backwards compatibility. distutils2 is designed to break backwards compatibility with distutils and is essentially a cleaned up distutils. Have you tried Bento? http://bento.readthedocs.org/en/latest/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.badger at gmail.com Mon Dec 10 14:57:50 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Mon, 10 Dec 2012 05:57:50 -0800 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: References: <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> <50C3F356.8060106@pearwood.info> Message-ID: <20121210135750.GE2613@unaka.lan> On Sun, Dec 09, 2012 at 01:51:09PM +1100, Chris Angelico wrote: > On Sun, Dec 9, 2012 at 1:11 PM, Steven D'Aprano wrote: > > Why would a software package called "Spam" install a top-level module called > > "Jam" rather than "Spam"? Isn't the whole point of Python packages to solve > > this namespace problem? > > That would require/demand that the software package MUST define a > module with its own name, and MUST NOT define any other top-level > modules, and also that package names MUST be unique. (RFC 2119 > keywords.) That would work, as long as those restrictions are > acceptable. > /me notes that setuptools itself is an example of a package that violates this rule )setuptools and pkg_resources). No objections to "That would work, as long as those restrictions are acceptable."... that seems to sum up where we're at. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From a.badger at gmail.com Mon Dec 10 15:13:58 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Mon, 10 Dec 2012 06:13:58 -0800 Subject: [Python-Dev] Conflicts [was Re: Keyword meanings [was: Accept just PEP-0426]] In-Reply-To: References: <50C3ADAB.4070308@mrabarnett.plus.com> <50C3E636.1020003@pearwood.info> <50C40406.4020301@mrabarnett.plus.com> Message-ID: <20121210141358.GF2613@unaka.lan> On Sun, Dec 09, 2012 at 12:48:45AM -0500, PJ Eby wrote: > > Do any of the distro folks know of a Python project tagged as > conflicting with another for their distro, where the conflict does > *not* involve any files in conflict? > In Fedora we do work to avoid most types of Conflicts (backporting fixes, etc) but I can give some examples of where Conflivts could have been used in the past: In docutils prior to the latest release, certain portions of docutils was broken if pyxml was installed (since pyxml replaces certain stdlib xml.* functionaltiy). So older docutils versions could have had a Conflicts: PyXML. Nick has since provided a technique for docutils to use that loads from the stdlib first and only goes to PyXML if the functionality is not available there. Various libraries in web stacks have had bugs that prevent the propser functioning of the web framework at the top level. In case of major issues (security, unable to startup), these top level frameworks could use versioned Conflicts to prevent installation. For instance: TurboGears might have a Conflicts: CherryPy < 2.3.1 Note, though, that if parallel installable versions and selection of the proper versions from that work, then this type of Conflict wouldn't be necessary. Instead you'd have versioned Requires: instead. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From barry at python.org Mon Dec 10 16:28:28 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 10 Dec 2012 10:28:28 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C593B2.9060203@python.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> Message-ID: <20121210102828.6b4d9a1e@limelight.wooz.org> On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote: >It's hard to predict how the extra CPU instructions are going to affect >the performance on different CPUs and scenarios. My guts tell me that >your proposal is going to perform better on modern CPUs with lots of L1 >and L2 cache and in an average case scenario. A worst case scenario with >lots of collisions might be measurable slower for large dicts and small >CPU cache. I'd be interested to see what effect this has on memory constrained platforms such as many current ARM applications (mostly likely 32bit for now). Python's memory consumption is an overheard complaint for folks developing for those platforms. Cheers, -Barry From solipsis at pitrou.net Mon Dec 10 16:48:45 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Dec 2012 16:48:45 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: <20121210164845.04942fa3@pitrou.net> Le Mon, 10 Dec 2012 10:40:30 +0100, Armin Rigo a ?crit : > Hi Raymond, > > On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger > wrote: > > Instead, the data should be organized as follows: > > > > indices = [None, 1, None, None, None, 0, None, 2] > > entries = [[-9092791511155847987, 'timmy', 'red'], > > [-8522787127447073495, 'barry', 'green'], > > [-6480567542315338377, 'guido', 'blue']] > > As a side note, your suggestion also enables order-preserving > dictionaries: iter() would automatically yield items in the order they > were inserted, as long as there was no deletion. People will > immediately start relying on this "feature"... and be confused by the > behavior of deletion. :-/ If that's really an issue, we can deliberately scramble the iteration order a bit :-) (of course it might negatively impact HW prefetching) Regards Antoine. From steve at pearwood.info Mon Dec 10 17:15:11 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 11 Dec 2012 03:15:11 +1100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: <50C60A8F.60107@pearwood.info> On 10/12/12 20:40, Armin Rigo wrote: > As a side note, your suggestion also enables order-preserving > dictionaries: iter() would automatically yield items in the order they > were inserted, as long as there was no deletion. People will > immediately start relying on this "feature"... and be confused by the > behavior of deletion. :-/ If we want to avoid the attractive nuisance of iteration order being almost, but not quite, order-preserving, there is a simple fix: when iterating over the dict, instead of starting at the start of the table, start at some arbitrary point in the middle and wrap around. That will increase the cost of iteration slightly, but avoid misleading behaviour. I think all we need do is change the existing __iter__ method from this: def __iter__(self): for key in self.keylist: if key is not UNUSED: yield key to this: # Untested! def __iter__(self): # Choose an arbitrary starting point. # 103 is chosen because it is prime. n = (103 % len(self.keylist)) if self.keylist else 0 for key in self.keylist[n:] + self.keylist[:n]: # I presume the C version will not duplicate the keylist. if key is not UNUSED: yield key This mixes the order of iteration up somewhat, just enough to avoid misleading people into thinking that this is order-preserving. The order will depend on the size of the dict, not the history. For example, with keys a, b, c, ... added in that order, the output is: len=1 key=a len=2 key=ba len=3 key=bca len=4 key=dabc len=5 key=deabc len=6 key=bcdefa len=7 key=fgabcde len=8 key=habcdefg len=9 key=efghiabcd which I expect is mixed up enough to discourage programmers from jumping to conclusions about dicts having guaranteed order. -- Steven From pje at telecommunity.com Mon Dec 10 17:16:49 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 10 Dec 2012 11:16:49 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20121210164845.04942fa3@pitrou.net> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou wrote: > Le Mon, 10 Dec 2012 10:40:30 +0100, > Armin Rigo a ?crit : >> Hi Raymond, >> >> On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger >> wrote: >> > Instead, the data should be organized as follows: >> > >> > indices = [None, 1, None, None, None, 0, None, 2] >> > entries = [[-9092791511155847987, 'timmy', 'red'], >> > [-8522787127447073495, 'barry', 'green'], >> > [-6480567542315338377, 'guido', 'blue']] >> >> As a side note, your suggestion also enables order-preserving >> dictionaries: iter() would automatically yield items in the order they >> were inserted, as long as there was no deletion. People will >> immediately start relying on this "feature"... and be confused by the >> behavior of deletion. :-/ > > If that's really an issue, we can deliberately scramble the iteration > order a bit :-) (of course it might negatively impact HW prefetching) On the other hand, this would also make a fast ordered dictionary subclass possible, just by not using the free list for additions, combined with periodic compaction before adds or after deletes. From pje at telecommunity.com Mon Dec 10 17:18:59 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 10 Dec 2012 11:18:59 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <20121210033808.GG28908@mcnabbs.org> References: <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <20121206064925.GC2613@unaka.lan> <20121207170134.GD2613@unaka.lan> <20121210033808.GG28908@mcnabbs.org> Message-ID: On Sun, Dec 9, 2012 at 10:38 PM, Andrew McNabb wrote: > On Fri, Dec 07, 2012 at 05:02:26PM -0500, PJ Eby wrote: >> If the packages have files in conflict, they won't be both installed. >> If they don't have files in conflict, there's nothing important to be >> informed of. If one is installing pexpect-u, then one does not need >> to discover that it is a successor of pexpect. > > In the specific case of pexpect and pexpect-u, the files don't actually > conflict. The pexpect package includes a "pexpect.py" file, while > pexpect-u includes a "pexpect/" directory. These conflict, but not in > the easily detectable sense. Excellent! A concrete non-file use case. Setuptools handles this particular scenario by including a list of top-level module or package names, but newer tools ought to look out for this scenario, too. From solipsis at pitrou.net Mon Dec 10 17:22:52 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Dec 2012 17:22:52 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: <20121210172252.07f68f75@pitrou.net> Le Mon, 10 Dec 2012 11:16:49 -0500, PJ Eby a ?crit : > On Mon, Dec 10, 2012 at 10:48 AM, Antoine Pitrou > wrote: > > Le Mon, 10 Dec 2012 10:40:30 +0100, > > Armin Rigo a ?crit : > >> Hi Raymond, > >> > >> On Mon, Dec 10, 2012 at 2:44 AM, Raymond Hettinger > >> wrote: > >> > Instead, the data should be organized as follows: > >> > > >> > indices = [None, 1, None, None, None, 0, None, 2] > >> > entries = [[-9092791511155847987, 'timmy', 'red'], > >> > [-8522787127447073495, 'barry', 'green'], > >> > [-6480567542315338377, 'guido', 'blue']] > >> > >> As a side note, your suggestion also enables order-preserving > >> dictionaries: iter() would automatically yield items in the order > >> they were inserted, as long as there was no deletion. People will > >> immediately start relying on this "feature"... and be confused by > >> the behavior of deletion. :-/ > > > > If that's really an issue, we can deliberately scramble the > > iteration order a bit :-) (of course it might negatively impact HW > > prefetching) > > On the other hand, this would also make a fast ordered dictionary > subclass possible, just by not using the free list for additions, > combined with periodic compaction before adds or after deletes. I suspect that's what Raymond has in mind :-) Regards Antoine. From brett at python.org Mon Dec 10 17:27:45 2012 From: brett at python.org (Brett Cannon) Date: Mon, 10 Dec 2012 11:27:45 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20121210102828.6b4d9a1e@limelight.wooz.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> Message-ID: On Mon, Dec 10, 2012 at 10:28 AM, Barry Warsaw wrote: > On Dec 10, 2012, at 08:48 AM, Christian Heimes wrote: > > >It's hard to predict how the extra CPU instructions are going to affect > >the performance on different CPUs and scenarios. My guts tell me that > >your proposal is going to perform better on modern CPUs with lots of L1 > >and L2 cache and in an average case scenario. A worst case scenario with > >lots of collisions might be measurable slower for large dicts and small > >CPU cache. > > I'd be interested to see what effect this has on memory constrained > platforms > such as many current ARM applications (mostly likely 32bit for now). > Python's > memory consumption is an overheard complaint for folks developing for those > platforms. > Maybe Kristjan can shed some light on how this would have helped him on the ps3 work he has been doing for Dust 514 as he has had memory issues there. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Mon Dec 10 17:34:58 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 10 Dec 2012 11:34:58 -0500 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: <87r4my1fcv.fsf@uwakimon.sk.tsukuba.ac.jp> References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> <87vcca1xv1.fsf@uwakimon.sk.tsukuba.ac.jp> <87r4my1fcv.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Dec 10, 2012 at 3:27 AM, Stephen J. Turnbull wrote: > PJ Eby writes: > > > By "clear", I mean "free of prior assumptions". > > Ah, well, I guess I've just run into a personal limitation. I can't > imagine thinking that is "free of prior assumptions". Not my > own, and not by others, either. I suppose I should have said, "free of *known* prior assumptions", since the trick to suspending assumptions is to find the ones you *have*. The deeper assumptions, alas, can usually only be found by clashing opinions with others... then stepping back and going, "wait... what does he/she believe that's *different* from what I believe, that allows them to have that differing opinion?" And then that's how you find out what it is that *you're* assuming, that you didn't know you were assuming. ;-) (Not to mention what the other person is.) > > Put together, the phrase "clear thinking on concrete use cases" means > > (at least to me), "dropping all preconceptions of the existing design > > and starting over from square one, to ask how best the problem may be > > solved, using specific examples as a guide rather than using > > generalities." > > Sure, but ISTM that's the opposite of what you've actually been doing, > at least in terms of contributing to my understanding. One obstacle > to discussion you have contributed to overcoming in my thinking is the > big generality that the packager (ie, the person writing the metadata) > is in a position to recommend "good behavior" to the installation > tool, vs. being in a position to point out "relevant considerations" > for users and tools installing the packager's product. Right, but I started from a concrete scenario I wanted to avoid, which led me to question the assumption that those fields were actually useful. As soon as I began questioning *that* assumption and asking for use cases (2 years ago, in the last PEP 345 revision discussion), it became apparent to me that there was something seriously wrong with the conflicts and obsoletes fields, as they had almost no real utility as they were defined and understood at that point. > Until that generality is formulated and expressed, it's very difficult > to see why the examples and particular solutions to use cases that > various proponents have described fail to address some real problems. Unfortunately, it's a chicken-and-egg problem: until you know what assumptions are being made, you can't formulate them. It's an iterative process of exposing assumptions, until you succeed in actually communicating. ;-) Heck, even something as simple as my assumptions about what "clear thinking" meant and what I was trying to say has taken some back and forth to clarify. ;-) > It was difficult for me to see, at first, what distinction was > actually being made. > > Specifically, I thought that the question about "Obsoletes" vs. > "Obsoleted-By" was about which package should be considered > authoritative about obsolescence. That is a reasonable distinction > for that particular discussion, but there is a deeper, and general, > principle behind that. Namely, "metadata is descriptive, not > prescriptive." Actually, the principle I was clinging to for *both* fields was not giving project authors authority over other people's projects. It's fine for metadata to be prescriptive (e.g. requirements), it's just that it should be prescriptive *only* for that project in isolation. (In the broader sense, it also applies to the distro situation: the project author doesn't really have authority over the distro, either, so it can only be a suggestion there, as well.) From a.badger at gmail.com Mon Dec 10 17:35:49 2012 From: a.badger at gmail.com (Toshio Kuratomi) Date: Mon, 10 Dec 2012 08:35:49 -0800 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On Fri, Dec 7, 2012 at 10:46 PM, PJ Eby wrote: > > In any case, as I said before, I don't have an issue with the fields > all being declared as being for informational purposes only. My issue > is only with recommendations for automated tool behavior that permit > one project's author to exercise authority over another project's > installation. Skipping over a lot of other replies between you and I because I think that we disagree on a lot but that's all moot if we agree here. I have no problems with Obsoletes, Conflicts, Requires, and Provides types of fields are marked informational. In fact, there are many cases where packages are overzealous in their use of Requires right now that cause distributions to patch the dependency information in the package metadata. -Toshio -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian at python.org Mon Dec 10 17:39:04 2012 From: christian at python.org (Christian Heimes) Date: Mon, 10 Dec 2012 17:39:04 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20121210102828.6b4d9a1e@limelight.wooz.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> Message-ID: <50C61028.3020200@python.org> Am 10.12.2012 16:28, schrieb Barry Warsaw: > I'd be interested to see what effect this has on memory constrained platforms > such as many current ARM applications (mostly likely 32bit for now). Python's > memory consumption is an overheard complaint for folks developing for those > platforms. I had ARM platforms in mind, too. Unfortunately we don't have any ARM platforms for testing and performance measurement. Even Trent's Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, that's all. Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7 Cortex-A9) and similar. Christian From p.f.moore at gmail.com Mon Dec 10 17:44:46 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 10 Dec 2012 16:44:46 +0000 Subject: [Python-Dev] Keyword meanings [was: Accept just PEP-0426] In-Reply-To: References: <50abef0e.ab96320a.6675.ffffb439@mx.google.com> <87pq375z44.fsf@uwakimon.sk.tsukuba.ac.jp> <87obir5r2u.fsf@uwakimon.sk.tsukuba.ac.jp> <90224964F19542FFA2B4AE2D115B54C1@gmail.com> <2FEACDF3DA3048CE920F97832F24E584@gmail.com> Message-ID: On 10 December 2012 16:35, Toshio Kuratomi wrote: > I have no problems with Obsoletes, Conflicts, Requires, and Provides types > of fields are marked informational. In fact, there are many cases where > packages are overzealous in their use of Requires right now that cause > distributions to patch the dependency information in the package metadata. Given the endless debate on these fields, and the fact that it pretty much all seems to be about what happens when tools enforce them, I'm +1 on this. Particularly as these fields were not the focus of this change to the spec in any case. Paul. From kristjan at ccpgames.com Mon Dec 10 17:06:23 2012 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 10 Dec 2012 16:06:23 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20121210102828.6b4d9a1e@limelight.wooz.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> Message-ID: Indeed, I had to change the dict tuning parameters to make dicts behave reasonably on the PS3. Interesting stuff indeed. > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Barry Warsaw > Sent: 10. desember 2012 15:28 > To: python-dev at python.org > Subject: Re: [Python-Dev] More compact dictionaries with faster iteration > > I'd be interested to see what effect this has on memory constrained > platforms such as many current ARM applications (mostly likely 32bit for > now). Python's memory consumption is an overheard complaint for folks > developing for those platforms. > > Cheers, > -Barry > From barry at python.org Mon Dec 10 17:53:17 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 10 Dec 2012 11:53:17 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C61028.3020200@python.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> <50C61028.3020200@python.org> Message-ID: <20121210115317.3f63bfa9@resist.wooz.org> On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote: >I had ARM platforms in mind, too. Unfortunately we don't have any ARM >platforms for testing and performance measurement. Even Trent's >Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, >that's all. I have a few ARM machines that I can use for testing, though I can't provide external access to them. * http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm (Which oops, I see is down. Why did I not get notifications about that?) * I have a Nexus 7 flashed with Ubuntu 12.10 (soon to be 13.04). * Pandaboard still sitting in its box. ;) >Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to >Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7 >Cortex-A9) and similar. Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of getting them up and running. I am not offering for *that*. ;) Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From arigo at tunes.org Mon Dec 10 19:01:51 2012 From: arigo at tunes.org (Armin Rigo) Date: Mon, 10 Dec 2012 19:01:51 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: Hi Philip, On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby wrote: > On the other hand, this would also make a fast ordered dictionary > subclass possible, just by not using the free list for additions, > combined with periodic compaction before adds or after deletes. Technically, I could see Python switching to ordered dictionaries everywhere. Raymond's insight suddenly makes it easy for CPython and PyPy, and at least Jython could use the LinkedHashMap class (although this would need checking with Jython guys). I'd vaguely argue that dictionary orders are one of the few non-reproducible factors in a Python program, so it might be a good thing. But only vaguely --- maybe I got far too used to random orders over time... A bient?t, Armin. From rdmurray at bitdance.com Mon Dec 10 18:56:38 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 10 Dec 2012 12:56:38 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> Message-ID: <20121210175638.A70772500E9@webabinitio.net> On Mon, 10 Dec 2012 16:06:23 +0000, kristjan at ccpgames.com wrote: > Indeed, I had to change the dict tuning parameters to make dicts > behave reasonably on the PS3. > > Interesting stuff indeed. Out of both curiosity and a possible application of my own for the information, what values did you end up with? --David From a.cavallo at cavallinux.eu Mon Dec 10 08:22:13 2012 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Mon, 10 Dec 2012 07:22:13 +0000 Subject: [Python-Dev] Is is worth disentangling distutils? Message-ID: <50C58DA5.3000307@cavallinux.eu> Hi, I wonder if is it worth/if there is any interest in trying to "clean" up distutils: nothing in terms to add new features, just a *major* cleanup retaining the exact same interface. I'm not planning anything like *adding features* or rewriting rpm/rpmbuild here, simply cleaning up that un-holy code mess. Yes it served well, don't get me wrong, and I think it did work much better than anything it was meant to replace it. I'm not into the py3 at all so I wonder how possibly it could fit/collide into the big plan. Or I'll be wasting my time? Thanks From brian at python.org Mon Dec 10 19:19:48 2012 From: brian at python.org (Brian Curtin) Date: Mon, 10 Dec 2012 12:19:48 -0600 Subject: [Python-Dev] Is is worth disentangling distutils? In-Reply-To: <50C58DA5.3000307@cavallinux.eu> References: <50C58DA5.3000307@cavallinux.eu> Message-ID: On Mon, Dec 10, 2012 at 1:22 AM, Antonio Cavallo wrote: > I'm not into the py3 at all so I wonder how possibly it could fit/collide > into the big plan. > > Or I'll be wasting my time? If you're not doing it on Python 3 then you are wasting your time. From pje at telecommunity.com Mon Dec 10 19:38:52 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 10 Dec 2012 13:38:52 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo wrote: > On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby wrote: >> On the other hand, this would also make a fast ordered dictionary >> subclass possible, just by not using the free list for additions, >> combined with periodic compaction before adds or after deletes. > > Technically, I could see Python switching to ordered dictionaries > everywhere. Raymond's insight suddenly makes it easy for CPython and > PyPy, and at least Jython could use the LinkedHashMap class (although > this would need checking with Jython guys). What about IronPython? Also, note that using ordered dictionaries carries a performance cost for dictionaries whose keys change a lot. This probably wouldn't affect most dictionaries in most programs, because module and object dictionaries generally don't delete and re-add a lot of keys very often. But in cases where a dictionary is used as a queue or stack or something of that sort, the packing costs could add up. Under the current scheme, as long as collisions were minimal, the contents wouldn't be repacked very often. Without numbers it's hard to say for certain, but the advantage to keeping ordered dictionaries a distinct type is that the standard dictionary type can then get that extra bit of speed in exchange for dropping the ordering requirement. From solipsis at pitrou.net Mon Dec 10 19:47:37 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 10 Dec 2012 19:47:37 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> <50C61028.3020200@python.org> <20121210115317.3f63bfa9@resist.wooz.org> Message-ID: <20121210194737.6ff98ff2@pitrou.net> On Mon, 10 Dec 2012 11:53:17 -0500 Barry Warsaw wrote: > On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote: > > >I had ARM platforms in mind, too. Unfortunately we don't have any ARM > >platforms for testing and performance measurement. Even Trent's > >Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, > >that's all. > > I have a few ARM machines that I can use for testing, though I can't provide > external access to them. > > * http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm > (Which oops, I see is down. Why did I not get notifications about that?) Because buildbot.python.org is waiting for someone (mail.python.org, perhaps) to accept its SMTP requests. Feel free to ping the necessary people ;-) Regards Antoine. From bwmaister at gmail.com Mon Dec 10 21:02:13 2012 From: bwmaister at gmail.com (Brandon W Maister) Date: Mon, 10 Dec 2012 15:02:13 -0500 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: <20121208165106.725ccabf@resist.wooz.org> References: <20121208165106.725ccabf@resist.wooz.org> Message-ID: > > P.S. Who wants to abuse Jono and Matthew's copyright again and provide a > git version? > Oh, I do! I also feel weird about adding a copyright to this, but how will other people feel comfortable using it if I don't? Also I put it in github, in case people want to fix it: https://github.com/quodlibetor/git-tools.el ;; Copyright (c) 2012 Brandon W Maister ;; ;; Permission is hereby granted, free of charge, to any person obtaining ;; a copy of this software and associated documentation files (the ;; "Software"), to deal in the Software without restriction, including ;; without limitation the rights to use, copy, modify, merge, publish, ;; distribute, sublicense, and/or sell copies of the Software, and to ;; permit persons to whom the Software is furnished to do so, subject to ;; the following conditions: ;; ;; The above copyright notice and this permission notice shall be ;; included in all copies or substantial portions of the Software. ;; ;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ;; This code is based on hg-tools.el, which in turn is based on bzr-tools.el ;; Copyright (c) 2008-2012 Jonathan Lange, Matthew Lefkowitz, Barry A. Warsaw (provide 'git-tools) (defconst git-tools-grep-command "git ls-files -z | xargs -0 grep -In %s" "The command used for grepping files using git. See `git-tools-grep'.") ;; Run 'code' at the root of the branch which dirname is in. (defmacro git-tools-at-branch-root (dirname &rest code) `(let ((default-directory (locate-dominating-file (expand-file-name ,dirname) ".git"))) , at code)) (defun git-tools-grep (expression dirname) "Search a branch for `expression'. If there's a C-u prefix, prompt for `dirname'." (interactive (let* ((string (read-string "Search for: ")) (dir (if (null current-prefix-arg) default-directory (read-directory-name (format "Search for %s in: " string))))) (list string dir))) (git-tools-at-branch-root dirname (grep-find (format git-tools-grep-command (shell-quote-argument expression))))) On Sat, Dec 8, 2012 at 4:51 PM, Barry Warsaw wrote: > Hark fellow Emacsers. All you unenlightened heathens can stop reading now. > > A few years ago, my colleague Jono Lange wrote probably the best little > chunk > of Emacs lisp ever. `M-x bzr-tools-grep` lets you easily search a Bazaar > repository for a case-sensitive string, providing you with a nice *grep* > buffer which you can scroll through. When you find a code sample you want > to > look at, C-c C-c visits the file and plops you right at the matching line. > You *only* grep through files under version control, so you get to ignore > generated files, and compilation artifacts, etc. > > Of course, this doesn't help you for working on the Python code base, > because > Mercurial. I finally whipped up this straight up rip of Jono's code to > work > with hg. I'm actually embarrassed to put a copyright on this thing, and > would > happily just donate it to Jono, drop it in Python's Misc directory, or > slip it > like a lump of coal into the xmas stocking of whoever wants to "maintain" > it > for the next 20 years. > > But anyway, it's already proven enormously helpful to me, so here it is. > > Cheers, > -Barry > > P.S. Who wants to abuse Jono and Matthew's copyright again and provide a > git version? > > ;; Copyright (c) 2012 Barry A. Warsaw > ;; > ;; Permission is hereby granted, free of charge, to any person obtaining > ;; a copy of this software and associated documentation files (the > ;; "Software"), to deal in the Software without restriction, including > ;; without limitation the rights to use, copy, modify, merge, publish, > ;; distribute, sublicense, and/or sell copies of the Software, and to > ;; permit persons to whom the Software is furnished to do so, subject to > ;; the following conditions: > ;; > ;; The above copyright notice and this permission notice shall be > ;; included in all copies or substantial portions of the Software. > ;; > ;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > ;; EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > ;; MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > ;; NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE > ;; LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION > ;; OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION > ;; WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. > > ;; This code is based on bzr-tools.el > ;; Copyright (c) 2008-2009 Jonathan Lange, Matthew Lefkowitz > > (provide 'hg-tools) > > (defconst hg-tools-grep-command > "hg locate --print0 | xargs -0 grep -In %s" > "The command used for grepping files using hg. See `hg-tools-grep'.") > > ;; Run 'code' at the root of the branch which dirname is in. > (defmacro hg-tools-at-branch-root (dirname &rest code) > `(let ((default-directory (locate-dominating-file (expand-file-name > ,dirname) ".hg"))) , at code)) > > > (defun hg-tools-grep (expression dirname) > "Search a branch for `expression'. If there's a C-u prefix, prompt for > `dirname'." > (interactive > (let* ((string (read-string "Search for: ")) > (dir (if (null current-prefix-arg) > default-directory > (read-directory-name (format "Search for %s in: " > string))))) > (list string dir))) > (hg-tools-at-branch-root dirname > (grep-find (format hg-tools-grep-command (shell-quote-argument > expression))))) > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/quodlibetor%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Dec 10 21:22:44 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 10 Dec 2012 15:22:44 -0500 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: References: <20121208165106.725ccabf@resist.wooz.org> Message-ID: <20121210152244.46942b1b@resist.wooz.org> On Dec 10, 2012, at 03:02 PM, Brandon W Maister wrote: >I also feel weird about adding a copyright to this, but how will other >people feel comfortable using it if I don't? Nice! I've been told that Emacs 23 is probably a minimum requirement because locate-dominating-file is missing in <= Emacs 22. I'm using Emacs 24.2 almost everywhere these days. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tjreedy at udel.edu Mon Dec 10 21:50:15 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 10 Dec 2012 15:50:15 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On 12/10/2012 1:38 PM, PJ Eby wrote: > On Mon, Dec 10, 2012 at 1:01 PM, Armin Rigo wrote: >> On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby wrote: >>> On the other hand, this would also make a fast ordered dictionary >>> subclass possible, just by not using the free list for additions, >>> combined with periodic compaction before adds or after deletes. >> >> Technically, I could see Python switching to ordered dictionaries >> everywhere. Raymond's insight suddenly makes it easy for CPython and >> PyPy, and at least Jython could use the LinkedHashMap class (although >> this would need checking with Jython guys). > > What about IronPython? > > Also, note that using ordered dictionaries carries a performance cost > for dictionaries whose keys change a lot. This probably wouldn't > affect most dictionaries in most programs, because module and object > dictionaries generally don't delete and re-add a lot of keys very > often. But in cases where a dictionary is used as a queue or stack or > something of that sort, the packing costs could add up. Under the > current scheme, as long as collisions were minimal, the contents > wouldn't be repacked very often. > > Without numbers it's hard to say for certain, but the advantage to > keeping ordered dictionaries a distinct type is that the standard > dictionary type can then get that extra bit of speed in exchange for > dropping the ordering requirement. I think that there should be a separate OrderedDict class (or subclass) and that the dict docs should warn (if not now) that while iterating a dict *may*, in some circumstances, give items in insertion *or* sort order, it *will not* in all cases. Example of the latter: >>> d = {8:0, 9:0, 15:0, 13:0, 14:0} >>> for k in d: print(k) 8 9 13 14 15 If one entered the keys in sorted order, as might be natural, one might mistakenly think that insertion order was being reproduced. -- Terry Jan Reedy From tim.delaney at aptare.com Mon Dec 10 22:29:50 2012 From: tim.delaney at aptare.com (Tim Delaney) Date: Tue, 11 Dec 2012 08:29:50 +1100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On 11 December 2012 05:01, Armin Rigo wrote: > Hi Philip, > > On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby wrote: > > On the other hand, this would also make a fast ordered dictionary > > subclass possible, just by not using the free list for additions, > > combined with periodic compaction before adds or after deletes. > > Technically, I could see Python switching to ordered dictionaries > everywhere. Raymond's insight suddenly makes it easy for CPython and > PyPy, and at least Jython could use the LinkedHashMap class (although > this would need checking with Jython guys). I'd vaguely argue that > dictionary orders are one of the few non-reproducible factors in a > Python program, so it might be a good thing. But only vaguely --- > maybe I got far too used to random orders over time... > Whilst I think Python should not move to ordered dictionaries everywhere, I would say there is an argument (no pun intended) for making **kwargs a dictionary that maintains insertion order *if there are no deletions*. It sounds like we could get that for free with this implementation, although from another post IronPython might not have something suitable. I think there are real advantages to doing so - a trivial one being the ability to easily initialise an ordered dictionary from another ordered dictionary. I could also see an argument for having this property for all dicts. There are many dictionaries that are never deleted from (probably most dict literals) and it would be nice to have an iteration order for them that matched the source code. However if deletions occur all bets would be off. If you need to maintain insertion order in the face of deletions, use an explicit ordereddict. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Dec 10 22:51:06 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 10 Dec 2012 16:51:06 -0500 Subject: [Python-Dev] Guido, Dropbox, and Python In-Reply-To: References: Message-ID: For those who have not heard, Guido left Google Friday and starts at Dropbox in January. (I hope you enjoy the break in between ;-). https://twitter.com/gvanrossum/status/277126763295944705 https://tech.dropbox.com/2012/12/welcome-guido/ My question, Guido, is how this will affect Python development, and in particular, your work on async. If not proprietary info, does or will Dropbox use Python3? -- Terry Jan Reedy From guido at python.org Mon Dec 10 23:01:44 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 10 Dec 2012 14:01:44 -0800 Subject: [Python-Dev] Guido, Dropbox, and Python In-Reply-To: References: Message-ID: On Mon, Dec 10, 2012 at 1:51 PM, Terry Reedy wrote: > For those who have not heard, Guido left Google Friday and starts at Dropbox > in January. (I hope you enjoy the break in between ;-). Thank you, I am already enjoying it! > https://twitter.com/gvanrossum/status/277126763295944705 > https://tech.dropbox.com/2012/12/welcome-guido/ > > My question, Guido, is how this will affect Python development, and in > particular, your work on async. If not proprietary info, does or will > Dropbox use Python3? It should not change a thing. I don't know whether Dropbox uses Python 3 but I don't think it will make any difference. Of course I have not yet started working at Dropbox so I will be able to give better answers in a few months. -- --Guido van Rossum (python.org/~guido) From andrew.svetlov at gmail.com Mon Dec 10 23:14:10 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 11 Dec 2012 00:14:10 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 11:29 PM, Tim Delaney wrote: > On 11 December 2012 05:01, Armin Rigo wrote: >> >> Hi Philip, >> >> On Mon, Dec 10, 2012 at 5:16 PM, PJ Eby wrote: >> > On the other hand, this would also make a fast ordered dictionary >> > subclass possible, just by not using the free list for additions, >> > combined with periodic compaction before adds or after deletes. >> >> Technically, I could see Python switching to ordered dictionaries >> everywhere. Raymond's insight suddenly makes it easy for CPython and >> PyPy, and at least Jython could use the LinkedHashMap class (although >> this would need checking with Jython guys). I'd vaguely argue that >> dictionary orders are one of the few non-reproducible factors in a >> Python program, so it might be a good thing. But only vaguely --- >> maybe I got far too used to random orders over time... > > > Whilst I think Python should not move to ordered dictionaries everywhere, I > would say there is an argument (no pun intended) for making **kwargs a > dictionary that maintains insertion order *if there are no deletions*. It > sounds like we could get that for free with this implementation, although > from another post IronPython might not have something suitable. > Please, no. dict and kwargs should be unordered without any assumptions. > I think there are real advantages to doing so - a trivial one being the > ability to easily initialise an ordered dictionary from another ordered > dictionary. > It can be done with adding short-circuit for OrderedDict class to accept another OrderedDict instance. > I could also see an argument for having this property for all dicts. There > are many dictionaries that are never deleted from (probably most dict > literals) and it would be nice to have an iteration order for them that > matched the source code. > > However if deletions occur all bets would be off. If you need to maintain > insertion order in the face of deletions, use an explicit ordereddict. > > Tim Delaney > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com > -- Thanks, Andrew Svetlov From pje at telecommunity.com Tue Dec 11 00:05:04 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 10 Dec 2012 18:05:04 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 4:29 PM, Tim Delaney wrote: > Whilst I think Python should not move to ordered dictionaries everywhere, I > would say there is an argument (no pun intended) for making **kwargs a > dictionary that maintains insertion order *if there are no deletions*. Oooh. Me likey. There have been many times where I've wished kwargs were ordered when designing an API. (Oddly, I don't remember any one of the APIs specifically, so don't ask me for a good example. I just remember a bunch of different physical locations where I was when I thought, "Ooh, what if I could... no, that's not going to work.") One other useful place for ordered dictionaries is class definitions processed by class decorators: no need to write a metaclass just to know what order stuff was defined in. > It sounds like we could get that for free with this implementation, although > from another post IronPython might not have something suitable. Actually, IronPython may already have ordered dictionaries by default; see: http://mail.python.org/pipermail/ironpython-users/2006-May/002319.html It's described as an implementation detail that may change, perhaps that could be changed to being unchanging. ;-) > I think there are real advantages to doing so - a trivial one being the ability > to easily initialise an ordered dictionary from another ordered dictionary. Or to merge two of them together, either at creation or .update(). I'm really starting to wonder if it might not be worth paying the compaction overhead to just make all dictionaries ordered, all the time. The problem is that if you are always adding new keys and deleting old ones (as might be done in a LRU cache, a queue, or other things like that) you'll probably see a lot of compaction overhead compared to today's dicts. OTOH... with a good algorithm for deciding *when* to compact, we can actually make the amortized cost O(1) or so, so maybe that's not a big deal. The cost to do a compaction is at worst, the current size of the table. So if you wait until a table has twice as many entries (more precisely, until the index of the last entry is twice what it was at last compaction), you will amortize the compaction cost down to one entry move per add, or O(1). That would handle the case of a cache or queue, but I'm not sure how it would work with supersized dictionaries that are then deleted down to a fraction of their original size. I suppose if you delete your way down to half the entries being populated, then you end up with two moves per delete, or O(2). (Yes, I know that's not a valid O number.) So, offhand, it seems pretty doable, and unlikely to significantly change worst-case performance even for pathological use cases. (Like using a dict when you should be using a deque.) From pje at telecommunity.com Tue Dec 11 00:06:43 2012 From: pje at telecommunity.com (PJ Eby) Date: Mon, 10 Dec 2012 18:06:43 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 5:14 PM, Andrew Svetlov wrote: > Please, no. dict and kwargs should be unordered without any assumptions. Why? From fwierzbicki at gmail.com Tue Dec 11 00:13:14 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Mon, 10 Dec 2012 15:13:14 -0800 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo wrote: > Technically, I could see Python switching to ordered dictionaries > everywhere. Raymond's insight suddenly makes it easy for CPython and > PyPy, and at least Jython could use the LinkedHashMap class (although > this would need checking with Jython guys). I honestly hope this doesn't happen - we use ConcurrentHashMap for our dictionaries (which lack ordering) and I'm sure getting it to preserve insertion order would cost us. -Frank From raymond.hettinger at gmail.com Tue Dec 11 00:17:57 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 10 Dec 2012 18:17:57 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C593B2.9060203@python.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> Message-ID: <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> On Dec 10, 2012, at 2:48 AM, Christian Heimes wrote: > On the other hand every lookup and collision checks needs an additional > multiplication, addition and pointer dereferencing: > > entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx Currently, the dict implementation allows alternative lookup functions based on whether the keys are all strings. The choice of lookup function is stored in a function pointer. That lets each lookup use the currently active lookup function without having to make any computations or branches. Likewise, the lookup functions could be swapped between char, short, long, and ulong index sizes during the resize step. IOW, the selection only needs to be made once per resize, not one per lookup. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From fwierzbicki at gmail.com Tue Dec 11 00:21:37 2012 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Mon, 10 Dec 2012 15:21:37 -0800 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 3:13 PM, fwierzbicki at gmail.com wrote: > On Mon, Dec 10, 2012 at 10:01 AM, Armin Rigo wrote: >> Technically, I could see Python switching to ordered dictionaries >> everywhere. Raymond's insight suddenly makes it easy for CPython and >> PyPy, and at least Jython could use the LinkedHashMap class (although >> this would need checking with Jython guys). > I honestly hope this doesn't happen - we use ConcurrentHashMap for our > dictionaries (which lack ordering) and I'm sure getting it to preserve > insertion order would cost us. I just found this http://code.google.com/p/concurrentlinkedhashmap/ so maybe it wouldn't be all that bad. I still personally like the idea of leaving basic dict order undetermined (there is already an OrderedDict if you need it right?) But if ConcurrentLinkedHashMap is as good as is suggested on that page then Jython doesn't need to be the thing that blocks the argument. -Frank From raymond.hettinger at gmail.com Tue Dec 11 00:39:22 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 10 Dec 2012 18:39:22 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C5A62B.3030001@hotpy.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C5A62B.3030001@hotpy.org> Message-ID: <6C5A898E-05EE-49F1-9266-7099CC10C09B@gmail.com> On Dec 10, 2012, at 4:06 AM, Mark Shannon wrote: >> Instead, the 24-byte entries should be stored in a >> dense table referenced by a sparse table of indices. > > What minimum size and resizing factor do you propose for the entries array? There are many ways to do this. I don't know which is best. The proof-of-concept code uses the existing list resizing logic. Another approach is to pre-allocate the two-thirds maximum (This is simple and fast but gives the smallest space savings.) A hybrid approach is to allocate in two steps (1/3 and then 2/3 if needed). Since hash tables aren't a new problem, there may already be published research on the best way to handle the entries array. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Tue Dec 11 00:44:49 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 10 Dec 2012 18:44:49 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: <3ED1B2FB-05CB-4479-9D91-8E103A358097@gmail.com> On Dec 10, 2012, at 1:38 PM, PJ Eby wrote: > Without numbers it's hard to say for certain, but the advantage to > keeping ordered dictionaries a distinct type is that the standard > dictionary type can then get that extra bit of speed in exchange for > dropping the ordering requirement. I expect that dicts and OrderedDicts will remain separate for reasons of speed, space, and respect for people's mental models. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Tue Dec 11 01:04:14 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 11 Dec 2012 00:04:14 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <6C5A898E-05EE-49F1-9266-7099CC10C09B@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C5A62B.3030001@hotpy.org> <6C5A898E-05EE-49F1-9266-7099CC10C09B@gmail.com> Message-ID: <50C6787E.6010200@hotpy.org> On 10/12/12 23:39, Raymond Hettinger wrote: > > On Dec 10, 2012, at 4:06 AM, Mark Shannon > wrote: > >>> Instead, the 24-byte entries should be stored in a >>> dense table referenced by a sparse table of indices. >> >> What minimum size and resizing factor do you propose for the entries >> array? > > There are many ways to do this. I don't know which is best. > The proof-of-concept code uses the existing list resizing logic. > Another approach is to pre-allocate the two-thirds maximum What do you mean by maximum? > (This is simple and fast but gives the smallest space savings.) > A hybrid approach is to allocate in two steps (1/3 and then 2/3 > if needed). I think you need to do some more analysis on this. It is possible that there is some improvement to be had from your approach, but I don't think the improvements will be as large as you have claimed. I suspect that you may be merely trading performance for reduced memory use, which can be done much more easily by reducing the minimum size and increasing the load factor. Cheers, Mark. From raymond.hettinger at gmail.com Tue Dec 11 04:45:07 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 10 Dec 2012 22:45:07 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C6787E.6010200@hotpy.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C5A62B.3030001@hotpy.org> <6C5A898E-05EE-49F1-9266-7099CC10C09B@gmail.com> <50C6787E.6010200@hotpy.org> Message-ID: On Dec 10, 2012, at 7:04 PM, Mark Shannon wrote: >> Another approach is to pre-allocate the two-thirds maximum >> (This is simple and fast but gives the smallest space savings.) > > What do you mean by maximum? A dict with an index table size of 8 gets resized when it is two-thirds full, so the maximum number of entries is 5. If you pre-allocate five entries for the initial dict, you've spent 5 * 24 bytes + 8 bytes for the indices for a total of 128 bytes. This compares to the current table of 8 * 24 bytes totaling 192 bytes. Many other strategies are possible. The proof-of-concept code uses the one employed by regular python lists. Their growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, .... This produces nice memory savings for entry lists. If you have a suggested allocation pattern or other constructive suggestion, it would be would welcome. Further sniping and unsubstantiated FUD would not. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From trent at snakebite.org Tue Dec 11 05:20:55 2012 From: trent at snakebite.org (Trent Nelson) Date: Mon, 10 Dec 2012 23:20:55 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20121210115317.3f63bfa9@resist.wooz.org> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <20121210102828.6b4d9a1e@limelight.wooz.org> <50C61028.3020200@python.org> <20121210115317.3f63bfa9@resist.wooz.org> Message-ID: <20121211042054.GA23095@snakebite.org> On Mon, Dec 10, 2012 at 08:53:17AM -0800, Barry Warsaw wrote: > On Dec 10, 2012, at 05:39 PM, Christian Heimes wrote: > > >I had ARM platforms in mind, too. Unfortunately we don't have any ARM > >platforms for testing and performance measurement. Even Trent's > >Snakebite doesn't have ARM boxes. I've a first generation Raspberry Pi, > >that's all. > >Perhaps somebody (PSF ?) is willing to donate a couple of ARM boards to > >Snakebite. I'm thinking of Raspberry Pi (ARMv6), Pandaboard (ARMv7 > >Cortex-A9) and similar. > > Suitable ARM boards can be had cheap, probably overwhelmed by labor costs of > getting them up and running. I am not offering for *that*. ;) If someone donates the hardware, I'll take care of everything else. Trent. From raymond.hettinger at gmail.com Tue Dec 11 05:59:31 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 10 Dec 2012 23:59:31 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C55109.7040107@mrabarnett.plus.com> <9B40E85E-C809-475B-BF08-4E16274158C5@gmail.com> Message-ID: On Dec 10, 2012, at 4:02 AM, Serhiy Storchaka wrote: > On 10.12.12 05:30, Raymond Hettinger wrote: >> On Dec 9, 2012, at 10:03 PM, MRAB > > wrote: >>> What happens when a key is deleted? Will that leave an empty slot in >>> the entry array? >> Yes. See the __delitem__() method in the pure python implemention >> at http://code.activestate.com/recipes/578375 > > You can move the last entry on freed place. This will preserve compactness of entries array and simplify and speedup iterations and some other operations. > > def __delitem__(self, key, hashvalue=None): > if hashvalue is None: > hashvalue = hash(key) > found, i = self._lookup(key, hashvalue) > if found: > index = self.indices[i] > self.indices[i] = DUMMY > self.size -= 1 > if index != size: > lasthash = self.hashlist[index] > lastkey = self.keylist[index] > found, j = self._lookup(lastkey, lasthash) > assert found > assert i != j > self.indices[j] = index > self.hashlist[index] = lasthash > self.keylist[index] = lastkey > self.valuelist[index] = self.valuelist[size] > index = size > self.hashlist[index] = UNUSED > self.keylist[index] = UNUSED > self.valuelist[index] = UNUSED > else: > raise KeyError(key) That is a clever improvement. Thank you. Using your idea (plus some tweaks) cleans-up the code a bit (simplifying iteration, simplifying the resizing logic, and eliminating the UNUSED constant). I'm updating the posted code to reflect your suggestion. Thanks again, Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From 76069016 at qq.com Tue Dec 11 07:08:27 2012 From: 76069016 at qq.com (=?utf-8?B?SXNtbA==?=) Date: Tue, 11 Dec 2012 14:08:27 +0800 Subject: [Python-Dev] where is the python "import" implemented Message-ID: An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Dec 11 08:16:27 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Dec 2012 08:16:27 +0100 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no References: <3YL4HM367BzRYL@mail.python.org> Message-ID: <20121211081627.0f0235e1@pitrou.net> On Tue, 11 Dec 2012 03:05:19 +0100 (CET) gregory.p.smith wrote: > Using 'long double' to force this structure to be worst case aligned is no > longer required as of Python 2.5+ when the gc_refs changed from an int (4 > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. > > The use of a 'long double' triggered a warning by Clang trunk's > Undefined-Behavior Sanitizer as on many platforms a long double requires > 16-byte alignment but the Python memory allocator only guarantees 8 byte > alignment. > > So our code would allocate and use these structures with technically improper > alignment. Though it didn't matter since the 'dummy' field is never used. > This silences that warning. > > Spelunking into code history, the double was added in 2001 to force better > alignment on some platforms and changed to a long double in 2002 to appease > Tru64. That issue should no loner be present since the upgrade from int to > Py_ssize_t where the minimum structure size increased to 16 (unless anyone > knows of a platform where ssize_t is 4 bytes?) What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). > We can probably get rid of the double and this union hack all together today. > That is a slightly more invasive change that can be left for later. How do you suggest to get rid of it? Some platforms still have strict alignment rules and we must enforce that PyObjects (*) are always aligned to the largest possible alignment, since a PyObject-derived struct can hold arbitrary C types. (*) GC-enabled PyObjects, anyway. Others will be naturally aligned thanks to the memory allocator. What's more, I think you shouldn't be doing this kind of change in a bugfix release. It might break compiled C extensions since you are changing some characteristics of object layout (although you would probably only break those extensions which access the GC header, which is probably not many of them). Resource consumption improvements generally go only into the next feature release. Regards Antoine. From solipsis at pitrou.net Tue Dec 11 08:21:30 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Dec 2012 08:21:30 +0100 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: <20121211082130.69fbc6c0@pitrou.net> On Tue, 11 Dec 2012 08:16:27 +0100 Antoine Pitrou wrote: > On Tue, 11 Dec 2012 03:05:19 +0100 (CET) > gregory.p.smith wrote: > > Using 'long double' to force this structure to be worst case aligned is no > > longer required as of Python 2.5+ when the gc_refs changed from an int (4 > > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. > > > > The use of a 'long double' triggered a warning by Clang trunk's > > Undefined-Behavior Sanitizer as on many platforms a long double requires > > 16-byte alignment but the Python memory allocator only guarantees 8 byte > > alignment. > > > > So our code would allocate and use these structures with technically improper > > alignment. Though it didn't matter since the 'dummy' field is never used. > > This silences that warning. > > > > Spelunking into code history, the double was added in 2001 to force better > > alignment on some platforms and changed to a long double in 2002 to appease > > Tru64. That issue should no loner be present since the upgrade from int to > > Py_ssize_t where the minimum structure size increased to 16 (unless anyone > > knows of a platform where ssize_t is 4 bytes?) > > What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). > > > We can probably get rid of the double and this union hack all together today. > > That is a slightly more invasive change that can be left for later. > > How do you suggest to get rid of it? Some platforms still have strict > alignment rules and we must enforce that PyObjects (*) are always > aligned to the largest possible alignment, since a PyObject-derived > struct can hold arbitrary C types. Ok, I hadn't seen your proposal. I find it reasonable: ?A more correct non-hacky alternative if any alignment issues are still found would be to use a compiler specific alignment declaration on the structure and determine which value to use at configure time.? However, the commit is still problematic, and I think it should be reverted. We can't remove the alignment hack just because it seems to be useless on x86(-64). Regards Antoine. From mark at hotpy.org Tue Dec 11 09:41:32 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 11 Dec 2012 08:41:32 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C5A62B.3030001@hotpy.org> <6C5A898E-05EE-49F1-9266-7099CC10C09B@gmail.com> <50C6787E.6010200@hotpy.org> Message-ID: <50C6F1BC.4040000@hotpy.org> On 11/12/12 03:45, Raymond Hettinger wrote: > > On Dec 10, 2012, at 7:04 PM, Mark Shannon > wrote: > >>> Another approach is to pre-allocate the two-thirds maximum >>> (This is simple and fast but gives the smallest space savings.) >> >> What do you mean by maximum? > > A dict with an index table size of 8 gets resized when it is two-thirds > full, > so the maximum number of entries is 5. If you pre-allocate five entries > for the initial dict, you've spent 5 * 24 bytes + 8 bytes for the indices > for a total of 128 bytes. This compares to the current table of 8 * 24 > bytes > totaling 192 bytes. > > Many other strategies are possible. The proof-of-concept code > uses the one employed by regular python lists. > Their growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, .... > This produces nice memory savings for entry lists. > > If you have a suggested allocation pattern or other > constructive suggestion, it would be would welcome. It seems like a reasonable starting point. Trying to avoid resizing the index array and the entries array at the same time is probably a good idea. > Further sniping and unsubstantiated FUD would not. Is asking that you back up your claims with some analysis that unreasonable? When you make claims such as """ The memory savings are significant (from 30% to 95% compression depending on the how full the table is). Small dicts (size 0, 1, or 2) get the most benefit. """ is it a surprise that I am sceptical? I like you idea. I just don't want everyone getting their hopes up for what may turn out to be a fairly minor improvement. Don't forget Unladen Swallow :) Cheers, Mark. From martin at v.loewis.de Tue Dec 11 10:08:29 2012 From: martin at v.loewis.de (martin at v.loewis.de) Date: Tue, 11 Dec 2012 10:08:29 +0100 Subject: [Python-Dev] where is the python "import" implemented In-Reply-To: References: Message-ID: <20121211100829.Horde.v4gEfsL8999QxvgNpwKhnBA@webmail.df.eu> > ??????? in this situation, I can not find the source code how python > implement it. I test a wrong format pyc, and got a error "ImportError: bad > magic number"?and I search "bad magic number" in the source code,? I > find it is in importlib/_bootstrap.py(line 815)?but when I modify this > error info(eg: test bad magic) and run again, nothing is changed. It seems > that the file is not the correct position. This is the right position. When you change _bootstrap.py, you need to run "make" again, to freeze the modified _bootstrap.py. Regards, Martin From solipsis at pitrou.net Tue Dec 11 10:11:39 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Dec 2012 10:11:39 +0100 Subject: [Python-Dev] where is the python "import" implemented References: Message-ID: <20121211101139.0db0826d@pitrou.net> Hello, Le Tue, 11 Dec 2012 14:08:27 +0800, "Isml" <76069016 at qq.com> a ?crit : > Hi, everyone, > I am testing modifying the pyc file when it is imported. As I > know, there is three situation: 1?runing in the python.exe > eg: python.exe test.pyc > in this situation, I find the source on line 1983 in file > pythonrun.c 2?import the pyc from a zip file > I find the source on line 1132 in zipimport.c > 3?do a normal import > eg: two file : main.py and testmodule.py > and in main.py: > import testmodule > > in this situation, I can not find the source code how python > implement it. I test a wrong format pyc, and got a error > "ImportError: bad magic number"?and I search "bad magic number" in > the source code, I find it is in importlib/_bootstrap.py(line > 815)?but when I modify this error info(eg: test bad magic) and run > again, nothing is changed. It seems that the file is not the correct > position. importlib/_bootstrap.py is indeed the place, but you need to run "make" once you have modified that file. _bootstrap.py is frozen into the executable at compile time, because otherwise the bootstrap issues are intractable. Regards Antoine. From solipsis at pitrou.net Tue Dec 11 10:13:31 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Dec 2012 10:13:31 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> Message-ID: <20121211101331.05087056@pitrou.net> Le Mon, 10 Dec 2012 18:17:57 -0500, Raymond Hettinger a ?crit : > > On Dec 10, 2012, at 2:48 AM, Christian Heimes > wrote: > > > On the other hand every lookup and collision checks needs an > > additional multiplication, addition and pointer dereferencing: > > > > entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx > > > Currently, the dict implementation allows alternative lookup > functions based on whether the keys are all strings. > The choice of lookup function is stored in a function pointer. > That lets each lookup use the currently active lookup > function without having to make any computations or branches. An indirect function call is technically a branch, as seen from the CPU (and not necessarily a very predictable one, although recent Intel CPUs are said to be quite good at that). Regards Antoine. From solipsis at pitrou.net Tue Dec 11 10:15:38 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Dec 2012 10:15:38 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C5A62B.3030001@hotpy.org> <6C5A898E-05EE-49F1-9266-7099CC10C09B@gmail.com> <50C6787E.6010200@hotpy.org> <50C6F1BC.4040000@hotpy.org> Message-ID: <20121211101538.50742ae1@pitrou.net> Le Tue, 11 Dec 2012 08:41:32 +0000, Mark Shannon a ?crit : > > > > If you have a suggested allocation pattern or other > > constructive suggestion, it would be would welcome. > It seems like a reasonable starting point. > Trying to avoid resizing the index array and the entries array at the > same time is probably a good idea. Why would you want to avoid that? If we want to allocate the dict's data as a single memory block (which saves a bit in memory consumption and also makes dict allocations faster), we need to resize both arrays at the same time. Regards Antoine. From 76069016 at qq.com Tue Dec 11 10:36:23 2012 From: 76069016 at qq.com (=?utf-8?B?SXNtbA==?=) Date: Tue, 11 Dec 2012 17:36:23 +0800 Subject: [Python-Dev] =?utf-8?b?5Zue5aSN77yaICB3aGVyZSBpcyB0aGUgcHl0aG9u?= =?utf-8?q?_=22import=22_implemented?= Message-ID: An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue Dec 11 12:12:10 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 11 Dec 2012 13:12:10 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C55109.7040107@mrabarnett.plus.com> <9B40E85E-C809-475B-BF08-4E16274158C5@gmail.com> Message-ID: Yet some comments about your Python implementation. 1. Don't use "is FREE" and "is DUMMY" as array doesn't preserve identity. 2. Wrong limits used in _make_index(): 128 overflows 'b', 65536 overflows 'h' and 'l' can be not enough for ssize_t. 3. round_upto_powtwo() can be implemented as 1 << n.bit_length(). 4. i * 5 faster than (i << 2) + i on Python. 5. You can get rid of "size" attribute and use len(self.keylist) instead. From regebro at gmail.com Tue Dec 11 16:23:37 2012 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 11 Dec 2012 16:23:37 +0100 Subject: [Python-Dev] Draft PEP for time zone support. Message-ID: This PEP is also available on github: https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt Text: PEP: 4?? Title: Time zone support improvements Version: $Revision$ Last-Modified: $Date$ Author: Lennart Regebro BDFL-Delegate: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Dec-2012 Post-History: 11-Dec-2012 Abstract ======== This PEP proposes the implementation of concrete time zone support in the Python standard library, and also improvements to the time zone API to deal with ambiguous time specifications during DST changes. Proposal ======== Concrete time zone support -------------------------- The time zone support in Python has no concrete implementation in the standard library, only a tzinfo baseclass, and since Python 3.2, one concrete time zone: UTC. To properly support time zones you need to include a database over all time zones, both current and historical, including daylight saving changes. But such information changes frequently, so even if we include the last information in a Python release, that information would be outdated just a few months later. Timezone support has therefore only been available through two third-party modules, ``pytz`` and ``dateutil``, both who include and wrap the "zoneinfo" database. This database, also called "tz" or "The Olsen database", is the de-facto standard time zone database over time zones, and it is included in most variants of Unix operating systems, including OS X. This gives us the opportunity to include only the code that supports the zoneinfo data in the standard library, but by default use the operating systems copy of the data, which typically will be kept updated by the updating mechanism of the operating system or distribution. For those who have an operating system that does not include the tz database, for example Windows, a distribution containing the latest tz database should also be available at the Python Package Index, so it can be easily installed with the Python packaging tools such as ``easy_install`` or ``pip``. This could also be done on Unices that are no longer recieving updates and therefore has an outdated database. With such a mechanism Python would have full time zone support in the standard library on most platforms, and a simple package installation would provide time zone support on those platforms where the tz database isn't included, such as Windows. The time zone support will be implemented by a new module called `timezone``, based on Stuart Bishop's ``pytz`` module. Getting the local time zone --------------------------- On Unix there is no standard way of finding the name of the time zone that is being used. All the information that is available is the time zone abbreviations, such as ``EST`` and ``PDT``, but many of those abbreviations are ambigious and therefore you can't rely on them to figure out which time zone you are located in. There is however a standard for finding the compiled time zone information since it's located in ``/etc/localtime``. Therefore it is possible to create a local time zone object with the correct time zone information even though you don't know the name of the time zone. A function in ``datetime`` should be provided to return the local time zone. The support for this will be made by integrating Lennart Regebro's ``tzlocal`` module into the new ``timezone`` module. Ambiguous times --------------- When changing over from daylight savings time the clock is turned back one hour. This means that the times during that hour happens twice, once without DST and then once with DST. Similarily, when changing to daylight savings time, one hour goes missing. The current time zone API can not differentiating between the two ambiguous times during a change from DST. For example, in Stockholm the time of 2012-11-28 02:00:00 happens twice, both at UTC 2012-11-28 00:00:00 and also at 2012-11-28 01:00:00. The current time zone API can not disambiguate this and therefore it's unclear which time should be returned:: # This could be either 00:00 or 01:00 UTC: >>> dt = datetime(2012, 11, 28, 2, 0, tzinfo=timezone('Europe/Stockholm')) # But we can not specify which: >>> dt.astimezone(timezone('UTC')) datetime.datetime(2012, 11, 28, 1, 0, tzinfo=) ``pytz`` solved this problem by adding ``is_dst`` parameters to several methods of the tzinfo objects to make it possible to disambiguate times when this is desired. This PEP proposes to add these ``is_dst`` parameters to the relevant methods of the ``datetime`` API, and therefore add this functionality directly to ``datetime``. This is likely the hardest part of this PEP as this involves updating the Implementation API ================== The new ``timezone``-module --------------------------- The public API of the new ``timezone``-module contains one new class, one new function and one new exception. * New class: ``DstTzInfo`` This class provides a concrete implementation of the ``zoneinfo`` base class that implements DST support. * New function :``get_timezone(name=None, db=None)`` This function takes a name string that must be a string specifying a valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11". If not given, the local timezone will be looked up. If an invalid zone name are given, or the local timezone can not be retrieved, the function raises `UnknownTimeZoneError`. The function also takes an optional path to the location of the zoneinfo database which should be used. If not specified, the function will check if the `timezonedata` module is installed, and then use that location or otherwise use the database in ``/usr/share/zoneinfo``. If no database is found an ``UnknownTimeZoneError`` or subclass thereof will be raised with a message explaining that no zoneinfo database can be found, but that you can install one with the ``timezonedata`` package. * New Exception: ``UnknownTimeZoneError`` This exception is raised when giving a time zone specification that can't be found:: >>> datetime.Timezone('Europe/New_York') Traceback (most recent call last): ... UnknownTimeZoneError: There is no time zone called 'Europe/New_York' Changes in the ``datetime``-module -------------------------------------- A new ``is_dst`` parameter is added to several of the `tzinfo` methods to handle time ambiguity during DST changeovers. * ``tzinfo.utcoffset(self, dt, is_dst=True)`` * ``tzinfo.dst(self, dt, is_dst=True)`` * ``tzinfo.tzname(self, dt, is_dst=True)`` The ``is_dst`` parameter can be ``True`` (default), ``False``, or ``None``. ``True`` will specify that the given datetime should be interpreted as happening during daylight savings time, ie that the time specified is before the change from DST. ``False`` will specify that the given datetime should be interpreted as not happening during daylight savings time, ie that the time specified is after the change from DST. ``None`` will raise an ``AmbiguousTimeError`` exception if the time specified was during a DST change over. It will also raise a ``NonExistentTimeError`` if a time is specified during the "missing time" in a change to DST. There are also two new exceptions: * ``AmbiguousTimeError`` This exception is raised when giving a datetime specification that are ambigious while setting ``is_dst`` to None:: >>> datetime(2012, 11, 28, 2, 0, tzinfo=timezone('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... AmbiguousTimeError: 2012-10-28 02:00:00 is ambiguous in time zone Europe/Stockholm * ``NonExistentTimeError`` This exception is raised when giving a datetime specification that are ambigious while setting ``is_dst`` to None:: >>> datetime(2012, 3, 25, 2, 0, tzinfo=timezone('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... NonExistentTimeError: 2012-03-25 02:00:00 does not exist in time zone Europe/Stockholm The ``timezonedata``-package ----------------------------- The zoneinfo database will be packaged for easy installation with ``easy_install``/``pip``/``buildout``. This package will not install any Python code, and will not contain any Python code except that which is needed for installation. Differences from the ``pytz`` API ================================= * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is implemented directly in ``datetime.tzinfo`` they are no longer needed. * The ``pytz`` method ``timezone()`` is instead called ``get_timezone()`` for clarity. * ``get_timezone()`` will return the local time zone if called without parameters. * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst`` support for static timezones. When ``is_dst`` support is included in ``datetime.tzinfo`` it is no longer needed. Discussion ========== Should the windows installer include the data package? ------------------------------------------------------ It has been suggested that the Windows installer should include the data package. This would mean that an explicit installation no longer would be needed on Windows. On the other hand, that would mean that many using Windows would not be aware that the database quickly becomes outdated and would not keep it updated. Resources ========= * http://pytz.sourceforge.net/ * http://pypi.python.org/pypi/tzlocal * http://pypi.python.org/pypi/python-dateutil Copyright ========= This document has been placed in the public domain. From dirkjan at ochtman.nl Tue Dec 11 16:39:33 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 11 Dec 2012 16:39:33 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: Message-ID: On Tue, Dec 11, 2012 at 4:23 PM, Lennart Regebro wrote: > Proposal > ======== > > The time zone support will be implemented by a new module called `timezone``, > based on Stuart Bishop's ``pytz`` module. I wonder if there needs to be something here about how to port from pytz to the new timezone library. > * New function :``get_timezone(name=None, db=None)`` > > This function takes a name string that must be a string specifying a > valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11". > If not given, the local timezone will be looked up. If an invalid zone name > are given, or the local timezone can not be retrieved, the function raises > `UnknownTimeZoneError`. > > The function also takes an optional path to the location of the zoneinfo > database which should be used. If not specified, the function will check if > the `timezonedata` module is installed, and then use that location > or otherwise > use the database in ``/usr/share/zoneinfo``. > > If no database is found an ``UnknownTimeZoneError`` or subclass thereof will > be raised with a message explaining that no zoneinfo database can be found, > but that you can install one with the ``timezonedata`` package. It seems like calling get_timezone() with an unknown timezone should just throw ValueError, not necessarily some custom Exception? It would probably be a good idea to have a different exception for the case of no database available. > Differences from the ``pytz`` API > ================================= > > * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work > around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is > implemented directly in ``datetime.tzinfo`` they are no longer needed. > > * The ``pytz`` method ``timezone()`` is instead called > ``get_timezone()`` for clarity. > > * ``get_timezone()`` will return the local time zone if called > without parameters. > > * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst`` > support for static > timezones. When ``is_dst`` support is included in > ``datetime.tzinfo`` it is no longer needed. This feels a bit superfluous. Why not keep a bit more of the pytz API to make porting easy? The pytz API has proven itself in the wild, so I don't see much point in renaming "for clarity". It also seems relatively painless to keep localize() and normalize() functions around for easy porting. > Discussion > ========== > > Should the windows installer include the data package? > ------------------------------------------------------ > > It has been suggested that the Windows installer should include the data > package. This would mean that an explicit installation no longer would be > needed on Windows. On the other hand, that would mean that many using Windows > would not be aware that the database quickly becomes outdated and would not > keep it updated. I still submit that it's pretty much just as easy to forget to update the database whether it's been installed by hand zero or one times, so I don't find your argument convincing. I don't mind the result much, though. Looking forward to have timezone support in the stdlib! Cheers, Dirkjan From p.f.moore at gmail.com Tue Dec 11 16:48:18 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 11 Dec 2012 15:48:18 +0000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: Message-ID: On 11 December 2012 15:39, Dirkjan Ochtman wrote: >> Should the windows installer include the data package? >> ------------------------------------------------------ >> >> It has been suggested that the Windows installer should include the data >> package. This would mean that an explicit installation no longer would be >> needed on Windows. On the other hand, that would mean that many using Windows >> would not be aware that the database quickly becomes outdated and would not >> keep it updated. > > I still submit that it's pretty much just as easy to forget to update > the database whether it's been installed by hand zero or one times, so > I don't find your argument convincing. I don't mind the result much, > though. I agree. Also, in corporate or similar environments where each individual package installation must be approved, having at least some timezone data in the base install ensures that all Python code can assume the *existence* of timezone support (if not necessarily the accuracy of that data). If the base Windows installer does not include timezone data, then the documentation should note this and offer advice on how to write code that degrades gracefully without timezones. If the base installer *does* include timezone data, of course, there should be a documented mechanism for updating it (we don't want magic like the old xml package used, I assume). Paul. From brian at python.org Tue Dec 11 16:58:12 2012 From: brian at python.org (Brian Curtin) Date: Tue, 11 Dec 2012 09:58:12 -0600 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: Message-ID: On Tue, Dec 11, 2012 at 9:48 AM, Paul Moore wrote: > On 11 December 2012 15:39, Dirkjan Ochtman wrote: >>> Should the windows installer include the data package? >>> ------------------------------------------------------ >>> >>> It has been suggested that the Windows installer should include the data >>> package. This would mean that an explicit installation no longer would be >>> needed on Windows. On the other hand, that would mean that many using Windows >>> would not be aware that the database quickly becomes outdated and would not >>> keep it updated. >> >> I still submit that it's pretty much just as easy to forget to update >> the database whether it's been installed by hand zero or one times, so >> I don't find your argument convincing. I don't mind the result much, >> though. > > I agree. Also, in corporate or similar environments where each > individual package installation must be approved, having at least some > timezone data in the base install ensures that all Python code can > assume the *existence* of timezone support (if not necessarily the > accuracy of that data). > > If the base Windows installer does not include timezone data, then the > documentation should note this and offer advice on how to write code > that degrades gracefully without timezones. > > If the base installer *does* include timezone data, of course, there > should be a documented mechanism for updating it (we don't want magic > like the old xml package used, I assume). I think we should try to get the data into the base installer and then include a small updater, perhaps putting it in a Windows scheduled task and checking PyPI periodically for newer versions. If a new one comes up, prompt if the user wants it. From solipsis at pitrou.net Tue Dec 11 17:07:55 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 11 Dec 2012 17:07:55 +0100 Subject: [Python-Dev] Draft PEP for time zone support. References: Message-ID: <20121211170755.3e0a7f77@pitrou.net> Le Tue, 11 Dec 2012 16:23:37 +0100, Lennart Regebro a ?crit : > > Changes in the ``datetime``-module > -------------------------------------- > > A new ``is_dst`` parameter is added to several of the `tzinfo` > methods to handle time ambiguity during DST changeovers. > > * ``tzinfo.utcoffset(self, dt, is_dst=True)`` > > * ``tzinfo.dst(self, dt, is_dst=True)`` > > * ``tzinfo.tzname(self, dt, is_dst=True)`` > > The ``is_dst`` parameter can be ``True`` (default), ``False``, or > ``None``. > > ``True`` will specify that the given datetime should be interpreted > as happening during daylight savings time, ie that the time specified > is before the change from DST. Why is it True by default? Do we have statistics showing that Python gets more use in summer? Regards Antoine. From ned at nedbatchelder.com Tue Dec 11 17:45:48 2012 From: ned at nedbatchelder.com (Ned Batchelder) Date: Tue, 11 Dec 2012 11:45:48 -0500 Subject: [Python-Dev] Do more at compile time; less at runtime In-Reply-To: <50C50F1B.2080005@hotpy.org> References: <50C50F1B.2080005@hotpy.org> Message-ID: <50C7633C.2040104@nedbatchelder.com> On 12/9/2012 5:22 PM, Mark Shannon wrote: > The current CPython bytecode interpreter is rather more complex than > it needs to be. A number of bytecodes could be eliminated and a few > more simplified by moving the work involved in handling compound > statements (loops, try-blocks, etc) from the interpreter to the compiler. As with all suggestions to optimize the bytecode generation, I'd like to re-iterate the need for a way to disable all optimization, for the sake of reasoning about the program. For example, debugging, coverage measurement, etc. This idea was misunderstood and defeated in http://bugs.python.org/issue2506, but I strongly believe it is important. --Ned. From guido at python.org Tue Dec 11 17:59:58 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Dec 2012 08:59:58 -0800 Subject: [Python-Dev] Do more at compile time; less at runtime In-Reply-To: <50C7633C.2040104@nedbatchelder.com> References: <50C50F1B.2080005@hotpy.org> <50C7633C.2040104@nedbatchelder.com> Message-ID: +1 On Dec 11, 2012 8:47 AM, "Ned Batchelder" wrote: > On 12/9/2012 5:22 PM, Mark Shannon wrote: > >> The current CPython bytecode interpreter is rather more complex than it >> needs to be. A number of bytecodes could be eliminated and a few more >> simplified by moving the work involved in handling compound statements >> (loops, try-blocks, etc) from the interpreter to the compiler. >> > > As with all suggestions to optimize the bytecode generation, I'd like to > re-iterate the need for a way to disable all optimization, for the sake of > reasoning about the program. For example, debugging, coverage measurement, > etc. This idea was misunderstood and defeated in http://bugs.python.org/* > *issue2506 , but I strongly believe it > is important. > > --Ned. > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dinov at microsoft.com Tue Dec 11 20:37:13 2012 From: dinov at microsoft.com (Dino Viehland) Date: Tue, 11 Dec 2012 19:37:13 +0000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: PJ wrote: > Actually, IronPython may already have ordered dictionaries by default; see: > > http://mail.python.org/pipermail/ironpython-users/2006- > May/002319.html > > It's described as an implementation detail that may change, perhaps that > could be changed to being unchanging. ;-) > I think this has changed since 2006. IronPython was originally using the .NET dictionary class and just locking while using it, but it now has a custom dictionary which is thread safe for multiple readers and allows 1 writer. But it doesn't do anything to preserve order of insertions. OTOH changing certain dictionaries in IronPython (such as keyword args) to be ordered would certainly be possible. Personally I just wouldn't want to see it be the default as that seems like unnecessary overhead when the specialized class exists. From barry at python.org Tue Dec 11 21:31:37 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 15:31:37 -0500 Subject: [Python-Dev] Draft PEP for time zone support. References: Message-ID: <20121211153137.29619bf6@resist.wooz.org> On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote: >This PEP is also available on github: > >https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt wget returns some html gobbledygook. Why-oh-why github?! >PEP: 4?? I've assigned this PEP 431, reformatted a few extra wide paragraphs, committed and pushed. Thanks Lennart! -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From donald.stufft at gmail.com Tue Dec 11 21:34:45 2012 From: donald.stufft at gmail.com (Donald Stufft) Date: Tue, 11 Dec 2012 15:34:45 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121211153137.29619bf6@resist.wooz.org> References: <20121211153137.29619bf6@resist.wooz.org> Message-ID: On Tuesday, December 11, 2012 at 3:31 PM, Barry Warsaw wrote: > On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote: > > > This PEP is also available on github: > > > > https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt > > wget returns some html gobbledygook. Why-oh-why github?!' wget https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt > > > PEP: 4?? > > I've assigned this PEP 431, reformatted a few extra wide paragraphs, committed > and pushed. > > Thanks Lennart! > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org (mailto:Python-Dev at python.org) > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/donald.stufft%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bwmaister at gmail.com Tue Dec 11 21:37:01 2012 From: bwmaister at gmail.com (Brandon W Maister) Date: Tue, 11 Dec 2012 15:37:01 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121211153137.29619bf6@resist.wooz.org> References: <20121211153137.29619bf6@resist.wooz.org> Message-ID: Barry you want github raw: https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt On Tue, Dec 11, 2012 at 3:31 PM, Barry Warsaw wrote: > On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote: > > >This PEP is also available on github: > > > >https://github.com/regebro/tz-pep/blob/master/pep-04tz.txt > > wget returns some html gobbledygook. Why-oh-why github?! > > >PEP: 4?? > > I've assigned this PEP 431, reformatted a few extra wide paragraphs, > committed > and pushed. > > Thanks Lennart! > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/quodlibetor%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Dec 12 01:11:11 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Dec 2012 16:11:11 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121211170755.3e0a7f77@pitrou.net> References: <20121211170755.3e0a7f77@pitrou.net> Message-ID: On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou wrote: > Le Tue, 11 Dec 2012 16:23:37 +0100, > Lennart Regebro a ?crit : >> >> Changes in the ``datetime``-module >> -------------------------------------- >> >> A new ``is_dst`` parameter is added to several of the `tzinfo` >> methods to handle time ambiguity during DST changeovers. >> >> * ``tzinfo.utcoffset(self, dt, is_dst=True)`` >> >> * ``tzinfo.dst(self, dt, is_dst=True)`` >> >> * ``tzinfo.tzname(self, dt, is_dst=True)`` >> >> The ``is_dst`` parameter can be ``True`` (default), ``False``, or >> ``None``. >> >> ``True`` will specify that the given datetime should be interpreted >> as happening during daylight savings time, ie that the time specified >> is before the change from DST. > > Why is it True by default? Do we have statistics showing that Python > gets more use in summer? My question exactly. The rest sounds good -- definitely use the system tz database on Unixy systems, pre-install on Windows and make updating easy. Some bikeshedding about static I don't really understand, so I'll leave to others. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed Dec 12 01:15:34 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Dec 2012 10:15:34 +1000 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland wrote: > OTOH changing certain dictionaries in IronPython (such as keyword args) to > be > ordered would certainly be possible. Personally I just wouldn't want to > see it > be the default as that seems like unnecessary overhead when the specialized > class exists. > Which reminds me, I was going to note that one of the main gains with ordered keyword arguments, is their use in the construction of string-keyed objects where you want to be able to control the order of iteration (e.g. for serialisation or display purposes). Currently you have to go the path of something like namedtuple where you define the order of iteration in one operation, and set the values in another. Initialising an ordered dict itself is one obvious use case, but anything else where you want to control the iteration order *and* set field names and values in a single call will potentially benefit. Independently of that, I'll note that this change would make it possible to add a .sort() method to dictionaries. Any subsequent mutation of the dictionary would requiring resorting, though (which isn't really all that different from maintaining a sorted list). The performance impact definitely needs to be benchmarked though, as the need to read two memory locations rather than one for a dictionary read could have weird caching effects. (Fortunately, many of the benchmarks run on Python 3.3 now, so it should be possible to get that data fairly easily) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Dec 12 01:58:12 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Dec 2012 10:58:12 +1000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: Message-ID: On Wed, Dec 12, 2012 at 1:23 AM, Lennart Regebro wrote: > Abstract > ======== > > This PEP proposes the implementation of concrete time zone support in the > Python standard library, and also improvements to the time zone API to deal > with ambiguous time specifications during DST changes. > Thanks for tackling this one, Lennart. > Proposal > ======== > > Concrete time zone support > -------------------------- > > The time zone support in Python has no concrete implementation in the > standard library, only a tzinfo baseclass, and since Python 3.2, one > concrete > time zone: UTC. This isn't quite right - the current concrete timezones support any fixed offset from UTC, not just UTC itself. http://docs.python.org/3/library/datetime#timezone-objects (Although there a couple of bugs in those docs at the moment: http://bugs.python.org/issue16667) > The time zone support will be implemented by a new module called > `timezone``, > based on Stuart Bishop's ``pytz`` module. > Ick, why a new module? Why not just add this directly to datetime? (It doesn't need to be provided by the C accelerator, it can go straight in the pure Python part). > This PEP proposes to add these ``is_dst`` parameters to the relevant > methods > of the ``datetime`` API, and therefore add this functionality directly to > ``datetime``. This is likely the hardest part of this PEP as this > involves updating > the > Missing the end of this sentence... > The ``timezonedata``-package > ----------------------------- > > The zoneinfo database will be packaged for easy installation with > ``easy_install``/``pip``/``buildout``. This package will not install any > Python code, and will not contain any Python code except that which is > needed > for installation. > I'd prefer a more aggressive name for this like "tzdata_override". My rationale is that *nix users need to thoroughly aware that if they install this package, they will stop benefiting from the automatic tz database updates provided by their OS (especially if they install it into the system site packages on a distro that has migrated to Python 3 for system tools). Such a name would also make it possible to provide *two* packaged databases, one checked before the OS data (tzdata_override), and one shipped with Python itself that is used only if the OS doesn't provide the timezone database (tzdata_fallback). tzdata_fallback would then be updated to the latest Olsen database for each maintenance release. Cross-platform applications that wanted more reliably up to date timezone data could then conditionally depend on tzdata_override for Windows deployments (using the environment marker support in metadata 1.2+). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From fumanchu at aminus.org Wed Dec 12 02:11:20 2012 From: fumanchu at aminus.org (Robert Brewer) Date: Tue, 11 Dec 2012 17:11:20 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <20121211170755.3e0a7f77@pitrou.net> Message-ID: Guido van Rossum wrote: > Sent: Tuesday, December 11, 2012 4:11 PM > To: Antoine Pitrou > Cc: python-dev at python.org > Subject: Re: [Python-Dev] Draft PEP for time zone support. > > On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou > wrote: > > Le Tue, 11 Dec 2012 16:23:37 +0100, > > Lennart Regebro a ?crit : > >> > >> Changes in the ``datetime``-module > >> -------------------------------------- > >> > >> A new ``is_dst`` parameter is added to several of the `tzinfo` > >> methods to handle time ambiguity during DST changeovers. > >> > >> * ``tzinfo.utcoffset(self, dt, is_dst=True)`` > >> > >> * ``tzinfo.dst(self, dt, is_dst=True)`` > >> > >> * ``tzinfo.tzname(self, dt, is_dst=True)`` > >> > >> The ``is_dst`` parameter can be ``True`` (default), ``False``, or > >> ``None``. > >> > >> ``True`` will specify that the given datetime should be interpreted > >> as happening during daylight savings time, ie that the time > specified > >> is before the change from DST. > > > > Why is it True by default? Do we have statistics showing that Python > > gets more use in summer? > > My question exactly. "Summer" in the USA, at least, is 238 days in 2012, while "Winter" into 2013 is only 126 days: >>> import datetime >>> datetime.date(2012, 11, 4) - datetime.date(2012, 3, 11) datetime.timedelta(238) >>> datetime.date(2013, 3, 10) - datetime.date(2012, 11, 4) datetime.timedelta(126) Robert Brewer fumanchu at aminus.org From barry at python.org Wed Dec 12 03:50:49 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 21:50:49 -0500 Subject: [Python-Dev] Draft PEP for time zone support. References: Message-ID: <20121211215049.0e872633@resist.wooz.org> Great work, Lennart. I really like this PEP. Feedback follows (I haven't yet read the rest of the messages in this thread ;). On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote: >This PEP proposes to add these ``is_dst`` parameters to the relevant methods >of the ``datetime`` API, and therefore add this functionality directly to >``datetime``. This is likely the hardest part of this PEP as this >involves updating >the Oops, something got cut off there. >The new ``timezone``-module >--------------------------- > >The public API of the new ``timezone``-module contains one new class, one new >function and one new exception. Why add a new module instead of putting all this into the existing datetime module, either directly or as a submodule? Seems like the obvious place to put it instead of claiming another top-level module name. >* New class: ``DstTzInfo`` > > This class provides a concrete implementation of the ``zoneinfo`` base > class that implements DST support. Is this a subclass of datetime.tzinfo? >* New function :``get_timezone(name=None, db=None)`` > > This function takes a name string that must be a string specifying a > valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11". > If not given, the local timezone will be looked up. If an invalid zone name > are given, or the local timezone can not be retrieved, the function raises > `UnknownTimeZoneError`. > > The function also takes an optional path to the location of the zoneinfo > database which should be used. If not specified, the function will check if > the `timezonedata` module is installed, and then use that location or > otherwise use the database in ``/usr/share/zoneinfo``. I'm bikeshedding, but can we find a better name than `db` for the second argument? Something that makes it obvious we're looking for file system path? >* New Exception: ``UnknownTimeZoneError`` I'd really like to see a TimeZoneError base class from which all these new exceptions inherit. >A new ``is_dst`` parameter is added to several of the `tzinfo` methods to >handle time ambiguity during DST changeovers. > >* ``tzinfo.utcoffset(self, dt, is_dst=True)`` I lied a little bit - I did skim the other messages, so I'll reserve comment on the default value of is_dst for follow ups. >* ``AmbiguousTimeError`` > >* ``NonExistentTimeError`` I'm not positive we need separate exceptions here, but I guess it can't hurt, and with the base class idea above, we can catch both either explicitly, or by catching the base class. > >The ``timezonedata``-package >----------------------------- Just to be clear, this doesn't expose any new modules, right? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Dec 12 03:54:21 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 21:54:21 -0500 Subject: [Python-Dev] Draft PEP for time zone support. References: Message-ID: <20121211215421.3da8c466@resist.wooz.org> On Dec 11, 2012, at 03:48 PM, Paul Moore wrote: >I agree. Also, in corporate or similar environments where each >individual package installation must be approved, having at least some >timezone data in the base install ensures that all Python code can >assume the *existence* of timezone support (if not necessarily the >accuracy of that data). One other thing that the PEP should describe is what happens on a distro that has timezone data, but which you also pip install the PyPI tzdata package. Which one wins? Is there a way to control it, other than providing an explicit path? Is there a way to uninstall the PyPI package? Does the API need to provide a method which tells you where the database it is using by default lives? Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Dec 12 03:59:24 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 21:59:24 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <20121211153137.29619bf6@resist.wooz.org> Message-ID: <20121211215924.0154e24b@resist.wooz.org> On Dec 11, 2012, at 03:37 PM, Brandon W Maister wrote: >Barry you want github raw: >https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt I found that out. I was mostly just complaining. ;) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Dec 12 03:57:24 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 21:57:24 -0500 Subject: [Python-Dev] Draft PEP for time zone support. References: <20121211153137.29619bf6@resist.wooz.org> Message-ID: <20121211215724.117f8d27@resist.wooz.org> On Dec 11, 2012, at 03:31 PM, Barry Warsaw wrote: >I've assigned this PEP 431, reformatted a few extra wide paragraphs, committed >and pushed. Unfortunately, it looks like the online PEP updater isn't working. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Dec 12 03:58:49 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 21:58:49 -0500 Subject: [Python-Dev] Draft PEP for time zone support. References: Message-ID: <20121211215849.419276ec@resist.wooz.org> On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote: >A new ``is_dst`` parameter is added to several of the `tzinfo` methods to >handle time ambiguity during DST changeovers. >``None`` will raise an ``AmbiguousTimeError`` exception if the time specified >was during a DST change over. It will also raise a ``NonExistentTimeError`` >if a time is specified during the "missing time" in a change to DST. I think None should be the default. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Wed Dec 12 04:14:23 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Dec 2012 13:14:23 +1000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121211215849.419276ec@resist.wooz.org> References: <20121211215849.419276ec@resist.wooz.org> Message-ID: On Wed, Dec 12, 2012 at 12:58 PM, Barry Warsaw wrote: > On Dec 11, 2012, at 04:23 PM, Lennart Regebro wrote: > > >A new ``is_dst`` parameter is added to several of the `tzinfo` methods to > >handle time ambiguity during DST changeovers. > > >``None`` will raise an ``AmbiguousTimeError`` exception if the time > specified > >was during a DST change over. It will also raise a > ``NonExistentTimeError`` > >if a time is specified during the "missing time" in a change to DST. > > I think None should be the default. > That's a backwards compatibility risk, though - many applications are likely coping just fine with the slightly corrupted time values, but would fall over if an exception was raised instead. The default should probably be chosen so that the single argument form of these calls continues to behave the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to say that the default behaviour is going to change in 3.5 (so the *actual* default would be sentinel value, in order to tell the difference between an explicit True being passed and relying on the default behaviour). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Wed Dec 12 04:19:30 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Dec 2012 22:19:30 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <20121211215849.419276ec@resist.wooz.org> Message-ID: <20121211221930.7df170ac@resist.wooz.org> On Dec 12, 2012, at 01:14 PM, Nick Coghlan wrote: >That's a backwards compatibility risk, though - many applications are >likely coping just fine with the slightly corrupted time values, but would >fall over if an exception was raised instead. The default should probably >be chosen so that the single argument form of these calls continues to >behave the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to >say that the default behaviour is going to change in 3.5 (so the *actual* >default would be sentinel value, in order to tell the difference between an >explicit True being passed and relying on the default behaviour). +1 Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From guido at python.org Wed Dec 12 04:51:50 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Dec 2012 19:51:50 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121211221930.7df170ac@resist.wooz.org> References: <20121211215849.419276ec@resist.wooz.org> <20121211221930.7df170ac@resist.wooz.org> Message-ID: On Tue, Dec 11, 2012 at 7:19 PM, Barry Warsaw wrote: > On Dec 12, 2012, at 01:14 PM, Nick Coghlan wrote: > >>That's a backwards compatibility risk, though - many applications are >>likely coping just fine with the slightly corrupted time values, but would >>fall over if an exception was raised instead. Right. >>The default should probably >>be chosen so that the single argument form of these calls continues to >>behave the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to >>say that the default behaviour is going to change in 3.5 (so the *actual* >>default would be sentinel value, in order to tell the difference between an >>explicit True being passed and relying on the default behaviour). > > +1 I don't think it's worth deprecating the old behavior. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Dec 12 02:43:59 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Dec 2012 17:43:59 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <20121211170755.3e0a7f77@pitrou.net> Message-ID: On Tue, Dec 11, 2012 at 5:11 PM, Robert Brewer wrote: > Guido van Rossum wrote: >> Sent: Tuesday, December 11, 2012 4:11 PM >> To: Antoine Pitrou >> Cc: python-dev at python.org >> Subject: Re: [Python-Dev] Draft PEP for time zone support. >> >> On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou >> wrote: >> > Le Tue, 11 Dec 2012 16:23:37 +0100, >> > Lennart Regebro a ?crit : >> >> >> >> Changes in the ``datetime``-module >> >> -------------------------------------- >> >> >> >> A new ``is_dst`` parameter is added to several of the `tzinfo` >> >> methods to handle time ambiguity during DST changeovers. >> >> >> >> * ``tzinfo.utcoffset(self, dt, is_dst=True)`` >> >> >> >> * ``tzinfo.dst(self, dt, is_dst=True)`` >> >> >> >> * ``tzinfo.tzname(self, dt, is_dst=True)`` >> >> >> >> The ``is_dst`` parameter can be ``True`` (default), ``False``, or >> >> ``None``. >> >> >> >> ``True`` will specify that the given datetime should be interpreted >> >> as happening during daylight savings time, ie that the time >> specified >> >> is before the change from DST. >> > >> > Why is it True by default? Do we have statistics showing that Python >> > gets more use in summer? >> >> My question exactly. > > "Summer" in the USA, at least, is 238 days in 2012, while "Winter" into 2013 is only 126 days: > >>>> import datetime >>>> datetime.date(2012, 11, 4) - datetime.date(2012, 3, 11) > datetime.timedelta(238) >>>> datetime.date(2013, 3, 10) - datetime.date(2012, 11, 4) > datetime.timedelta(126) Very funny, but that can't be the real reason. *Most* datetime values aren't ambiguous, so in those cases the parameter should be ignored, right? There's only one hour per year where you need to specify it (two, if we want to artificially assign a meaning to values falling the impossible hour). And during those times it's equally likely that you meant either of the possibilities. I think the meaning of the parameter must be clarified, perhaps as follows: - ignored except during the ambiguous hour and during the impossible hour - during the ambiguous or impossible hour: - if True, prefer/pretend DST - if False, prefer/pretend non-DST - if None, raise an error Here I'd prefer the default to be None if I had to do it over again, but given that the current behavior is one of the first two (which one?) we probably can't do that. Still, it's slightly confusing that passing None is not the same as omitting the parameter altogether -- there aren't many APIs that explicitly support passing None but don't use it as the default (though there probably are some precedents). Maybe requesting an error should be done through some other special value, and None should be the same as omitted and the same as the old behavior? But where would the special value come from? It should be made as easy as possible to "do the right thing" (i.e. raise an error). Or maybe have a separate Boolean flag to request an error? -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Wed Dec 12 09:13:16 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 12 Dec 2012 08:13:16 +0000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: Message-ID: On 12 December 2012 00:58, Nick Coghlan wrote: > I'd prefer a more aggressive name for this like "tzdata_override". My > rationale is that *nix users need to thoroughly aware that if they install > this package, they will stop benefiting from the automatic tz database > updates provided by their OS (especially if they install it into the system > site packages on a distro that has migrated to Python 3 for system tools). > > Such a name would also make it possible to provide *two* packaged databases, > one checked before the OS data (tzdata_override), and one shipped with > Python itself that is used only if the OS doesn't provide the timezone > database (tzdata_fallback). tzdata_fallback would then be updated to the > latest Olsen database for each maintenance release. Cross-platform > applications that wanted more reliably up to date timezone data could then > conditionally depend on tzdata_override for Windows deployments (using the > environment marker support in metadata 1.2+). That sounds sensible, EIBTI and all that. It is a lot simpler than shipping the package and some sort of auto-updater, too. Paul From chris.jerdonek at gmail.com Wed Dec 12 09:16:32 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Wed, 12 Dec 2012 00:16:32 -0800 Subject: [Python-Dev] Guido, Dropbox, and Python In-Reply-To: References: Message-ID: <1910465344822117677@unknownmsgid> On Dec 10, 2012, at 1:52 PM, Terry Reedy wrote: > My question, Guido, is how this will affect Python development, and in particular, your work on async. If not proprietary info, does or will Dropbox use Python3? I talked to some Dropbox people tonight, and they said they use 2.7 for the client and 2.5 for the server. It is a project for them to switch the server to using 2.7. --Chris Sent from my iPhone From christian at python.org Wed Dec 12 10:53:34 2012 From: christian at python.org (Christian Heimes) Date: Wed, 12 Dec 2012 10:53:34 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: Message-ID: <50C8541E.60406@python.org> Am 12.12.2012 01:58, schrieb Nick Coghlan: > Ick, why a new module? Why not just add this directly to datetime? (It > doesn't need to be provided by the C accelerator, it can go straight in > the pure Python part). +1 for something like datetime.timezone How well does hg handle files renames? The datetime module could be converted to a package. > I'd prefer a more aggressive name for this like "tzdata_override". My > rationale is that *nix users need to thoroughly aware that if they > install this package, they will stop benefiting from the automatic tz > database updates provided by their OS (especially if they install it > into the system site packages on a distro that has migrated to Python 3 > for system tools). +1, too. From petri at digip.org Wed Dec 12 12:27:21 2012 From: petri at digip.org (Petri Lehtinen) Date: Wed, 12 Dec 2012 13:27:21 +0200 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: References: <20121208165106.725ccabf@resist.wooz.org> Message-ID: <20121212112721.GF29473@p29> Brandon W Maister wrote: > (defconst git-tools-grep-command > "git ls-files -z | xargs -0 grep -In %s" > "The command used for grepping files using git. See `git-tools-grep'.") What's wrong with git grep? From rosslagerwall at gmail.com Wed Dec 12 15:12:41 2012 From: rosslagerwall at gmail.com (Ross Lagerwall) Date: Wed, 12 Dec 2012 14:12:41 +0000 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: <20121212112721.GF29473@p29> References: <20121208165106.725ccabf@resist.wooz.org> <20121212112721.GF29473@p29> Message-ID: <20121212141241.GA17564@hobo.wolfson.cam.ac.uk> On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: > Brandon W Maister wrote: > > (defconst git-tools-grep-command > > "git ls-files -z | xargs -0 grep -In %s" > > "The command used for grepping files using git. See `git-tools-grep'.") > > What's wrong with git grep? Or "hg grep", for that matter? -- Ross Lagerwall From python-dev at masklinn.net Wed Dec 12 15:20:30 2012 From: python-dev at masklinn.net (Xavier Morel) Date: Wed, 12 Dec 2012 15:20:30 +0100 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: <20121212141241.GA17564@hobo.wolfson.cam.ac.uk> References: <20121208165106.725ccabf@resist.wooz.org> <20121212112721.GF29473@p29> <20121212141241.GA17564@hobo.wolfson.cam.ac.uk> Message-ID: On 2012-12-12, at 15:12 , Ross Lagerwall wrote: > On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: >> Brandon W Maister wrote: >>> (defconst git-tools-grep-command >>> "git ls-files -z | xargs -0 grep -In %s" >>> "The command used for grepping files using git. See `git-tools-grep'.") >> >> What's wrong with git grep? > > Or "hg grep", for that matter? hg grep searches the history, not the working copy. *-tools-grep only searches the working copy but automatically filters files to only search in files under version control. Which as far as I know is indeed what git-grep does already. From petri at digip.org Wed Dec 12 15:42:50 2012 From: petri at digip.org (Petri Lehtinen) Date: Wed, 12 Dec 2012 16:42:50 +0200 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: <20121212141241.GA17564@hobo.wolfson.cam.ac.uk> References: <20121208165106.725ccabf@resist.wooz.org> <20121212112721.GF29473@p29> <20121212141241.GA17564@hobo.wolfson.cam.ac.uk> Message-ID: <20121212144250.GM29473@p29> Ross Lagerwall wrote: > On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: > > Brandon W Maister wrote: > > > (defconst git-tools-grep-command > > > "git ls-files -z | xargs -0 grep -In %s" > > > "The command used for grepping files using git. See `git-tools-grep'.") > > > > What's wrong with git grep? > > Or "hg grep", for that matter? hg grep searches in the repository history, so it's not good for this. From bwmaister at gmail.com Wed Dec 12 15:46:36 2012 From: bwmaister at gmail.com (Brandon W Maister) Date: Wed, 12 Dec 2012 09:46:36 -0500 Subject: [Python-Dev] Emacs users: hg-tools-grep In-Reply-To: References: <20121208165106.725ccabf@resist.wooz.org> <20121212112721.GF29473@p29> <20121212141241.GA17564@hobo.wolfson.cam.ac.uk> Message-ID: Yes indeed-- in my eagerness to make my first post to python-dev be well-received I completely forgot about git grep. brandon On Wed, Dec 12, 2012 at 9:20 AM, Xavier Morel wrote: > On 2012-12-12, at 15:12 , Ross Lagerwall wrote: > > > On Wed, Dec 12, 2012 at 01:27:21PM +0200, Petri Lehtinen wrote: > >> Brandon W Maister wrote: > >>> (defconst git-tools-grep-command > >>> "git ls-files -z | xargs -0 grep -In %s" > >>> "The command used for grepping files using git. See > `git-tools-grep'.") > >> > >> What's wrong with git grep? > > > > Or "hg grep", for that matter? > > hg grep searches the history, not the working copy. *-tools-grep only > searches the working copy but automatically filters files to only search > in files under version control. > > Which as far as I know is indeed what git-grep does already. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/quodlibetor%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Wed Dec 12 16:56:54 2012 From: regebro at gmail.com (Lennart Regebro) Date: Wed, 12 Dec 2012 16:56:54 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50C8541E.60406@python.org> References: <50C8541E.60406@python.org> Message-ID: General comments: It seems like the consensus is moving towards making sure there always is a database available. If this means including it in the standard Python distribution as well, or only on Windows, I don't know, opinions on that are welcome. The steps to look for a database would then change to: 1. The path specified, if not None. 2. The module for timezone "overrides". 3. The OS database. 4. The database included in Python. We need to determine if a warning should be raised in case of 4 or not, as well as the name for the override module. I think the word "override" here is possibly unclear, I'd prefer something like "timezone-update" or similar. I'm personally a bit sceptical to writing a special updater/installer just for this. I don't want to have a special unique way to install this package. As it comes to OS packages, Christian Heimes pointed out that most Windows installations today has Java installed, and kept updated, and it has a zoneinfo database. We could consider using that on Windows as well, although it admittedly feels quite icky. I haven't been able to find any other common locations for the zoneinfo database on Windows. Specific answers: On Tue, Dec 11, 2012 at 4:39 PM, Dirkjan Ochtman wrote: > I wonder if there needs to be something here about how to port from > pytz to the new timezone library. It would be nice to have, but I don't think it's necessary to have in the PEP. > It seems like calling get_timezone() with an unknown timezone should > just throw ValueError, not necessarily some custom Exception? That could very well be. What are others opinions on this? > Why not keep a bit more of the pytz API to make porting easy? The renaming of the timezone() function to get_timezone() is indeed small, but changing pytz.timezone(foo) to timezone.timezone(foo) is really significantly easier than renaming it to timezone.get_timezone(foo). If we keep all of the API intact you could do try: import pytz as timezone except ImportError: import timezone Which would make porting quicker, that's true, but do we really want to keep unecessary API's around forever? Isn't it better to minimize the noise from the start? > It also seems relatively painless to keep localize() and normalize() > functions around for easy porting. Sure, but you then have two ways of doing the same thing, which I think we should avoid. On Tue, Dec 11, 2012 at 5:07 PM, Antoine Pitrou wrote: >> The ``is_dst`` parameter can be ``True`` (default), ``False``, or >> ``None``. >> > Why is it True by default? Do we have statistics showing that Python > gets more use in summer? Because for some reason both me and Stuart Bishop thought it should be, but at least in my case I don't have any actual good reason why. Checking with how pytz does this shows that pytz in fact defaults to False, so I think the default should be False. On Wed, Dec 12, 2012 at 3:50 AM, Barry Warsaw wrote: >> This is likely the hardest part of this PEP as this involves updating the > Oops, something got cut off there. Ah, yes, I was going to write that the difficult bit was updating the _datetime.c module. > Why add a new module instead of putting all this into the existing datetime > module, either directly or as a submodule? Seems like the obvious place to > put it instead of claiming another top-level module name. pytz as it is consists of several modules, and a significant amount of code, it didn't feel right to move all that into the datetime.py module. It also didn't feel right to then not implement it in _datetime.c, but perhaps that's just me being silly. But a submodule could work. > I'm bikeshedding, but can we find a better name than `db` for the second > argument? Something that makes it obvious we're looking for file system path? Absolutely. db_path? > I'd really like to see a TimeZoneError base class from which all these new > exceptions inherit. That makes sense. >>The ``timezonedata``-package >>----------------------------- > > Just to be clear, this doesn't expose any new modules, right? That's the intention, yes, although I haven't investigated ways of knowing if it's installed or not yet, and exposing a module is the obvious way of doing that. But I'm hoping there will be better ways, right? > One other thing that the PEP should describe is what happens on a distro that > has timezone data, but which you also pip install the PyPI tzdata package. > Which one wins? Is there a way to control it, other than providing an > explicit path? Is there a way to uninstall the PyPI package? Does the API > need to provide a method which tells you where the database it is using by > default lives? The PyPI package wins, I'll clarify that bit. I'm think the data should end up in site-packages somewhere, and that it should be installable and uninstallable with pip/easy_install and by simply deleting it. On Wed, Dec 12, 2012 at 4:14 AM, Nick Coghlan wrote: > That's a backwards compatibility risk, though - many applications are likely > coping just fine with the slightly corrupted time values, but would fall > over if an exception was raised instead. The default should probably be > chosen so that the single argument form of these calls continues to behave > the same in 3.4 as it does in 3.3, emitting a DeprecationWarning to say that > the default behaviour is going to change in 3.5 (so the *actual* default > would be sentinel value, in order to tell the difference between an explicit > True being passed and relying on the default behaviour). Although explicit is better than implicit, I think this is one case where this doesn't apply. The cases where you really care which half past two you meant, or the cases where you want an error when you specify 2012-03-25 02:30 in Europe/Stockholm is exceedingly rare. Most people would not know this can happen, and therefore they would not handle the errors, but they would not want the application to fail when it does happen. I think the default therefore should be True or False. From brian at python.org Wed Dec 12 17:11:15 2012 From: brian at python.org (Brian Curtin) Date: Wed, 12 Dec 2012 10:11:15 -0600 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: On Wed, Dec 12, 2012 at 9:56 AM, Lennart Regebro wrote: > General comments: > > > It seems like the consensus is moving towards making sure there always is a > database available. If this means including it in the standard Python > distribution as well, or only on Windows, I don't know, opinions on that are > welcome. > > The steps to look for a database would then change to: > > 1. The path specified, if not None. > > 2. The module for timezone "overrides". > > 3. The OS database. > > 4. The database included in Python. > > We need to determine if a warning should be raised in case of 4 or not, as > well as the name for the override module. I think the word "override" here is > possibly unclear, I'd prefer something like "timezone-update" or similar. > > I'm personally a bit sceptical to writing a special updater/installer just > for this. I don't want to have a special unique way to install this package. > > As it comes to OS packages, Christian Heimes pointed out that most Windows > installations today has Java installed, and kept updated, and it has a > zoneinfo database. We could consider using that on Windows as well, although > it admittedly feels quite icky. Depending on Java being installed or even installing it alongside Python would be a funny April Fools prank. This can't happen. I don't think it's all that bad to include a small script on Windows which runs every few days to check PyPI, then present an option to update the info. This is what Java itself is doing anyway. From dirkjan at ochtman.nl Wed Dec 12 17:21:52 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 12 Dec 2012 17:21:52 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: On Wed, Dec 12, 2012 at 4:56 PM, Lennart Regebro wrote: >> Why not keep a bit more of the pytz API to make porting easy? > > The renaming of the timezone() function to get_timezone() is indeed small, > but changing pytz.timezone(foo) to timezone.timezone(foo) is really > significantly easier than renaming it to timezone.get_timezone(foo). > > If we keep all of the API intact you could do > > try: > import pytz as timezone > except ImportError: > import timezone > > Which would make porting quicker, that's true, but do we really want to keep > unecessary API's around forever? Isn't it better to minimize the noise from > the start? That entirely depends on when you define to be "the start". It seems to me the consensus on python-dev has been that packages primarily evolve outside the stdlib; it seems a bit weird to then, at the time of stdlib inclusion, start changing the API. >> Why is it True by default? Do we have statistics showing that Python >> gets more use in summer? > > Because for some reason both me and Stuart Bishop thought it should be, but > at least in my case I don't have any actual good reason why. Checking with > how pytz does this shows that pytz in fact defaults to False, so I think > the default should be False. Here, too, I think that sticking with pytz's default would be a good idea. Cheers, Dirkjan From p.f.moore at gmail.com Wed Dec 12 17:28:23 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 12 Dec 2012 16:28:23 +0000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: On 12 December 2012 16:11, Brian Curtin wrote: > I don't think it's all that bad to include a small script on Windows > which runs every few days to check PyPI, then present an option to > update the info. This is what Java itself is doing anyway. What would that do in an environment without internet access? Or with a firewall blocking Python's requests and returning an error page without warning (so the updater just sees incorrect data)? What about corporate environments that want to control the rollout of updates? (I can't imagine that in practice, but certainly companies do it for Java). Most Windows updaters use the "official" Windows APIs so that they work properly with odd cases like ISA proxies taking credentials from the Windows user login. Python's stdlib doesn't support that type of thing. I'm -1 on auto-updating because it's too easy to produce a "nearly right" solution that doesn't work in highly-controlled (e.g., corporate) environments. And a "correct" solution would be hard to support with python-dev's level of Windows expertise. Paul. From solipsis at pitrou.net Wed Dec 12 17:44:36 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 12 Dec 2012 17:44:36 +0100 Subject: [Python-Dev] Draft PEP for time zone support. References: <50C8541E.60406@python.org> Message-ID: <20121212174436.40acd4e1@pitrou.net> Le Wed, 12 Dec 2012 10:11:15 -0600, Brian Curtin a ?crit : > > I don't think it's all that bad to include a small script on Windows > which runs every few days to check PyPI, then present an option to > update the info. This is what Java itself is doing anyway. I don't get why people are so obsessed about updating the timezone database. Really, this is not worse than having a vulnerable OpenSSL linked with your Python executable. Purity does not bring any advantage here. Regards Antoine. From guido at python.org Wed Dec 12 17:53:49 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Dec 2012 08:53:49 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121212174436.40acd4e1@pitrou.net> References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> Message-ID: On Wed, Dec 12, 2012 at 8:44 AM, Antoine Pitrou wrote: > Le Wed, 12 Dec 2012 10:11:15 -0600, > Brian Curtin a ?crit : >> >> I don't think it's all that bad to include a small script on Windows >> which runs every few days to check PyPI, then present an option to >> update the info. This is what Java itself is doing anyway. > > I don't get why people are so obsessed about updating the timezone > database. Really, this is not worse than having a vulnerable OpenSSL > linked with your Python executable. Purity does not bring any > advantage here. Bingo. As long as the recipe to update is clear, most users can ignore this, because the countries about which they care don't change DST rules often enough for it to matter. When it does matter, they'll know (changing the DST rules is something that local news sources tend to track :-) and they can update their software when stuff they use starts getting the time wrong. Obviously sysadmins responsible for large numbers of users can make this into a routine, and ditto people who run services. But these folks are professionals and are good at automating tasks like this. -- --Guido van Rossum (python.org/~guido) From Steve.Dower at microsoft.com Wed Dec 12 17:54:36 2012 From: Steve.Dower at microsoft.com (Steve Dower) Date: Wed, 12 Dec 2012 16:54:36 +0000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: Paul Moore wrote: > On 12 December 2012 16:11, Brian Curtin wrote: > > I don't think it's all that bad to include a small script on Windows > > which runs every few days to check PyPI, then present an option to > > update the info. This is what Java itself is doing anyway. > > What would that do in an environment without internet access? Or with a > firewall blocking Python's requests and returning an error page without > warning (so the updater just sees incorrect data)? What about corporate > environments that want to control the rollout of updates? (I can't imagine > that in practice, but certainly companies do it for Java). Most Windows > updaters use the "official" Windows APIs so that they work properly with > odd cases like ISA proxies taking credentials from the Windows user login. > Python's stdlib doesn't support that type of thing. > > I'm -1 on auto-updating because it's too easy to produce a "nearly right" > solution that doesn't work in highly-controlled (e.g., > corporate) environments. And a "correct" solution would be hard to support > with python-dev's level of Windows expertise. And what about embedded installations of Python, such as in TortoiseHg? And all the people (such as myself) who disable updaters that they don't like or didn't expect? The "correct" solution on Windows may be to use a static database for historical dates and the information in the registry for current and future dates. The registry is updated through Windows Update, which is at least as reliable as anything Python could do. (I'm not sure exactly what the state of updates to older versions is like, but I'd assume WinXP still gets timezone updates and Win2K doesn't.) Details of the registry entries are at http://msdn.microsoft.com/en-us/library/ms725481.aspx. It looks like the data is focused on modern timezones rather than localities, which would mean a many-to-one mapping from zoneinfo. Unfortunately it doesn't look like there's enough overlap to allow an automated mapping. That said, it is incredibly easy to convert between UTC and local (http://msdn.microsoft.com/en-us/library/ms724949.aspx), even for dates in the past or future when the information is available. It's just that timezones other than the user's preference are difficult. Cheers, Steve From merwok at netwok.org Wed Dec 12 18:51:41 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 12 Dec 2012 12:51:41 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50C8541E.60406@python.org> References: <50C8541E.60406@python.org> Message-ID: <50C8C42D.80703@netwok.org> Hi, Le 12/12/2012 04:53, Christian Heimes a ?crit : > Am 12.12.2012 01:58, schrieb Nick Coghlan: >> Ick, why a new module? Why not just add this directly to datetime? (It >> doesn't need to be provided by the C accelerator, it can go straight in >> the pure Python part). > > +1 for something like datetime.timezone > > How well does hg handle files renames? The datetime module could be > converted to a package. Quite well. It?s easy to rename datetime.py to datetime/__init__py, and subsequent fixes in 3.3?s datetime.py will be merged into datetime/__init__.py by Mercurial?s merge subsystem. Cheers From d.s.seljebotn at astro.uio.no Wed Dec 12 21:37:08 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 12 Dec 2012 21:37:08 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> Message-ID: <50C8EAF4.8070104@astro.uio.no> On 12/12/2012 01:15 AM, Nick Coghlan wrote: > On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland > wrote: > > OTOH changing certain dictionaries in IronPython (such as keyword > args) to be > ordered would certainly be possible. Personally I just wouldn't > want to see it > be the default as that seems like unnecessary overhead when the > specialized > class exists. > > > Which reminds me, I was going to note that one of the main gains with > ordered keyword arguments, is their use in the construction of > string-keyed objects where you want to be able to control the order of > iteration (e.g. for serialisation or display purposes). Currently you > have to go the path of something like namedtuple where you define the > order of iteration in one operation, and set the values in another. So here's a brand new argument against ordered dicts: The existence of perfect hashing schemes. They fundamentally conflict with ordered dicts. I played with using them for vtable dispatches in Cython this summer, and they can perform really, really well for branch-predicted lookups in hot loops, because you always/nearly always eliminate linear probing and so there's no branch misses or extra comparisons. (The overhead of a perfect hash table lookup over a traditional vtable lookup was only a couple of cycles in my highly artificial fully branch-predicted micro-benchmark.) There's some overhead in setup; IIRC, ~20 microseconds for 64 elements, 2 GHz CPU, though that was a first prototype implementation and both algorithmic improvements and tuning should be possible. So it's not useful for everything, but perhaps for things like module dictionaries and classes an "optionally perfect" dict can make sense. Note: I'm NOT suggesting the use of perfect hashing, just making sure it's existence is mentioned and that one is aware that if always-ordered dicts become the language standard it precludes this option far off in the future. (Something like a sort() method could still work and make the dict "unperfect"; one could also have a pack() method that made the dict perfect again.). That concludes the on-topic parts of my post. -- Dag Sverre Seljebotn APPENDIX Going off-topic for those who are interested, here's the longwinded and ugly details. My code [1] is based on the paper [2] (psuedo-code in Appendix A), but I adapted it a bit to be useful for tens/hundreds of elements rather than billions. The ingredients: 1) You need the hash to be 32 bits (or 64) of good entropy (md5 or murmurhash or similar). (Yes, that's a tall order for CPython, I'm just describing the scheme.) (If the hash collides on all bits you *will* collide, so some fallback is still necesarry, just unlikely.) 2) To lookup, the idea is (psuedo-code!) typedef struct { int m_f m_g, r, k; int16_t d[k]; /* "small" int, like current proposal */ } table_header_t; And then one computes index of an element with hash "h" using the function ((h >> tab->r) & tab->m_f) ^ tab->d[h & tab->m_g] rather than the usual "h % n". While more arithmetic, arithmetic is cheap and branch misses are not. 3) To set up/repack a table one needs to find the parameters. The general idea is: a) Partition the hashes into k bins by using "h & m_g". There will be collisions, but the number of bins with many collisions will be very small; most bins will have 2 or 1 or 0 elements. b) Starting with the largest bin, distribute the elements according to the hash function. If a bin collides with the existing contents, try another value for d[binindex] until it doesn't. The r parameter let's you try again 32 (or 64) times to find a solution. In my testcases there was ~0.1% chance of not finding a solution (that is, exhausting possible choices of r) with 64-bit hashes with 4 or 8 elements and no empty table elements. For any other number of elements, or with some empty elements, the chance of failure was much lower.) [1] It's not exactly a great demo, but it contains the algorithm. If there's much interest I should clean it up and make a proper benchmark demo out of it: https://github.com/dagss/pyextensibletype/blob/perfecthash/include/perfecthash.h [2] Pagh (1999) http://www.brics.dk/RS/99/13/BRICS-RS-99-13.ps.gz From pje at telecommunity.com Wed Dec 12 22:31:01 2012 From: pje at telecommunity.com (PJ Eby) Date: Wed, 12 Dec 2012 16:31:01 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C8EAF4.8070104@astro.uio.no> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> <50C8EAF4.8070104@astro.uio.no> Message-ID: On Wed, Dec 12, 2012 at 3:37 PM, Dag Sverre Seljebotn wrote: > On 12/12/2012 01:15 AM, Nick Coghlan wrote: >> >> On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland > > wrote: >> >> OTOH changing certain dictionaries in IronPython (such as keyword >> args) to be >> ordered would certainly be possible. Personally I just wouldn't >> want to see it >> be the default as that seems like unnecessary overhead when the >> specialized >> class exists. >> >> >> Which reminds me, I was going to note that one of the main gains with >> ordered keyword arguments, is their use in the construction of >> string-keyed objects where you want to be able to control the order of >> iteration (e.g. for serialisation or display purposes). Currently you >> have to go the path of something like namedtuple where you define the >> order of iteration in one operation, and set the values in another. > > > So here's a brand new argument against ordered dicts: The existence of > perfect hashing schemes. They fundamentally conflict with ordered dicts. If I understand your explanation, then they don't conflict with the type of ordering described in this thread. Raymond's optimization separates the "hash table" part from the "contents" part of a dictionary, and there is no requirement that these two parts be in the same size or the same order. Indeed, Raymond's split design lets you re-parameterize the hashing all you want, without perturbing the iteration order at all. You would in fact be able to take a dictionary at any moment, and say, "optimize the 'hash table' part to a non-colliding state based on the current contents", without touching the 'contents' part at all. (One could do this at class creation time on a class dictionary, and just after importing on a module dictionary, for example.) From d.s.seljebotn at astro.uio.no Wed Dec 12 23:06:18 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 12 Dec 2012 23:06:18 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> <50C8EAF4.8070104@astro.uio.no> Message-ID: <50C8FFDA.7050908@astro.uio.no> On 12/12/2012 10:31 PM, PJ Eby wrote: > On Wed, Dec 12, 2012 at 3:37 PM, Dag Sverre Seljebotn > wrote: >> On 12/12/2012 01:15 AM, Nick Coghlan wrote: >>> >>> On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland >> > wrote: >>> >>> OTOH changing certain dictionaries in IronPython (such as keyword >>> args) to be >>> ordered would certainly be possible. Personally I just wouldn't >>> want to see it >>> be the default as that seems like unnecessary overhead when the >>> specialized >>> class exists. >>> >>> >>> Which reminds me, I was going to note that one of the main gains with >>> ordered keyword arguments, is their use in the construction of >>> string-keyed objects where you want to be able to control the order of >>> iteration (e.g. for serialisation or display purposes). Currently you >>> have to go the path of something like namedtuple where you define the >>> order of iteration in one operation, and set the values in another. >> >> >> So here's a brand new argument against ordered dicts: The existence of >> perfect hashing schemes. They fundamentally conflict with ordered dicts. > > If I understand your explanation, then they don't conflict with the > type of ordering described in this thread. Raymond's optimization > separates the "hash table" part from the "contents" part of a > dictionary, and there is no requirement that these two parts be in the > same size or the same order. I don't fully agree. Perfect hashing already separates "hash table" from "contents" (sort of), and saves the memory in much the same way. Yes, you can repeat the trick and have 2 levels of indirection, but that then requires an additional table of small ints which is pure overhead present for the sorting; in short, it's no longer an optimization but just overhead for the sortability. Dag Sverre > > Indeed, Raymond's split design lets you re-parameterize the hashing > all you want, without perturbing the iteration order at all. You > would in fact be able to take a dictionary at any moment, and say, > "optimize the 'hash table' part to a non-colliding state based on the > current contents", without touching the 'contents' part at all. > > (One could do this at class creation time on a class dictionary, and > just after importing on a module dictionary, for example.) > From d.s.seljebotn at astro.uio.no Wed Dec 12 23:09:37 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Wed, 12 Dec 2012 23:09:37 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C8FFDA.7050908@astro.uio.no> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> <50C8EAF4.8070104@astro.uio.no> <50C8FFDA.7050908@astro.uio.no> Message-ID: <50C900A1.9000206@astro.uio.no> On 12/12/2012 11:06 PM, Dag Sverre Seljebotn wrote: > On 12/12/2012 10:31 PM, PJ Eby wrote: >> On Wed, Dec 12, 2012 at 3:37 PM, Dag Sverre Seljebotn >> wrote: >>> On 12/12/2012 01:15 AM, Nick Coghlan wrote: >>>> >>>> On Wed, Dec 12, 2012 at 5:37 AM, Dino Viehland >>> > wrote: >>>> >>>> OTOH changing certain dictionaries in IronPython (such as keyword >>>> args) to be >>>> ordered would certainly be possible. Personally I just wouldn't >>>> want to see it >>>> be the default as that seems like unnecessary overhead when the >>>> specialized >>>> class exists. >>>> >>>> >>>> Which reminds me, I was going to note that one of the main gains with >>>> ordered keyword arguments, is their use in the construction of >>>> string-keyed objects where you want to be able to control the order of >>>> iteration (e.g. for serialisation or display purposes). Currently you >>>> have to go the path of something like namedtuple where you define the >>>> order of iteration in one operation, and set the values in another. >>> >>> >>> So here's a brand new argument against ordered dicts: The existence of >>> perfect hashing schemes. They fundamentally conflict with ordered dicts. >> >> If I understand your explanation, then they don't conflict with the >> type of ordering described in this thread. Raymond's optimization >> separates the "hash table" part from the "contents" part of a >> dictionary, and there is no requirement that these two parts be in the >> same size or the same order. > > I don't fully agree. > > Perfect hashing already separates "hash table" from "contents" (sort > of), and saves the memory in much the same way. This was a bit inaccurate, but the point is: The perfect hash function doesn't need any fill-in to avoid collisions, you can (except in exceptional circumstances) fill the table 100%, so the memory is already saved. Dag Sverre > > Yes, you can repeat the trick and have 2 levels of indirection, but that > then requires an additional table of small ints which is pure overhead > present for the sorting; in short, it's no longer an optimization but > just overhead for the sortability. > > Dag Sverre > >> >> Indeed, Raymond's split design lets you re-parameterize the hashing >> all you want, without perturbing the iteration order at all. You >> would in fact be able to take a dictionary at any moment, and say, >> "optimize the 'hash table' part to a non-colliding state based on the >> current contents", without touching the 'contents' part at all. >> >> (One could do this at class creation time on a class dictionary, and >> just after importing on a module dictionary, for example.) >> > From greg.ewing at canterbury.ac.nz Wed Dec 12 20:51:34 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 13 Dec 2012 08:51:34 +1300 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <20121211170755.3e0a7f77@pitrou.net> Message-ID: <50C8E046.9030201@canterbury.ac.nz> > On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou wrote: > >>Do we have statistics showing that Python >>gets more use in summer? Well, pythons are cold-blooded, so they're probably more active during the warmer seasons... -- Greg From regebro at gmail.com Thu Dec 13 00:08:57 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Dec 2012 00:08:57 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: On Wed, Dec 12, 2012 at 5:21 PM, Dirkjan Ochtman wrote: > That entirely depends on when you define to be "the start". It seems > to me the consensus on python-dev has been that packages primarily > evolve outside the stdlib; it seems a bit weird to then, at the time > of stdlib inclusion, start changing the API. But this bit of the API is there only as a hack, because stdlib does not support is_dst. We are changing that. Hence those extra functions are no longer needed. //Lennart From regebro at gmail.com Thu Dec 13 00:14:12 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Dec 2012 00:14:12 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: On Wed, Dec 12, 2012 at 5:54 PM, Steve Dower wrote: > Details of the registry entries are at http://msdn.microsoft.com/en-us/library/ms725481.aspx. It looks like the data is focused on modern timezones rather than localities, which would mean a many-to-one mapping from zoneinfo. Unfortunately it doesn't look like there's enough overlap to allow an automated mapping. No, but the Unicode consortium (I think) is keeping a mapping updated manually. I'm using that in tzlocal, to figure out the local timezone of the computer on Windows. However, I think that mixing and matching timezone data in this way from the two systems are likely to be full of pitfalls edge-cases and complexities I do not dare even think seriously about. There will probably be *less* errors by just keeping an old timezone database around. Besides, what it they don't run Windows update? Then the data still is outdated? //Lennart From tismer at stackless.com Wed Dec 12 23:55:30 2012 From: tismer at stackless.com (Christian Tismer) Date: Wed, 12 Dec 2012 23:55:30 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <20121211170755.3e0a7f77@pitrou.net> Message-ID: <50C90B62.5050406@stackless.com> On 12.12.12 02:43, Guido van Rossum wrote: > On Tue, Dec 11, 2012 at 5:11 PM, Robert Brewer wrote: >> Guido van Rossum wrote: >>> Sent: Tuesday, December 11, 2012 4:11 PM >>> To: Antoine Pitrou >>> Cc:python-dev at python.org >>> Subject: Re: [Python-Dev] Draft PEP for time zone support. >>> >>> On Tue, Dec 11, 2012 at 8:07 AM, Antoine Pitrou >>> wrote: >>>> Le Tue, 11 Dec 2012 16:23:37 +0100, >>>> Lennart Regebro a ?crit : >>>>> Changes in the ``datetime``-module >>>>> -------------------------------------- >>>>> >>>>> A new ``is_dst`` parameter is added to several of the `tzinfo` >>>>> methods to handle time ambiguity during DST changeovers. >>>>> >>>>> * ``tzinfo.utcoffset(self, dt, is_dst=True)`` >>>>> >>>>> * ``tzinfo.dst(self, dt, is_dst=True)`` >>>>> >>>>> * ``tzinfo.tzname(self, dt, is_dst=True)`` >>>>> >>>>> The ``is_dst`` parameter can be ``True`` (default), ``False``, or >>>>> ``None``. >>>>> >>>>> ``True`` will specify that the given datetime should be interpreted >>>>> as happening during daylight savings time, ie that the time >>> specified >>>>> is before the change from DST. >>>> Why is it True by default? Do we have statistics showing that Python >>>> gets more use in summer? >>> My question exactly. >> "Summer" in the USA, at least, is 238 days in 2012, while "Winter" into 2013 is only 126 days: >> >>>>> import datetime >>>>> datetime.date(2012, 11, 4) - datetime.date(2012, 3, 11) >> datetime.timedelta(238) >>>>> datetime.date(2013, 3, 10) - datetime.date(2012, 11, 4) >> datetime.timedelta(126) > Very funny, but that can't be the real reason. *Most* datetime values > aren't ambiguous, so in those cases the parameter should be ignored, > right? There's only one hour per year where you need to specify it > (two, if we want to artificially assign a meaning to values falling > the impossible hour). And during those times it's equally likely that > you meant either of the possibilities. I think the meaning of the > parameter must be clarified, perhaps as follows: > > - ignored except during the ambiguous hour and during the impossible hour > - during the ambiguous or impossible hour: > - if True, prefer/pretend DST > - if False, prefer/pretend non-DST > - if None, raise an error > > Here I'd prefer the default to be None if I had to do it over again, > but given that the current behavior is one of the first two (which > one?) we probably can't do that. Still, it's slightly confusing that > passing None is not the same as omitting the parameter altogether -- > there aren't many APIs that explicitly support passing None but don't > use it as the default (though there probably are some precedents). > Maybe requesting an error should be done through some other special > value, and None should be the same as omitted and the same as the old > behavior? But where would the special value come from? It should be > made as easy as possible to "do the right thing" (i.e. raise an > error). Or maybe have a separate Boolean flag to request an error? > I see an issue here that makes me a little uncomfortable: Having a default that makes code work all year but raises an error during the "impossible hour" could be problematic in critical code. Can we make this more explicit by forcing the users to decide? I like the idea of the extra boolean flag here, because it will be explicitly visible that this code intentionally creates an exception. Or even not a flag, but the exception to be raised, or a callable to handle this case? Sloppy coding can be dangerous. So maybe the warning module could be helpful as well: If None is passed and no explicit flag/exception/callable given, bother the user with a warning message ;-) cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship*http://starship.python.net/ 14482 Potsdam : PGP key ->http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today?http://www.stackless.com/ From tjreedy at udel.edu Thu Dec 13 00:23:37 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 12 Dec 2012 18:23:37 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> Message-ID: On 12/12/2012 11:53 AM, Guido van Rossum wrote: > Bingo. As long as the recipe to update is clear, most users can ignore > this, because the countries about which they care don't change DST > rules often enough for it to matter. When it does matter, they'll know > (changing the DST rules is something that local news sources tend to > track :-) and they can update their software when stuff they use > starts getting the time wrong. Obviously sysadmins responsible for > large numbers of users can make this into a routine, and ditto people > who run services. But these folks are professionals and are good at > automating tasks like this. As a Windows user, I would like there to be one tz data file used by all Python versions on my machine, including ones included with other apps. I would like every installer, including for bug fix releases, to update it. This should be sufficient for 99% of Windows users. As Guido says above, the docs should tell the other 1% how to update it explicitly. -- Terry Jan Reedy From tjreedy at udel.edu Thu Dec 13 00:30:46 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 12 Dec 2012 18:30:46 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> Message-ID: On 12/12/2012 10:56 AM, Lennart Regebro wrote: >> It seems like calling get_timezone() with an unknown timezone should >> just throw ValueError, not necessarily some custom Exception? > > That could very well be. What are others opinions on this? ValueError. That is what it is. Nothing special here. >> Why not keep a bit more of the pytz API to make porting easy? > > The renaming of the timezone() function to get_timezone() is indeed small, And gratuitous, to me. I don't generally like 'get' prefixes anyway. > but changing pytz.timezone(foo) to timezone.timezone(foo) is really > significantly easier than renaming it to timezone.get_timezone(foo). > > If we keep all of the API intact you could do > > try: > import pytz as timezone > except ImportError: > import timezone > > Which would make porting quicker, that's true, but do we really want to keep > unecessary API's around forever? Isn't it better to minimize the noise from > the start? While the module that was the basis for the ipaddress module was released on PyPI and its api developed however it did, the API was worked over quite a bit before the addition of ipaddress. So I agree that the current api can be revised before being more-or-less frozen in the stdlib. >> It also seems relatively painless to keep localize() and normalize() >> functions around for easy porting. > > Sure, but you then have two ways of doing the same thing, which I think we > should avoid. I agree that this is precisely the time to remove cruft (if indeed it is such). -- Terry Jan Reedy From regebro at gmail.com Thu Dec 13 00:33:13 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Dec 2012 00:33:13 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> Message-ID: On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy wrote: > As a Windows user, I would like there to be one tz data file used by all > Python versions on my machine, including ones included with other apps. That would be nice, but where would that be installed? There is no standard location for zoneinfo files. And do we really want to install python-specific files outside the Python tree? //Lennart From python at mrabarnett.plus.com Thu Dec 13 01:10:06 2012 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 13 Dec 2012 00:10:06 +0000 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> Message-ID: <50C91CDE.8050504@mrabarnett.plus.com> On 2012-12-12 23:33, Lennart Regebro wrote: > On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy wrote: >> As a Windows user, I would like there to be one tz data file used by all >> Python versions on my machine, including ones included with other apps. > > That would be nice, but where would that be installed? There is no > standard location for zoneinfo files. And do we really want to install > python-specific files outside the Python tree? > Python version x.y is installed into, say, C:\Pythonxy, so perhaps a good place would be, say, C:\Python. From brian at python.org Thu Dec 13 01:27:06 2012 From: brian at python.org (Brian Curtin) Date: Wed, 12 Dec 2012 18:27:06 -0600 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50C91CDE.8050504@mrabarnett.plus.com> References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On Wed, Dec 12, 2012 at 6:10 PM, MRAB wrote: > On 2012-12-12 23:33, Lennart Regebro wrote: >> >> On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy wrote: >>> >>> As a Windows user, I would like there to be one tz data file used by all >>> Python versions on my machine, including ones included with other apps. >> >> >> That would be nice, but where would that be installed? There is no >> standard location for zoneinfo files. And do we really want to install >> python-specific files outside the Python tree? >> > Python version x.y is installed into, say, C:\Pythonxy, so perhaps a > good place would be, say, C:\Python. C:\ProgramData\Python From tjreedy at udel.edu Thu Dec 13 02:24:04 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 12 Dec 2012 20:24:04 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On 12/12/2012 7:27 PM, Brian Curtin wrote: > On Wed, Dec 12, 2012 at 6:10 PM, MRAB wrote: >> On 2012-12-12 23:33, Lennart Regebro wrote: >>> >>> On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy wrote: >>>> >>>> As a Windows user, I would like there to be one tz data file used by all >>>> Python versions on my machine, including ones included with other apps. >>> >>> >>> That would be nice, but where would that be installed? There is no >>> standard location for zoneinfo files. And do we really want to install >>> python-specific files outside the Python tree? There is no 'Python tree' on windows. Rather, there is a separate tree for each version, located where the user directs. Windows used to have a %APPDATA% directory variable. Not sure about Win 7, let alone 8. Martin and others should know better. Or ask the user where to put it. I know where I would choose, and it would not be on my C drive. Un-installers would not delete (unless a reference count were kept and were decremented to 0). >> Python version x.y is installed into, say, C:\Pythonxy, so perhaps a >> good place would be, say, C:\Python. > > C:\ProgramData\Python Making a new top-level directory without asking is obnoxious. -- Terry Jan Reedy From brian at python.org Thu Dec 13 02:36:15 2012 From: brian at python.org (Brian Curtin) Date: Wed, 12 Dec 2012 19:36:15 -0600 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On Dec 12, 2012 7:24 PM, "Terry Reedy" wrote: > > On 12/12/2012 7:27 PM, Brian Curtin wrote: >> >> On Wed, Dec 12, 2012 at 6:10 PM, MRAB wrote: >>> >>> On 2012-12-12 23:33, Lennart Regebro wrote: >>>> >>>> >>>> On Thu, Dec 13, 2012 at 12:23 AM, Terry Reedy wrote: >>>>> >>>>> >>>>> As a Windows user, I would like there to be one tz data file used by all >>>>> Python versions on my machine, including ones included with other apps. >>>> >>>> >>>> >>>> That would be nice, but where would that be installed? There is no >>>> standard location for zoneinfo files. And do we really want to install >>>> python-specific files outside the Python tree? > > > There is no 'Python tree' on windows. Rather, there is a separate tree for each version, located where the user directs. > > Windows used to have a %APPDATA% directory variable. Not sure about Win 7, let alone 8. Martin and others should know better. > > Or ask the user where to put it. I know where I would choose, and it would not be on my C drive. Un-installers would not delete (unless a reference count were kept and were decremented to 0). > > >>> Python version x.y is installed into, say, C:\Pythonxy, so perhaps a >>> good place would be, say, C:\Python. >> >> >> C:\ProgramData\Python > > > Making a new top-level directory without asking is obnoxious. See http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu Dec 13 02:43:20 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 12 Dec 2012 17:43:20 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: <50C932B8.9060100@g.nevcal.com> On 12/12/2012 5:36 PM, Brian Curtin wrote: > > >> C:\ProgramData\Python > ^^^^^ That. Is not the path that the link below is talking about, though. > > > > > > Making a new top-level directory without asking is obnoxious. > > See > http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows > -------------- next part -------------- An HTML attachment was scrubbed... URL: From janzert at janzert.com Thu Dec 13 03:10:46 2012 From: janzert at janzert.com (Janzert) Date: Wed, 12 Dec 2012 21:10:46 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50C932B8.9060100@g.nevcal.com> References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> <50C932B8.9060100@g.nevcal.com> Message-ID: On 12/12/2012 8:43 PM, Glenn Linderman wrote: > On 12/12/2012 5:36 PM, Brian Curtin wrote: >> >> >> C:\ProgramData\Python >> > > ^^^^^ That. Is not the path that the link below is talking > about, though. > It actually does; it is rather confusing though. :/ It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The actual on disk location for this has changed over windows versions. As noted below in the SO link given: "Note that this documentation refers to the typical path as per older versions of Windows. In modern versions of Windows it is located in %SystemDrive%\ProgramData." >> > >> > >> > Making a new top-level directory without asking is obnoxious. >> >> See >> http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows >> > > > From brian at python.org Thu Dec 13 03:54:23 2012 From: brian at python.org (Brian Curtin) Date: Wed, 12 Dec 2012 20:54:23 -0600 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> <50C932B8.9060100@g.nevcal.com> Message-ID: On Wed, Dec 12, 2012 at 8:10 PM, Janzert wrote: > On 12/12/2012 8:43 PM, Glenn Linderman wrote: >> >> On 12/12/2012 5:36 PM, Brian Curtin wrote: >>> >>> >>> >> C:\ProgramData\Python >>> >> >> ^^^^^ That. Is not the path that the link below is talking >> about, though. >> > > It actually does; it is rather confusing though. :/ It's referring to > KNOWNFOLDERID constant FOLDERID_ProgramData. The actual on disk location for > this has changed over windows versions. As noted below in the SO link given: > > "Note that this documentation refers to the typical path as per older > versions of Windows. In modern versions of Windows it is located in > %SystemDrive%\ProgramData." Correct. Anyway, on with the actual timezone stuff... From pje at telecommunity.com Thu Dec 13 06:11:23 2012 From: pje at telecommunity.com (PJ Eby) Date: Thu, 13 Dec 2012 00:11:23 -0500 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <50C8FFDA.7050908@astro.uio.no> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> <50C8EAF4.8070104@astro.uio.no> <50C8FFDA.7050908@astro.uio.no> Message-ID: On Wed, Dec 12, 2012 at 5:06 PM, Dag Sverre Seljebotn wrote: > Perfect hashing already separates "hash table" from "contents" (sort of), > and saves the memory in much the same way. > > Yes, you can repeat the trick and have 2 levels of indirection, but that > then requires an additional table of small ints which is pure overhead > present for the sorting; in short, it's no longer an optimization but just > overhead for the sortability. I'm confused. I understood your algorithm to require repacking, rather than it being a suitable algorithm for incremental change to an existing dictionary. ISTM that that would mean you still pay some sort of overhead (either in time or space) while the dictionary is still being mutated. Also, I'm not sure how 2 levels of indirection come into it. The algorithm you describe has, as I understand it, only up to 12 perturbation values ("bins"), so it's a constant space overhead, not a variable one. What's more, you can possibly avoid the extra memory access by using a different perfect hashing algorithm, at the cost of a slower optimization step or using a little more memory. > Note: I'm NOT suggesting the use of perfect hashing, just making > sure it's existence is mentioned and that one is aware that if > always-ordered dicts become the language standard it precludes > this option far off in the future. Not really. It means that some forms of perfect hashing might require adding a few more ints worth of overhead for the dictionaries that use it. If it's really a performance benefit for very-frequently-used dictionaries, that might still be worthwhile. From regebro at gmail.com Thu Dec 13 07:06:19 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Dec 2012 07:06:19 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On Thu, Dec 13, 2012 at 2:24 AM, Terry Reedy wrote: > Or ask the user where to put it. If we ask where it should be installed, then we need a registry setting for that or we don't know where it's located when it is to be used. And if we ask, then people will install it in non-standard locations. While installers for software with Python don't want their users to be asked, so they'll install it in the standard location, overriding the managers preferred, updated custom location with the standard location with a database that is probably not updated. So I think that asking is not an option at all. It either goes in %PROGRAMDATA%\Python\zoneinfo or it's not shared at all. > I know where I would choose, and it would > not be on my C drive. Un-installers would not delete (unless a reference > count were kept and were decremented to 0). True, and that's annoying when those counters go wrong. All in all I would say I would prefer to install this per Python. //Lennart From regebro at gmail.com Thu Dec 13 07:21:57 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Dec 2012 07:21:57 +0100 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: <50C58DA5.3000307@cavallinux.eu> References: <50C58DA5.3000307@cavallinux.eu> Message-ID: On Mon, Dec 10, 2012 at 8:22 AM, Antonio Cavallo wrote: > Hi, > I wonder if is it worth/if there is any interest in trying to "clean" up > distutils: nothing in terms to add new features, just a *major* cleanup > retaining the exact same interface. > > > I'm not planning anything like *adding features* or rewriting rpm/rpmbuild > here, simply cleaning up that un-holy code mess. Yes it served well, don't > get me wrong, and I think it did work much better than anything it was meant > to replace it. > > I'm not into the py3 at all so I wonder how possibly it could fit/collide > into the big plan. > > Or I'll be wasting my time? The effort of making something that replaces distutils is, as far as I can understand, currently on the level of taking the best bits out of distutils2 and putting it into Python 3.4 under the name "packaging". I'm sure that effort can need more help. //Lennart From v+python at g.nevcal.com Thu Dec 13 07:39:25 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 12 Dec 2012 22:39:25 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> <50C932B8.9060100@g.nevcal.com> Message-ID: <50C9781D.9090200@g.nevcal.com> On 12/12/2012 6:10 PM, Janzert wrote: > On 12/12/2012 8:43 PM, Glenn Linderman wrote: >> On 12/12/2012 5:36 PM, Brian Curtin wrote: >>> >>> >> C:\ProgramData\Python >>> >> >> ^^^^^ That. Is not the path that the link below is talking >> about, though. >> > > It actually does; it is rather confusing though. :/ I agree with the below. But I have never seen a version of Windows on which c:\ProgramData was the actual path for FOLDERID_ProgramData. Can you reference documentation that states that it was there, for some version? This documentation speaks of: c:\Documents and Settings\AllUsers\Application Data (which I knew from XP, and I think 2000, not sure I remember NT) In Vista.0, Vista.1, and Vista.2, I guess it is moved to C:\users\AllUsers\AppData\Roaming (typically). Neither of those would result in C:\ProgramData\Python. > It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The > actual on disk location for this has changed over windows versions. As > noted below in the SO link given: > > "Note that this documentation refers to the typical path as per older > versions of Windows. In modern versions of Windows it is located in > %SystemDrive%\ProgramData." > >>> > >>> > >>> > Making a new top-level directory without asking is obnoxious. >>> >>> See >>> http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Dec 13 08:18:01 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Dec 2012 08:18:01 +0100 Subject: [Python-Dev] cpython: expose TCP_FASTOPEN and MSG_FASTOPEN References: <3YMKyG14mkzRnn@mail.python.org> Message-ID: <20121213081801.0255e430@pitrou.net> On Thu, 13 Dec 2012 04:24:54 +0100 (CET) benjamin.peterson wrote: > http://hg.python.org/cpython/rev/5435a9278028 > changeset: 80834:5435a9278028 > user: Benjamin Peterson > date: Wed Dec 12 22:24:47 2012 -0500 > summary: > expose TCP_FASTOPEN and MSG_FASTOPEN > > files: > Misc/NEWS | 3 +++ > Modules/socketmodule.c | 7 ++++++- > 2 files changed, 9 insertions(+), 1 deletions(-) > > > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -163,6 +163,9 @@ > Library > ------- > > +- Expose the TCP_FASTOPEN and MSG_FASTOPEN flags in socket when they're > + available. This should be documented, no? Regards Antoine. From d.s.seljebotn at astro.uio.no Thu Dec 13 08:26:47 2012 From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn) Date: Thu, 13 Dec 2012 08:26:47 +0100 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <20121210164845.04942fa3@pitrou.net> <50C8EAF4.8070104@astro.uio.no> <50C8FFDA.7050908@astro.uio.no> Message-ID: <50C98337.4090402@astro.uio.no> On 12/13/2012 06:11 AM, PJ Eby wrote: > On Wed, Dec 12, 2012 at 5:06 PM, Dag Sverre Seljebotn > wrote: >> Perfect hashing already separates "hash table" from "contents" (sort of), >> and saves the memory in much the same way. >> >> Yes, you can repeat the trick and have 2 levels of indirection, but that >> then requires an additional table of small ints which is pure overhead >> present for the sorting; in short, it's no longer an optimization but just >> overhead for the sortability. > > I'm confused. I understood your algorithm to require repacking, > rather than it being a suitable algorithm for incremental change to an > existing dictionary. ISTM that that would mean you still pay some > sort of overhead (either in time or space) while the dictionary is > still being mutated. As-is the algorithm just assumes all key-value-pairs are available at creation time. So yes, if you don't reallocate when making the dict perfect then it could make sense to combine it with the scheme discussed in this thread. If one does leave some free slots open there's some probability of an insertion working without complete repacking, but the probability is smaller than with a normal dict. Hybrid schemes and trade-offs in this direction could be possible. > > Also, I'm not sure how 2 levels of indirection come into it. The > algorithm you describe has, as I understand it, only up to 12 > perturbation values ("bins"), so it's a constant space overhead, not a > variable one. What's more, you can possibly avoid the extra memory > access by using a different perfect hashing algorithm, at the cost of > a slower optimization step or using a little more memory. I said there's k perturbation values; you need an additional array some_int_t d[k] where some_int_t is large enough to hold n (the number of entries). Just like what's proposed in this thread. The paper recommends k > 2*n, but in my experiments I could get away with k = n in 99.9% of the cases (given perfect entropy in the hashes...). So the overhead is roughly the same as what's proposed here. I think the most promising thing would be to have always have a single integer table and either use it for indirection (usual case) or perfect hash function parameters (say, after a pack() method has been called and before new insertions). >> Note: I'm NOT suggesting the use of perfect hashing, just making >> sure it's existence is mentioned and that one is aware that if >> always-ordered dicts become the language standard it precludes >> this option far off in the future. > > Not really. It means that some forms of perfect hashing might require > adding a few more ints worth of overhead for the dictionaries that use > it. If it's really a performance benefit for very-frequently-used > dictionaries, that might still be worthwhile. > As mentioned above the overhead is larger. I think the main challenge is to switch to a hashing scheme with larger entropy for strings, like murmurhash3. Having lots of zero bits in the string for short strings will kill the scheme, since it needs several attempts to succeed (the "r" parameter). So string hashing is slowed down a bit (given the caching I don't know how important this is). Ideally one should make sure the hashes 64-bit on 64-bit platforms too (IIUC long is 32-bit on Windows but I don't know Windows well). But the main reason I say I'm not proposing it is I don't have time to code it up for demonstration and people like to have something to look at when they get proposals :-) Dag Sverre From janzert at janzert.com Thu Dec 13 08:32:53 2012 From: janzert at janzert.com (Janzert) Date: Thu, 13 Dec 2012 02:32:53 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50C9781D.9090200@g.nevcal.com> References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> <50C932B8.9060100@g.nevcal.com> <50C9781D.9090200@g.nevcal.com> Message-ID: On 12/13/2012 1:39 AM, Glenn Linderman wrote: > On 12/12/2012 6:10 PM, Janzert wrote: >> On 12/12/2012 8:43 PM, Glenn Linderman wrote: >>> On 12/12/2012 5:36 PM, Brian Curtin wrote: >>>> >>>> >> C:\ProgramData\Python >>>> >>> >>> ^^^^^ That. Is not the path that the link below is talking >>> about, though. >>> >> >> It actually does; it is rather confusing though. :/ > > I agree with the below. But I have never seen a version of Windows on > which c:\ProgramData was the actual path for FOLDERID_ProgramData. Can > you reference documentation that states that it was there, for some > version? This documentation speaks of: > > c:\Documents and Settings\AllUsers\Application Data (which I knew from > XP, and I think 2000, not sure I remember NT) > > In Vista.0, Vista.1, and Vista.2, I guess it is moved to > C:\users\AllUsers\AppData\Roaming (typically). > > Neither of those would result in C:\ProgramData\Python. > The SO answer links to the KNOWNFOLDERID docs; the relevant entry specifically is at http://msdn.microsoft.com/en-us/library/windows/desktop/dd378457.aspx#FOLDERID_ProgramData which gives the default path as, %ALLUSERSPROFILE% (%ProgramData%, %SystemDrive%\ProgramData) checking on my local windows 7 install gives: C:\>echo %ALLUSERSPROFILE% C:\ProgramData C:\>echo %ProgramData% C:\ProgramData >> It's referring to KNOWNFOLDERID constant FOLDERID_ProgramData. The >> actual on disk location for this has changed over windows versions. As >> noted below in the SO link given: >> >> "Note that this documentation refers to the typical path as per older >> versions of Windows. In modern versions of Windows it is located in >> %SystemDrive%\ProgramData." >> >>>> > >>>> > >>>> > Making a new top-level directory without asking is obnoxious. >>>> >>>> See >>>> http://stackoverflow.com/questions/9518890/what-is-the-significance-programdata-in-windows >>>> > > > From tjreedy at udel.edu Thu Dec 13 09:22:59 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 13 Dec 2012 03:22:59 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On 12/13/2012 1:06 AM, Lennart Regebro wrote: > On Thu, Dec 13, 2012 at 2:24 AM, Terry Reedy wrote: >> Or ask the user where to put it. > > If we ask where it should be installed, then we need a registry > setting for that Right. > So I think that asking is not an option at all. It either goes in > %PROGRAMDATA%\Python\zoneinfo or it's not shared at all. If that works for all xp+ versions, fine. > >> I know where I would choose, and it would >> not be on my C drive. Un-installers would not delete (unless a reference >> count were kept and were decremented to 0). > > True, and that's annoying when those counters go wrong. It seems to me that Windows has a mechanism for this, at least in some versions. But maybe it only works for dlls. > All in all I would say I would prefer to install this per Python. Then explicit update requires multiple downloads or copying. This is a violation of DRY. If if is not too large, it would not hurt to never delete it. -- Terry Jan Reedy From v+python at g.nevcal.com Thu Dec 13 09:38:01 2012 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 13 Dec 2012 00:38:01 -0800 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> <50C932B8.9060100@g.nevcal.com> <50C9781D.9090200@g.nevcal.com> Message-ID: <50C993E9.8060204@g.nevcal.com> On 12/12/2012 11:32 PM, Janzert wrote: > On 12/13/2012 1:39 AM, Glenn Linderman wrote: >> On 12/12/2012 6:10 PM, Janzert wrote: >>> On 12/12/2012 8:43 PM, Glenn Linderman wrote: >>>> On 12/12/2012 5:36 PM, Brian Curtin wrote: >>>>> >>>>> >> C:\ProgramData\Python >>>>> >>>> >>>> ^^^^^ That. Is not the path that the link below is talking >>>> about, though. >>>> >>> >>> It actually does; it is rather confusing though. :/ >> >> I agree with the below. But I have never seen a version of Windows on >> which c:\ProgramData was the actual path for FOLDERID_ProgramData. Can >> you reference documentation that states that it was there, for some >> version? This documentation speaks of: >> >> c:\Documents and Settings\AllUsers\Application Data (which I knew from >> XP, and I think 2000, not sure I remember NT) >> >> In Vista.0, Vista.1, and Vista.2, I guess it is moved to >> C:\users\AllUsers\AppData\Roaming (typically). >> >> Neither of those would result in C:\ProgramData\Python. >> > > The SO answer links to the KNOWNFOLDERID docs; the relevant entry > specifically is at > > http://msdn.microsoft.com/en-us/library/windows/desktop/dd378457.aspx#FOLDERID_ProgramData > > > which gives the default path as, > > %ALLUSERSPROFILE% (%ProgramData%, %SystemDrive%\ProgramData) > > checking on my local windows 7 install gives: > > C:\>echo %ALLUSERSPROFILE% > C:\ProgramData > > C:\>echo %ProgramData% > C:\ProgramData Interesting. It _did_ say something about data that is not specific to a user... and yet I overlooked that. Those environment variable settings are, indeed, on my Win 7 machine, so I have erred and apologize. That said, the directory C:\ProgramData does NOT exist on my Win 7 machine, so it appears that VERY LITTLE software actually uses that setting. (I have nearly a hundred free and commercial packages installed on this machine. Not that 100 is a large percentage of the available software for Windows, but if the use was common, 100 packages would be likely to contain one that used it, eh?). Thanks for the education, especially because you had to beat it into my skull! -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Thu Dec 13 10:07:34 2012 From: regebro at gmail.com (Lennart Regebro) Date: Thu, 13 Dec 2012 10:07:34 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On Thu, Dec 13, 2012 at 9:22 AM, Terry Reedy wrote: > On 12/13/2012 1:06 AM, Lennart Regebro wrote: >> All in all I would say I would prefer to install this per Python. > > Then explicit update requires multiple downloads or copying. This is a > violation of DRY. If if is not too large, it would not hurt to never delete > it. Yes, but this is no different that if you want to keep any library updated over multiple Python versions. And I don't want to invent another installation procedure that works for just this, or have a little script that checks periodically for updates only for this, adding to the plethora of update checkers on windows already. You either keep your Python and it's libraries updated or you do not, I don't think this is any different, and I think it should have the exact same mechanisms and functions as all other third-party PyPI packages. //Lennart From solipsis at pitrou.net Thu Dec 13 12:06:18 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 13 Dec 2012 12:06:18 +0100 Subject: [Python-Dev] Draft PEP for time zone support. References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: <20121213120618.5ee39f51@pitrou.net> Le Thu, 13 Dec 2012 10:07:34 +0100, Lennart Regebro a ?crit : > On Thu, Dec 13, 2012 at 9:22 AM, Terry Reedy wrote: > > On 12/13/2012 1:06 AM, Lennart Regebro wrote: > >> All in all I would say I would prefer to install this per Python. > > > > Then explicit update requires multiple downloads or copying. This > > is a violation of DRY. If if is not too large, it would not hurt to > > never delete it. > > Yes, but this is no different that if you want to keep any library > updated over multiple Python versions. And I don't want to invent > another installation procedure that works for just this, or have a > little script that checks periodically for updates only for this, > adding to the plethora of update checkers on windows already. You > either keep your Python and it's libraries updated or you do not, I > don't think this is any different, and I think it should have the > exact same mechanisms and functions as all other third-party PyPI > packages. Agreed. This doesn't warrant special-casing. Regards Antoine. From christian at python.org Thu Dec 13 12:34:31 2012 From: christian at python.org (Christian Heimes) Date: Thu, 13 Dec 2012 12:34:31 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: <50C9BD47.6050305@python.org> Am 13.12.2012 10:07, schrieb Lennart Regebro: > Yes, but this is no different that if you want to keep any library > updated over multiple Python versions. And I don't want to invent > another installation procedure that works for just this, or have a > little script that checks periodically for updates only for this, > adding to the plethora of update checkers on windows already. You > either keep your Python and it's libraries updated or you do not, I > don't think this is any different, and I think it should have the > exact same mechanisms and functions as all other third-party PyPI > packages. +1 This PEP does fine without any auto-updatefeature. Please let Lennart concentrate on the task at hand. If an auto-update system is still wanted, it can and should be designed by somebody else as a separate PEP. IMHO it's not Lennart's obligation to do so. Christian From benjamin at python.org Thu Dec 13 15:02:34 2012 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 13 Dec 2012 09:02:34 -0500 Subject: [Python-Dev] cpython: expose TCP_FASTOPEN and MSG_FASTOPEN In-Reply-To: <20121213081801.0255e430@pitrou.net> References: <3YMKyG14mkzRnn@mail.python.org> <20121213081801.0255e430@pitrou.net> Message-ID: 2012/12/13 Antoine Pitrou : > On Thu, 13 Dec 2012 04:24:54 +0100 (CET) > benjamin.peterson wrote: >> http://hg.python.org/cpython/rev/5435a9278028 >> changeset: 80834:5435a9278028 >> user: Benjamin Peterson >> date: Wed Dec 12 22:24:47 2012 -0500 >> summary: >> expose TCP_FASTOPEN and MSG_FASTOPEN >> >> files: >> Misc/NEWS | 3 +++ >> Modules/socketmodule.c | 7 ++++++- >> 2 files changed, 9 insertions(+), 1 deletions(-) >> >> >> diff --git a/Misc/NEWS b/Misc/NEWS >> --- a/Misc/NEWS >> +++ b/Misc/NEWS >> @@ -163,6 +163,9 @@ >> Library >> ------- >> >> +- Expose the TCP_FASTOPEN and MSG_FASTOPEN flags in socket when they're >> + available. > > This should be documented, no? Other similar constants are documented only by TCP_* and a suggestion to look at the manpage. -- Regards, Benjamin From tjreedy at udel.edu Thu Dec 13 21:45:59 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 13 Dec 2012 15:45:59 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: On 12/13/2012 4:07 AM, Lennart Regebro wrote: > On Thu, Dec 13, 2012 at 9:22 AM, Terry Reedy wrote: >> On 12/13/2012 1:06 AM, Lennart Regebro wrote: >>> All in all I would say I would prefer to install this per Python. >> >> Then explicit update requires multiple downloads or copying. This is a >> violation of DRY. If if is not too large, it would not hurt to never delete >> it. > > Yes, but this is no different that if you want to keep any library > updated over multiple Python versions. How I do that for my multi-version packages is to put them in a separate 'python' directory and put python.pth with the path to that directory in the various site-packages directories. Any change to the *one* copy is available to all versions and all will operate the same if the code is truly multi-version. When I installed 3.3, I copied python.pth into its site-packages and was ready to go. > And I don't want to invent another installation procedure > that works for just this, An email or so ago, you said that the tz database should go in C:\programdata (which currently does not exist on my machine either). That would be a new, invented installation procedure. > or have a little script that checks periodically > for updates only for this, > adding to the plethora of update checkers on windows already. I *never* suggested this. In fact, I said that installing an updated database (available to all Python versions) with each release would be sufficient for nearly everyone on Windows. > either keep your Python and it's libraries updated or you do not, I > don't think this is any different,and I think it should have the > exact same mechanisms and functions as all other third-party PyPI > packages. When I suggested that users be able to put the database where they want, *just like with any other third-party package PyPI package*, you are the one who said no, this should be special cased. The situation is this: most *nixes have or can have one system tz database. Python code that uses it will give the same answer regardless of the Python version. Windows apparently does not have such a thing. So we can a) not use the tz database in the stdlib because it would not work on Windows (the defacto current situation); b) use it but let the functions fail on Windows; c) install a different version of the database with each Python installation, that can only be used by that installation, so that results may depend on the Python version. (This seem to be what you are now proposing, and if bugfix releases update the data only for that version, could result in earlier versions giving more accurate answers.); d) install one database at a time so all Python versions give the same answer, just as on *nix. -- Terry Jan Reedy From rosuav at gmail.com Thu Dec 13 21:57:52 2012 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 14 Dec 2012 07:57:52 +1100 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? Message-ID: The default version shown on http://docs.python.org/ is now 3.3.0, which I think is a Good Thing. However, http://python.org/download/ puts 2.7 first, and says: """If you don't know which version to use, start with Python 2.7; more existing third party software is compatible with Python 2 than Python 3 right now.""" Firstly, is this still true? (I wouldn't have a clue.) And secondly, would this be better worded as "one's better but the other's a good fall-back"? Something like: """Don't know which version to use? Python 3.3 is the recommended version for new projects, but much existing software is compatible with Python 2.""" I only ever send people there to learn about programming, not to get a dependency for an existing codebase, so I don't know what is actually used. ChrisA From rosslagerwall at gmail.com Thu Dec 13 22:14:51 2012 From: rosslagerwall at gmail.com (Ross Lagerwall) Date: Thu, 13 Dec 2012 21:14:51 +0000 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? In-Reply-To: References: Message-ID: <20121213211451.GA21697@hobo.wolfson.cam.ac.uk> On Fri, Dec 14, 2012 at 07:57:52AM +1100, Chris Angelico wrote: > The default version shown on http://docs.python.org/ is now 3.3.0, > which I think is a Good Thing. However, http://python.org/download/ > puts 2.7 first, and says: > > """If you don't know which version to use, start with Python 2.7; more > existing third party software is compatible with Python 2 than Python > 3 right now.""" > > Firstly, is this still true? (I wouldn't have a clue.) And secondly, > would this be better worded as "one's better but the other's a good > fall-back"? Something like: > > """Don't know which version to use? Python 3.3 is the recommended > version for new projects, but much existing software is compatible > with Python 2.""" > I would say listing 3.3 as the recommended version to use is a good thing, especially as distros like Ubuntu and Fedora transition to Python 3. It also makes sense, given that the docs default to 3.3. -- Ross Lagerwall From a.cavallo at cavallinux.eu Fri Dec 14 01:10:45 2012 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Fri, 14 Dec 2012 00:10:45 +0000 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: References: <50C58DA5.3000307@cavallinux.eu> Message-ID: <50CA6E85.9060900@cavallinux.eu> I'll have a look into distutils2, I tough it was (another) dead end. I every case my target is py2k (2.7.x) and I've no case for transitioning to py3k (too much risk). Lennart Regebro wrote: > On Mon, Dec 10, 2012 at 8:22 AM, Antonio Cavallo > wrote: >> Hi, >> I wonder if is it worth/if there is any interest in trying to "clean" up >> distutils: nothing in terms to add new features, just a *major* cleanup >> retaining the exact same interface. >> >> >> I'm not planning anything like *adding features* or rewriting rpm/rpmbuild >> here, simply cleaning up that un-holy code mess. Yes it served well, don't >> get me wrong, and I think it did work much better than anything it was meant >> to replace it. >> >> I'm not into the py3 at all so I wonder how possibly it could fit/collide >> into the big plan. >> >> Or I'll be wasting my time? > > The effort of making something that replaces distutils is, as far as I > can understand, currently on the level of taking the best bits out of > distutils2 and putting it into Python 3.4 under the name "packaging". > I'm sure that effort can need more help. > > //Lennart From ncoghlan at gmail.com Fri Dec 14 02:06:41 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Dec 2012 11:06:41 +1000 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: <50CA6E85.9060900@cavallinux.eu> References: <50C58DA5.3000307@cavallinux.eu> <50CA6E85.9060900@cavallinux.eu> Message-ID: On Fri, Dec 14, 2012 at 10:10 AM, Antonio Cavallo wrote: > I'll have a look into distutils2, I tough it was (another) dead end. > I every case my target is py2k (2.7.x) and I've no case for transitioning > to py3k (too much risk). distutils2 started as a copy of distutils, so it's hard to tell the difference between the parts which have been fixed and the parts which are still just distutils legacy components (this is why the merge back was dropped from 3.3 - too many pieces simply weren't ready and simply would have perpetuated problems inherited from distutils). distlib (https://distlib.readthedocs.org/en/latest/overview.html) is a successor project that takes a different view of building up the low level pieces without inheriting the bad parts of the distutils legacy (a problem suffered by both setuptools/distribute and distutils2). distlib also runs natively on both 2.x and 3.x, as the idea is that these interoperability standards should be well supported in *current* Python versions, not just those where the stdlib has caught up (i.e. now 3.4 at the earliest) The aim is to get to a situation more like that with wsgiref, where the stdlib defines the foundation and key APIs and data formats needed for interoperability, while allowing a flourishing ecosystem of user-oriented tools (like pip, bento, zc.buildout, etc) that still solve the key problems addressed by setuptools/distribute without the opaque and hard to extend distutils core that can make the existing tools impossible to debug when they go wrong. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From trent at snakebite.org Fri Dec 14 02:21:24 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 13 Dec 2012 20:21:24 -0500 Subject: [Python-Dev] Mercurial workflow question... Message-ID: <20121214012047.GA46660@snakebite.org> Scenario: I'm working on a change that I want to actively test on a bunch of Snakebite hosts. Getting the change working is going to be an iterative process -- lots of small commits trying to attack the problem one little bit at a time. Eventually I'll get to a point where I'm happy with the change. So, it's time to do all the necessary cruft that needs to be done before making the change public. Updating docs, tweaking style, Misc/NEWS, etc. That'll involve at least a few more commits. Most changes will also need to be merged to other branches, too, so that needs to be taken care of. (And it's a given that I would have been pulling and merging from hg.p.o/cpython during the whole process.) Then, finally, it's time to push. Now, if I understand how Mercurial works correctly, using the above workflow will result in all those little intermediate hacky commits being forever preserved in the global/public cpython repo. I will have polluted the history of all affected files with all my changes. That just doesn't "feel" right. But, it appears as though it's an intrinsic side-effect of how Mercurial works. With git, you have a bit more flexibility to affect how your final public commits via merge fast-forwarding. Subversion gives you the ultimate control of how your final commit looks (albeit at the expense of having to do the merging in a much more manual fashion). As I understand it, even if I contain all my intermediate commits in a server-side cloned repo, that doesn't really change anything; all commits will eventually be reflected in cpython via the final `hg push`. So, my first question is this: is this actually a problem? Is the value I'm placing on "pristine" log histories misplaced in the DVCS world? Do we, as a collective, care? I can think of two alternate approaches I could use: - Use a common NFS mount for each source tree on every Snakebite box (and coercing each build to be done in a separate area). Get everything perfect and then do a single commit of all changes. The thing I don't like about this approach is that I can't commit/rollback/tweak/bisect intermediate commits as I go along -- some changes are complex and take a few attempts to get right. - Use a completely separate clone to house all the intermediate commits, then generate a diff once the final commit is ready, then apply that diff to the main cpython repo, then push that. This approach is fine, but it seems counter-intuitive to the whole concept of DVCS. Thoughts? Trent. From larry at hastings.org Fri Dec 14 03:02:54 2012 From: larry at hastings.org (Larry Hastings) Date: Thu, 13 Dec 2012 18:02:54 -0800 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: <20121214012047.GA46660@snakebite.org> References: <20121214012047.GA46660@snakebite.org> Message-ID: <50CA88CE.80705@hastings.org> On 12/13/2012 05:21 PM, Trent Nelson wrote: > Thoughts? % hg help rebase //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Dec 14 03:36:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 14 Dec 2012 12:36:01 +1000 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: <50CA88CE.80705@hastings.org> References: <20121214012047.GA46660@snakebite.org> <50CA88CE.80705@hastings.org> Message-ID: On Fri, Dec 14, 2012 at 12:02 PM, Larry Hastings wrote: > On 12/13/2012 05:21 PM, Trent Nelson wrote: > > Thoughts? > > > % hg help rebase > And also the histedit extension (analagous to "git rebase -i"). Both Git and Hg recognise there is a difference between interim commits and ones you want to publish and provide tools to revise a series of commits into a simpler set for publication to an official repo. The difference is that in Git this is allowed by default for all branches (which can create fun and games if someone upstream of you edits the history of you branch you used as a base for your own work), while Hg makes a distinction between different phases (secret -> draft -> public) and disallows operations that rewrite history if they would affect public changesets. So the challenge with Mercurial over Git is ensuring the relevant branches stay in "draft" mode locally even though you want to push them to a server-side clone for distribution to the build servers. I know one way to do that would be to ask that the relevant clone be switched to non-publishing mode (see http://mercurial.selenic.com/wiki/Phases#Publishing_Repository). I don't know if there's another way to do it without altering the config on the server. General intro to phases: http://www.logilab.org/blogentry/88203 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri Dec 14 03:38:51 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 13 Dec 2012 19:38:51 -0700 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 1:57 PM, Chris Angelico wrote: > The default version shown on http://docs.python.org/ is now 3.3.0, > which I think is a Good Thing. However, http://python.org/download/ > puts 2.7 first, and says: > > """If you don't know which version to use, start with Python 2.7; more > existing third party software is compatible with Python 2 than Python > 3 right now.""" > > Firstly, is this still true? (I wouldn't have a clue.) Nope: http://py3ksupport.appspot.com/ http://python3wos.appspot.com/ (plone and zope skew the results) -eric From rdmurray at bitdance.com Fri Dec 14 03:48:23 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 13 Dec 2012 21:48:23 -0500 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: <20121214012047.GA46660@snakebite.org> References: <20121214012047.GA46660@snakebite.org> Message-ID: <20121214024824.3BCCC2500B2@webabinitio.net> On Thu, 13 Dec 2012 20:21:24 -0500, Trent Nelson wrote: > - Use a completely separate clone to house all the intermediate > commits, then generate a diff once the final commit is ready, > then apply that diff to the main cpython repo, then push that. > This approach is fine, but it seems counter-intuitive to the > whole concept of DVCS. Perhaps. But that's exactly what I did with the email package changes for 3.3. You seem to have a tension between "all those dirty little commits" and "clean history" and the fact that a dvcs is designed to preserve all those commits...if you don't want those intermediate commits in the official repo, then why is a diff/patch a bad way to achieve that? If you keep your pulls up to date in your feature repo, the diff/patch process is simple and smooth. The repo I worked on the email features in is still available, too, if anyone is crazy enough to want to know about those intermediate steps... --David From chris.jerdonek at gmail.com Fri Dec 14 04:00:39 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 13 Dec 2012 19:00:39 -0800 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: <20121214024824.3BCCC2500B2@webabinitio.net> References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: On Thu, Dec 13, 2012 at 6:48 PM, R. David Murray wrote: > On Thu, 13 Dec 2012 20:21:24 -0500, Trent Nelson wrote: >> - Use a completely separate clone to house all the intermediate >> commits, then generate a diff once the final commit is ready, >> then apply that diff to the main cpython repo, then push that. >> This approach is fine, but it seems counter-intuitive to the >> whole concept of DVCS. > > Perhaps. But that's exactly what I did with the email package changes > for 3.3. > > You seem to have a tension between "all those dirty little commits" and > "clean history" and the fact that a dvcs is designed to preserve all > those commits...if you don't want those intermediate commits in the > official repo, then why is a diff/patch a bad way to achieve that? Right. And you usually have to do this beforehand anyways to upload your changes to the tracker for review. Also, for the record (not that anyone has said anything to the contrary), our dev guide says, "You should collapse changesets of a single feature or bugfix before pushing the result to the main repository. The reason is that we don?t want the history to be full of intermediate commits recording the private history of the person working on a patch. If you are using the rebase extension, consider adding the --collapse option to hg rebase. The collapse extension is another choice." (from http://docs.python.org/devguide/committing.html#working-with-mercurial ) --Chris From barry at python.org Fri Dec 14 01:59:27 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 13 Dec 2012 19:59:27 -0500 Subject: [Python-Dev] Mercurial workflow question... References: <20121214012047.GA46660@snakebite.org> <50CA88CE.80705@hastings.org> Message-ID: <20121213195927.1ced5464@resist.wooz.org> On Dec 14, 2012, at 12:36 PM, Nick Coghlan wrote: >Both Git and Hg recognise there is a difference between interim commits and >ones you want to publish and provide tools to revise a series of commits >into a simpler set for publication to an official repo. One of the things I love about Bazaar is that it has a concept of "main line of development" that usually makes all this hand-wringing a non-issue. When I merge my development branch, with all its interim commits into trunk, all those revisions go with it. But it never matters because when you view history (and bisect, etc.) on trunk, you see the merge as one commit. Sure, you can descend into the right-hand side if you want to see all those sub-commits, and the graphical tools allow you to expand them fairly easily, but usually you just ignore them. Nothing's completely for free of course, and having a main line of development does mean you have to be careful about merge directionality, but that's generally something you ingrain in your workflow once, and then forget about it. The bottom line is that Bazaar users rarely feel the need to rebase, even though you can if you want to. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From nad at acm.org Fri Dec 14 07:11:49 2012 From: nad at acm.org (Ned Deily) Date: Thu, 13 Dec 2012 22:11:49 -0800 Subject: [Python-Dev] Mercurial workflow question... References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: In article <20121214024824.3BCCC2500B2 at webabinitio.net>, "R. David Murray" wrote: > On Thu, 13 Dec 2012 20:21:24 -0500, Trent Nelson wrote: > > - Use a completely separate clone to house all the intermediate > > commits, then generate a diff once the final commit is ready, > > then apply that diff to the main cpython repo, then push that. > > This approach is fine, but it seems counter-intuitive to the > > whole concept of DVCS. > > Perhaps. But that's exactly what I did with the email package changes > for 3.3. > > You seem to have a tension between "all those dirty little commits" and > "clean history" and the fact that a dvcs is designed to preserve all > those commits...if you don't want those intermediate commits in the > official repo, then why is a diff/patch a bad way to achieve that? If > you keep your pulls up to date in your feature repo, the diff/patch > process is simple and smooth. Also, if you prefer to go the patch route, hg provides the mq extension (inspired by quilt) to simplify managing patches including version controlling the patches. I find it much easy to deal that way with maintenance changes that may have a non-trivial gestation period. -- Ned Deily, nad at acm.org From stephen at xemacs.org Fri Dec 14 08:14:29 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 14 Dec 2012 16:14:29 +0900 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: <20121214024824.3BCCC2500B2@webabinitio.net> References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: <877golyuju.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > those commits...if you don't want those intermediate commits in the > official repo, then why is a diff/patch a bad way to achieve that? Because a decent VCS provides TOOWTDI. And sometimes there are different degrees of "intermediate", or pehaps you even want to slice, dice, and mince the patches at the hunk level. Presenting the logic of the change often is best done in pieces but in an ahistorical way, but debugging often benefits from the context of an exact sequential history. That said, diff/patch across repos is not per se evil, and may be easier for users to visualize than the results of the DAG transformations (such as rebase) provided by existing dVCSes. From greg at krypto.org Fri Dec 14 08:27:41 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 13 Dec 2012 23:27:41 -0800 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: <20121211081627.0f0235e1@pitrou.net> References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 11:16 PM, Antoine Pitrou wrote: > On Tue, 11 Dec 2012 03:05:19 +0100 (CET) > gregory.p.smith wrote: > > Using 'long double' to force this structure to be worst case aligned > is no > > longer required as of Python 2.5+ when the gc_refs changed from an int (4 > > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. > > > > The use of a 'long double' triggered a warning by Clang trunk's > > Undefined-Behavior Sanitizer as on many platforms a long double requires > > 16-byte alignment but the Python memory allocator only guarantees 8 byte > > alignment. > > > > So our code would allocate and use these structures with technically > improper > > alignment. Though it didn't matter since the 'dummy' field is never > used. > > This silences that warning. > > > > Spelunking into code history, the double was added in 2001 to force > better > > alignment on some platforms and changed to a long double in 2002 to > appease > > Tru64. That issue should no loner be present since the upgrade from int > to > > Py_ssize_t where the minimum structure size increased to 16 (unless > anyone > > knows of a platform where ssize_t is 4 bytes?) > > What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). > No they don't. size_t and ssize_t exist in large part because they are often larger than an int or long on 32bit platforms. They are 64-bit on Linux regardless of platform (i think there is a way to force a compile in ancient mode that forces them and the APIs being used to be 32-bit size_t variants but nobody does that). > > We can probably get rid of the double and this union hack all together > today. > > That is a slightly more invasive change that can be left for later. > > How do you suggest to get rid of it? Some platforms still have strict > alignment rules and we must enforce that PyObjects (*) are always > aligned to the largest possible alignment, since a PyObject-derived > struct can hold arbitrary C types. > > (*) GC-enabled PyObjects, anyway. Others will be naturally aligned > thanks to the memory allocator. > > > What's more, I think you shouldn't be doing this kind of change in a > bugfix release. It might break compiled C extensions since you are > changing some characteristics of object layout (although you would > probably only break those extensions which access the GC header, which > is probably not many of them). Resource consumption improvements > generally go only into the next feature release. > This isn't a resource consumption improvement. It is a compilation correctness change with zero impact on the generated code or ABI compatibility before and after. The structure, as defined, is was flagged as problematic by Clang's undefined behavior sanitizer because it contains a 'long double' which requires 16-byte alignment but Python's own memory allocator was using an 8 byte boundary. So changing the definition of the dummy side of the union makes zero difference to already compiled code as it (a) doesn't change the structure's size and (b) all existing implementations already align these on an 8 byte boundary. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Fri Dec 14 08:48:12 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 13 Dec 2012 23:48:12 -0800 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: On Thu, Dec 13, 2012 at 11:27 PM, Gregory P. Smith wrote: > > On Mon, Dec 10, 2012 at 11:16 PM, Antoine Pitrou wrote: > >> On Tue, 11 Dec 2012 03:05:19 +0100 (CET) >> gregory.p.smith wrote: >> > Using 'long double' to force this structure to be worst case aligned >> is no >> > longer required as of Python 2.5+ when the gc_refs changed from an int >> (4 >> > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. >> > >> > The use of a 'long double' triggered a warning by Clang trunk's >> > Undefined-Behavior Sanitizer as on many platforms a long double requires >> > 16-byte alignment but the Python memory allocator only guarantees 8 byte >> > alignment. >> > >> > So our code would allocate and use these structures with technically >> improper >> > alignment. Though it didn't matter since the 'dummy' field is never >> used. >> > This silences that warning. >> > >> > Spelunking into code history, the double was added in 2001 to force >> better >> > alignment on some platforms and changed to a long double in 2002 to >> appease >> > Tru64. That issue should no loner be present since the upgrade from >> int to >> > Py_ssize_t where the minimum structure size increased to 16 (unless >> anyone >> > knows of a platform where ssize_t is 4 bytes?) >> >> What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). >> > > No they don't. > > size_t and ssize_t exist in large part because they are often larger than > an int or long on 32bit platforms. They are 64-bit on Linux regardless of > platform (i think there is a way to force a compile in ancient mode that > forces them and the APIs being used to be 32-bit size_t variants but nobody > does that). > > >> > We can probably get rid of the double and this union hack all together >> today. >> > That is a slightly more invasive change that can be left for later. >> >> How do you suggest to get rid of it? Some platforms still have strict >> alignment rules and we must enforce that PyObjects (*) are always >> aligned to the largest possible alignment, since a PyObject-derived >> struct can hold arbitrary C types. >> >> (*) GC-enabled PyObjects, anyway. Others will be naturally aligned >> thanks to the memory allocator. >> >> >> What's more, I think you shouldn't be doing this kind of change in a >> bugfix release. It might break compiled C extensions since you are >> changing some characteristics of object layout (although you would >> probably only break those extensions which access the GC header, which >> is probably not many of them). Resource consumption improvements >> generally go only into the next feature release. >> > BTW - This change was done on tip only. The comment about this being 'in a bugfix release' is wrong. While I personally believe this is needed in all of the release branches I didn't commit this one there *just in case* there is some weird platform where this change actually makes a difference. I don't believe such a thing exists in 2012, but as there is no way that is worth my time for me to find that out, I didn't put it in a bugfix branch. -gps > This isn't a resource consumption improvement. It is a compilation > correctness change with zero impact on the generated code or ABI > compatibility before and after. The structure, as defined, is was flagged > as problematic by Clang's undefined behavior sanitizer because it contains > a 'long double' which requires 16-byte alignment but Python's own memory > allocator was using an 8 byte boundary. > > So changing the definition of the dummy side of the union makes zero > difference to already compiled code as it (a) doesn't change the > structure's size and (b) all existing implementations already align these > on an 8 byte boundary. > > -gps > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Fri Dec 14 08:50:59 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 13 Dec 2012 23:50:59 -0800 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: <20121211082130.69fbc6c0@pitrou.net> References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> <20121211082130.69fbc6c0@pitrou.net> Message-ID: On Mon, Dec 10, 2012 at 11:21 PM, Antoine Pitrou wrote: > On Tue, 11 Dec 2012 08:16:27 +0100 > Antoine Pitrou wrote: > > > On Tue, 11 Dec 2012 03:05:19 +0100 (CET) > > gregory.p.smith wrote: > > > Using 'long double' to force this structure to be worst case aligned > is no > > > longer required as of Python 2.5+ when the gc_refs changed from an int > (4 > > > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. > > > > > > The use of a 'long double' triggered a warning by Clang trunk's > > > Undefined-Behavior Sanitizer as on many platforms a long double > requires > > > 16-byte alignment but the Python memory allocator only guarantees 8 > byte > > > alignment. > > > > > > So our code would allocate and use these structures with technically > improper > > > alignment. Though it didn't matter since the 'dummy' field is never > used. > > > This silences that warning. > > > > > > Spelunking into code history, the double was added in 2001 to force > better > > > alignment on some platforms and changed to a long double in 2002 to > appease > > > Tru64. That issue should no loner be present since the upgrade from > int to > > > Py_ssize_t where the minimum structure size increased to 16 (unless > anyone > > > knows of a platform where ssize_t is 4 bytes?) > > > > What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). > > > > > We can probably get rid of the double and this union hack all together > today. > > > That is a slightly more invasive change that can be left for later. > > > > How do you suggest to get rid of it? Some platforms still have strict > > alignment rules and we must enforce that PyObjects (*) are always > > aligned to the largest possible alignment, since a PyObject-derived > > struct can hold arbitrary C types. > > Ok, I hadn't seen your proposal. I find it reasonable: > > ?A more correct non-hacky alternative if any alignment issues are still > found would be to use a compiler specific alignment declaration on the > structure and determine which value to use at configure time.? > > > However, the commit is still problematic, and I think it should be > reverted. We can't remove the alignment hack just because it seems to > be useless on x86(-64). > I didn't remove it. I made it match what our memory allocator is already doing. Thanks for reviewing commits in such detail BTW. I do appreciate it. BTW, I didn't notice your replies until now because you didn't include me in the to/cc list on the responses. Please do that if you want a faster response. :) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Dec 14 09:02:08 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 14 Dec 2012 09:02:08 +0100 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: On 14 Dec, 2012, at 8:27, "Gregory P. Smith" wrote: > > On Mon, Dec 10, 2012 at 11:16 PM, Antoine Pitrou wrote: > On Tue, 11 Dec 2012 03:05:19 +0100 (CET) > gregory.p.smith wrote: > > Using 'long double' to force this structure to be worst case aligned is no > > longer required as of Python 2.5+ when the gc_refs changed from an int (4 > > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. > > > > The use of a 'long double' triggered a warning by Clang trunk's > > Undefined-Behavior Sanitizer as on many platforms a long double requires > > 16-byte alignment but the Python memory allocator only guarantees 8 byte > > alignment. > > > > So our code would allocate and use these structures with technically improper > > alignment. Though it didn't matter since the 'dummy' field is never used. > > This silences that warning. > > > > Spelunking into code history, the double was added in 2001 to force better > > alignment on some platforms and changed to a long double in 2002 to appease > > Tru64. That issue should no loner be present since the upgrade from int to > > Py_ssize_t where the minimum structure size increased to 16 (unless anyone > > knows of a platform where ssize_t is 4 bytes?) > > What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). > > No they don't. > > size_t and ssize_t exist in large part because they are often larger than an int or long on 32bit platforms. They are 64-bit on Linux regardless of platform (i think there is a way to force a compile in ancient mode that forces them and the APIs being used to be 32-bit size_t variants but nobody does that). Are you sure about this, what you describe seem to be loff_t (the typedef for file offsets), not size_t (the typedef for sizes of memory blocks). The size of memory blocks is limited to a 32-bit number on 32-bit systems (for the obvious reason). Size_t is 32-bit on at least some platforms: $ cat t.c #include #include int main(void) { printf("sizeof(size_t): %d\n", (int)sizeof(size_t)); return 0; } $ cc -o t t.c -arch i386 $ ./t sizeof(size_t): 4 This session is on an OSX system, but you'll get the same output on a 32-bit linux system with default compiler settings (I've tested this on a SLES10 system). > > > > We can probably get rid of the double and this union hack all together today. > > That is a slightly more invasive change that can be left for later. > > How do you suggest to get rid of it? Some platforms still have strict > alignment rules and we must enforce that PyObjects (*) are always > aligned to the largest possible alignment, since a PyObject-derived > struct can hold arbitrary C types. > > (*) GC-enabled PyObjects, anyway. Others will be naturally aligned > thanks to the memory allocator. > > > What's more, I think you shouldn't be doing this kind of change in a > bugfix release. It might break compiled C extensions since you are > changing some characteristics of object layout (although you would > probably only break those extensions which access the GC header, which > is probably not many of them). Resource consumption improvements > generally go only into the next feature release. > > This isn't a resource consumption improvement. It is a compilation correctness change with zero impact on the generated code or ABI compatibility before and after. The structure, as defined, is was flagged as problematic by Clang's undefined behavior sanitizer because it contains a 'long double' which requires 16-byte alignment but Python's own memory allocator was using an 8 byte boundary. > > So changing the definition of the dummy side of the union makes zero difference to already compiled code as it (a) doesn't change the structure's size and (b) all existing implementations already align these on an 8 byte boundary. > > -gps > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Fri Dec 14 09:31:40 2012 From: regebro at gmail.com (Lennart Regebro) Date: Fri, 14 Dec 2012 09:31:40 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: OK, so it's been 12 hours with no further discussion, so I'll make an attempt to summarize what I think is the consensus changes before updating the PEP. 1. Python will include a timezone database both in the source distribution and the Windows installer (although I suspect that binary packages for Linux distributions may skip this, but that's OK). 2. The timezone module becomes datetime.timezone, meaning datetime.py is moved to datetime/__init__.py 3. get_timezone() will be just timezone() as no voices was raised to defend get_timezone() 4. The db parameter in timezone() will be renamed db_path 5. is_dst will default to False 6. The UnknownTimeZoneError exception will be just a ValueError 7. The two errors raised when converting timezones will both inherit from a base exception. 8. A better name for the timezone data package. "tzdata-override" was suggested, I prefer "tzdata-update" as it is clearer. //Lennart From greg at krypto.org Fri Dec 14 09:40:28 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 14 Dec 2012 00:40:28 -0800 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: On Fri, Dec 14, 2012 at 12:02 AM, Ronald Oussoren wrote: > > On 14 Dec, 2012, at 8:27, "Gregory P. Smith" wrote: > > > On Mon, Dec 10, 2012 at 11:16 PM, Antoine Pitrou wrote: > >> On Tue, 11 Dec 2012 03:05:19 +0100 (CET) >> gregory.p.smith wrote: >> > Using 'long double' to force this structure to be worst case aligned >> is no >> > longer required as of Python 2.5+ when the gc_refs changed from an int >> (4 >> > bytes) to a Py_ssize_t (8 bytes) as the minimum size is 16 bytes. >> > >> > The use of a 'long double' triggered a warning by Clang trunk's >> > Undefined-Behavior Sanitizer as on many platforms a long double requires >> > 16-byte alignment but the Python memory allocator only guarantees 8 byte >> > alignment. >> > >> > So our code would allocate and use these structures with technically >> improper >> > alignment. Though it didn't matter since the 'dummy' field is never >> used. >> > This silences that warning. >> > >> > Spelunking into code history, the double was added in 2001 to force >> better >> > alignment on some platforms and changed to a long double in 2002 to >> appease >> > Tru64. That issue should no loner be present since the upgrade from >> int to >> > Py_ssize_t where the minimum structure size increased to 16 (unless >> anyone >> > knows of a platform where ssize_t is 4 bytes?) >> >> What?? Every 32-bit platform has a 4 bytes ssize_t (and size_t). >> > > No they don't. > > > size_t and ssize_t exist in large part because they are often larger than > an int or long on 32bit platforms. They are 64-bit on Linux regardless of > platform (i think there is a way to force a compile in ancient mode that > forces them and the APIs being used to be 32-bit size_t variants but nobody > does that). > > > Are you sure about this, what you describe seem to be loff_t (the typedef > for file offsets), not size_t (the typedef for sizes of memory blocks). The > size of memory blocks is limited to a 32-bit number on 32-bit systems (for > the obvious reason). > > Size_t is 32-bit on at least some platforms: > > $ cat t.c > #include > #include > > int main(void) > { > printf("sizeof(size_t): %d\n", (int)sizeof(size_t)); > return 0; > } > > $ cc -o t t.c -arch i386 > $ ./t > sizeof(size_t): 4 > > > This session is on an OSX system, but you'll get the same output on a > 32-bit linux system with default compiler settings (I've tested this on a > SLES10 system). > You are correct. My bad. size_t vs off_t vs my full brain strikes again. Regardless it doesn't change the correctness of my change. Though I'd love it if someone would figure out the cross platform compiler macro based struct alignment incantations to get rid of the need for the union with dummy all together. It wasn't even clear from the 2002 change description if changing double to long double over 10 years ago was actually _fixing_ a bug or was done for no good reason? Our allocator (obmalloc.c) doesn't allocate memory at many platform's required 16 byte long double alignment, it uses 8 byte alignment so the change couldn't have had any impact unless someone compiled --without-pymalloc using a system allocator that guaranteed a larger value. I wouldn't expect that and have seen no indication of that anywhere. The history of the long double and double additions: http://hg.python.org/cpython/rev/7065135f9202 long double - http://hg.python.org/cpython/rev/b4f829941f3d double - http://bugs.python.org/issue467145 why double was added, it fixed the HPUX 11 build. -gps > > >> > We can probably get rid of the double and this union hack all together >> today. >> > That is a slightly more invasive change that can be left for later. >> >> How do you suggest to get rid of it? Some platforms still have strict >> alignment rules and we must enforce that PyObjects (*) are always >> aligned to the largest possible alignment, since a PyObject-derived >> struct can hold arbitrary C types. >> >> (*) GC-enabled PyObjects, anyway. Others will be naturally aligned >> thanks to the memory allocator. >> >> >> What's more, I think you shouldn't be doing this kind of change in a >> bugfix release. It might break compiled C extensions since you are >> changing some characteristics of object layout (although you would >> probably only break those extensions which access the GC header, which >> is probably not many of them). Resource consumption improvements >> generally go only into the next feature release. >> > > This isn't a resource consumption improvement. It is a compilation > correctness change with zero impact on the generated code or ABI > compatibility before and after. The structure, as defined, is was flagged > as problematic by Clang's undefined behavior sanitizer because it contains > a 'long double' which requires 16-byte alignment but Python's own memory > allocator was using an 8 byte boundary. > > So changing the definition of the dummy side of the union makes zero > difference to already compiled code as it (a) doesn't change the > structure's size and (b) all existing implementations already align these > on an 8 byte boundary. > > -gps > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Fri Dec 14 09:42:44 2012 From: dickinsm at gmail.com (Mark Dickinson) Date: Fri, 14 Dec 2012 08:42:44 +0000 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: On Fri, Dec 14, 2012 at 7:27 AM, Gregory P. Smith wrote: > So changing the definition of the dummy side of the union makes zero > difference to already compiled code as it (a) doesn't change the structure's > size and (b) all existing implementations already align these on an 8 byte > boundary. It looks to me as though the struct size *is* changed, at least on some platforms. Before this commit, I get (OS X 10.6, 64-bit non-debug build): Python 3.4.0a0 (default:b4c383f31881+, Dec 14 2012, 08:30:39) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class A(object): pass ... >>> a = A() >>> import sys >>> sys.getsizeof(a) 64 After it: Python 3.4.0a0 (default:76bc92fb90c1+, Dec 14 2012, 08:33:48) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class A(object): pass ... >>> a = A() >>> import sys >>> sys.getsizeof(a) 56 -- Mark From regebro at gmail.com Fri Dec 14 09:56:16 2012 From: regebro at gmail.com (Lennart Regebro) Date: Fri, 14 Dec 2012 09:56:16 +0100 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: <50CAE828.2090409@cavallinux.eu> References: <50C58DA5.3000307@cavallinux.eu> <50CA6E85.9060900@cavallinux.eu> <50CAE828.2090409@cavallinux.eu> Message-ID: On Fri, Dec 14, 2012 at 9:49 AM, Antonio Cavallo wrote: > My requirements would quite simple: > 2. cross compiling That is *not* a simple requirement. //Lennart From greg at krypto.org Fri Dec 14 10:14:04 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 14 Dec 2012 01:14:04 -0800 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no In-Reply-To: References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: Yes, see the followup. My comments before were all misinterpreting size_t. Same result on x86_64 linux. On a 64-bit platform the 24 byte structure now occupies 24 bytes instead of being padded to 32. Nice. On a 32-bit platform it should remain 16 bytes. The PyGC_Head union structure is NOT part of the ABI laid out in http://www.python.org/dev/peps/pep-0384/ and is accurately excluded from the .h file when Py_LIMITED_API is defined so changing this in 3.4 should not be a problem. This structure occupies the space gc tracked PyObject* pointers. -gps On Fri, Dec 14, 2012 at 12:42 AM, Mark Dickinson wrote: > On Fri, Dec 14, 2012 at 7:27 AM, Gregory P. Smith wrote: > > So changing the definition of the dummy side of the union makes zero > > difference to already compiled code as it (a) doesn't change the > structure's > > size and (b) all existing implementations already align these on an 8 > byte > > boundary. > > It looks to me as though the struct size *is* changed, at least on > some platforms. Before this commit, I get (OS X 10.6, 64-bit > non-debug build): > > Python 3.4.0a0 (default:b4c383f31881+, Dec 14 2012, 08:30:39) > [GCC 4.2.1 (Apple Inc. build 5664)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> class A(object): pass > ... > >>> a = A() > >>> import sys > >>> sys.getsizeof(a) > 64 > > > After it: > > Python 3.4.0a0 (default:76bc92fb90c1+, Dec 14 2012, 08:33:48) > [GCC 4.2.1 (Apple Inc. build 5664)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> class A(object): pass > ... > >>> a = A() > >>> import sys > >>> sys.getsizeof(a) > 56 > > -- > Mark > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Dec 14 10:34:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 14 Dec 2012 10:34:25 +0100 Subject: [Python-Dev] Mercurial workflow question... References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: <20121214103425.68ebb4b0@pitrou.net> Le Thu, 13 Dec 2012 21:48:23 -0500, "R. David Murray" a ?crit : > On Thu, 13 Dec 2012 20:21:24 -0500, Trent Nelson > wrote: > > - Use a completely separate clone to house all the > > intermediate commits, then generate a diff once the final commit is > > ready, then apply that diff to the main cpython repo, then push > > that. This approach is fine, but it seems counter-intuitive to the > > whole concept of DVCS. > > Perhaps. But that's exactly what I did with the email package changes > for 3.3. > > You seem to have a tension between "all those dirty little commits" > and "clean history" and the fact that a dvcs is designed to preserve > all those commits...if you don't want those intermediate commits in > the official repo, then why is a diff/patch a bad way to achieve > that? If you keep your pulls up to date in your feature repo, the > diff/patch process is simple and smooth. +1. We definitely don't want tons of small incremental commits in the official repo. "One changeset == one issue" should be the ideal horizon when committing changes. Regards Antoine. From solipsis at pitrou.net Fri Dec 14 10:41:41 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 14 Dec 2012 10:41:41 +0100 Subject: [Python-Dev] cpython: Using 'long double' to force this structure to be worst case aligned is no References: <3YL4HM367BzRYL@mail.python.org> <20121211081627.0f0235e1@pitrou.net> Message-ID: <20121214104141.73b38274@pitrou.net> Le Fri, 14 Dec 2012 01:14:04 -0800, "Gregory P. Smith" a ?crit : > Yes, see the followup. My comments before were all misinterpreting > size_t. > > Same result on x86_64 linux. On a 64-bit platform the 24 byte > structure now occupies 24 bytes instead of being padded to 32. > Nice. On a 32-bit platform it should remain 16 bytes. But you are losing the 16-byte alignment that the union was precisely designed to enforce. > The PyGC_Head union structure is NOT part of the ABI laid out in > http://www.python.org/dev/peps/pep-0384/ and is accurately excluded > from the .h file when Py_LIMITED_API is defined so changing this in > 3.4 should not be a problem. Not an ABI problem in 3.4 indeed (except that it might break platforms with strict alignment requirements). It should be noted that the GC head isn't part of standard atomic types (int, float, str...), so the memory gain will not be very noticeable IMO. Regards Antoine. From christian at python.org Fri Dec 14 12:01:32 2012 From: christian at python.org (Christian Heimes) Date: Fri, 14 Dec 2012 12:01:32 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> Message-ID: <50CB070C.9090409@python.org> Am 14.12.2012 09:31, schrieb Lennart Regebro: > 1. Python will include a timezone database both in the source > distribution and the Windows installer (although I suspect that binary > packages for Linux distributions may skip this, but that's OK). You need to specify the details. Where is the database stored and under which condition is it updated? Suggestions: * The zoneinfo database is stored in the new package 'tzdata', that's Lib/tzdata in the source dist. The files are kept in our hg repository, too. * A tool chain is provided that compiles the zoneinfos from a Olson tar.gz file. (Bonus points for a download + update script). The tool chain is included in source dist, e.g. Tools/. * The db is updated on a regular basis during the development, alpha and beta phase by any core dev. Every patch level release shall contain the latest version of the db, maybe except for security releases. * It's the release managers responsibility to make sure, all final releases contain the current db. This needs to be added to the RM's TODO list. Who is going to create the tzdata_update package, how is it compiled and how often should it be released? One other thing, the zoneinfo database should be compatible with zipfile distributions. The module should be able to load the files from a stdlib zipfile. The feature is important for freeze, py2exe and py2app. Christian From barry at python.org Fri Dec 14 16:29:03 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Dec 2012 10:29:03 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50CB070C.9090409@python.org> References: <50C8541E.60406@python.org> <20121212174436.40acd4e1@pitrou.net> <50C91CDE.8050504@mrabarnett.plus.com> <50CB070C.9090409@python.org> Message-ID: <20121214102903.67b2061f@resist.wooz.org> On Dec 14, 2012, at 12:01 PM, Christian Heimes wrote: >* It's the release managers responsibility to make sure, all final > releases contain the current db. This needs to be added to the > RM's TODO list. That would be PEP 101. -Barry From a.cavallo at cavallinux.eu Fri Dec 14 09:49:44 2012 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Fri, 14 Dec 2012 08:49:44 +0000 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: References: <50C58DA5.3000307@cavallinux.eu> <50CA6E85.9060900@cavallinux.eu> Message-ID: <50CAE828.2090409@cavallinux.eu> Mmm, so the question would be distutils2 or distlib? I think tarek made a graph of the different packages systems... seen on reddit some time ago. My requirements would quite simple: 1. support DESTDIR approach where a package can be installed in an intermediate directory before its final destination) 2. cross compiling 3. install even if a dependecy package is not installed so decoupling installation from "configuration" point 1 is needed for system integrators (eg. people working in rpm builds). point 2 is entirely mine :) point 3 is the same philosophical difference between build, install and configuration steps: its part of good practices. In short it shouldn't replace the system wide dependency manager (in rpm it would be yum/zypp and in debian is much more confused, in window.. it doesn't exists as well in macos having the approach to pack it up everything in one place). Funny enough distutils (the old dead horse) does it all except point 2: that is my reason to clean up the code. I've just seen py3k distutils but it would be worth a back port to py2k. Thanks Nick Coghlan wrote: > On Fri, Dec 14, 2012 at 10:10 AM, Antonio Cavallo > > wrote: > > I'll have a look into distutils2, I tough it was (another) dead end. > I every case my target is py2k (2.7.x) and I've no case for > transitioning to py3k (too much risk). > > > distutils2 started as a copy of distutils, so it's hard to tell the > difference between the parts which have been fixed and the parts which > are still just distutils legacy components (this is why the merge back > was dropped from 3.3 - too many pieces simply weren't ready and simply > would have perpetuated problems inherited from distutils). > > distlib (https://distlib.readthedocs.org/en/latest/overview.html) is a > successor project that takes a different view of building up the low > level pieces without inheriting the bad parts of the distutils legacy (a > problem suffered by both setuptools/distribute and distutils2). distlib > also runs natively on both 2.x and 3.x, as the idea is that these > interoperability standards should be well supported in *current* Python > versions, not just those where the stdlib has caught up (i.e. now 3.4 at > the earliest) > > The aim is to get to a situation more like that with wsgiref, where the > stdlib defines the foundation and key APIs and data formats needed for > interoperability, while allowing a flourishing ecosystem of > user-oriented tools (like pip, bento, zc.buildout, etc) that still solve > the key problems addressed by setuptools/distribute without the opaque > and hard to extend distutils core that can make the existing tools > impossible to debug when they go wrong. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | > Brisbane, Australia From status at bugs.python.org Fri Dec 14 18:07:28 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 14 Dec 2012 18:07:28 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20121214170728.A44BE1CDC6@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-12-07 - 2012-12-14) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3826 (+10) closed 24631 (+34) total 28457 (+44) Open issues with patches: 1683 Issues opened (37) ================== #16421: importlib.machinery.ExtensionFileLoader cannot load several mo http://bugs.python.org/issue16421 reopened by haypo #16637: py-bt, py-locals, etc. GDB commands fail with output-radix 16 http://bugs.python.org/issue16637 opened by mshroyer #16638: support multi-line docstring signatures in IDLE calltips http://bugs.python.org/issue16638 opened by chris.jerdonek #16640: Less code under lock in sched.scheduler http://bugs.python.org/issue16640 opened by serhiy.storchaka #16641: sched.scheduler.enter arguments should not be modifiable http://bugs.python.org/issue16641 opened by serhiy.storchaka #16642: Mention new "kwargs" named tuple parameter in sched module http://bugs.python.org/issue16642 opened by serhiy.storchaka #16643: Wrong documented default value for timefunc parameter in sched http://bugs.python.org/issue16643 opened by serhiy.storchaka #16644: Wrong code in ContextManagerTests.test_invalid_args() in test_ http://bugs.python.org/issue16644 opened by serhiy.storchaka #16645: Wrong test_extract_hardlink() in test_tarfile.py http://bugs.python.org/issue16645 opened by serhiy.storchaka #16646: FTP.makeport() loses socket error details http://bugs.python.org/issue16646 opened by serhiy.storchaka #16647: LMTP.connect() loses socket error details http://bugs.python.org/issue16647 opened by serhiy.storchaka #16648: stdib should use new exception types from PEP 3151 http://bugs.python.org/issue16648 opened by asvetlov #16649: Add a PyCF_DISPLAY_EXPRESSION_RESULTS flag http://bugs.python.org/issue16649 opened by ncoghlan #16650: Popen._internal_poll() references errno.ECHILD outside of the http://bugs.python.org/issue16650 opened by serhiy.storchaka #16651: Find out what stdlib modules lack a pure Python implementation http://bugs.python.org/issue16651 opened by brett.cannon #16652: socket.getfqdn docs are not explicit enough about the algorith http://bugs.python.org/issue16652 opened by r.david.murray #16653: reference kept in f_locals prevents the tracing/profiling of a http://bugs.python.org/issue16653 opened by xdegaye #16655: IDLE list.append calltips test failures http://bugs.python.org/issue16655 opened by chris.jerdonek #16656: os.walk ignores international dirs on Windows http://bugs.python.org/issue16656 opened by techtonik #16657: traceback.format_tb incorrect docsting http://bugs.python.org/issue16657 opened by mgedmin #16658: Missing "return" in HTTPConnection.send() http://bugs.python.org/issue16658 opened by amaury.forgeotdarc #16659: Pure Python implementation of random http://bugs.python.org/issue16659 opened by serhiy.storchaka #16661: test_posix.test_getgrouplist fails on some systems - incorrect http://bugs.python.org/issue16661 opened by gregory.p.smith #16662: load_tests not invoked in package/__init__.py http://bugs.python.org/issue16662 opened by rbcollins #16663: Poor documentation for METH_KEYWORDS http://bugs.python.org/issue16663 opened by r3m0t #16664: [PATCH] Test Glob: files starting with . http://bugs.python.org/issue16664 opened by Sebastian.Kreft #16665: doc for builtin hex() is poor http://bugs.python.org/issue16665 opened by rurpy2 #16666: docs wrongly imply socket.getaddrinfo takes keyword arguments http://bugs.python.org/issue16666 opened by Mikel.Ward #16667: timezone docs need "versionadded: 3.2" http://bugs.python.org/issue16667 opened by ncoghlan #16669: Docstrings for namedtuple http://bugs.python.org/issue16669 opened by serhiy.storchaka #16670: Point class may be not be a good example for namedtuple http://bugs.python.org/issue16670 opened by julien.tayon #16672: improve tracing performances when f_trace is NULL http://bugs.python.org/issue16672 opened by xdegaye #16674: Faster getrandbits() for small integers http://bugs.python.org/issue16674 opened by serhiy.storchaka #16676: Segfault under Python 3.3 after PyType_GenericNew http://bugs.python.org/issue16676 opened by tseaver #16677: Hard to find operator precedence in Lang Ref. http://bugs.python.org/issue16677 opened by rurpy2 #16678: optparse: parse only known options http://bugs.python.org/issue16678 opened by techtonik #16679: Wrong URL path decoding http://bugs.python.org/issue16679 opened by claudep Most recent 15 issues with no replies (15) ========================================== #16677: Hard to find operator precedence in Lang Ref. http://bugs.python.org/issue16677 #16676: Segfault under Python 3.3 after PyType_GenericNew http://bugs.python.org/issue16676 #16674: Faster getrandbits() for small integers http://bugs.python.org/issue16674 #16663: Poor documentation for METH_KEYWORDS http://bugs.python.org/issue16663 #16658: Missing "return" in HTTPConnection.send() http://bugs.python.org/issue16658 #16657: traceback.format_tb incorrect docsting http://bugs.python.org/issue16657 #16655: IDLE list.append calltips test failures http://bugs.python.org/issue16655 #16652: socket.getfqdn docs are not explicit enough about the algorith http://bugs.python.org/issue16652 #16646: FTP.makeport() loses socket error details http://bugs.python.org/issue16646 #16645: Wrong test_extract_hardlink() in test_tarfile.py http://bugs.python.org/issue16645 #16643: Wrong documented default value for timefunc parameter in sched http://bugs.python.org/issue16643 #16642: Mention new "kwargs" named tuple parameter in sched module http://bugs.python.org/issue16642 #16640: Less code under lock in sched.scheduler http://bugs.python.org/issue16640 #16626: Infinite recursion in glob.glob('*:') on Windows http://bugs.python.org/issue16626 #16623: argparse help formatter does not honor non-breaking space http://bugs.python.org/issue16623 Most recent 15 issues waiting for review (15) ============================================= #16679: Wrong URL path decoding http://bugs.python.org/issue16679 #16674: Faster getrandbits() for small integers http://bugs.python.org/issue16674 #16672: improve tracing performances when f_trace is NULL http://bugs.python.org/issue16672 #16669: Docstrings for namedtuple http://bugs.python.org/issue16669 #16667: timezone docs need "versionadded: 3.2" http://bugs.python.org/issue16667 #16664: [PATCH] Test Glob: files starting with . http://bugs.python.org/issue16664 #16659: Pure Python implementation of random http://bugs.python.org/issue16659 #16657: traceback.format_tb incorrect docsting http://bugs.python.org/issue16657 #16653: reference kept in f_locals prevents the tracing/profiling of a http://bugs.python.org/issue16653 #16650: Popen._internal_poll() references errno.ECHILD outside of the http://bugs.python.org/issue16650 #16647: LMTP.connect() loses socket error details http://bugs.python.org/issue16647 #16646: FTP.makeport() loses socket error details http://bugs.python.org/issue16646 #16645: Wrong test_extract_hardlink() in test_tarfile.py http://bugs.python.org/issue16645 #16644: Wrong code in ContextManagerTests.test_invalid_args() in test_ http://bugs.python.org/issue16644 #16642: Mention new "kwargs" named tuple parameter in sched module http://bugs.python.org/issue16642 Top 10 most discussed issues (10) ================================= #16612: Integrate "Argument Clinic" specialized preprocessor into CPyt http://bugs.python.org/issue16612 21 msgs #16656: os.walk ignores international dirs on Windows http://bugs.python.org/issue16656 21 msgs #16651: Find out what stdlib modules lack a pure Python implementation http://bugs.python.org/issue16651 17 msgs #15207: mimetypes.read_windows_registry() uses the wrong regkey, creat http://bugs.python.org/issue15207 9 msgs #14894: distutils.LooseVersion fails to compare number and a word http://bugs.python.org/issue14894 7 msgs #16659: Pure Python implementation of random http://bugs.python.org/issue16659 7 msgs #16661: test_posix.test_getgrouplist fails on some systems - incorrect http://bugs.python.org/issue16661 6 msgs #16666: docs wrongly imply socket.getaddrinfo takes keyword arguments http://bugs.python.org/issue16666 6 msgs #7741: Allow multiple statements in code.InteractiveConsole.push http://bugs.python.org/issue7741 5 msgs #16665: doc for builtin hex() is poor http://bugs.python.org/issue16665 5 msgs Issues closed (32) ================== #7719: distutils: ignore .nfsXXXX files http://bugs.python.org/issue7719 closed by eric.araujo #11797: 2to3 does not correct "reload" http://bugs.python.org/issue11797 closed by python-dev #12446: StreamReader Readlines behavior odd http://bugs.python.org/issue12446 closed by serhiy.storchaka #13091: ctypes: memory leak http://bugs.python.org/issue13091 closed by pitrou #13390: Hunt memory allocations in addition to reference leaks http://bugs.python.org/issue13390 closed by pitrou #13512: ~/.pypirc created insecurely http://bugs.python.org/issue13512 closed by eric.araujo #13614: setup.py register fails if long_description contains ReST http://bugs.python.org/issue13614 closed by eric.araujo #14475: codecs.StreamReader.read behaves differently from regular file http://bugs.python.org/issue14475 closed by serhiy.storchaka #15209: Re-raising exceptions from an expression http://bugs.python.org/issue15209 closed by ncoghlan #15526: test_startfile crash on Windows 7 AMD64 http://bugs.python.org/issue15526 closed by sbt #15872: shutil.rmtree(..., ignore_errors=True) doesn't ignore all erro http://bugs.python.org/issue15872 closed by hynek #16049: Create abstract base classes by inheritance rather than a dire http://bugs.python.org/issue16049 closed by asvetlov #16267: order of decorators @abstractmethod and @classmethod is signif http://bugs.python.org/issue16267 closed by ncoghlan #16495: bytes_decode() unnecessarily examines encoding http://bugs.python.org/issue16495 closed by chris.jerdonek #16582: Tkinter calls SystemExit with string http://bugs.python.org/issue16582 closed by asvetlov #16598: Docs: double newlines printed in some file iteration examples http://bugs.python.org/issue16598 closed by asvetlov #16602: weakref can return an object with 0 refcount http://bugs.python.org/issue16602 closed by pitrou #16614: argparse should have an option to require un-abbreviated optio http://bugs.python.org/issue16614 closed by terry.reedy #16616: test_poll.PollTests.poll_unit_tests() is dead code http://bugs.python.org/issue16616 closed by sbt #16628: leak in ctypes.resize() http://bugs.python.org/issue16628 closed by pitrou #16629: IDLE: Calltips test fails due to int docstring change http://bugs.python.org/issue16629 closed by chris.jerdonek #16634: urllib.error.HTTPError.reason is not documented http://bugs.python.org/issue16634 closed by orsenthil #16636: codecs: readline() followed by readlines() returns trunkated r http://bugs.python.org/issue16636 closed by serhiy.storchaka #16639: not your all issuse send http://bugs.python.org/issue16639 closed by rosslagerwall #16654: IDLE problems with Mac OS 10.6.8 ("print syntax") http://bugs.python.org/issue16654 closed by r.david.murray #16660: Segmentation fault when importing hashlib http://bugs.python.org/issue16660 closed by gregory.p.smith #16668: Remove python3dll.vcxproj from pcbuild.sln http://bugs.python.org/issue16668 closed by loewis #16671: logging.handlers.QueueListener sentinel should not be None http://bugs.python.org/issue16671 closed by vinay.sajip #16673: Corrections in the library OS (PEP8) http://bugs.python.org/issue16673 closed by ezio.melotti #16675: Ship Python with a package manager http://bugs.python.org/issue16675 closed by r.david.murray #16680: Line buffering in socket._fileobject is broken http://bugs.python.org/issue16680 closed by neologix #1748064: inspect.getargspec fails on built-in or slot wrapper methods http://bugs.python.org/issue1748064 closed by asvetlov From benno at benno.id.au Fri Dec 14 20:17:19 2012 From: benno at benno.id.au (Ben Leslie) Date: Sat, 15 Dec 2012 06:17:19 +1100 Subject: [Python-Dev] http.client Nagle/delayed-ack optimization Message-ID: The http.client HTTPConnection._send_output method has an optimization for avoiding bad interactions between delayed-ack and the Nagle algorithm: http://hg.python.org/cpython/file/f32f67d26035/Lib/http/client.py#l884 Unfortunately this interacts rather poorly if the case where the message_body is a bytes instance and is rather large. If the message_body is bytes it is appended to the headers, which causes a copy of the data. When message_body is large this duplication of data can cause a significant spike in memory usage. (In my particular case I was uploading a 200MB file to 30 hosts at the same leading to memory spikes over 6GB. I've solved this by subclassing and removing the optimization, however I'd appreciate thoughts on how this could best be solved in the library itself. Options I have thought of are: 1: Have some size threshold on the copy. A little bit too much magic. Unclear what the size threshold should be. 2: Provide an explicit argument to turn the optimization on/off. This is ugly as it would need to be attached up the call chain to the request method. 3: Provide a property on the HTTPConnection object which enables the optimization or not. Optionally configured as part of __init__. 4: Add a class level attribute (similar to auto_open, default_port, etc) which controls the optimization. Be very interested to get some feedback so I can craft the appropriate patch. Thanks, Benno -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Dec 14 20:27:00 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 14 Dec 2012 20:27:00 +0100 Subject: [Python-Dev] http.client Nagle/delayed-ack optimization References: Message-ID: <20121214202700.169ed8a9@pitrou.net> On Sat, 15 Dec 2012 06:17:19 +1100 Ben Leslie wrote: > The http.client HTTPConnection._send_output method has an optimization for > avoiding bad interactions between delayed-ack and the Nagle algorithm: > > http://hg.python.org/cpython/file/f32f67d26035/Lib/http/client.py#l884 > > Unfortunately this interacts rather poorly if the case where the > message_body is a bytes instance and is rather large. > > If the message_body is bytes it is appended to the headers, which causes a > copy of the data. When message_body is large this duplication of data can > cause a significant spike in memory usage. > > (In my particular case I was uploading a 200MB file to 30 hosts at the same > leading to memory spikes over 6GB. > > I've solved this by subclassing and removing the optimization, however I'd > appreciate thoughts on how this could best be solved in the library itself. > > Options I have thought of are: > > 1: Have some size threshold on the copy. A little bit too much magic. > Unclear what the size threshold should be. I think a hardcoded threshold is the right thing to do. It doesn't sound very useful to try doing a single send() call when you have a large chunk of data (say, more than 1 MB). Regards Antoine. From a.cavallo at cavallinux.eu Fri Dec 14 22:51:50 2012 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Fri, 14 Dec 2012 21:51:50 +0000 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: References: <50C58DA5.3000307@cavallinux.eu> <50CA6E85.9060900@cavallinux.eu> <50CAE828.2090409@cavallinux.eu> Message-ID: <50CB9F76.4020203@cavallinux.eu> It is not that complex... What's ahead is even more complex. Lennart Regebro wrote: > On Fri, Dec 14, 2012 at 9:49 AM, Antonio Cavallo > wrote: >> My requirements would quite simple: >> 2. cross compiling > > That is *not* a simple requirement. > > //Lennart From tjreedy at udel.edu Sat Dec 15 01:52:28 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 14 Dec 2012 19:52:28 -0500 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? In-Reply-To: <20121213211451.GA21697@hobo.wolfson.cam.ac.uk> References: <20121213211451.GA21697@hobo.wolfson.cam.ac.uk> Message-ID: On 12/13/2012 4:14 PM, Ross Lagerwall wrote: > On Fri, Dec 14, 2012 at 07:57:52AM +1100, Chris Angelico wrote: >> The default version shown on http://docs.python.org/ is now 3.3.0, >> which I think is a Good Thing. However, http://python.org/download/ >> puts 2.7 first, and says: >> >> """If you don't know which version to use, start with Python 2.7; more >> existing third party software is compatible with Python 2 than Python >> 3 right now.""" >> >> Firstly, is this still true? (I wouldn't have a clue.) And secondly, >> would this be better worded as "one's better but the other's a good >> fall-back"? Something like: >> >> """Don't know which version to use? Python 3.3 is the recommended >> version for new projects, but much existing software is compatible >> with Python 2.""" >> > > I would say listing 3.3 as the recommended version to use is a good > thing, especially as distros like Ubuntu and Fedora transition to Python > 3. It also makes sense, given that the docs default to 3.3. From the LibreOffice 4.0beta1 release notes http://wiki.documentfoundation.org/ReleaseNotes/4.0 Python The bundled Python was upgraded from Python 2.6 to Python 3.3 (Michael Stahl) Python extensions and macros may require some degree of re-work to work on the latest Python; see for example Porting to Python 3. The last is a link to Lennart Regebro's book: http://python3porting.com/ -- Terry Jan Reedy From jeanpierreda at gmail.com Sat Dec 15 02:31:13 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 14 Dec 2012 20:31:13 -0500 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? In-Reply-To: References: Message-ID: On Thu, Dec 13, 2012 at 9:38 PM, Eric Snow wrote: >> """If you don't know which version to use, start with Python 2.7; more >> existing third party software is compatible with Python 2 than Python >> 3 right now.""" >> >> Firstly, is this still true? (I wouldn't have a clue.) > > Nope: > > http://py3ksupport.appspot.com/ > http://python3wos.appspot.com/ (plone and zope skew the results) Until those numbers hit 100%, or until projects start dropping support for Python 2.x, the statement would still be true. -- Devin From rosuav at gmail.com Sat Dec 15 02:55:32 2012 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 15 Dec 2012 12:55:32 +1100 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? In-Reply-To: References: Message-ID: On Sat, Dec 15, 2012 at 12:31 PM, Devin Jeanpierre wrote: > On Thu, Dec 13, 2012 at 9:38 PM, Eric Snow wrote: >>> """If you don't know which version to use, start with Python 2.7; more >>> existing third party software is compatible with Python 2 than Python >>> 3 right now.""" >>> >>> Firstly, is this still true? (I wouldn't have a clue.) >> >> Nope: >> >> http://py3ksupport.appspot.com/ >> http://python3wos.appspot.com/ (plone and zope skew the results) > > Until those numbers hit 100%, or until projects start dropping support > for Python 2.x, the statement would still be true. Not necessarily dropping; all you need is for new projects to not bother supporting 2.x and the statement can become false. ChrisA From stephen at xemacs.org Sat Dec 15 03:44:59 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 15 Dec 2012 11:44:59 +0900 Subject: [Python-Dev] Downloads page: Which version of Python should be listed first? In-Reply-To: References: Message-ID: <87pq2cxcd0.fsf@uwakimon.sk.tsukuba.ac.jp> Devin Jeanpierre writes: > Until those numbers hit 100%, or until projects start dropping support > for Python 2.x, the statement would still be true. This is simply not true. Once the numbers hit somewhere in the neighborhood of 50%, the network effects (the need to "connect" to the more progressive projects) are going to rule, and "most projects" will face strong pressure to adapt (which means providing Python 3 support, *not* "dropping Python 2"). On the other hand, some projects will never bother because they're completely standalone: you won't ever reach 100%, and even 90% may not be necessary for the ecology to be considered "fully adapted to Python 3". From chris at kateandchris.net Sat Dec 15 20:28:09 2012 From: chris at kateandchris.net (Chris Lambacher) Date: Sat, 15 Dec 2012 14:28:09 -0500 Subject: [Python-Dev] [Distutils] Is is worth disentangling distutils? In-Reply-To: <50CB9F76.4020203@cavallinux.eu> References: <50C58DA5.3000307@cavallinux.eu> <50CA6E85.9060900@cavallinux.eu> <50CAE828.2090409@cavallinux.eu> <50CB9F76.4020203@cavallinux.eu> Message-ID: You can already cross compile with distutils, though it is not exactly easy: http://pyvideo.org/video/682/cross-compiling-python-c-extensions-for-embedde -Chris On Fri, Dec 14, 2012 at 4:51 PM, Antonio Cavallo wrote: > It is not that complex... What's ahead is even more complex. > > > > > > Lennart Regebro wrote: > >> On Fri, Dec 14, 2012 at 9:49 AM, Antonio Cavallo >> wrote: >> >>> My requirements would quite simple: >>> 2. cross compiling >>> >> >> That is *not* a simple requirement. >> >> //Lennart >> > ______________________________**_________________ > Distutils-SIG maillist - Distutils-SIG at python.org > http://mail.python.org/**mailman/listinfo/distutils-sig > -- Christopher Lambacher chris at kateandchris.net -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sun Dec 16 13:28:24 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 16 Dec 2012 13:28:24 +0100 Subject: [Python-Dev] cpython: Issue #16049: add abc.ABC helper class. In-Reply-To: <3YMhG258Y2zRRl@mail.python.org> References: <3YMhG258Y2zRRl@mail.python.org> Message-ID: Am 13.12.2012 18:09, schrieb andrew.svetlov: > http://hg.python.org/cpython/rev/9347869d1066 > changeset: 80840:9347869d1066 > user: Andrew Svetlov > date: Thu Dec 13 19:09:33 2012 +0200 > summary: > Issue #16049: add abc.ABC helper class. > > Patch by Bruno Dupuis. > > files: > Doc/library/abc.rst | 18 ++++++++++++++---- > Lib/abc.py | 6 ++++++ > Lib/test/test_abc.py | 13 +++++++++++++ > Misc/NEWS | 3 +++ > 4 files changed, 36 insertions(+), 4 deletions(-) > > > diff --git a/Doc/library/abc.rst b/Doc/library/abc.rst > --- a/Doc/library/abc.rst > +++ b/Doc/library/abc.rst > @@ -12,9 +12,9 @@ > -------------- > > This module provides the infrastructure for defining :term:`abstract base > -classes ` (ABCs) in Python, as outlined in :pep:`3119`; see the PEP for why this > -was added to Python. (See also :pep:`3141` and the :mod:`numbers` module > -regarding a type hierarchy for numbers based on ABCs.) > +classes ` (ABCs) in Python, as outlined in :pep:`3119`; > +see the PEP for why this was added to Python. (See also :pep:`3141` and the > +:mod:`numbers` module regarding a type hierarchy for numbers based on ABCs.) > > The :mod:`collections` module has some concrete classes that derive from > ABCs; these can, of course, be further derived. In addition the > @@ -23,7 +23,7 @@ > hashable or a mapping. > > > -This module provides the following class: > +This module provides the following classes: > > .. class:: ABCMeta > > @@ -127,6 +127,16 @@ > available as a method of ``Foo``, so it is provided separately. > > > +.. class:: ABC > + > + A helper class that has :class:`ABCMeta` as metaclass. :class:`ABC` is the > + standard class to inherit from in order to create an abstract base class, > + avoiding sometimes confusing metaclass usage. > + > + Note that :class:`ABC` type is still :class:`ABCMeta`, therefore inheriting > + from :class:`ABC` requires usual precautions regarding metaclasses usage > + as multiple inheritance may lead to metaclass conflicts. > + Needs a versionadded. Georg From raymond.hettinger at gmail.com Mon Dec 17 06:17:08 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 16 Dec 2012 21:17:08 -0800 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: On Dec 13, 2012, at 7:00 PM, Chris Jerdonek wrote: > On Thu, Dec 13, 2012 at 6:48 PM, R. David Murray wrote: >> On Thu, 13 Dec 2012 20:21:24 -0500, Trent Nelson wrote: >>> - Use a completely separate clone to house all the intermediate >>> commits, then generate a diff once the final commit is ready, >>> then apply that diff to the main cpython repo, then push that. >>> This approach is fine, but it seems counter-intuitive to the >>> whole concept of DVCS. >> >> Perhaps. But that's exactly what I did with the email package changes >> for 3.3. >> >> You seem to have a tension between "all those dirty little commits" and >> "clean history" and the fact that a dvcs is designed to preserve all >> those commits...if you don't want those intermediate commits in the >> official repo, then why is a diff/patch a bad way to achieve that? > > Right. And you usually have to do this beforehand anyways to upload > your changes to the tracker for review. > > Also, for the record (not that anyone has said anything to the > contrary), our dev guide says, "You should collapse changesets of a > single feature or bugfix before pushing the result to the main > repository. The reason is that we don?t want the history to be full of > intermediate commits recording the private history of the person > working on a patch. If you are using the rebase extension, consider > adding the --collapse option to hg rebase. The collapse extension is > another choice." > > (from http://docs.python.org/devguide/committing.html#working-with-mercurial ) Does hg's ability to "make merges easier than svn" depend on having all the intermediate commits? I thought the theory was that the smaller changesets provided extra information that made it possible to merge two expansive groups of changes. Raymond From tim.delaney at aptare.com Mon Dec 17 06:26:30 2012 From: tim.delaney at aptare.com (Tim Delaney) Date: Mon, 17 Dec 2012 16:26:30 +1100 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: Apologies the top-posting (damned Gmail ...). Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.delaney at aptare.com Mon Dec 17 06:25:26 2012 From: tim.delaney at aptare.com (Tim Delaney) Date: Mon, 17 Dec 2012 16:25:26 +1100 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: Possibly. A collapsed changeset is more likely to have larger hunks of changes e.g. two changesets that each modified adjacent pieces of code get collapsed down to a single change hunk - which would make the merge machinery have to work harder to detect moved hunks, etc. In practice, so long as each collapsed changeset is for a single change I haven't seen this be a major issue. However, I'm personally a "create a new named branch for each task, keep all intermediate history" kind of guy (and I get to set the rules for my team ;) so I don't see collapsed changesets very often. Tim Delaney On 17 December 2012 16:17, Raymond Hettinger wrote: > > On Dec 13, 2012, at 7:00 PM, Chris Jerdonek > wrote: > > > On Thu, Dec 13, 2012 at 6:48 PM, R. David Murray > wrote: > >> On Thu, 13 Dec 2012 20:21:24 -0500, Trent Nelson > wrote: > >>> - Use a completely separate clone to house all the intermediate > >>> commits, then generate a diff once the final commit is ready, > >>> then apply that diff to the main cpython repo, then push that. > >>> This approach is fine, but it seems counter-intuitive to the > >>> whole concept of DVCS. > >> > >> Perhaps. But that's exactly what I did with the email package changes > >> for 3.3. > >> > >> You seem to have a tension between "all those dirty little commits" and > >> "clean history" and the fact that a dvcs is designed to preserve all > >> those commits...if you don't want those intermediate commits in the > >> official repo, then why is a diff/patch a bad way to achieve that? > > > > Right. And you usually have to do this beforehand anyways to upload > > your changes to the tracker for review. > > > > Also, for the record (not that anyone has said anything to the > > contrary), our dev guide says, "You should collapse changesets of a > > single feature or bugfix before pushing the result to the main > > repository. The reason is that we don?t want the history to be full of > > intermediate commits recording the private history of the person > > working on a patch. If you are using the rebase extension, consider > > adding the --collapse option to hg rebase. The collapse extension is > > another choice." > > > > (from > http://docs.python.org/devguide/committing.html#working-with-mercurial ) > > > Does hg's ability to "make merges easier than svn" depend on having > all the intermediate commits? I thought the theory was that the smaller > changesets provided extra information that made it possible to merge > two expansive groups of changes. > > > Raymond > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/timothy.c.delaney%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon Dec 17 07:22:00 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 17 Dec 2012 15:22:00 +0900 Subject: [Python-Dev] Mercurial workflow question... In-Reply-To: References: <20121214012047.GA46660@snakebite.org> <20121214024824.3BCCC2500B2@webabinitio.net> Message-ID: <87bodtxkon.fsf@uwakimon.sk.tsukuba.ac.jp> Raymond Hettinger writes: > Does hg's ability to "make merges easier than svn" depend on having > all the intermediate commits? I thought the theory was that the smaller > changesets provided extra information that made it possible to merge > two expansive groups of changes. Tim Delaney's explanation is correct as far as it goes. But I would give a pretty firm "No" as the answer to your question. The big difference between svn (and CVS) and hg (and git and bzr) at the time of migrating the Python repository was that svn didn't track merges, only branches. So in svn you get a 3-way merge with the branch point as the base version. This meant that you could not track progress of the mainline while working on a branch. svn tends to report the merge of recent mainline changes back into the mainline as conflicts when merging your branch into the mainline[1][2], all too often resulting in a big mess. hg, because it records merges as well as branches, can use the most recent common version (typically the mainline parent of the most recent "catch-up" merge) as the base version. This means that (1) there are somewhat fewer divergences because your branch already contains most changes to the mainline, and (2) you don't get "spurious" conflicts. On the other hand, more frequent intermediate committing is mostly helpful in bisection, and so the usefulness depends on very disciplined committing (only commit build- and test-able code). Summary: only the frequency of intermediate merge commits really matters. Because in hg it's possible to have frequent "catch-up" merges from mainline, you get smaller merges with fewer conflicts both at "catch-up" time and at merge-to-mainline time. Footnotes: [1] Not the whole story, but OK for this purpose. Technical details available on request. [2] I have paid almost no attention to svn since Python migrated to hg, so perhaps svn has improved merge support in the meantime. But that doesn't really matter since svn is merely being used to help explain why commit granularity doesn't matter much to hg's merge capabilities. From andrew.svetlov at gmail.com Tue Dec 18 13:28:59 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 18 Dec 2012 14:28:59 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Following issue #13390, fix compilation --without-pymalloc, and make In-Reply-To: <3YQGh3369WzNhB@mail.python.org> References: <3YQGh3369WzNhB@mail.python.org> Message-ID: Looks like Windows buildbots broken by this commit. On Tue, Dec 18, 2012 at 12:07 AM, antoine.pitrou wrote: > http://hg.python.org/cpython/rev/a85673b55177 > changeset: 80923:a85673b55177 > user: Antoine Pitrou > date: Mon Dec 17 23:05:59 2012 +0100 > summary: > Following issue #13390, fix compilation --without-pymalloc, and make sys.getallocatedblocks() return 0 in that situation. > > files: > Doc/library/sys.rst | 15 ++++++++------- > Lib/test/test_sys.py | 7 ++++++- > Objects/obmalloc.c | 7 +++++++ > 3 files changed, 21 insertions(+), 8 deletions(-) > > > diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst > --- a/Doc/library/sys.rst > +++ b/Doc/library/sys.rst > @@ -396,16 +396,17 @@ > .. function:: getallocatedblocks() > > Return the number of memory blocks currently allocated by the interpreter, > - regardless of their size. This function is mainly useful for debugging > - small memory leaks. Because of the interpreter's internal caches, the > - result can vary from call to call; you may have to call > - :func:`_clear_type_cache()` to get more predictable results. > + regardless of their size. This function is mainly useful for tracking > + and debugging memory leaks. Because of the interpreter's internal > + caches, the result can vary from call to call; you may have to call > + :func:`_clear_type_cache()` and :func:`gc.collect()` to get more > + predictable results. > + > + If a Python build or implementation cannot reasonably compute this > + information, :func:`getallocatedblocks()` is allowed to return 0 instead. > > .. versionadded:: 3.4 > > - .. impl-detail:: > - Not all Python implementations may be able to return this information. > - > > .. function:: getcheckinterval() > > diff --git a/Lib/test/test_sys.py b/Lib/test/test_sys.py > --- a/Lib/test/test_sys.py > +++ b/Lib/test/test_sys.py > @@ -7,6 +7,7 @@ > import operator > import codecs > import gc > +import sysconfig > > # count the number of test runs, used to create unique > # strings to intern in test_intern() > @@ -616,9 +617,13 @@ > "sys.getallocatedblocks unavailable on this build") > def test_getallocatedblocks(self): > # Some sanity checks > + with_pymalloc = sysconfig.get_config_var('WITH_PYMALLOC') > a = sys.getallocatedblocks() > self.assertIs(type(a), int) > - self.assertGreater(a, 0) > + if with_pymalloc: > + self.assertGreater(a, 0) > + else: > + self.assertEqual(a, 0) > try: > # While we could imagine a Python session where the number of > # multiple buffer objects would exceed the sharing of references, > diff --git a/Objects/obmalloc.c b/Objects/obmalloc.c > --- a/Objects/obmalloc.c > +++ b/Objects/obmalloc.c > @@ -1316,6 +1316,13 @@ > { > PyMem_FREE(p); > } > + > +Py_ssize_t > +_Py_GetAllocatedBlocks(void) > +{ > + return 0; > +} > + > #endif /* WITH_PYMALLOC */ > > #ifdef PYMALLOC_DEBUG > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Thanks, Andrew Svetlov From raymond.hettinger at gmail.com Tue Dec 18 22:35:40 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 18 Dec 2012 13:35:40 -0800 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <20121211101331.05087056@pitrou.net> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> Message-ID: <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> On Dec 11, 2012, at 1:13 AM, Antoine Pitrou wrote: >> >> On Dec 10, 2012, at 2:48 AM, Christian Heimes >> wrote: >> >>> On the other hand every lookup and collision checks needs an >>> additional multiplication, addition and pointer dereferencing: >>> >>> entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx >> >> >> Currently, the dict implementation allows alternative lookup >> functions based on whether the keys are all strings. >> The choice of lookup function is stored in a function pointer. >> That lets each lookup use the currently active lookup >> function without having to make any computations or branches. > > An indirect function call is technically a branch, as seen from the CPU > (and not necessarily a very predictable one, although recent Intel > CPUs are said to be quite good at that). FWIW, we already have an indirection to the lookup function. I would piggyback on that, so no new indirections are required. My plan now is to apply the space compaction idea to sets. That code is less complex than dicts, and set operations stand to benefit the most from improved iteration speed. The steps would be: * Create a good set of benchmarks for set operations for both size and speed. * Start with the simplest version of the idea: separate the entries table from the hash table. Keep the hash table at Py_ssize_t, and pre-allocate the entry array to two-thirds the size of the hash table. This should give about a 25% space savings and speed-up iteration for all the set-to-set operations. * If that all works out, I want to trim the entry table for frozensefs so that the entry table has no over-allocations. This should give a much larger space savings. * Next, I want to experiment with the idea of using char/short/long sizes for the hash table. Since there is already an existing lookup indirection, this can be done with no additional overhead. Small sets will get the most benefit for the space savings and the cache performance for hash lookups should improve nicely (for a hash table of size 32 could fit in a single cache line). At each step, I'll run the benchmarks to make sure the expected speed and space benefits are being achieved. As a side-effect, sets without deletions will retain their insertion order. If this is of concern, it would be no problem to implement Antoine's idea of scrambling the entries during iteration. Raymond P.S. I've gotten a lot of suggestions for improvements to the proof-of-concept code. Thank you for that. The latest version is at: http://code.activestate.com/recipes/578375/ In that code, entries are stored in regular Python lists and inherit their over-allocation characteristics (about 12.5% overallocated for large lists). There are many other possible allocation strategies with their own respective speed/space trade-offs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From "ja...py" at farowl.co.uk Tue Dec 18 22:15:23 2012 From: "ja...py" at farowl.co.uk (Jeff Allen) Date: Tue, 18 Dec 2012 21:15:23 +0000 Subject: [Python-Dev] Understanding the buffer API In-Reply-To: <20120808104700.GA29854@sleipnir.bytereef.org> References: <501C5FF8.10209@farowl.co.uk> <20120804091150.GA12337@sleipnir.bytereef.org> <501D283F.9050207@farowl.co.uk> <20120804152549.GA16358@sleipnir.bytereef.org> <20120804164140.GA16845@sleipnir.bytereef.org> <20120808104700.GA29854@sleipnir.bytereef.org> Message-ID: <50D0DCEB.4070800@farowl.co.uk> On 08/08/2012 11:47, Stefan Krah wrote: > Nick Coghlan wrote: >> It does place a constraint on consumers that they can't assume those >> fields will be NULL just because they didn't ask for them, but I'm >> struggling to think of any reason why a client would actually *check* >> that instead of just assuming it. > Can we continue this discussion some other time, perhaps after 3.3 is out? > I'd like to respond, but need a bit more time to think about it than I have > right now (for this issue). Those who contributed to the design of it through discussion here may be interested in how this has turned out in Jython. Although Jython is still at a 2.7 alpha, the buffer API has proved itself in a few parts of the core now and feels reasonably solid. It works for bytes in one dimension. There's a bit of description here: http://wiki.python.org/jython/BufferProtocol Long story short, I took the route of providing all information, which makes the navigational parts of the flags argument unequivocally a statement of what navigation the client is assuming will be sufficient. (The exception if thrown says explicitly that it won't be enough.) It follows that if two clients want a view on the same object, an exporter can safely give them the same one. Buffers take care of export counting for the exporter (as in the bytearray resize lock), and buffers can give you a sliced view of themselves without help from the exporter. The innards of memoryview are much simpler for all this and enable it to implement slicing (as in CPython 3.3) in one dimension. There may be ideas worth stealing here if the CPython buffer is revisited. N dimensional arrays and indirect addressing, while supported in principle, have no implementation. I'm fairly sure multi-byte items, as a way to export arrays of other types, makes no sense in Java where type security is strict and a parallel but type-safe approach will be needed. Jeff Allen From andrew.svetlov at gmail.com Tue Dec 18 23:40:49 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 19 Dec 2012 00:40:49 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <50C593B2.9060203@python.org> <3A8E318B-93F9-4CB0-92DE-C5EC7E2DBA77@gmail.com> <20121211101331.05087056@pitrou.net> <0EC9E09D-4E36-4E8E-BAF3-79E187D07AD3@gmail.com> Message-ID: Good plan! On Tue, Dec 18, 2012 at 11:35 PM, Raymond Hettinger wrote: > > On Dec 11, 2012, at 1:13 AM, Antoine Pitrou wrote: > > > On Dec 10, 2012, at 2:48 AM, Christian Heimes > wrote: > > On the other hand every lookup and collision checks needs an > additional multiplication, addition and pointer dereferencing: > > entry = entries_baseaddr + sizeof(PyDictKeyEntry) * idx > > > > Currently, the dict implementation allows alternative lookup > functions based on whether the keys are all strings. > The choice of lookup function is stored in a function pointer. > That lets each lookup use the currently active lookup > function without having to make any computations or branches. > > > An indirect function call is technically a branch, as seen from the CPU > (and not necessarily a very predictable one, although recent Intel > CPUs are said to be quite good at that). > > > FWIW, we already have an indirection to the lookup function. > I would piggyback on that, so no new indirections are required. > > My plan now is to apply the space compaction idea to sets. > That code is less complex than dicts, and set operations > stand to benefit the most from improved iteration speed. > > The steps would be: > > * Create a good set of benchmarks for set operations > for both size and speed. > > * Start with the simplest version of the idea: separate the > entries table from the hash table. Keep the hash table at > Py_ssize_t, and pre-allocate the entry array to two-thirds the size > of the hash table. This should give about a 25% space savings > and speed-up iteration for all the set-to-set operations. > > * If that all works out, I want to trim the entry table for frozensefs > so that the entry table has no over-allocations. This should > give a much larger space savings. > > * Next, I want to experiment with the idea of using char/short/long > sizes for the hash table. Since there is already an existing > lookup indirection, this can be done with no additional overhead. > Small sets will get the most benefit for the space savings and > the cache performance for hash lookups should improve nicely > (for a hash table of size 32 could fit in a single cache line). > > At each step, I'll run the benchmarks to make sure the expected > speed and space benefits are being achieved. > > As a side-effect, sets without deletions will retain their insertion > order. If this is of concern, it would be no problem to implement > Antoine's idea of scrambling the entries during iteration. > > > Raymond > > > P.S. I've gotten a lot of suggestions for improvements to the > proof-of-concept code. Thank you for that. The latest version > is at: http://code.activestate.com/recipes/578375/ > In that code, entries are stored in regular Python lists > and inherit their over-allocation characteristics (about > 12.5% overallocated for large lists). There are many > other possible allocation strategies with their own > respective speed/space trade-offs. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com > -- Thanks, Andrew Svetlov From ncoghlan at gmail.com Wed Dec 19 08:24:43 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 19 Dec 2012 17:24:43 +1000 Subject: [Python-Dev] [Python-checkins] cpython: Mention OSError instead of IOError in the docs. In-Reply-To: <3YQsVs2mGnzQl0@mail.python.org> References: <3YQsVs2mGnzQl0@mail.python.org> Message-ID: On Wed, Dec 19, 2012 at 7:16 AM, andrew.svetlov wrote: > http://hg.python.org/cpython/rev/a6ea6f803017 > changeset: 80934:a6ea6f803017 > user: Andrew Svetlov > date: Tue Dec 18 23:16:44 2012 +0200 > summary: > Mention OSError instead of IOError in the docs. > > files: > Doc/faq/library.rst | 4 ++-- > 1 files changed, 2 insertions(+), 2 deletions(-) > > > diff --git a/Doc/faq/library.rst b/Doc/faq/library.rst > --- a/Doc/faq/library.rst > +++ b/Doc/faq/library.rst > @@ -209,7 +209,7 @@ > try: > c = sys.stdin.read(1) > print("Got character", repr(c)) > - except IOError: > + except OSError: > pass > finally: > termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm) > @@ -222,7 +222,7 @@ > :func:`termios.tcsetattr` turns off stdin's echoing and disables > canonical > mode. :func:`fcntl.fnctl` is used to obtain stdin's file descriptor > flags > and modify them for non-blocking mode. Since reading stdin when it is > empty > - results in an :exc:`IOError`, this error is caught and ignored. > + results in an :exc:`OSError`, this error is caught and ignored. > With any of these changes in the docs, please don't forget to include appropriate "versionchanged" directives. Many people using the Python 3 docs at "docs.python.org/3/" will still be on Python 3.2, and thus relying on the presence of such directives to let them know that while the various OS-related exception names are now just aliases for OSError in 3.3+, the distinctions still matter in 3.2. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Wed Dec 19 12:46:07 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 19 Dec 2012 13:46:07 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Mention OSError instead of IOError in the docs. In-Reply-To: References: <3YQsVs2mGnzQl0@mail.python.org> Message-ID: Done in #e5958a4e52ef On Wed, Dec 19, 2012 at 9:24 AM, Nick Coghlan wrote: > On Wed, Dec 19, 2012 at 7:16 AM, andrew.svetlov > wrote: >> >> http://hg.python.org/cpython/rev/a6ea6f803017 >> changeset: 80934:a6ea6f803017 >> user: Andrew Svetlov >> date: Tue Dec 18 23:16:44 2012 +0200 >> summary: >> Mention OSError instead of IOError in the docs. >> >> files: >> Doc/faq/library.rst | 4 ++-- >> 1 files changed, 2 insertions(+), 2 deletions(-) >> >> >> diff --git a/Doc/faq/library.rst b/Doc/faq/library.rst >> --- a/Doc/faq/library.rst >> +++ b/Doc/faq/library.rst >> @@ -209,7 +209,7 @@ >> try: >> c = sys.stdin.read(1) >> print("Got character", repr(c)) >> - except IOError: >> + except OSError: >> pass >> finally: >> termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm) >> @@ -222,7 +222,7 @@ >> :func:`termios.tcsetattr` turns off stdin's echoing and disables >> canonical >> mode. :func:`fcntl.fnctl` is used to obtain stdin's file descriptor >> flags >> and modify them for non-blocking mode. Since reading stdin when it is >> empty >> - results in an :exc:`IOError`, this error is caught and ignored. >> + results in an :exc:`OSError`, this error is caught and ignored. > > > With any of these changes in the docs, please don't forget to include > appropriate "versionchanged" directives. Many people using the Python 3 docs > at "docs.python.org/3/" will still be on Python 3.2, and thus relying on the > presence of such directives to let them know that while the various > OS-related exception names are now just aliases for OSError in 3.3+, the > distinctions still matter in 3.2. > > Cheers, > Nick. > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Thanks, Andrew Svetlov From techtonik at gmail.com Wed Dec 19 14:57:32 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 19 Dec 2012 16:57:32 +0300 Subject: [Python-Dev] hg annotate is broken on hg.python.org In-Reply-To: References: Message-ID: On Mon, Dec 10, 2012 at 4:39 AM, Chris Jerdonek wrote: > On Sun, Dec 9, 2012 at 3:30 PM, anatoly techtonik > wrote: > > Just to let you know that annotate in hgweb is broken for Python sources. > > > > > http://hg.python.org/cpython/annotate/692be1f9fa1d/Lib/distutils/tests/test_register.py > > Maybe I'm missing something, but what's broken about it? There was broken html markup for some revisions. Everything is ok now. Next time I'll try to post a screenshot. Also, in my > experience it's okay to file issues about hg.python.org on the main > tracker if you suspect something isn't right or you think should be > improved. > I bet such issues will be closed as 'invalid', so there is no place for them there as explained in this request: http://psf.upfronthosting.co.za/roundup/meta/issue340 -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed Dec 19 23:14:47 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 20 Dec 2012 01:14:47 +0300 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> Message-ID: On Sun, Dec 9, 2012 at 7:14 AM, Glyph wrote: > On Dec 7, 2012, at 5:10 PM, anatoly techtonik wrote: > > What about reading from other file descriptors? subprocess.Popen allows >> arbitrary file descriptors to be used. Is there any provision here for >> reading and writing non-blocking from or to those? > > > On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is > select. Of course a test is needed, but why it should not just work? > > > This is exactly why the provision needs to be made explicitly. > > On Windows it is WriteFile and ReadFile and PeekNamedPipe - unless the > handle is a socket in which case it needs to be WSARecv. Or maybe it's > some other weird thing - like, maybe a mailslot - and you need to call a > different API. > IIRC on Windows there is no socket descriptor that can be used as a file descriptor. Seems reasonable to limit the implementation to standard file descriptors in this platform. On *nix it really shouldn't be select. select cannot wait upon a file > descriptor whose *value* is greater than FD_SETSIZE, which means it sets > a hard (and small) limit on the number of things that a process which wants > to use this facility can be doing. > I didn't know that. Should a note be added to http://docs.python.org/2/library/select ? I also thought that poll acts like, well, a polling function - eating 100% CPU while looping over inputs over and over checking if there is something to react to. > On the other hand, if you hard-code another arbitrary limit like this into > the stdlib subprocess module, it will just be another great reason why > Twisted's spawnProcess is the best and everyone should use it instead, so > be my guest ;-). > spawnProcess requires a reactor. This PEP is an alternative for the proponents of green energy. =) -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed Dec 19 23:20:36 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 20 Dec 2012 01:20:36 +0300 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Sun, Dec 9, 2012 at 7:17 AM, Gregory P. Smith wrote: > I'm really not sure what this PEP is trying to get at given that it > contains no examples and sounds from the descriptions to be adding a > complicated api on top of something that already, IMNSHO, has too much it > (subprocess.Popen). > > Regardless, any user can use the stdout/err/in file objects with their own > code that handles them asynchronously (yes that can be painful but that is > what is required for _any_ socket or pipe I/O you don't want to block on). > And how to use stdout/stderr/in asynchronously in cross-platform manner? IIUC the problem is that every read is blocking. > It *sounds* to me like this entire PEP could be written and released as a > third party module on PyPI that offers a subprocess.Popen subclass adding > some more convenient non-blocking APIs. That's where I'd start if I were > interested in this as a future feature. > I've rewritten the PEP based on how do I understand the code. I don't know how to update it and how to comply with open documentation license, so I just attach it and add PEPs list to CC. Me too has a feeling that the PEP should be stripped of additional high level API until low level functionality is well understood and accepted. -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: pep-3145.diff Type: application/octet-stream Size: 9273 bytes Desc: not available URL: From ben+python at benfinney.id.au Thu Dec 20 01:18:59 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 20 Dec 2012 11:18:59 +1100 Subject: [Python-Dev] Draft PEP for time zone support. References: <50C8541E.60406@python.org> Message-ID: <7wsj71vamk.fsf@benfinney.id.au> Terry Reedy writes: > On 12/12/2012 10:56 AM, Lennart Regebro wrote: > > >> It seems like calling get_timezone() with an unknown timezone > >> should just throw ValueError, not necessarily some custom > >> Exception? > > > > That could very well be. What are others opinions on this? > > ValueError. That is what it is. Nothing special here. I think it's useful to have this raise a custom exception UnknownTimeZoneError, subclassed from ValueError. That satisfies those (including me!) who think it should be a ValueError, while also making the exception more specific so it can be handled apart from other possible ValueError causes. In short: +1 to ?class UnknownTimeZoneError(ValueError)?. -- \ ?Members of the general public commonly find copyright rules | `\ implausible, and simply disbelieve them.? ?Jessica Litman, | _o__) _Digital Copyright_ | Ben Finney From glyph at twistedmatrix.com Thu Dec 20 01:47:05 2012 From: glyph at twistedmatrix.com (Glyph) Date: Wed, 19 Dec 2012 16:47:05 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> Message-ID: <31357E70-8F23-4464-9CE2-B1F39EF44585@twistedmatrix.com> On Dec 19, 2012, at 2:14 PM, anatoly techtonik wrote: > On Sun, Dec 9, 2012 at 7:14 AM, Glyph wrote: > On Dec 7, 2012, at 5:10 PM, anatoly techtonik wrote: > >> What about reading from other file descriptors? subprocess.Popen allows arbitrary file descriptors to be used. Is there any provision here for reading and writing non-blocking from or to those? >> >> On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is select. Of course a test is needed, but why it should not just work? > > > This is exactly why the provision needs to be made explicitly. > > On Windows it is WriteFile and ReadFile and PeekNamedPipe - unless the handle is a socket in which case it needs to be WSARecv. Or maybe it's some other weird thing - like, maybe a mailslot - and you need to call a different API. > > IIRC on Windows there is no socket descriptor that can be used as a file descriptor. Seems reasonable to limit the implementation to standard file descriptors in this platform. Via the documentation of ReadFile: hFile [in] A handle to the device (for example, a file, file stream, physical disk, volume, console buffer, tape drive, socket, communications resource, mailslot, or pipe). (...) For asynchronous read operations, hFile can be any handle that is opened with the FILE_FLAG_OVERLAPPED flag by the CreateFilefunction, or a socket handle returned by the socket or accept function. (emphasis mine). So, you can treat sockets as regular files in some contexts, and not in others. Of course there are other reasons to use WSARecv instead of ReadFile sometimes, which is why there are multiple functions. > On *nix it really shouldn't be select. select cannot wait upon a file descriptor whose value is greater than FD_SETSIZE, which means it sets a hard (and small) limit on the number of things that a process which wants to use this facility can be doing. > > I didn't know that. Should a note be added to http://docs.python.org/2/library/select ? The note that should be added there is simply "you should know how the select system call works in C if you want to use this module". > I also thought that poll acts like, well, a polling function - eating 100% CPU while looping over inputs over and over checking if there is something to react to. Nope. Admittedly, the naming is slightly misleading. > On the other hand, if you hard-code another arbitrary limit like this into the stdlib subprocess module, it will just be another great reason why Twisted's spawnProcess is the best and everyone should use it instead, so be my guest ;-). > > spawnProcess requires a reactor. This PEP is an alternative for the proponents of green energy. =) Do you know what happens when you take something that is supposed to be happening inside a reactor, and then move it outside a reactor? It's not called "green energy", it's called "a bomb" ;-). -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From exarkun at twistedmatrix.com Thu Dec 20 01:02:42 2012 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Thu, 20 Dec 2012 00:02:42 -0000 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> Message-ID: <20121220000242.6389.1569707808.divmod.xquotient.316@localhost6.localdomain6> Please stop copying me on this thread. Thanks, Jean-Paul From techtonik at gmail.com Thu Dec 20 04:46:02 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 20 Dec 2012 06:46:02 +0300 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: <31357E70-8F23-4464-9CE2-B1F39EF44585@twistedmatrix.com> References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> <31357E70-8F23-4464-9CE2-B1F39EF44585@twistedmatrix.com> Message-ID: On Thu, Dec 20, 2012 at 3:47 AM, Glyph wrote: > > On Dec 19, 2012, at 2:14 PM, anatoly techtonik > wrote: > > On Sun, Dec 9, 2012 at 7:14 AM, Glyph wrote: > >> On Dec 7, 2012, at 5:10 PM, anatoly techtonik >> wrote: >> >> What about reading from other file descriptors? subprocess.Popen allows >>> arbitrary file descriptors to be used. Is there any provision here for >>> reading and writing non-blocking from or to those? >> >> >> On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is >> select. Of course a test is needed, but why it should not just work? >> >> >> This is exactly why the provision needs to be made explicitly. >> >> On Windows it is WriteFile and ReadFile and PeekNamedPipe - unless the >> handle is a socket in which case it needs to be WSARecv. Or maybe it's >> some other weird thing - like, maybe a mailslot - and you need to call a >> different API. >> > > IIRC on Windows there is no socket descriptor that can be used as a file > descriptor. Seems reasonable to limit the implementation to standard file > descriptors in this platform. > > > Via the documentation of ReadFile: < > http://msdn.microsoft.com/en-us/library/windows/desktop/aa365467(v=vs.85).aspx > > > > hFile [in] > > A handle to the device (for example, a file, file stream, physical disk, > volume, console buffer, tape drive, *socket*, communications resource, > mailslot, or pipe). (...) For asynchronous read operations, hFile can be > any handle that is opened with the FILE_FLAG_OVERLAPPED flag by > the CreateFilefunction, or a *socket handle returned by > the socket or accept function*. > > > (emphasis mine). > > So, you can treat sockets as regular files in some contexts, and not in > others. Of course there are other reasons to use WSARecv instead of > ReadFile sometimes, which is why there are multiple functions. > handle != descriptor, and Python documentation explicitly says that socket descriptor is limited, so it's ok to continue not supporting socket descriptors for pipes. http://docs.python.org/2/library/socket.html#socket.socket.fileno > On *nix it really shouldn't be select. select cannot wait upon a file >> descriptor whose *value* is greater than FD_SETSIZE, which means it sets >> a hard (and small) limit on the number of things that a process which wants >> to use this facility can be doing. >> > > I didn't know that. Should a note be added to > http://docs.python.org/2/library/select ? > > > The note that should be added there is simply "you should know how the > select system call works in C if you want to use this module". > Why spreading FUD if it is possible to define a good entrypoint for those who want to learn, but don't have enough time? Why not to say directly that select interface is outdated? > On the other hand, if you hard-code another arbitrary limit like this >> into the stdlib subprocess module, it will just be another great reason why >> Twisted's spawnProcess is the best and everyone should use it instead, so >> be my guest ;-). >> > > spawnProcess requires a reactor. This PEP is an alternative for the > proponents of green energy. =) > > > Do you know what happens when you take something that is supposed to be > happening *inside* a reactor, and then move it *outside* a reactor? It's > not called "green energy", it's called "a bomb" ;-). > The biggest complain about nuclear physics is that to understand what's going on it should have been gone 3D long ago. =) I think Twisted needs to organize competition on the best visualization of underlying concepts. It will help people grasp the concepts behind and different problems much faster (as well as gain an ability to compare different reactors). -------------- next part -------------- An HTML attachment was scrubbed... URL: From kristjan at ccpgames.com Thu Dec 20 15:08:43 2012 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Thu, 20 Dec 2012 14:08:43 +0000 Subject: [Python-Dev] http.client Nagle/delayed-ack optimization In-Reply-To: <20121214202700.169ed8a9@pitrou.net> References: <20121214202700.169ed8a9@pitrou.net> Message-ID: How serendipitous, I was just reporting a similar problem to Sony in one of their console sdks yesterday :) Indeed, the Nagle problem only shows up if you are sending more than one segments that are not full size. It will not occur in a sequence of full segments. Therefore, it is perfectly ok to send the headers + payload as a set of large chunks. The problem only occurs if sending two or more short segments. So, if sending even the short headers, followed by the large payload, there is no problem. The problem exists only if, in addition to the short headers, you are sending the short payload. In summary: If the payload is less than the MSS (consider this perhaps 2k) send it along with the headers. Otherwise, you can go ahead and send the headers, and thepayload (in large chunks if you want) without fear. See: http://en.wikipedia.org/wiki/Nagle%27s_algorithm and http://en.wikipedia.org/wiki/TCP_delayed_acknowledgment K > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou > Sent: 14. desember 2012 19:27 > To: python-dev at python.org > Subject: Re: [Python-Dev] http.client Nagle/delayed-ack optimization > > On Sat, 15 Dec 2012 06:17:19 +1100 > Ben Leslie wrote: > > The http.client HTTPConnection._send_output method has an optimization > > for avoiding bad interactions between delayed-ack and the Nagle > algorithm: > > > > http://hg.python.org/cpython/file/f32f67d26035/Lib/http/client.py#l884 > > > > Unfortunately this interacts rather poorly if the case where the > > message_body is a bytes instance and is rather large. > > > > If the message_body is bytes it is appended to the headers, which > > causes a copy of the data. When message_body is large this duplication > > of data can cause a significant spike in memory usage. > > > > (In my particular case I was uploading a 200MB file to 30 hosts at the > > same leading to memory spikes over 6GB. > > > > I've solved this by subclassing and removing the optimization, however > > I'd appreciate thoughts on how this could best be solved in the library itself. > > > > Options I have thought of are: > > > > 1: Have some size threshold on the copy. A little bit too much magic. > > Unclear what the size threshold should be. > > I think a hardcoded threshold is the right thing to do. It doesn't sound very > useful to try doing a single send() call when you have a large chunk of data > (say, more than 1 MB). > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python- > dev/kristjan%40ccpgames.com From barry at python.org Thu Dec 20 17:43:15 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 20 Dec 2012 11:43:15 -0500 Subject: [Python-Dev] Draft PEP for time zone support. References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> Message-ID: <20121220114315.554a52ac@resist.wooz.org> On Dec 20, 2012, at 11:18 AM, Ben Finney wrote: >Terry Reedy writes: > >> On 12/12/2012 10:56 AM, Lennart Regebro wrote: >> >> >> It seems like calling get_timezone() with an unknown timezone >> >> should just throw ValueError, not necessarily some custom >> >> Exception? >> > >> > That could very well be. What are others opinions on this? >> >> ValueError. That is what it is. Nothing special here. > >I think it's useful to have this raise a custom exception >UnknownTimeZoneError, subclassed from ValueError. That satisfies those >(including me!) who think it should be a ValueError, while also making >the exception more specific so it can be handled apart from other >possible ValueError causes. > >In short: +1 to ?class UnknownTimeZoneError(ValueError)?. That would be `class UnknownTimeZoneError(ValueError, TimeZoneError)`. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From brian at python.org Thu Dec 20 18:33:27 2012 From: brian at python.org (Brian Curtin) Date: Thu, 20 Dec 2012 11:33:27 -0600 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet Message-ID: Last week in Raymond's dictionary thread, the topic of ARM came up, along with the relative lack of build slave coverage. Today Trent Nelson received the PandaBoard purchased by the PSF, and a Raspberry Pi should be coming shortly as well. http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html Thanks to the PSF for purchasing and thanks to Trent for offering to host them in Snakebite! From glyph at twistedmatrix.com Thu Dec 20 05:02:43 2012 From: glyph at twistedmatrix.com (Glyph) Date: Wed, 19 Dec 2012 20:02:43 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> <5D7E1D12-6E32-41CE-9C58-7D34AD1D37B8@twistedmatrix.com> <31357E70-8F23-4464-9CE2-B1F39EF44585@twistedmatrix.com> Message-ID: On Dec 19, 2012, at 7:46 PM, anatoly techtonik wrote: > >> On *nix it really shouldn't be select. select cannot wait upon a file descriptor whose value is greater than FD_SETSIZE, which means it sets a hard (and small) limit on the number of things that a process which wants to use this facility can be doing. >> >> I didn't know that. Should a note be added to http://docs.python.org/2/library/select ? > > The note that should be added there is simply "you should know how the select system call works in C if you want to use this module". > > Why spreading FUD if it is possible to define a good entrypoint for those who want to learn, but don't have enough time? Why not to say directly that select interface is outdated? It's not FUD. If you know how select() works in C, you may well want to call it. It's the most portable multiplexing API, although it has a number of limitations. Really, what most users in this situation ought to be using is Twisted, but it seems there is not sufficient interest to bundle Twisted's core in the stdlib. However, the thing Guido is working on lately may be interoperable enough with Twisted that you can upgrade to it more easily in future versions of Python, so one day it may be reasonable to say select is outdated. (Maybe not though. It's a good thing nobody told me that select was deprecated in favor of asyncore.) >> On the other hand, if you hard-code another arbitrary limit like this into the stdlib subprocess module, it will just be another great reason why Twisted's spawnProcess is the best and everyone should use it instead, so be my guest ;-). >> >> spawnProcess requires a reactor. This PEP is an alternative for the proponents of green energy. =) > > Do you know what happens when you take something that is supposed to be happening inside a reactor, and then move it outside a reactor? It's not called "green energy", it's called "a bomb" ;-). > > The biggest complain about nuclear physics is that to understand what's going on it should have been gone 3D long ago. =) I think Twisted needs to organize competition on the best visualization of underlying concepts. It will help people grasp the concepts behind and different problems much faster (as well as gain an ability to compare different reactors). I would love for someone to do this, of course, but now we're _really_ off topic. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu Dec 20 19:06:18 2012 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 20 Dec 2012 10:06:18 -0800 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: Message-ID: On Thu, Dec 20, 2012 at 9:33 AM, Brian Curtin wrote: > Last week in Raymond's dictionary thread, the topic of ARM came up, > along with the relative lack of build slave coverage. Today Trent > Nelson received the PandaBoard purchased by the PSF, and a Raspberry > Pi should be coming shortly as well. > > http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html > > Thanks to the PSF for purchasing and thanks to Trent for offering to > host them in Snakebite! > __________________________ > That's good news. A related question about Snakebite, though. Maybe I missed something obvious, but is there an overview of how the core devs can use it? In particular, I'd want to know if Snakebite runs Python's tests regularly - and if it does, how can I see the status. How do I know if any commit of mine broke some host Snakebite has? How can I SSH to that host in order to reproduce and fix the problem? Some sort of a blog post about this, at least, would be very helpful for me and possibly other developers as well. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Dec 20 19:10:45 2012 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 20 Dec 2012 12:10:45 -0600 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: Message-ID: 2012/12/20 Eli Bendersky : > On Thu, Dec 20, 2012 at 9:33 AM, Brian Curtin wrote: >> >> Last week in Raymond's dictionary thread, the topic of ARM came up, >> along with the relative lack of build slave coverage. Today Trent >> Nelson received the PandaBoard purchased by the PSF, and a Raspberry >> Pi should be coming shortly as well. >> >> http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html >> >> Thanks to the PSF for purchasing and thanks to Trent for offering to >> host them in Snakebite! >> __________________________ > > > That's good news. A related question about Snakebite, though. Maybe I > missed something obvious, but is there an overview of how the core devs can > use it? In particular, I'd want to know if Snakebite runs Python's tests > regularly - and if it does, how can I see the status. How do I know if any > commit of mine broke some host Snakebite has? How can I SSH to that host in > order to reproduce and fix the problem? Some sort of a blog post about this, > at least, would be very helpful for me and possibly other developers as > well. http://mail.python.org/pipermail/python-dev/2012-September/121651.html Presumably that should go somewhere more permanent. -- Regards, Benjamin From trent at snakebite.org Thu Dec 20 19:43:49 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 20 Dec 2012 13:43:49 -0500 Subject: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC... Message-ID: <20121220184348.GA69156@snakebite.org> This seems odd to me so I wanted to see what others think. The unit test Lib/unittest/test/test_runner.py:Test_TextRunner.test_warnings will eventually hit subprocess.Popen._communicate. The `mswindows` implementation of this method relies on threads to buffer stdin/stdout. That'll eventually result in PyOs_StdioReadline being called without the GIL being held. PyOs_StdioReadline calls PyMem_MALLOC, PyMem_FREE and possibly PyMem_REALLOC. On a debug build, these macros are redirected to their _PyMem_Debug* counterparts. The call hierarchy for _PyMem_DebugMalloc looks like this: void * _PyMem_DebugMalloc(size_t nbytes) { return _PyObject_DebugMallocApi(_PYMALLOC_MEM_ID, nbytes); } /* generic debug memory api, with an "id" to identify the API in use */ void * _PyObject_DebugMallocApi(char id, size_t nbytes) { uchar *p; /* base address of malloc'ed block */ uchar *tail; /* p + 2*SST + nbytes == pointer to tail pad bytes */ size_t total; /* nbytes + 4*SST */ bumpserialno(); ------------^^^^^^^^^^^^^^^ total = nbytes + 4*SST; if (total < nbytes) /* overflow: can't represent total as a size_t */ return NULL; p = (uchar *)PyObject_Malloc(total); -------------------------^^^^^^^^^^^^^^^^^^^^^^^ if (p == NULL) return NULL; Both bumpserialno() and PyObject_Malloc affect global state. The latter also has a bunch of LOCK() and UNLOCK() statements, but these end up being no-ops: /* * Python's threads are serialized, * so object malloc locking is disabled. */ #define SIMPLELOCK_DECL(lock) /* simple lock declaration */ #define SIMPLELOCK_INIT(lock) /* allocate (if needed) and ... */ #define SIMPLELOCK_FINI(lock) /* free/destroy an existing */ #define SIMPLELOCK_LOCK(lock) /* acquire released lock */ #define SIMPLELOCK_UNLOCK(lock) /* release acquired lock */ ... /* * This malloc lock */ SIMPLELOCK_DECL(_malloc_lock) #define LOCK() SIMPLELOCK_LOCK(_malloc_lock) #define UNLOCK() SIMPLELOCK_UNLOCK(_malloc_lock) #define LOCK_INIT() SIMPLELOCK_INIT(_malloc_lock) #define LOCK_FINI() SIMPLELOCK_FINI(_malloc_lock) The PyObject_Malloc() one concerns me the most, as it affects huge amounts of global state. Also, I just noticed PyOs_StdioReadline() can call PyErr_SetString, which will result in a bunch of other calls that should only be made whilst the GIL is held. So, like I said, this seems like a bit of a head scratcher. Legit issue or am I missing something? Trent. From trent at snakebite.org Thu Dec 20 19:47:17 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 20 Dec 2012 13:47:17 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: Message-ID: <20121220184717.GB69156@snakebite.org> On Thu, Dec 20, 2012 at 10:10:45AM -0800, Benjamin Peterson wrote: > 2012/12/20 Eli Bendersky : > > On Thu, Dec 20, 2012 at 9:33 AM, Brian Curtin wrote: > >> > >> Last week in Raymond's dictionary thread, the topic of ARM came up, > >> along with the relative lack of build slave coverage. Today Trent > >> Nelson received the PandaBoard purchased by the PSF, and a Raspberry > >> Pi should be coming shortly as well. > >> > >> http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html > >> > >> Thanks to the PSF for purchasing and thanks to Trent for offering to > >> host them in Snakebite! > >> __________________________ > > > > > > That's good news. A related question about Snakebite, though. Maybe I > > missed something obvious, but is there an overview of how the core devs can > > use it? In particular, I'd want to know if Snakebite runs Python's tests > > regularly - and if it does, how can I see the status. How do I know if any > > commit of mine broke some host Snakebite has? How can I SSH to that host in > > order to reproduce and fix the problem? Some sort of a blog post about this, > > at least, would be very helpful for me and possibly other developers as > > well. > > http://mail.python.org/pipermail/python-dev/2012-September/121651.html > > Presumably that should go somewhere more permanent. Indeed, I'm going to carve out some time over the Christmas/NY break to work on this. There should really be a "Developer's Guide" that explains how to get the most out of the network. Trent. From trent at snakebite.org Thu Dec 20 19:52:56 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 20 Dec 2012 13:52:56 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: Message-ID: <20121220185256.GC69156@snakebite.org> On Thu, Dec 20, 2012 at 09:33:27AM -0800, Brian Curtin wrote: > Last week in Raymond's dictionary thread, the topic of ARM came up, > along with the relative lack of build slave coverage. Today Trent > Nelson received the PandaBoard purchased by the PSF, and a Raspberry > Pi should be coming shortly as well. > > http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html > > Thanks to the PSF for purchasing and thanks to Trent for offering to > host them in Snakebite! No problemo'. If only all the other Snakebite servers could fit in my palm and run off 0.25A. (The HP-UX and Tru64 boxes in particular take up 7U/8U, weigh 160/140lbs, and chew about ~12A each.) I'll work on setting the ARM boards up next week. Trent. From a.cavallo at cavallinux.eu Thu Dec 20 20:00:14 2012 From: a.cavallo at cavallinux.eu (a.cavallo at cavallinux.eu) Date: Thu, 20 Dec 2012 20:00:14 +0100 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet Message-ID: <5354.1356030014@cavallinux.eu> How about folding them??? I did it, now I don't need a power supply anymore :O On Thu 20/12/12 19:52, Trent Nelson trent at snakebite.org wrote: > No problemo'. If only all the other Snakebite servers could fit in > my palm and run off 0.25A. From solipsis at pitrou.net Thu Dec 20 20:18:53 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 20 Dec 2012 20:18:53 +0100 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet References: <20121220185256.GC69156@snakebite.org> Message-ID: <20121220201853.69f67e1f@pitrou.net> On Thu, 20 Dec 2012 13:52:56 -0500 Trent Nelson wrote: > On Thu, Dec 20, 2012 at 09:33:27AM -0800, Brian Curtin wrote: > > Last week in Raymond's dictionary thread, the topic of ARM came up, > > along with the relative lack of build slave coverage. Today Trent > > Nelson received the PandaBoard purchased by the PSF, and a Raspberry > > Pi should be coming shortly as well. > > > > http://blog.python.org/2012/12/pandaboard-raspberry-pi-coming-to.html > > > > Thanks to the PSF for purchasing and thanks to Trent for offering to > > host them in Snakebite! > > No problemo'. If only all the other Snakebite servers could fit in > my palm and run off 0.25A. (The HP-UX and Tru64 boxes in particular > take up 7U/8U, weigh 160/140lbs, and chew about ~12A each.) > > I'll work on setting the ARM boards up next week. For the record, Barry's ARM buildbot has been failing for a long time: http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm Regards Antoine. From trent at snakebite.org Thu Dec 20 20:54:16 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 20 Dec 2012 14:54:16 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20121220185256.GC69156@snakebite.org> References: <20121220185256.GC69156@snakebite.org> Message-ID: <20121220195416.GA69248@snakebite.org> On Thu, Dec 20, 2012 at 10:52:56AM -0800, Trent Nelson wrote: > I'll work on setting the ARM boards up next week. Does anyone have a preference regarding the operating system? There are a bunch of choices listed here: http://www.omappedia.org/wiki/Main_Page As long as it can run a recent sshd and zsh, I have no preference. Trent. From brett at python.org Thu Dec 20 21:18:41 2012 From: brett at python.org (Brett Cannon) Date: Thu, 20 Dec 2012 15:18:41 -0500 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: You cannot rewrite an existing PEP if you are not one of the original owners, nor can you add yourself as an author to a PEP without permission from the original authors. And please do not CC the peps mailing list on discussions. It should only be used to mail in new PEPs or acceptable patches to PEPs. On Wed, Dec 19, 2012 at 5:20 PM, anatoly techtonik wrote: > On Sun, Dec 9, 2012 at 7:17 AM, Gregory P. Smith wrote: > >> I'm really not sure what this PEP is trying to get at given that it >> contains no examples and sounds from the descriptions to be adding a >> complicated api on top of something that already, IMNSHO, has too much it >> (subprocess.Popen). >> >> Regardless, any user can use the stdout/err/in file objects with their >> own code that handles them asynchronously (yes that can be painful but that >> is what is required for _any_ socket or pipe I/O you don't want to block >> on). >> > > And how to use stdout/stderr/in asynchronously in cross-platform manner? > IIUC the problem is that every read is blocking. > > >> It *sounds* to me like this entire PEP could be written and released as >> a third party module on PyPI that offers a subprocess.Popen subclass adding >> some more convenient non-blocking APIs. That's where I'd start if I were >> interested in this as a future feature. >> > > I've rewritten the PEP based on how do I understand the code. I don't know > how to update it and how to comply with open documentation license, so I > just attach it and add PEPs list to CC. Me too has a feeling that the PEP > should be stripped of additional high level API until low level > functionality is well understood and accepted. > > -- > anatoly t. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu Dec 20 21:23:50 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 20 Dec 2012 15:23:50 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20121220195416.GA69248@snakebite.org> References: <20121220185256.GC69156@snakebite.org> <20121220195416.GA69248@snakebite.org> Message-ID: <20121220152350.4d832508@limelight.wooz.org> On Dec 20, 2012, at 02:54 PM, Trent Nelson wrote: >On Thu, Dec 20, 2012 at 10:52:56AM -0800, Trent Nelson wrote: >> I'll work on setting the ARM boards up next week. > > Does anyone have a preference regarding the operating system? There > are a bunch of choices listed here: > > http://www.omappedia.org/wiki/Main_Page > > As long as it can run a recent sshd and zsh, I have no preference. Well, I'm biased of course, but Ubuntu should be easy to install and should run just fine AFAIK. I'm running 12.10 on my ARM buildbot (an iMX.53 board). -Barry From chris.jerdonek at gmail.com Thu Dec 20 21:55:05 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 20 Dec 2012 12:55:05 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Thu, Dec 20, 2012 at 12:18 PM, Brett Cannon wrote: > > And please do not CC the peps mailing list on discussions. It should only be > used to mail in new PEPs or acceptable patches to PEPs. PEP 1 should perhaps be clarified if the above is the case. Currently, PEP 1 says all PEP-related e-mail should be sent there: "The PEP editors assign PEP numbers and change their status. Please send all PEP-related email to (no cross-posting please). Also see PEP Editor Responsibilities & Workflow below." as well as: "A PEP editor must subscribe to the list. All PEP-related correspondence should be sent (or CC'd) to (but please do not cross-post!)." (Incidentally, the statement not to cross-post seems contradictory if a PEP-related e-mail is also sent to python-dev, for example.) --Chris > On Wed, Dec 19, 2012 at 5:20 PM, anatoly techtonik > wrote: >> >> On Sun, Dec 9, 2012 at 7:17 AM, Gregory P. Smith wrote: >>> >>> I'm really not sure what this PEP is trying to get at given that it >>> contains no examples and sounds from the descriptions to be adding a >>> complicated api on top of something that already, IMNSHO, has too much it >>> (subprocess.Popen). >>> >>> Regardless, any user can use the stdout/err/in file objects with their >>> own code that handles them asynchronously (yes that can be painful but that >>> is what is required for _any_ socket or pipe I/O you don't want to block >>> on). >> >> >> And how to use stdout/stderr/in asynchronously in cross-platform manner? >> IIUC the problem is that every read is blocking. >> >>> >>> It sounds to me like this entire PEP could be written and released as a >>> third party module on PyPI that offers a subprocess.Popen subclass adding >>> some more convenient non-blocking APIs. That's where I'd start if I were >>> interested in this as a future feature. >> >> >> I've rewritten the PEP based on how do I understand the code. I don't know >> how to update it and how to comply with open documentation license, so I >> just attach it and add PEPs list to CC. Me too has a feeling that the PEP >> should be stripped of additional high level API until low level >> functionality is well understood and accepted. >> >> -- >> anatoly t. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/brett%40python.org >> > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com > From andrew.svetlov at gmail.com Thu Dec 20 22:11:27 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Thu, 20 Dec 2012 23:11:27 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Mention OSError instead of IOError in the docs. In-Reply-To: References: <3YQsVs2mGnzQl0@mail.python.org> Message-ID: Don't sure about applying doc changes to 3.3. They are very minor. The main change will be deprecation of aliases in the docs, that can be applied only to upcoming release. On Wed, Dec 19, 2012 at 7:05 PM, Serhiy Storchaka wrote: > On 19.12.12 09:24, Nick Coghlan wrote: >> >> With any of these changes in the docs, please don't forget to include >> appropriate "versionchanged" directives. Many people using the Python 3 >> docs at "docs.python.org/3/ " will still be >> >> on Python 3.2, and thus relying on the presence of such directives to >> let them know that while the various OS-related exception names are now >> just aliases for OSError in 3.3+, the distinctions still matter in 3.2. > > > I also propose to apply all this documentation changes to 3.3. > > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins -- Thanks, Andrew Svetlov From brett at python.org Thu Dec 20 22:12:40 2012 From: brett at python.org (Brett Cannon) Date: Thu, 20 Dec 2012 16:12:40 -0500 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Thu, Dec 20, 2012 at 3:55 PM, Chris Jerdonek wrote: > On Thu, Dec 20, 2012 at 12:18 PM, Brett Cannon wrote: > > > > And please do not CC the peps mailing list on discussions. It should > only be > > used to mail in new PEPs or acceptable patches to PEPs. > > PEP 1 should perhaps be clarified if the above is the case. > Currently, PEP 1 says all PEP-related e-mail should be sent there: > > "The PEP editors assign PEP numbers and change their status. Please > send all PEP-related email to (no cross-posting > please). Also see PEP Editor Responsibilities & Workflow below." > > as well as: > > "A PEP editor must subscribe to the list. All > PEP-related correspondence should be sent (or CC'd) to > (but please do not cross-post!)." > > (Incidentally, the statement not to cross-post seems contradictory if > a PEP-related e-mail is also sent to python-dev, for example.) > But it very clearly states to NOT cross-post which is exactly what Anatoly did and that is what I take issue with the most. I personally don't see any confusion with the wording. It clearly states that if you are a PEP author you should mail the peps editors and NOT cross-post. If you are an editor, make sure any emailing you do with an individual CCs the list but do NOT cross-post. -Brett > > --Chris > > > > > On Wed, Dec 19, 2012 at 5:20 PM, anatoly techtonik > > wrote: > >> > >> On Sun, Dec 9, 2012 at 7:17 AM, Gregory P. Smith > wrote: > >>> > >>> I'm really not sure what this PEP is trying to get at given that it > >>> contains no examples and sounds from the descriptions to be adding a > >>> complicated api on top of something that already, IMNSHO, has too much > it > >>> (subprocess.Popen). > >>> > >>> Regardless, any user can use the stdout/err/in file objects with their > >>> own code that handles them asynchronously (yes that can be painful but > that > >>> is what is required for _any_ socket or pipe I/O you don't want to > block > >>> on). > >> > >> > >> And how to use stdout/stderr/in asynchronously in cross-platform manner? > >> IIUC the problem is that every read is blocking. > >> > >>> > >>> It sounds to me like this entire PEP could be written and released as a > >>> third party module on PyPI that offers a subprocess.Popen subclass > adding > >>> some more convenient non-blocking APIs. That's where I'd start if I > were > >>> interested in this as a future feature. > >> > >> > >> I've rewritten the PEP based on how do I understand the code. I don't > know > >> how to update it and how to comply with open documentation license, so I > >> just attach it and add PEPs list to CC. Me too has a feeling that the > PEP > >> should be stripped of additional high level API until low level > >> functionality is well understood and accepted. > >> > >> -- > >> anatoly t. > >> > >> _______________________________________________ > >> Python-Dev mailing list > >> Python-Dev at python.org > >> http://mail.python.org/mailman/listinfo/python-dev > >> Unsubscribe: > >> http://mail.python.org/mailman/options/python-dev/brett%40python.org > >> > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > > http://mail.python.org/mailman/options/python-dev/chris.jerdonek%40gmail.com > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Dec 20 22:57:17 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Dec 2012 07:57:17 +1000 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20121220152350.4d832508@limelight.wooz.org> References: <20121220185256.GC69156@snakebite.org> <20121220195416.GA69248@snakebite.org> <20121220152350.4d832508@limelight.wooz.org> Message-ID: I'd vote for Fedora on at least one of them (like Barry, I'm biased, though) Cheers, Nick. -- Sent from my phone, thus the relative brevity :) On Dec 21, 2012 6:27 AM, "Barry Warsaw" wrote: > On Dec 20, 2012, at 02:54 PM, Trent Nelson wrote: > > >On Thu, Dec 20, 2012 at 10:52:56AM -0800, Trent Nelson wrote: > >> I'll work on setting the ARM boards up next week. > > > > Does anyone have a preference regarding the operating system? There > > are a bunch of choices listed here: > > > > http://www.omappedia.org/wiki/Main_Page > > > > As long as it can run a recent sshd and zsh, I have no preference. > > Well, I'm biased of course, but Ubuntu should be easy to install and should > run just fine AFAIK. I'm running 12.10 on my ARM buildbot (an iMX.53 > board). > > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From meadori at gmail.com Thu Dec 20 23:34:29 2012 From: meadori at gmail.com (Meador Inge) Date: Thu, 20 Dec 2012 16:34:29 -0600 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20121220152350.4d832508@limelight.wooz.org> References: <20121220185256.GC69156@snakebite.org> <20121220195416.GA69248@snakebite.org> <20121220152350.4d832508@limelight.wooz.org> Message-ID: On Thu, Dec 20, 2012 at 2:23 PM, Barry Warsaw wrote: > On Dec 20, 2012, at 02:54 PM, Trent Nelson wrote: > >>On Thu, Dec 20, 2012 at 10:52:56AM -0800, Trent Nelson wrote: >>> I'll work on setting the ARM boards up next week. >> >> Does anyone have a preference regarding the operating system? There >> are a bunch of choices listed here: >> >> http://www.omappedia.org/wiki/Main_Page >> >> As long as it can run a recent sshd and zsh, I have no preference. > > Well, I'm biased of course, but Ubuntu should be easy to install and should > run just fine AFAIK. I'm running 12.10 on my ARM buildbot (an iMX.53 board). +1 for Ubuntu. I have setup a PandaBoard at home with Ubuntu and built/tested Python with it. The installation was very easy. -- # Meador From eliben at gmail.com Fri Dec 21 00:10:49 2012 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 20 Dec 2012 15:10:49 -0800 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20121220184717.GB69156@snakebite.org> References: <20121220184717.GB69156@snakebite.org> Message-ID: > > > > > > That's good news. A related question about Snakebite, though. Maybe I > > > missed something obvious, but is there an overview of how the core > devs can > > > use it? In particular, I'd want to know if Snakebite runs Python's > tests > > > regularly - and if it does, how can I see the status. How do I know if > any > > > commit of mine broke some host Snakebite has? How can I SSH to that > host in > > > order to reproduce and fix the problem? Some sort of a blog post about > this, > > > at least, would be very helpful for me and possibly other developers as > > > well. > > > > http://mail.python.org/pipermail/python-dev/2012-September/121651.html > > > > Presumably that should go somewhere more permanent. > > Indeed, I'm going to carve out some time over the Christmas/NY break > to work on this. There should really be a "Developer's Guide" that > explains how to get the most out of the network. > > Thanks, indeed a more permanent place would be nice. So from reading the above, am I correct in the understanding that these hosts don't actually run tests at the moment? They only do if we log into them to test stuff? I think it would be really nice if they could actually run as buildbot slaves and execute Python tests continuously. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Fri Dec 21 01:24:27 2012 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 20 Dec 2012 18:24:27 -0600 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: <20121220184717.GB69156@snakebite.org> Message-ID: 2012/12/20 Eli Bendersky : > >> > > >> > > That's good news. A related question about Snakebite, though. Maybe I >> > > missed something obvious, but is there an overview of how the core >> > > devs can >> > > use it? In particular, I'd want to know if Snakebite runs Python's >> > > tests >> > > regularly - and if it does, how can I see the status. How do I know if >> > > any >> > > commit of mine broke some host Snakebite has? How can I SSH to that >> > > host in >> > > order to reproduce and fix the problem? Some sort of a blog post about >> > > this, >> > > at least, would be very helpful for me and possibly other developers >> > > as >> > > well. >> > >> > http://mail.python.org/pipermail/python-dev/2012-September/121651.html >> > >> > Presumably that should go somewhere more permanent. >> >> Indeed, I'm going to carve out some time over the Christmas/NY break >> to work on this. There should really be a "Developer's Guide" that >> explains how to get the most out of the network. >> > > Thanks, indeed a more permanent place would be nice. So from reading the > above, am I correct in the understanding that these hosts don't actually run > tests at the moment? They only do if we log into them to test stuff? I think > it would be really nice if they could actually run as buildbot slaves and > execute Python tests continuously. Some of them are buildbots, some are not. -- Regards, Benjamin From chris.jerdonek at gmail.com Fri Dec 21 01:35:14 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 20 Dec 2012 16:35:14 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Thu, Dec 20, 2012 at 1:12 PM, Brett Cannon wrote: > > On Thu, Dec 20, 2012 at 3:55 PM, Chris Jerdonek > wrote: >> >> On Thu, Dec 20, 2012 at 12:18 PM, Brett Cannon wrote: >> > >> > And please do not CC the peps mailing list on discussions. It should >> > only be >> > used to mail in new PEPs or acceptable patches to PEPs. >> >> PEP 1 should perhaps be clarified if the above is the case. >> Currently, PEP 1 says all PEP-related e-mail should be sent there: >> >> "The PEP editors assign PEP numbers and change their status. Please >> send all PEP-related email to (no cross-posting >> please). Also see PEP Editor Responsibilities & Workflow below." >> >> as well as: >> >> "A PEP editor must subscribe to the list. All >> PEP-related correspondence should be sent (or CC'd) to >> (but please do not cross-post!)." >> >> (Incidentally, the statement not to cross-post seems contradictory if >> a PEP-related e-mail is also sent to python-dev, for example.) > > > But it very clearly states to NOT cross-post which is exactly what Anatoly > did and that is what I take issue with the most. I personally don't see any > confusion with the wording. It clearly states that if you are a PEP author > you should mail the peps editors and NOT cross-post. If you are an editor, > make sure any emailing you do with an individual CCs the list but do NOT > cross-post. I don't disagree that he shouldn't have cross-posted. I was just pointing out that the language should be clarified. What's confusing is that the current language implies that one shouldn't send any PEP-related e-mails to any mailing list other than peps at . In particular, how can one discuss PEPs on python-dev or python-ideas without violating that language (e.g. this e-mail which is related to PEP 1)? It is probably just a matter of clarifying what "PEP-related" means. --Chris From greg at krypto.org Fri Dec 21 02:47:40 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 20 Dec 2012 17:47:40 -0800 Subject: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC... In-Reply-To: <20121220184348.GA69156@snakebite.org> References: <20121220184348.GA69156@snakebite.org> Message-ID: On Thu, Dec 20, 2012 at 10:43 AM, Trent Nelson wrote: > This seems odd to me so I wanted to see what others think. The unit > test Lib/unittest/test/test_runner.py:Test_TextRunner.test_warnings > will eventually hit subprocess.Popen._communicate. > > The `mswindows` implementation of this method relies on threads to > buffer stdin/stdout. That'll eventually result in PyOs_StdioReadline > being called without the GIL being held. PyOs_StdioReadline calls > PyMem_MALLOC, PyMem_FREE and possibly PyMem_REALLOC. > Those threads are implemented in Python so how would the GIL ever not be held? -gps > > On a debug build, these macros are redirected to their _PyMem_Debug* > counterparts. The call hierarchy for _PyMem_DebugMalloc looks like > this: > > void * > _PyMem_DebugMalloc(size_t nbytes) > { > return _PyObject_DebugMallocApi(_PYMALLOC_MEM_ID, nbytes); > } > > /* generic debug memory api, with an "id" to > identify the API in use */ > void * > _PyObject_DebugMallocApi(char id, size_t nbytes) > { > uchar *p; /* base address of malloc'ed block */ > uchar *tail; /* p + 2*SST + nbytes == > pointer to tail pad bytes */ > size_t total; /* nbytes + 4*SST */ > > bumpserialno(); > ------------^^^^^^^^^^^^^^^ > > total = nbytes + 4*SST; > if (total < nbytes) > /* overflow: can't represent total as a size_t */ > return NULL; > > p = (uchar *)PyObject_Malloc(total); > -------------------------^^^^^^^^^^^^^^^^^^^^^^^ > if (p == NULL) > return NULL; > > > > Both bumpserialno() and PyObject_Malloc affect global state. The > latter > also has a bunch of LOCK() and UNLOCK() statements, but these end up > being > no-ops: > > /* > * Python's threads are serialized, > * so object malloc locking is disabled. > */ > #define SIMPLELOCK_DECL(lock) /* simple lock declaration */ > #define SIMPLELOCK_INIT(lock) /* allocate (if needed) and ... */ > #define SIMPLELOCK_FINI(lock) /* free/destroy an existing */ > #define SIMPLELOCK_LOCK(lock) /* acquire released lock */ > #define SIMPLELOCK_UNLOCK(lock) /* release acquired lock */ > ... > /* > * This malloc lock > */ > SIMPLELOCK_DECL(_malloc_lock) > #define LOCK() SIMPLELOCK_LOCK(_malloc_lock) > #define UNLOCK() SIMPLELOCK_UNLOCK(_malloc_lock) > #define LOCK_INIT() SIMPLELOCK_INIT(_malloc_lock) > #define LOCK_FINI() SIMPLELOCK_FINI(_malloc_lock) > > The PyObject_Malloc() one concerns me the most, as it affects huge > amounts of global state. Also, I just noticed PyOs_StdioReadline() > can call PyErr_SetString, which will result in a bunch of other > calls that should only be made whilst the GIL is held. > > So, like I said, this seems like a bit of a head scratcher. Legit > issue or am I missing something? > > Trent. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From trent at snakebite.org Fri Dec 21 04:12:31 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 20 Dec 2012 22:12:31 -0500 Subject: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC... In-Reply-To: References: <20121220184348.GA69156@snakebite.org> Message-ID: <20121221031231.GA69960@snakebite.org> On Thu, Dec 20, 2012 at 05:47:40PM -0800, Gregory P. Smith wrote: > On Thu, Dec 20, 2012 at 10:43 AM, Trent Nelson > wrote: > > This seems odd to me so I wanted to see what others think. The unit > test Lib/unittest/test/test_runner.py:Test_TextRunner.test_warnings > will eventually hit subprocess.Popen._communicate. > > The `mswindows` implementation of this method relies on threads to > buffer stdin/stdout. That'll eventually result in > PyOs_StdioReadline > being called without the GIL being held. PyOs_StdioReadline calls > PyMem_MALLOC, PyMem_FREE and possibly PyMem_REALLOC. > > Those threads are implemented in Python so how would the GIL ever not be > held? > -gps PyOS_Readline drops the GIL prior to calling PyOS_StdioReadline: Py_BEGIN_ALLOW_THREADS --------^^^^^^^^^^^^^^^^^^^^^^ #ifdef WITH_THREAD PyThread_acquire_lock(_PyOS_ReadlineLock, 1); #endif /* This is needed to handle the unlikely case that the * interpreter is in interactive mode *and* stdin/out are not * a tty. This can happen, for example if python is run like * this: python -i < test1.py */ if (!isatty (fileno (sys_stdin)) || !isatty (fileno (sys_stdout))) rv = PyOS_StdioReadline (sys_stdin, sys_stdout, prompt); -----------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ else rv = (*PyOS_ReadlineFunctionPointer)(sys_stdin, sys_stdout, prompt); Py_END_ALLOW_THREADS Trent. From trent at snakebite.org Fri Dec 21 04:13:45 2012 From: trent at snakebite.org (Trent Nelson) Date: Thu, 20 Dec 2012 22:13:45 -0500 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: References: <20121220184717.GB69156@snakebite.org> Message-ID: <20121221031345.GB69960@snakebite.org> On Thu, Dec 20, 2012 at 03:10:49PM -0800, Eli Bendersky wrote: > > > > > > That's good news. A related question about Snakebite, though. Maybe > I > > > missed something obvious, but is there an overview of how the core > devs can > > > use it? In particular, I'd want to know if Snakebite runs Python's > tests > > > regularly - and if it does, how can I see the status. How do I know > if any > > > commit of mine broke some host Snakebite has? How can I SSH to that > host in > > > order to reproduce and fix the problem? Some sort of a blog post > about this, > > > at least, would be very helpful for me and possibly other developers > as > > > well. > > > > http://mail.python.org/pipermail/python-dev/2012-September/121651.html > > > > Presumably that should go somewhere more permanent. > > Indeed, I'm going to carve out some time over the Christmas/NY break > to work on this. There should really be a "Developer's Guide" that > explains how to get the most out of the network. > > Thanks, indeed a more permanent place would be nice. So from reading the > above, am I correct in the understanding that these hosts don't actually > run tests at the moment? They only do if we log into them to test stuff? I > think it would be really nice if they could actually run as buildbot > slaves and execute Python tests continuously. Almost all of them are running slaves, and have been since ~August. Take a look at our buildbot page -- any host with [SB] in the name is a Snakebite host. Trent. From eliben at gmail.com Fri Dec 21 04:56:18 2012 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 20 Dec 2012 19:56:18 -0800 Subject: [Python-Dev] PandaBoard, Raspberry Pi coming to Buildbot fleet In-Reply-To: <20121221031345.GB69960@snakebite.org> References: <20121220184717.GB69156@snakebite.org> <20121221031345.GB69960@snakebite.org> Message-ID: On Thu, Dec 20, 2012 at 7:13 PM, Trent Nelson wrote: > On Thu, Dec 20, 2012 at 03:10:49PM -0800, Eli Bendersky wrote: > > > > > > > > That's good news. A related question about Snakebite, though. > Maybe > > I > > > > missed something obvious, but is there an overview of how the > core > > devs can > > > > use it? In particular, I'd want to know if Snakebite runs > Python's > > tests > > > > regularly - and if it does, how can I see the status. How do I > know > > if any > > > > commit of mine broke some host Snakebite has? How can I SSH to > that > > host in > > > > order to reproduce and fix the problem? Some sort of a blog post > > about this, > > > > at least, would be very helpful for me and possibly other > developers > > as > > > > well. > > > > > > > http://mail.python.org/pipermail/python-dev/2012-September/121651.html > > > > > > Presumably that should go somewhere more permanent. > > > > Indeed, I'm going to carve out some time over the Christmas/NY > break > > to work on this. There should really be a "Developer's Guide" > that > > explains how to get the most out of the network. > > > > Thanks, indeed a more permanent place would be nice. So from reading > the > > above, am I correct in the understanding that these hosts don't > actually > > run tests at the moment? They only do if we log into them to test > stuff? I > > think it would be really nice if they could actually run as buildbot > > slaves and execute Python tests continuously. > > Almost all of them are running slaves, and have been since ~August. > Take a look at our buildbot page -- any host with [SB] in the name > is a Snakebite host. > > Trent. > Ah, I see. Thanks. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From 76069016 at qq.com Fri Dec 21 07:03:40 2012 From: 76069016 at qq.com (=?gb18030?B?SXNtbA==?=) Date: Fri, 21 Dec 2012 14:03:40 +0800 Subject: [Python-Dev] compile python 3.3 with bz2 support Message-ID: hi, everyone: I want to compile python 3.3 with bz2 support on RedHat 5.5 but fail to do that. Here is how I do it: 1?download bzip2 and compile it(make?make -f Makefile_libbz2_so?make install) 2?chang to python 3.3 source directory : ./configure --with-bz2=/usr/local/include 3?make 4?make install after installation complete, I test it? [root at localhost Python-3.3.0]# python3 -c "import bz2" Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.3/bz2.py", line 21, in from _bz2 import BZ2Compressor, BZ2Decompressor ImportError: No module named '_bz2' By the way, RedHat 5.5 has a built-in python 2.4.3. Would it be a problem? -------------- next part -------------- An HTML attachment was scrubbed... URL: From 76069016 at qq.com Fri Dec 21 07:20:11 2012 From: 76069016 at qq.com (=?utf-8?B?SXNtbA==?=) Date: Fri, 21 Dec 2012 14:20:11 +0800 Subject: [Python-Dev] compile python 3.3 with bz2 support Message-ID: An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Fri Dec 21 08:25:25 2012 From: songofacandy at gmail.com (INADA Naoki) Date: Fri, 21 Dec 2012 16:25:25 +0900 Subject: [Python-Dev] What is the "sequence"? ([issue16728] collections.abc.Sequence shoud provide __subclasshook__ Message-ID: I've report http://bugs.python.org/issue16728 , but I am confused about what is the sequence now. Glossary defines sequence as iteratable having __getitem__ and __len__. Objects doesn't have __iter__ is iterable when it having __getitem__. http://docs.python.org/3/reference/datamodel.html says: > Sequences also support slicing: a[i:j] selects all items with index *k*such that *i* <= *k* < *j*. When used as an expression, a slice is a sequence of the same type. This implies that the index set is renumbered so that it starts at 0. But I think this sentence explains about standard types and not definition of sequence. http://docs.python.org/3/library/collections.abc.html says: > This module provides *abstract base classes*that can be used to test whether a class provides a particular interface; for example, whether it is hashable or whether it is a mapping. And collections.abc.Sequence requires "index()" and "count()". What is the requirement for calling something is "sequence"? Off Topc: Sequence.__iter__ uses __len__ and __getitem__ but default iterator uses only __getitem__. This difference is ugly. -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From kristjan at ccpgames.com Fri Dec 21 10:31:44 2012 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Fri, 21 Dec 2012 09:31:44 +0000 Subject: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC... In-Reply-To: <20121221031231.GA69960@snakebite.org> References: <20121220184348.GA69156@snakebite.org> <20121221031231.GA69960@snakebite.org> Message-ID: I ran into this the other day. I had put in hooks in the PyMem_MALLOC to track memory per tasklet, and it crashed in those cases because it was being called without the GIL. My local patch was simply to _not_ release the GIL. Clearly, calling PyMem_MALLOC without the GIL is an API violation. K > -----Original Message----- > From: Python-Dev [mailto:python-dev- > bounces+kristjan=ccpgames.com at python.org] On Behalf Of Trent Nelson > Sent: 21. desember 2012 03:13 > To: Gregory P. Smith > Cc: Python-Dev > Subject: Re: [Python-Dev] Possible GIL/threading issue involving subprocess > and PyMem_MALLOC... > > On Thu, Dec 20, 2012 at 05:47:40PM -0800, Gregory P. Smith wrote: > > On Thu, Dec 20, 2012 at 10:43 AM, Trent Nelson > > wrote: > > > > This seems odd to me so I wanted to see what others think. The unit > > test Lib/unittest/test/test_runner.py:Test_TextRunner.test_warnings > > will eventually hit subprocess.Popen._communicate. > > > > The `mswindows` implementation of this method relies on threads to > > buffer stdin/stdout. That'll eventually result in > > PyOs_StdioReadline > > being called without the GIL being held. PyOs_StdioReadline calls > > PyMem_MALLOC, PyMem_FREE and possibly PyMem_REALLOC. > > > > Those threads are implemented in Python so how would the GIL ever not > be > > held? > > -gps > > PyOS_Readline drops the GIL prior to calling PyOS_StdioReadline: > > Py_BEGIN_ALLOW_THREADS > --------^^^^^^^^^^^^^^^^^^^^^^ > #ifdef WITH_THREAD > PyThread_acquire_lock(_PyOS_ReadlineLock, 1); > #endif > > /* This is needed to handle the unlikely case that the > * interpreter is in interactive mode *and* stdin/out are not > * a tty. This can happen, for example if python is run like > * this: python -i < test1.py > */ > if (!isatty (fileno (sys_stdin)) || !isatty (fileno (sys_stdout))) > rv = PyOS_StdioReadline (sys_stdin, sys_stdout, prompt); ---------------- > -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > else > rv = (*PyOS_ReadlineFunctionPointer)(sys_stdin, sys_stdout, > prompt); > Py_END_ALLOW_THREADS > > > Trent. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python- > dev/kristjan%40ccpgames.com From solipsis at pitrou.net Fri Dec 21 10:43:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Dec 2012 10:43:11 +0100 Subject: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC... References: <20121220184348.GA69156@snakebite.org> <20121221031231.GA69960@snakebite.org> Message-ID: <20121221104311.1f3e2c7e@pitrou.net> Le Fri, 21 Dec 2012 09:31:44 +0000, Kristj?n Valur J?nsson a ?crit : > I ran into this the other day. I had put in hooks in the > PyMem_MALLOC to track memory per tasklet, and it crashed in those > cases because it was being called without the GIL. My local patch > was simply to _not_ release the GIL. Clearly, calling PyMem_MALLOC > without the GIL is an API violation. Indeed, this deserves fixing. (it would be better to still release the GIL around the low-level I/O call, of course) Thanks Trent for finding this! Antoine. From trent at snakebite.org Fri Dec 21 11:50:52 2012 From: trent at snakebite.org (Trent Nelson) Date: Fri, 21 Dec 2012 05:50:52 -0500 Subject: [Python-Dev] Possible GIL/threading issue involving subprocess and PyMem_MALLOC... In-Reply-To: <20121221104311.1f3e2c7e@pitrou.net> References: <20121220184348.GA69156@snakebite.org> <20121221031231.GA69960@snakebite.org> <20121221104311.1f3e2c7e@pitrou.net> Message-ID: <20121221105051.GA70597@snakebite.org> On Fri, Dec 21, 2012 at 01:43:11AM -0800, Antoine Pitrou wrote: > Le Fri, 21 Dec 2012 09:31:44 +0000, > Kristj?n Valur J?nsson a ?crit : > > I ran into this the other day. I had put in hooks in the > > PyMem_MALLOC to track memory per tasklet, and it crashed in those > > cases because it was being called without the GIL. My local patch > > was simply to _not_ release the GIL. Clearly, calling PyMem_MALLOC > > without the GIL is an API violation. > > Indeed, this deserves fixing. > (it would be better to still release the GIL around the low-level I/O > call, of course) Created http://bugs.python.org/issue16742 to capture the issue for now. I want to make some more progress on the parallel stuff first so if somebody wants to tackle it in the meantime, be my guest. > Thanks Trent for finding this! Unexpected (but handy) side-effect of the parallel context work :-) (I wonder if that's the only thread-safe issue in our code base...) > Antoine. Trent. From brett at python.org Fri Dec 21 15:46:55 2012 From: brett at python.org (Brett Cannon) Date: Fri, 21 Dec 2012 09:46:55 -0500 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Thu, Dec 20, 2012 at 7:35 PM, Chris Jerdonek wrote: > On Thu, Dec 20, 2012 at 1:12 PM, Brett Cannon wrote: > > > > On Thu, Dec 20, 2012 at 3:55 PM, Chris Jerdonek < > chris.jerdonek at gmail.com> > > wrote: > >> > >> On Thu, Dec 20, 2012 at 12:18 PM, Brett Cannon > wrote: > >> > > >> > And please do not CC the peps mailing list on discussions. It should > >> > only be > >> > used to mail in new PEPs or acceptable patches to PEPs. > >> > >> PEP 1 should perhaps be clarified if the above is the case. > >> Currently, PEP 1 says all PEP-related e-mail should be sent there: > >> > >> "The PEP editors assign PEP numbers and change their status. Please > >> send all PEP-related email to (no cross-posting > >> please). Also see PEP Editor Responsibilities & Workflow below." > >> > >> as well as: > >> > >> "A PEP editor must subscribe to the list. All > >> PEP-related correspondence should be sent (or CC'd) to > >> (but please do not cross-post!)." > >> > >> (Incidentally, the statement not to cross-post seems contradictory if > >> a PEP-related e-mail is also sent to python-dev, for example.) > > > > > > But it very clearly states to NOT cross-post which is exactly what > Anatoly > > did and that is what I take issue with the most. I personally don't see > any > > confusion with the wording. It clearly states that if you are a PEP > author > > you should mail the peps editors and NOT cross-post. If you are an > editor, > > make sure any emailing you do with an individual CCs the list but do NOT > > cross-post. > > I don't disagree that he shouldn't have cross-posted. I was just > pointing out that the language should be clarified. What's confusing > is that the current language implies that one shouldn't send any > PEP-related e-mails to any mailing list other than peps at . In > particular, how can one discuss PEPs on python-dev or python-ideas > without violating that language (e.g. this e-mail which is related to > PEP 1)? It is probably just a matter of clarifying what "PEP-related" > means. > I'm just not seeing the confusion, sorry. And we have never really had any confusion over this wording before. If you want to send a patch to tweak the wording to me more clear then please go ahead and I will consider it, but I'm not worried enough about it to try to come up with some rewording myself. -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Dec 21 18:07:23 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 21 Dec 2012 18:07:23 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20121221170723.0B2EC1C884@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-12-14 - 2012-12-21) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3844 (+18) closed 24677 (+46) total 28521 (+64) Open issues with patches: 1686 Issues opened (38) ================== #16682: Document that audioop works with bytes, not strings http://bugs.python.org/issue16682 opened by serhiy.storchaka #16684: Unicode property value abbreviated names and long names http://bugs.python.org/issue16684 opened by PanderMusubi #16685: Deprecate accepting strings as arguments in audioop functions http://bugs.python.org/issue16685 opened by serhiy.storchaka #16686: audioop overflow issues http://bugs.python.org/issue16686 opened by serhiy.storchaka #16688: Backreferences make case-insensitive regex fail on non-ASCII s http://bugs.python.org/issue16688 opened by pyos #16689: stdout stderr redirection mess http://bugs.python.org/issue16689 opened by techtonik #16690: Reference leak with custom tp_dealloc in PyType_FromSpec http://bugs.python.org/issue16690 opened by bfroehle #16692: Support TLS 1.1 and TLS 1.2 http://bugs.python.org/issue16692 opened by pitrou #16694: Add pure Python operator module http://bugs.python.org/issue16694 opened by zach.ware #16695: Clarify fnmatch & glob docs about the handling of leading "."s http://bugs.python.org/issue16695 opened by hynek #16698: test_posix.test_getgroups fails on some systems http://bugs.python.org/issue16698 opened by rosslagerwall #16699: Mountain Lion buildbot lacks disk space http://bugs.python.org/issue16699 opened by pitrou #16700: Document that bytes OS API can returns unusable results on Win http://bugs.python.org/issue16700 opened by serhiy.storchaka #16701: Docs missing the behavior of += (in-place add) for lists. http://bugs.python.org/issue16701 opened by montysinngh #16702: Force urllib2_localnet test not to use http proxies http://bugs.python.org/issue16702 opened by jeffknupp #16705: Use concrete classes inherited from OSError instead of errno c http://bugs.python.org/issue16705 opened by asvetlov #16709: unittest discover order is filesystem specific - hard to repro http://bugs.python.org/issue16709 opened by rbcollins #16712: collections.abc.Sequence should not provide __reversed__ http://bugs.python.org/issue16712 opened by naoki #16713: "tel" URIs should support params http://bugs.python.org/issue16713 opened by pitrou #16715: Get rid of IOError. Use OSError instead http://bugs.python.org/issue16715 opened by asvetlov #16716: Deprecate OSError aliases in the doc http://bugs.python.org/issue16716 opened by asvetlov #16718: Mysterious atexit fail http://bugs.python.org/issue16718 opened by techtonik #16720: Get rid of os.error. Use OSError instead http://bugs.python.org/issue16720 opened by serhiy.storchaka #16721: configure incorrectly adds -OPT:Olimit=0 for clang http://bugs.python.org/issue16721 opened by Vladimir.Timofeev #16723: io.TextIOWrapper on urllib.request.urlopen terminates prematur http://bugs.python.org/issue16723 opened by mdehoon #16726: expat ParseFile expects bytes, not string http://bugs.python.org/issue16726 opened by mdehoon #16728: Missing cross-reference in sequence glossary entry http://bugs.python.org/issue16728 opened by naoki #16729: Document how to provide defaults for setup.py commands options http://bugs.python.org/issue16729 opened by techtonik #16730: _fill_cache in _bootstrap.py crashes without directory execute http://bugs.python.org/issue16730 opened by David.Pritchard #16731: xxlimited/xxmodule docstrings ambiguous http://bugs.python.org/issue16731 opened by danielsh #16732: setup.py support for xxmodule without tkinker http://bugs.python.org/issue16732 opened by danielsh #16733: Solaris ctypes_test failures http://bugs.python.org/issue16733 opened by yippi #16737: Different behaviours in script run directly and via runpy.run_ http://bugs.python.org/issue16737 opened by vinay.sajip #16739: texttestresult should decorate the stream with _WritelnDecorat http://bugs.python.org/issue16739 opened by elopio #16741: `int()`, `float()`, etc think python strings are null-terminat http://bugs.python.org/issue16741 opened by gangesmaster #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline, which is http://bugs.python.org/issue16742 opened by trent #16743: mmap accepts files > 1 GB, but processes only 1 GB http://bugs.python.org/issue16743 opened by schlamar #16744: sys.path.append causes wrong behaviour http://bugs.python.org/issue16744 opened by rappy Most recent 15 issues with no replies (15) ========================================== #16744: sys.path.append causes wrong behaviour http://bugs.python.org/issue16744 #16743: mmap accepts files > 1 GB, but processes only 1 GB http://bugs.python.org/issue16743 #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline, which is http://bugs.python.org/issue16742 #16741: `int()`, `float()`, etc think python strings are null-terminat http://bugs.python.org/issue16741 #16733: Solaris ctypes_test failures http://bugs.python.org/issue16733 #16732: setup.py support for xxmodule without tkinker http://bugs.python.org/issue16732 #16729: Document how to provide defaults for setup.py commands options http://bugs.python.org/issue16729 #16726: expat ParseFile expects bytes, not string http://bugs.python.org/issue16726 #16720: Get rid of os.error. Use OSError instead http://bugs.python.org/issue16720 #16715: Get rid of IOError. Use OSError instead http://bugs.python.org/issue16715 #16712: collections.abc.Sequence should not provide __reversed__ http://bugs.python.org/issue16712 #16702: Force urllib2_localnet test not to use http proxies http://bugs.python.org/issue16702 #16699: Mountain Lion buildbot lacks disk space http://bugs.python.org/issue16699 #16695: Clarify fnmatch & glob docs about the handling of leading "."s http://bugs.python.org/issue16695 #16692: Support TLS 1.1 and TLS 1.2 http://bugs.python.org/issue16692 Most recent 15 issues waiting for review (15) ============================================= #16739: texttestresult should decorate the stream with _WritelnDecorat http://bugs.python.org/issue16739 #16732: setup.py support for xxmodule without tkinker http://bugs.python.org/issue16732 #16731: xxlimited/xxmodule docstrings ambiguous http://bugs.python.org/issue16731 #16730: _fill_cache in _bootstrap.py crashes without directory execute http://bugs.python.org/issue16730 #16720: Get rid of os.error. Use OSError instead http://bugs.python.org/issue16720 #16702: Force urllib2_localnet test not to use http proxies http://bugs.python.org/issue16702 #16694: Add pure Python operator module http://bugs.python.org/issue16694 #16688: Backreferences make case-insensitive regex fail on non-ASCII s http://bugs.python.org/issue16688 #16686: audioop overflow issues http://bugs.python.org/issue16686 #16682: Document that audioop works with bytes, not strings http://bugs.python.org/issue16682 #16679: Wrong URL path decoding http://bugs.python.org/issue16679 #16674: Faster getrandbits() for small integers http://bugs.python.org/issue16674 #16672: improve tracing performances when f_trace is NULL http://bugs.python.org/issue16672 #16669: Docstrings for namedtuple http://bugs.python.org/issue16669 #16667: timezone docs need "versionadded: 3.2" http://bugs.python.org/issue16667 Top 10 most discussed issues (10) ================================= #16688: Backreferences make case-insensitive regex fail on non-ASCII s http://bugs.python.org/issue16688 13 msgs #16718: Mysterious atexit fail http://bugs.python.org/issue16718 13 msgs #16728: Missing cross-reference in sequence glossary entry http://bugs.python.org/issue16728 9 msgs #16612: Integrate "Argument Clinic" specialized preprocessor into CPyt http://bugs.python.org/issue16612 7 msgs #16685: Deprecate accepting strings as arguments in audioop functions http://bugs.python.org/issue16685 7 msgs #16694: Add pure Python operator module http://bugs.python.org/issue16694 7 msgs #15533: subprocess.Popen(cwd) documentation http://bugs.python.org/issue15533 5 msgs #16618: Different glob() results for strings and bytes http://bugs.python.org/issue16618 5 msgs #16669: Docstrings for namedtuple http://bugs.python.org/issue16669 4 msgs #16716: Deprecate OSError aliases in the doc http://bugs.python.org/issue16716 4 msgs Issues closed (43) ================== #8853: getaddrinfo should accept port of type long http://bugs.python.org/issue8853 closed by petri.lehtinen #10155: Add fixups for encoding problems to wsgiref http://bugs.python.org/issue10155 closed by aclover #11175: allow argparse FileType to accept encoding and errors argument http://bugs.python.org/issue11175 closed by petri.lehtinen #14901: Python Windows FAQ is Very Outdated http://bugs.python.org/issue14901 closed by brian.curtin #15743: test_urllib2/test_urllib use deprecated urllib.Request methods http://bugs.python.org/issue15743 closed by jeffknupp #15783: decimal: Support None default values in the C accelerator modu http://bugs.python.org/issue15783 closed by skrah #16298: httplib.HTTPResponse.read could potentially leave the socket o http://bugs.python.org/issue16298 closed by pitrou #16480: pyvenv 3.3 fails to create symlinks for /local/{bi http://bugs.python.org/issue16480 closed by doko #16488: Add context manager support to epoll object http://bugs.python.org/issue16488 closed by pitrou #16597: file descriptor not being closed with context manager on IOErr http://bugs.python.org/issue16597 closed by python-dev #16626: Infinite recursion in glob.glob('*:') on Windows http://bugs.python.org/issue16626 closed by pitrou #16646: FTP.makeport() loses socket error details http://bugs.python.org/issue16646 closed by giampaolo.rodola #16647: LMTP.connect() loses socket error details http://bugs.python.org/issue16647 closed by asvetlov #16661: test_posix.test_getgrouplist fails on some systems - incorrect http://bugs.python.org/issue16661 closed by rosslagerwall #16664: Test Glob: files starting with . http://bugs.python.org/issue16664 closed by hynek #16670: Point class may be not be a good example for namedtuple http://bugs.python.org/issue16670 closed by terry.reedy #16678: optparse: parse only known options http://bugs.python.org/issue16678 closed by r.david.murray #16681: Documentation 'bidirectional category' should be 'bidirectiona http://bugs.python.org/issue16681 closed by ezio.melotti #16683: Resort audioop documentation http://bugs.python.org/issue16683 closed by ezio.melotti #16687: Fix small gramatical error and add reference link in hashlib d http://bugs.python.org/issue16687 closed by python-dev #16691: How to use ctypes.windll.user32.MessageBoxW http://bugs.python.org/issue16691 closed by ned.deily #16693: Assertion error in ceval if Chainmap(object()) used as locals http://bugs.python.org/issue16693 closed by python-dev #16696: BytesWarning in glob.glob http://bugs.python.org/issue16696 closed by pitrou #16697: argparse kwarg 'choices' documentation http://bugs.python.org/issue16697 closed by r.david.murray #16703: except statement turns defined variable into undefined http://bugs.python.org/issue16703 closed by brett.cannon #16704: Get rid of select.error in stdlib. Use OSError instead http://bugs.python.org/issue16704 closed by asvetlov #16706: Get rid of os.error. Use OSError instead http://bugs.python.org/issue16706 closed by asvetlov #16707: --with-pydebug and --without-pymalloc are incompatible http://bugs.python.org/issue16707 closed by pitrou #16708: Module: shutil will not import when writen in the text editor http://bugs.python.org/issue16708 closed by r.david.murray #16710: json encode/decode error http://bugs.python.org/issue16710 closed by amaury.forgeotdarc #16711: s/next()/__next__/ in collections.abc.Iterator document. http://bugs.python.org/issue16711 closed by asvetlov #16714: Raise exceptions, don't throw http://bugs.python.org/issue16714 closed by asvetlov #16717: Get rid of socket.error. Use OSError instead http://bugs.python.org/issue16717 closed by asvetlov #16719: Get rid of WindowsError. Use OSError instead http://bugs.python.org/issue16719 closed by asvetlov #16722: __index__() overrides __bytes__() when bytes() is called http://bugs.python.org/issue16722 closed by python-dev #16724: Define `binary data` representation in Python http://bugs.python.org/issue16724 closed by r.david.murray #16725: Add 'ident' property to SysLogHandler like in Python 3.x http://bugs.python.org/issue16725 closed by r.david.murray #16727: Windows installers for 2.7.3 don't install python27.dll correc http://bugs.python.org/issue16727 closed by loewis #16734: Delay interpreter startup phase until script is read http://bugs.python.org/issue16734 closed by christian.heimes #16735: zipfile.is_zipfile wrongly recognizes non-zip as zip http://bugs.python.org/issue16735 closed by r.david.murray #16736: select.poll() converts long to int without checking for overfl http://bugs.python.org/issue16736 closed by sbt #16738: Comparisons difference: bytes with bytes, str with str http://bugs.python.org/issue16738 closed by christian.heimes #16740: Types created with PyType_FromSpec lack a __module__ attribute http://bugs.python.org/issue16740 closed by bfroehle From phd at phdru.name Fri Dec 21 07:17:45 2012 From: phd at phdru.name (Oleg Broytman) Date: Fri, 21 Dec 2012 10:17:45 +0400 Subject: [Python-Dev] compile python 3.3 with bz2 support In-Reply-To: References: Message-ID: <20121221061745.GB12583@iskra.aviel.ru> Hello. We are sorry but we cannot help you. This mailing list is to work on developing Python (adding new features to Python itself and fixing bugs); if you're having problems learning, understanding or using Python, please find another forum. Probably python-list/comp.lang.python mailing list/news group is the best place; there are Python developers who participate in it; you may get a faster, and probably more complete, answer there. See http://www.python.org/community/ for other lists/news groups/fora. Thank you for understanding. On Fri, Dec 21, 2012 at 02:03:40PM +0800, Isml <76069016 at qq.com> wrote: > hi, everyone: > I want to compile python 3.3 with bz2 support on RedHat 5.5 but fail to do that. Here is how I do it: > 1??download bzip2 and compile it(make??make -f Makefile_libbz2_so??make install) > 2??chang to python 3.3 source directory : ./configure --with-bz2=/usr/local/include > 3??make > 4??make install > > after installation complete, I test it?? > [root at localhost Python-3.3.0]# python3 -c "import bz2" > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python3.3/bz2.py", line 21, in > from _bz2 import BZ2Compressor, BZ2Decompressor > ImportError: No module named '_bz2' You have to install bz2 development files (headers and libraries) before recompiling python. > By the way, RedHat 5.5 has a built-in python 2.4.3. Would it be a problem? Depends on what are you going to do. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From csebasha at gmail.com Fri Dec 21 08:05:31 2012 From: csebasha at gmail.com (csebasha) Date: Thu, 20 Dec 2012 23:05:31 -0800 (PST) Subject: [Python-Dev] Testing the tests by modifying the ordering of dict items. In-Reply-To: <4F05A9CC.3000806@hotpy.org> References: <4F05A9CC.3000806@hotpy.org> Message-ID: <1356073531934-5000138.post@n6.nabble.com> Hello Mark, Did you raise bug for this? -- View this message in context: http://python.6.n6.nabble.com/Testing-the-tests-by-modifying-the-ordering-of-dict-items-tp3221386p5000138.html Sent from the Python - python-dev mailing list archive at Nabble.com. From guido at python.org Fri Dec 21 19:57:12 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 10:57:12 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted Message-ID: Dear python-dev *and* python-ideas, I am posting PEP 3156 here for early review and discussion. As you can see from the liberally sprinkled TBD entries it is not done, but I am about to disappear on vacation for a few weeks and I am reasonably happy with the state of things so far. (Of course feedback may change this. :-) Also, there has already been some discussion on python-ideas (and even on Twitter) so I don't want python-dev to feel out of the loop -- this *is* a proposal for a new standard library module. (But no, I haven't picked the module name yet. :-) There's an -- also incomplete -- reference implementation at http://code.google.com/p/tulip/ -- unlike the first version of tulip, this version actually has (some) unittests. Let the bikeshedding begin! (Oh, happy holidays too. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- PEP: 3156 Title: Asynchronous IO Support Rebooted Version: $Revision$ Last-Modified: $Date$ Author: Guido van Rossum Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 12-Dec-2012 Post-History: TBD Abstract ======== This is a proposal for asynchronous I/O in Python 3, starting with Python 3.3. Consider this the concrete proposal that is missing from PEP 3153. The proposal includes a pluggable event loop API, transport and protocol abstractions similar to those in Twisted, and a higher-level scheduler based on ``yield from`` (PEP 380). A reference implementation is in the works under the code name tulip. Introduction ============ The event loop is the place where most interoperability occurs. It should be easy for (Python 3.3 ports of) frameworks like Twisted, Tornado, or ZeroMQ to either adapt the default event loop implementation to their needs using a lightweight wrapper or proxy, or to replace the default event loop implementation with an adaptation of their own event loop implementation. (Some frameworks, like Twisted, have multiple event loop implementations. This should not be a problem since these all have the same interface.) It should even be possible for two different third-party frameworks to interoperate, either by sharing the default event loop implementation (each using its own adapter), or by sharing the event loop implementation of either framework. In the latter case two levels of adaptation would occur (from framework A's event loop to the standard event loop interface, and from there to framework B's event loop). Which event loop implementation is used should be under control of the main program (though a default policy for event loop selection is provided). Thus, two separate APIs are defined: - getting and setting the current event loop object - the interface of a conforming event loop and its minimum guarantees An event loop implementation may provide additional methods and guarantees. The event loop interface does not depend on ``yield from``. Rather, it uses a combination of callbacks, additional interfaces (transports and protocols), and Futures. The latter are similar to those defined in PEP 3148, but have a different implementation and are not tied to threads. In particular, they have no wait() method; the user is expected to use callbacks. For users (like myself) who don't like using callbacks, a scheduler is provided for writing asynchronous I/O code as coroutines using the PEP 380 ``yield from`` expressions. The scheduler is not pluggable; pluggability occurs at the event loop level, and the scheduler should work with any conforming event loop implementation. For interoperability between code written using coroutines and other async frameworks, the scheduler has a Task class that behaves like a Future. A framework that interoperates at the event loop level can wait for a Future to complete by adding a callback to the Future. Likewise, the scheduler offers an operation to suspend a coroutine until a callback is called. Limited interoperability with threads is provided by the event loop interface; there is an API to submit a function to an executor (see PEP 3148) which returns a Future that is compatible with the event loop. Non-goals ========= Interoperability with systems like Stackless Python or greenlets/gevent is not a goal of this PEP. Specification ============= Dependencies ------------ Python 3.3 is required. No new language or standard library features beyond Python 3.3 are required. No third-party modules or packages are required. Module Namespace ---------------- The specification here will live in a new toplevel package. Different components will live in separate submodules of that package. The package will import common APIs from their respective submodules and make them available as package attributes (similar to the way the email package works). The name of the toplevel package is currently unspecified. The reference implementation uses the name 'tulip', but the name will change to something more boring if and when the implementation is moved into the standard library (hopefully for Python 3.4). Until the boring name is chosen, this PEP will use 'tulip' as the toplevel package name. Classes and functions given without a module name are assumed to be accessed via the toplevel package. Event Loop Policy: Getting and Setting the Event Loop ----------------------------------------------------- To get the current event loop, use ``get_event_loop()``. This returns an instance of the ``EventLoop`` class defined below or an equivalent object. It is possible that ``get_event_loop()`` returns a different object depending on the current thread, or depending on some other notion of context. To set the current event loop, use ``set_event_loop(event_loop)``, where ``event_loop`` is an instance of the ``EventLoop`` class or equivalent. This uses the same notion of context as ``get_event_loop()``. For the benefit of unit tests and other special cases there's a third policy function: ``init_event_loop()``, which creates a new EventLoop instance and calls ``set_event_loop()`` with it. TBD: Maybe we should have a ``create_default_event_loop_instance()`` function instead? To change the way the above three functions work (including their notion of context), call ``set_event_loop_policy(policy)``, where ``policy`` is an event loop policy object. The policy object can be any object that has methods ``get_event_loop()``, ``set_event_loop(event_loop)`` and ``init_event_loop()`` behaving like the functions described above. The default event loop policy is an instance of the class ``DefaultEventLoopPolicy``. The current event loop policy object can be retrieved by calling ``get_event_loop_policy()``. An event loop policy may but does not have to enforce that there is only one event loop in existence. The default event loop policy does not enforce this, but it does enforce that there is only one event loop per thread. Event Loop Interface -------------------- (A note about times: as usual in Python, all timeouts, intervals and delays are measured in seconds, and may be ints or floats. The accuracy and precision of the clock are up to the implementation; the default implementation uses ``time.monotonic()``.) A conforming event loop object has the following methods: - ``run()``. Runs the event loop until there is nothing left to do. This means, in particular: - No more calls scheduled with ``call_later()``, ``call_repeatedly()``, ``call_soon()``, or ``call_soon_threadsafe()``, except for cancelled calls. - No more registered file descriptors. It is up to the registering party to unregister a file descriptor when it is closed. Note: ``run()`` blocks until the termination condition is met, or until ``stop()`` is called. Note: if you schedule a call with ``call_repeatedly()``, ``run()`` will not exit until you cancel it. TBD: How many variants of this do we really need? - ``stop()``. Stops the event loop as soon as it is convenient. It is fine to restart the loop with ``run()`` (or one of its variants) subsequently. Note: How soon exactly is up to the implementation. All immediate callbacks that were already scheduled to run before ``stop()`` is called must still be run, but callbacks scheduled after it is called (or scheduled to be run later) will not be run. - ``run_forever()``. Runs the event loop until ``stop()`` is called. - ``run_until_complete(future, timeout=None)``. Runs the event loop until the Future is done. If a timeout is given, it waits at most that long. If the Future is done, its result is returned, or its exception is raised; if the timeout expires before the Future is done, or if ``stop()`` is called, ``TimeoutError`` is raised (but the Future is not cancelled). This cannot be called when the event loop is already running. Note: This API is most useful for tests and the like. It should not be used as a substitute for ``yield from future`` or other ways to wait for a Future (e.g. registering a done callback). - ``run_once(timeout=None)``. Run the event loop for a little while. If a timeout is given, an I/O poll made will block at most that long; otherwise, an I/O poll is not constrained in time. Note: Exactlly how much work this does is up to the implementation. One constraint: if a callback immediately schedules itself using ``call_soon()``, causing an infinite loop, ``run_once()`` should still return. - ``call_later(delay, callback, *args)``. Arrange for ``callback(*args)`` to be called approximately ``delay`` seconds in the future, once, unless cancelled. Returns a ``Handler`` object representing the callback, whose ``cancel()`` method can be used to cancel the callback. - ``call_repeatedly(interval, callback, **args)``. Like ``call_later()`` but calls the callback repeatedly, every ``interval`` seconds, until the ``Handler`` returned is cancelled. The first call is in ``interval`` seconds. - ``call_soon(callback, *args)``. Equivalent to ``call_later(0, callback, *args)``. - ``call_soon_threadsafe(callback, *args)``. Like ``call_soon(callback, *args)``, but when called from another thread while the event loop is blocked waiting for I/O, unblocks the event loop. This is the *only* method that is safe to call from another thread or from a signal handler. (To schedule a callback for a later time in a threadsafe manner, you can use ``ev.call_soon_threadsafe(ev.call_later, when, callback, *args)``.) - TBD: A way to register a callback that is already wrapped in a ``Handler``. Maybe ``call_soon()`` could just check ``isinstance(callback, Handler)``? It should silently skip a cancelled callback. Some methods in the standard conforming interface return Futures: - ``wrap_future(future)``. This takes a PEP 3148 Future (i.e., an instance of ``concurrent.futures.Future``) and returns a Future compatible with the event loop (i.e., a ``tulip.Future`` instance). - ``run_in_executor(executor, function, *args)``. Arrange to call ``function(*args)`` in an executor (see PEP 3148). Returns a Future whose result on success is the return value that call. This is equivalent to ``wrap_future(executor.submit(function, *args))``. If ``executor`` is ``None``, a default ``ThreadPoolExecutor`` with 5 threads is used. (TBD: Should the default executor be shared between different event loops? Should we even have a default executor? Should be be able to set its thread count? Shoul we even have this method?) - ``set_default_executor(executor)``. Set the default executor used by ``run_in_executor()``. - ``getaddrinfo(host, port, family=0, type=0, proto=0, flags=0)``. Similar to the ``socket.getaddrinfo()`` function but returns a Future. The Future's result on success will be a list of the same format as returned by ``socket.getaddrinfo()``. The default implementation calls ``socket.getaddrinfo()`` using ``run_in_executor()``, but other implementations may choose to implement their own DNS lookup. - ``getnameinfo(sockaddr, flags=0)``. Similar to ``socket.getnameinfo()`` but returns a Future. The Future's result on success will be a tuple ``(host, port)``. Same implementation remarks as for ``getaddrinfo()``. - ``create_transport(protocol_factory, host, port, **kwargs)``. Creates a transport and a protocol and ties them together. Returns a Future whose result on success is a (transport, protocol) pair. Note that when the Future completes, the protocol's ``connection_made()`` method has not yet been called; that will happen when the connection handshake is complete. When it is impossible to connect to the given host and port, the Future will raise an exception instead. Optional keyword arguments: - ``family``, ``type``, ``proto``, ``flags``: Address familty, socket type, protcol, and miscellaneous flags to be passed through to ``getaddrinfo()``. These all default to ``0`` except ``type`` which defaults to ``socket.SOCK_STREAM``. - ``ssl``: Pass ``True`` to create an SSL transport (by default a plain TCP is created). Or pass an ``ssl.SSLContext`` object to override the default SSL context object to be used. TBD: Should this be called create_connection()? - ``start_serving(...)``. Enters a loop that accepts connections. TBD: Signature. There are two possibilities: 1. You pass it a non-blocking socket that you have already prepared with ``bind()`` and ``listen()`` (these system calls do not block AFAIK), a protocol factory (I hesitate to use this word :-), and optional flags that control the transport creation (e.g. ssl). 2. Instead of a socket, you pass it a host and port, and some more optional flags (e.g. to control IPv4 vs IPv6, or to set the backlog value to be passed to ``listen()``). In either case, once it has a socket, it will wrap it in a transport, and then enter a loop accepting connections (the best way to implement such a loop depends on the platform). Each time a connection is accepted, a transport and protocol are created for it. This should return an object that can be used to control the serving loop, e.g. to stop serving, abort all active connections, and (if supported) adjust the backlog or other parameters. It may also have an API to inquire about active connections. If version (2) is selected, it should probably return a Future whose result on success will be that control object, and which becomes done once the accept loop is started. TBD: It may be best to use version (2), since on some platforms the best way to start a server may not involve sockets (but will still involve transports and protocols). TBD: Be more specific. TBD: Some platforms may not be interested in implementing all of these, e.g. start_serving() may be of no interest to mobile apps. (Although, there's a Minecraft server on my iPad...) The following methods for registering callbacks for file descriptors are optional. If they are not implemented, accessing the method (without calling it) returns AttributeError. The default implementation provides them but the user normally doesn't use these directly -- they are used by the transport implementations exclusively. Also, on Windows these may be present or not depending on whether a select-based or IOCP-based event loop is used. These take integer file descriptors only, not objects with a fileno() method. The file descriptor should represent something pollable -- i.e. no disk files. - ``add_reader(fd, callback, *args)``. Arrange for ``callback(*args)`` to be called whenever file descriptor ``fd`` is ready for reading. Returns a ``Handler`` object which can be used to cancel the callback. Note that, unlike ``call_later()``, the callback may be called many times. Calling ``add_reader()`` again for the same file descriptor implicitly cancels the previous callback for that file descriptor. (TBD: Returning a ``Handler`` that can be cancelled seems awkward. Let's forget about that.) (TBD: Change this to raise an exception if a handler is already set.) - ``add_writer(fd, callback, *args)``. Like ``add_reader()``, but registers the callback for writing instead of for reading. - ``remove_reader(fd)``. Cancels the current read callback for file descriptor ``fd``, if one is set. A no-op if no callback is currently set for the file descriptor. (The reason for providing this alternate interface is that it is often more convenient to remember the file descriptor than to remember the ``Handler`` object.) (TBD: Return ``True`` if a handler was removed, ``False`` if not.) - ``remove_writer(fd)``. This is to ``add_writer()`` as ``remove_reader()`` is to ``add_reader()``. - ``add_connector(fd, callback, *args)``. Like ``add_writer()`` but meant to wait for ``connect()`` operations, which on some platforms require different handling (e.g. ``WSAPoll()`` on Windows). - ``remove_connector(fd)``. This is to ``remove_writer()`` as ``add_connector()`` is to ``add_writer()``. TBD: What about multiple callbacks per fd? The current semantics is that ``add_reader()/add_writer()`` replace a previously registered callback. Change this to raise an exception if a callback is already registered. The following methods for doing async I/O on sockets are optional. They are alternative to the previous set of optional methods, intended for transport implementations on Windows using IOCP (if the event loop supports it). The socket argument has to be a non-blocking socket. - ``sock_recv(sock, n)``. Receive up to ``n`` bytes from socket ``sock``. Returns a Future whose result on success will be a bytes object on success. - ``sock_sendall(sock, data)``. Send bytes ``data`` to the socket ``sock``. Returns a Future whose result on success will be ``None``. (TBD: Is it better to emulate ``sendall()`` or ``send()`` semantics? I think ``sendall()`` -- but perhaps it should still be *named* ``send()``?) - ``sock_connect(sock, address)``. Connect to the given address. Returns a Future whose result on success will be ``None``. - ``sock_accept(sock)``. Accept a connection from a socket. The socket must be in listening mode and bound to an address. Returns a Future whose result on success will be a tuple ``(conn, peer)`` where ``conn`` is a connected non-blocking socket and ``peer`` is the peer address. (TBD: People tell me that this style of API is too slow for high-volume servers. So there's also ``start_serving()`` above. Then do we still need this?) TBD: Optional methods are not so good. Perhaps these should be required? It may still depend on the platform which set is more efficient. Callback Sequencing ------------------- When two callbacks are scheduled for the same time, they are run in the order in which they are registered. For example:: ev.call_soon(foo) ev.call_soon(bar) guarantees that ``foo()`` is called before ``bar()``. If ``call_soon()`` is used, this guarantee is true even if the system clock were to run backwards. This is also the case for ``call_later(0, callback, *args)``. However, if ``call_later()`` is used with a nonzero delay, all bets are off if the system clock were to runs backwards. (A good event loop implementation should use ``time.monotonic()`` to avoid problems when the clock runs backward. See PEP 418.) Context ------- All event loops have a notion of context. For the default event loop implementation, the context is a thread. An event loop implementation should run all callbacks in the same context. An event loop implementation should run only one callback at a time, so callbacks can assume automatic mutual exclusion with other callbacks scheduled in the same event loop. Exceptions ---------- There are two categories of exceptions in Python: those that derive from the ``Exception`` class and those that derive from ``BaseException``. Exceptions deriving from ``Exception`` will generally be caught and handled appropriately; for example, they will be passed through by Futures, and they will be logged and ignored when they occur in a callback. However, exceptions deriving only from ``BaseException`` are never caught, and will usually cause the program to terminate with a traceback. (Examples of this category include ``KeyboardInterrupt`` and ``SystemExit``; it is usually unwise to treat these the same as most other exceptions.) The Handler Class ----------------- The various methods for registering callbacks (e.g. ``call_later()``) all return an object representing the registration that can be used to cancel the callback. For want of a better name this object is called a ``Handler``, although the user never needs to instantiate instances of this class. There is one public method: - ``cancel()``. Attempt to cancel the callback. TBD: Exact specification. Read-only public attributes: - ``callback``. The callback function to be called. - ``args``. The argument tuple with which to call the callback function. - ``cancelled``. True if ``cancel()`` has been called. Note that some callbacks (e.g. those registered with ``call_later()``) are meant to be called only once. Others (e.g. those registered with ``add_reader()``) are meant to be called multiple times. TBD: An API to call the callback (encapsulating the exception handling necessary)? Should it record how many times it has been called? Maybe this API should just be ``__call__()``? (But it should suppress exceptions.) TBD: Public attribute recording the realtime value when the callback is scheduled? (Since this is needed anyway for storing it in a heap.) Futures ------- The ``tulip.Future`` class here is intentionally similar to the ``concurrent.futures.Future`` class specified by PEP 3148, but there are slight differences. The supported public API is as follows, indicating the differences with PEP 3148: - ``cancel()``. TBD: Exact specification. - ``cancelled()``. - ``running()``. Note that the meaning of this method is essentially "cannot be cancelled and isn't done yet". (TBD: Would be nice if this could be set *and* cleared in some cases, e.g. sock_recv().) - ``done()``. - ``result()``. Difference with PEP 3148: This has no timeout argument and does *not* wait; if the future is not yet done, it raises an exception. - ``exception()``. Difference with PEP 3148: This has no timeout argument and does *not* wait; if the future is not yet done, it raises an exception. - ``add_done_callback(fn)``. Difference with PEP 3148: The callback is never called immediately, and always in the context of the caller. (Typically, a context is a thread.) You can think of this as calling the callback through ``call_soon_threadsafe()``. Note that the callback (unlike all other callbacks defined in this PEP, and ignoring the convention from the section "Callback Style" below) is always called with a single argument, the Future object. The internal methods defined in PEP 3148 are not supported. (TBD: Maybe we do need to support these, in order to make it easy to write user code that returns a Future?) A ``tulip.Future`` object is not acceptable to the ``wait()`` and ``as_completed()`` functions in the ``concurrent.futures`` package. A ``tulip.Future`` object is acceptable to a ``yield from`` expression when used in a coroutine. This is implemented through the ``__iter__()`` interface on the Future. See the section "Coroutines and the Scheduler" below. When a Future is garbage-collected, if it has an associated exception but neither ``result()`` nor ``exception()`` nor ``__iter__()`` has ever been called (or the latter hasn't raised the exception yet -- details TBD), the exception should be logged. TBD: At what level? In the future (pun intended) we may unify ``tulip.Future`` and ``concurrent.futures.Future``, e.g. by adding an ``__iter__()`` method to the latter that works with ``yield from``. To prevent accidentally blocking the event loop by calling e.g. ``result()`` on a Future that's not don yet, the blocking operation may detect that an event loop is active in the current thread and raise an exception instead. However the current PEP strives to have no dependencies beyond Python 3.3, so changes to ``concurrent.futures.Future`` are off the table for now. Transports ---------- A transport is an abstraction on top of a socket or something similar (for example, a UNIX pipe or an SSL connection). Transports are strongly influenced by Twisted and PEP 3153. Users rarely implement or instantiate transports -- rather, event loops offer utility methods to set up transports. Transports work in conjunction with protocols. Protocols are typically written without knowing or caring about the exact type of transport used, and transports can be used with a wide variety of protocols. For example, an HTTP client protocol implementation may be used with either a plain socket transport or an SSL transport. The plain socket transport can be used with many different protocols besides HTTP (e.g. SMTP, IMAP, POP, FTP, IRC, SPDY). Most connections have an asymmetric nature: the client and server usually have very different roles and behaviors. Hence, the interface between transport and protocol is also asymmetric. From the protocol's point of view, *writing* data is done by calling the ``write()`` method on the transport object; this buffers the data and returns immediately. However, the transport takes a more active role in *reading* data: whenever some data is read from the socket (or other data source), the transport calls the protocol's ``data_received()`` method. Transports have the following public methods: - ``write(data)``. Write some bytes. The argument must be a bytes object. Returns ``None``. The transport is free to buffer the bytes, but it must eventually cause the bytes to be transferred to the entity at the other end, and it must maintain stream behavior. That is, ``t.write(b'abc'); t.write(b'def')`` is equivalent to ``t.write(b'abcdef')``, as well as to:: t.write(b'a') t.write(b'b') t.write(b'c') t.write(b'd') t.write(b'e') t.write(b'f') - ``writelines(iterable)``. Equivalent to:: for data in iterable: self.write(data) - ``write_eof()``. Close the writing end of the connection. Subsequent calls to ``write()`` are not allowed. Once all buffered data is transferred, the transport signals to the other end that no more data will be received. Some protocols don't support this operation; in that case, calling ``write_eof()`` will raise an exception. (Note: This used to be called ``half_close()``, but unless you already know what it is for, that name doesn't indicate *which* end is closed.) - ``can_write_eof()``. Return ``True`` if the protocol supports ``write_eof()``, ``False`` if it does not. (This method is needed because some protocols need to change their behavior when ``write_eof()`` is unavailable. For example, in HTTP, to send data whose size is not known ahead of time, the end of the data is typically indicated using ``write_eof()``; however, SSL does not support this, and an HTTP protocol implementation would have to use the "chunked" transfer encoding in this case. But if the data size is known ahead of time, the best approach in both cases is to use the Content-Length header.) - ``pause()``. Suspend delivery of data to the protocol until a subsequent ``resume()`` call. Between ``pause()`` and ``resume()``, the protocol's ``data_received()`` method will not be called. This has no effect on ``write()``. - ``resume()``. Restart delivery of data to the protocol via ``data_received()``. - ``close()``. Sever the connection with the entity at the other end. Any data buffered by ``write()`` will (eventually) be transferred before the connection is actually closed. The protocol's ``data_received()`` method will not be called again. Once all buffered data has been flushed, the protocol's ``connection_lost()`` method will be called with ``None`` as the argument. Note that this method does not wait for all that to happen. - ``abort()``. Immediately sever the connection. Any data still buffered by the transport is thrown away. Soon, the protocol's ``connection_lost()`` method will be called with ``None`` as argument. (TBD: Distinguish in the ``connection_lost()`` argument between ``close()``, ``abort()`` or a close initated by the other end? Or add a transport method to inquire about this? Glyph's proposal was to pass different exceptions for this purpose.) TBD: Provide flow control the other way -- the transport may need to suspend the protocol if the amount of data buffered becomes a burden. Proposal: let the transport call ``protocol.pause()`` and ``protocol.resume()`` if they exist; if they don't exist, the protocol doesn't support flow control. (Perhaps different names to avoid confusion between protocols and transports?) Protocols --------- Protocols are always used in conjunction with transports. While a few common protocols are provided (e.g. decent though not necessarily excellent HTTP client and server implementations), most protocols will be implemented by user code or third-party libraries. A protocol must implement the following methods, which will be called by the transport. Consider these callbacks that are always called by the event loop in the right context. (See the "Context" section above.) - ``connection_made(transport)``. Indicates that the transport is ready and connected to the entity at the other end. The protocol should probably save the transport reference as an instance variable (so it can call its ``write()`` and other methods later), and may write an initial greeting or request at this point. - ``data_received(data)``. The transport has read some bytes from the connection. The argument is always a non-empty bytes object. There are no guarantees about the minimum or maximum size of the data passed along this way. ``p.data_received(b'abcdef')`` should be treated exactly equivalent to:: p.data_received(b'abc') p.data_received(b'def') - ``eof_received()``. This is called when the other end called ``write_eof()`` (or something equivalent). The default implementation calls ``close()`` on the transport, which causes ``connection_lost()`` to be called (eventually) on the protocol. - ``connection_lost(exc)``. The transport has been closed or aborted, has detected that the other end has closed the connection cleanly, or has encountered an unexpected error. In the first three cases the argument is ``None``; for an unexpected error, the argument is the exception that caused the transport to give up. (TBD: Do we need to distinguish between the first three cases?) Here is a chart indicating the order and multiplicity of calls: 1. ``connection_made()`` -- exactly once 2. ``data_received()`` -- zero or more times 3. ``eof_received()`` -- at most once 4. ``connection_lost()`` -- exactly once TBD: Discuss whether user code needs to do anything to make sure that protocol and transport aren't garbage-collected prematurely. Callback Style -------------- Most interfaces taking a callback also take positional arguments. For instance, to arrange for ``foo("abc", 42)`` to be called soon, you call ``ev.call_soon(foo, "abc", 42)``. To schedule the call ``foo()``, use ``ev.call_soon(foo)``. This convention greatly reduces the number of small lambdas required in typical callback programming. This convention specifically does *not* support keyword arguments. Keyword arguments are used to pass optional extra information about the callback. This allows graceful evolution of the API without having to worry about whether a keyword might be significant to a callee somewhere. If you have a callback that *must* be called with a keyword argument, you can use a lambda or ``functools.partial``. For example:: ev.call_soon(functools.partial(foo, "abc", repeat=42)) Choosing an Event Loop Implementation ------------------------------------- TBD. (This is about the choice to use e.g. select vs. poll vs. epoll, and how to override the choice. Probably belongs in the event loop policy.) Coroutines and the Scheduler ============================ This is a separate toplevel section because its status is different from the event loop interface. Usage of coroutines is optional, and it is perfectly fine to write code using callbacks only. On the other hand, there is only one implementation of the scheduler/coroutine API, and if you're using coroutines, that's the one you're using. Coroutines ---------- A coroutine is a generator that follows certain conventions. For documentation purposes, all coroutines should be decorated with ``@tulip.coroutine``, but this cannot be strictly enforced. Coroutines use the ``yield from`` syntax introduced in PEP 380, instead of the original ``yield`` syntax. The word "coroutine", like the word "generator", is used for two different (though related) concepts: - The function that defines a coroutine (a function definition decorated with ``tulip.coroutine``). If disambiguation is needed, we call this a *coroutine function*. - The object obtained by calling a coroutine function. This object represents a computation or an I/O operation (usually a combination) that will complete eventually. For disambiguation we call it a *coroutine object*. Things a coroutine can do: - ``result = yield from future`` -- suspends the coroutine until the future is done, then returns the future's result, or raises its exception, which will be propagated. - ``result = yield from coroutine`` -- wait for another coroutine to produce a result (or raise an exception, which will be propagated). The ``coroutine`` expression must be a *call* to another coroutine. - ``results = yield from tulip.par(futures_and_coroutines)`` -- Wait for a list of futures and/or coroutines to complete and return a list of their results. If one of the futures or coroutines raises an exception, that exception is propagated, after attempting to cancel all other futures and coroutines in the list. - ``return result`` -- produce a result to the coroutine that is waiting for this one using ``yield from``. - ``raise exception`` -- raise an exception in the coroutine that is waiting for this one using ``yield from``. Calling a coroutine does not start its code running -- it is just a generator, and the coroutine object returned by the call is really a generator object, which doesn't do anything until you iterate over it. In the case of a coroutine object, there are two basic ways to start it running: call ``yield from coroutine`` from another coroutine (assuming the other coroutine is already running!), or convert it to a Task. Coroutines can only run when the event loop is running. Tasks ----- A Task is an object that manages an independently running coroutine. The Task interface is the same as the Future interface. The task becomes done when its coroutine returns or raises an exception; if it returns a result, that becomes the task's result, if it raises an exception, that becomes the task's exception. Cancelling a task that's not done yet prevents its coroutine from completing; in this case an exception is thrown into the coroutine that it may catch to further handle cancellation, but it doesn't have to (this is done using the standard ``close()`` method on generators, described in PEP 342). The ``par()`` function described above runs coroutines in parallel by converting them to Tasks. (Arguments that are already Tasks or Futures are not converted.) Tasks are also useful for interoperating between coroutines and callback-based frameworks like Twisted. After converting a coroutine into a Task, callbacks can be added to the Task. You may ask, why not convert all coroutines to Tasks? The ``@tulip.coroutine`` decorator could do this. This would slow things down considerably in the case where one coroutine calls another (and so on), as switching to a "bare" coroutine has much less overhead than switching to a Task. The Scheduler ------------- The scheduler has no public interface. You interact with it by using ``yield from future`` and ``yield from task``. In fact, there is no single object representing the scheduler -- its behavior is implemented by the ``Task`` and ``Future`` classes using only the public interface of the event loop, so it will work with third-party event loop implementations, too. Sleeping -------- TBD: ``yield sleep(seconds)``. Can use ``sleep(0)`` to suspend to poll for I/O. Wait for First -------------- TBD: Need an interface to wait for the first of a collection of Futures. Coroutines and Protocols ------------------------ The best way to use coroutines to implement protocols is probably to use a streaming buffer that gets filled by ``data_received()`` and can be read asynchronously using methods like ``read(n)`` and ``readline()`` that return a Future. When the connection is closed, ``read()`` should return a Future whose result is ``b''``, or raise an exception if ``connection_closed()`` is called with an exception. To write, the ``write()`` method (and friends) on the transport can be used -- these do not return Futures. A standard protocol implementation should be provided that sets this up and kicks off the coroutine when ``connection_made()`` is called. TBD: Be more specific. Cancellation ------------ TBD. When a Task is cancelled its coroutine may see an exception at any point where it is yielding to the scheduler (i.e., potentially at any ``yield from`` operation). We need to spell out which exception is raised. Also TBD: timeouts. Open Issues =========== - A debugging API? E.g. something that logs a lot of stuff, or logs unusual conditions (like queues filling up faster than they drain) or even callbacks taking too much time... - Do we need introspection APIs? E.g. asking for the read callback given a file descriptor. Or when the next scheduled call is. Or the list of file descriptors registered with callbacks. - Should we have ``future.add_callback(callback, *args)``, using the convention from the section "Callback Style" above, or should we stick with the PEP 3148 specification of ``future.add_done_callback(callback)`` which calls ``callback(future)``? (Glyph suggested using a different method name since add_done_callback() does not guarantee that the callback will be called in the right context.) - Returning a Future is relatively expensive, and it is quite possible that some types of calls *usually* complete immediately (e.g. writing small amounts of data to a socket). A trick used by Richard Oudkerk in the tulip project's proactor branch makes calls like recv() either return a regular result or *raise* a Future. The caller (likely a transport) must then write code like this:: try: res = ev.sock_recv(sock, 8192) except Future as f: yield from sch.block_future(f) res = f.result() - Do we need a larger vocabulary of operations for combining coroutines and/or futures? E.g. in addition to par() we could have a way to run several coroutines sequentially (returning all results or passing the result of one to the next and returning the final result?). We might also introduce explicit locks (though these will be a bit of a pain to use, as we can't use the ``with lock: block`` syntax). Anyway, I think all of these are easy enough to write using ``Task``. Proposal: ``f = yield from wait_one(fs)`` takes a set of Futures and sets f to the first of those that is done. (Yes, this requires an intermediate Future to wait for.) You can then write:: while fs: f = tulip.wait_one(fs) fs.remove(f) - Support for datagram protocols, "connected" or otherwise? Probably need more socket I/O methods, e.g. ``sock_sendto()`` and ``sock_recvfrom()``. Or users can write their own (it's not rocket science). Is it reasonable to map ``write()``, ``writelines()``, ``data_received()`` to single datagrams? - Task or callback priorities? (I hope not.) - An EventEmitter in the style of NodeJS? Or make this a separate PEP? It's easy enough to do in user space, though it may benefit from standardization. (See https://github.com/mnot/thor/blob/master/thor/events.py and https://github.com/mnot/thor/blob/master/doc/events.md for examples.) Acknowledgments =============== Apart from PEP 3153, influences include PEP 380 and Greg Ewing's tutorial for ``yield from``, Twisted, Tornado, ZeroMQ, pyftpdlib, tulip (the author's attempts at synthesis of all these), wattle (Steve Dower's counter-proposal), numerous discussions on python-ideas from September through December 2012, a Skype session with Steve Dower and Dino Viehland, email exchanges with Ben Darnell, an audience with Niels Provos (original author of libevent), and two in-person meetings with several Twisted developers, including Glyph, Brian Warner, David Reid, and Duncan McGreggor. Also, the author's previous work on async support in the NDB library for Google App Engine was an important influence. Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From jnoller at gmail.com Fri Dec 21 20:06:47 2012 From: jnoller at gmail.com (Jesse Noller) Date: Fri, 21 Dec 2012 14:06:47 -0500 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Friday, December 21, 2012 at 1:57 PM, Guido van Rossum wrote: > Dear python-dev *and* python-ideas, > > I am posting PEP 3156 here for early review and discussion. As you can > see from the liberally sprinkled TBD entries it is not done, but I am > about to disappear on vacation for a few weeks and I am reasonably > happy with the state of things so far. (Of course feedback may change > this. :-) Also, there has already been some discussion on python-ideas > (and even on Twitter) so I don't want python-dev to feel out of the > loop -- this *is* a proposal for a new standard library module. (But > no, I haven't picked the module name yet. :-) > > There's an -- also incomplete -- reference implementation at > http://code.google.com/p/tulip/ -- unlike the first version of tulip, > this version actually has (some) unittests. > > Let the bikeshedding begin! > > (Oh, happy holidays too. :-) > > -- > --Guido van Rossum (python.org/~guido (http://python.org/~guido)) > I really do like tulip as the name. It's quite pretty. From exarkun at twistedmatrix.com Fri Dec 21 20:08:49 2012 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Fri, 21 Dec 2012 19:08:49 -0000 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: <20121221190849.6389.1662921795.divmod.xquotient.355@localhost6.localdomain6> Please stop copying me on this thread. Thanks, Jean-Paul From guido at python.org Fri Dec 21 20:09:39 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 11:09:39 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 11:06 AM, Jesse Noller wrote: > I really do like tulip as the name. It's quite pretty. I chose it because Twisted and Tornado both start with T. But those have kind of dark associations; I wanted to offset that with something lighter. (OTOH we could use a black tulip as a logo. :-) Regardless, it's not the kind of name we tend to use for the stdlib. It'll probably end up being asynclib or something... -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Fri Dec 21 20:44:11 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Dec 2012 20:44:11 +0100 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted References: Message-ID: <20121221204411.1269322b@pitrou.net> Hello, > To get the current event loop, use get_event_loop(). This returns an > instance of the EventLoop class defined below or an equivalent > object. It is possible that get_event_loop() returns a different > object depending on the current thread, or depending on some other > notion of context. > > To set the current event loop, use set_event_loop(event_loop), where > event_loop is an instance of the EventLoop class or equivalent. This > uses the same notion of context as get_event_loop(). So can we instantiate an EventLoop directly and then call set_event_loop() with it? Or is the use case different? > - ``create_transport(protocol_factory, host, port, **kwargs)``. > Creates a transport and a protocol and ties them together. Returns > a Future whose result on success is a (transport, protocol) pair. > Note that when the Future completes, the protocol's > ``connection_made()`` method has not yet been called; that will > happen when the connection handshake is complete. When it is > impossible to connect to the given host and port, the Future will > raise an exception instead. > > Optional keyword arguments: > > - ``family``, ``type``, ``proto``, ``flags``: Address familty, > socket type, protcol, and miscellaneous flags to be passed through > to ``getaddrinfo()``. These all default to ``0`` except ``type`` > which defaults to ``socket.SOCK_STREAM``. > > - ``ssl``: Pass ``True`` to create an SSL transport (by default a > plain TCP is created). Or pass an ``ssl.SSLContext`` object to > override the default SSL context object to be used. > > TBD: Should this be called create_connection()? Either create_connection() or create_client(). create_transport() is wrong, since server transports wouldn't use that function. I would favour create_client() if this function is also meant to support UDP (I know you haven't thought about UDP yet, but it is an important and common use case). I have another question about that API: if I want to cancel the connection attempt after a given delay, how do I do that? If I call cancel() on the future, does it cancel the connect() call? As for SSL, there are security issues with having a "default SSL context" (notably, any decent client use of SSL *must* check the server certificate against an appropriate set of CAs). It's much better to force users to pass a context explicitly. Choosing default settings should only be for higher-level APIs like urllib.request. (btw, don't you mean that family defaults to AF_INET?) > If executor is None, a default ThreadPoolExecutor with 5 threads is used Is it because Twisted's thread pool has minThreads=5? :) > The transport is free to buffer the bytes, but it must eventually > cause the bytes to be transferred to the entity at the other end, and > it must maintain stream behavior. That is, t.write(b'abc'); > t.write(b'def') is equivalent to t.write(b'abcdef') I think this is a bad idea. The kernel's network stack should do the buffering (and choose appropriate algorithms for that), not the user-level framework. The transport should write the bytes as soon as the fd is ready for writing, and it should write the same chunks as given by the user, not a concatenation of them. Besides, it would be better if transports weren't automatically *streaming* transports. There are connected datagram protocols, such as named pipes under Windows (multiprocessing already uses non-blocking Windows named pipes). > Proposal: let the transport call protocol.pause() and > protocol.resume() if they exist; if they don't exist, the protocol > doesn't support flow control. +1. The Protocol base class can provide default no-op implementations. > TBD: Discuss whether user code needs to do anything to make sure that > protocol and transport aren't garbage-collected prematurely. The transport should be tied to the event loop as long as the connection holds, and the protocol will hold to the transport. > TBD: Need an interface to wait for the first of a collection of Futures. Have you looked at Twisted's DeferredList? http://twistedmatrix.com/documents/12.1.0/api/twisted.internet.defer.DeferredList.html I think par() could take a keyword-only argument to specify you want the callback to be triggered on the first result (and perhaps being able to choose between "the first success result" and "the first success or error result"). > A trick used by Richard Oudkerk in the tulip project's proactor > branch makes calls like recv() either return a regular result or > raise a Future. The caller (likely a transport) must then write code > like this: Isn't it a case of premature optimization? If we want to keep this, there should be a nicer API, perhaps like Twisted's maybeDeferred: http://twistedmatrix.com/documents/current/api/twisted.internet.defer.html#maybeDeferred > We might also introduce explicit locks (though these will be a bit of > a pain to use, as we can't use the with lock: block syntax). I don't understand why you couldn't use "with lock" in a coroutine. Am I misunderstanding something? > Is it reasonable to map write(), writelines(), data_received() to > single datagrams? Well, at least that's how Twisted does it (not sure about writelines()). Regards Antoine. From solipsis at pitrou.net Fri Dec 21 20:50:33 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Dec 2012 20:50:33 +0100 Subject: [Python-Dev] peps: Specify start_serving(). Add Post-History. References: <3YSg56288VzNZl@mail.python.org> Message-ID: <20121221205033.634afbaf@pitrou.net> On Fri, 21 Dec 2012 20:34:18 +0100 (CET) guido.van.rossum wrote: > > - In either case, once it has a socket, it will wrap it in a > - transport, and then enter a loop accepting connections (the best way > - to implement such a loop depends on the platform). Each time a > - connection is accepted, a transport and protocol are created for it. > + TBD: Support SSL? I don't even know how to do that synchronously, > + and I suppose it needs a certificate. You need a SSLContext, and that SSLContext must have a cert / key pair defined using the `load_cert_chain()` method. I supposed you meant "asynchronously", not "synchronously". The listening socket doesn't have to be a SSL socket, only the connected sockets need to be wrapped. The non-blocking handshake shouldn't be different (AFAICT) for a server SSL socket than for a client SSL socket. Regards Antoine. From guido at python.org Fri Dec 21 21:37:25 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 12:37:25 -0800 Subject: [Python-Dev] peps: Specify start_serving(). Add Post-History. In-Reply-To: <20121221205033.634afbaf@pitrou.net> References: <3YSg56288VzNZl@mail.python.org> <20121221205033.634afbaf@pitrou.net> Message-ID: I really meant *synchronously*... I usually start with working sync code and then figure out what to do to make it async. I'll give what you suggest a try. --Guido van Rossum (sent from Android phone) On Dec 21, 2012 11:54 AM, "Antoine Pitrou" wrote: > On Fri, 21 Dec 2012 20:34:18 +0100 (CET) > guido.van.rossum wrote: > > > > - In either case, once it has a socket, it will wrap it in a > > - transport, and then enter a loop accepting connections (the best way > > - to implement such a loop depends on the platform). Each time a > > - connection is accepted, a transport and protocol are created for it. > > + TBD: Support SSL? I don't even know how to do that synchronously, > > + and I suppose it needs a certificate. > > You need a SSLContext, and that SSLContext must have a cert / key pair > defined using the `load_cert_chain()` method. > > I supposed you meant "asynchronously", not "synchronously". The > listening socket doesn't have to be a SSL socket, only the connected > sockets need to be wrapped. The non-blocking handshake shouldn't be > different (AFAICT) for a server SSL socket than for a client SSL socket. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Dec 21 22:00:02 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 21 Dec 2012 22:00:02 +0100 Subject: [Python-Dev] peps: Specify start_serving(). Add Post-History. In-Reply-To: References: <3YSg56288VzNZl@mail.python.org> <20121221205033.634afbaf@pitrou.net> Message-ID: <20121221220002.7ba7cfcd@pitrou.net> On Fri, 21 Dec 2012 12:37:25 -0800 Guido van Rossum wrote: > I really meant *synchronously*... I usually start with working sync code > and then figure out what to do to make it async. I'll give what you suggest > a try. Ah. Then I hope the doc example can help you: http://docs.python.org/dev/library/ssl.html#server-side-operation Regards Antoine. From _ at lvh.cc Fri Dec 21 22:04:04 2012 From: _ at lvh.cc (Laurens Van Houtven) Date: Fri, 21 Dec 2012 22:04:04 +0100 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: Looks reasonable to me :) Comments: create_transport "combines" a transport and a protocol. Is that process reversible? that might seem like an exotic thing (and I guess it kind of is), but I've wanted this e.g for websockets, and I guess there's a few other cases where it could be useful :) eof_received on protocols seems unusual. What's the rationale? I know we disagree that callbacks (of the line_received variety) are a good idea for blocking IO (I think we should have universal protocol implementations), but can we agree that they're what we want for tulip? If so, I can try to figure out a way to get them to fit together :) I'm assuming that this means you'd like protocols and transports in this PEP? A generic comment on yield from APIs that I'm sure has been discussed in some e-mail I missed: is there an obvious way to know up front whether something needs to be yielded or yield frommed? In twisted, which is what I'm used to it's all deferreds; but here a future's yield from but sleep's yield? Will comment more as I keep reading I'm sure :) On Fri, Dec 21, 2012 at 8:09 PM, Guido van Rossum wrote: > On Fri, Dec 21, 2012 at 11:06 AM, Jesse Noller wrote: > > I really do like tulip as the name. It's quite pretty. > > I chose it because Twisted and Tornado both start with T. But those > have kind of dark associations; I wanted to offset that with something > lighter. (OTOH we could use a black tulip as a logo. :-) > > Regardless, it's not the kind of name we tend to use for the stdlib. > It'll probably end up being asynclib or something... > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- cheers lvh -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Dec 21 22:10:45 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 13:10:45 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: <20121221204411.1269322b@pitrou.net> References: <20121221204411.1269322b@pitrou.net> Message-ID: Inline. --Guido van Rossum (sent from Android phone) On Dec 21, 2012 11:47 AM, "Antoine Pitrou" wrote: > > > Hello, > > > To get the current event loop, use get_event_loop(). This returns an > > instance of the EventLoop class defined below or an equivalent > > object. It is possible that get_event_loop() returns a different > > object depending on the current thread, or depending on some other > > notion of context. > > > > To set the current event loop, use set_event_loop(event_loop), where > > event_loop is an instance of the EventLoop class or equivalent. This > > uses the same notion of context as get_event_loop(). > > So can we instantiate an EventLoop directly and then call > set_event_loop() with it? Or is the use case different? That's an abstract class, but if you know a concert implementation you can use this. E.g. latest Tulip unit tests. > > - ``create_transport(protocol_factory, host, port, **kwargs)``. > > Creates a transport and a protocol and ties them together. Returns > > a Future whose result on success is a (transport, protocol) pair. > > Note that when the Future completes, the protocol's > > ``connection_made()`` method has not yet been called; that will > > happen when the connection handshake is complete. When it is > > impossible to connect to the given host and port, the Future will > > raise an exception instead. > > > > Optional keyword arguments: > > > > - ``family``, ``type``, ``proto``, ``flags``: Address familty, > > socket type, protcol, and miscellaneous flags to be passed through > > to ``getaddrinfo()``. These all default to ``0`` except ``type`` > > which defaults to ``socket.SOCK_STREAM``. > > > > - ``ssl``: Pass ``True`` to create an SSL transport (by default a > > plain TCP is created). Or pass an ``ssl.SSLContext`` object to > > override the default SSL context object to be used. > > > > TBD: Should this be called create_connection()? > > Either create_connection() or create_client(). create_transport() is > wrong, since server transports wouldn't use that function. > > I would favour create_client() if this function is also meant to > support UDP (I know you haven't thought about UDP yet, but it is an > important and common use case). OK. > I have another question about that API: if I want to cancel the > connection attempt after a given delay, how do I do that? If I call > cancel() on the future, does it cancel the connect() call? It does in Tulip, because it's really a task. Maybe this should be in the spec? > As for SSL, there are security issues with having a "default SSL > context" (notably, any decent client use of SSL *must* check the server > certificate against an appropriate set of CAs). It's much better to > force users to pass a context explicitly. Choosing default settings > should only be for higher-level APIs like urllib.request. Hm. That makes simple tests harder. But I understand the concern. > (btw, don't you mean that family defaults to AF_INET?) > > > If executor is None, a default ThreadPoolExecutor with 5 threads is used > > Is it because Twisted's thread pool has minThreads=5? :) Yes, and to encourage the use of set_default_executor() ... :-) > > The transport is free to buffer the bytes, but it must eventually > > cause the bytes to be transferred to the entity at the other end, and > > it must maintain stream behavior. That is, t.write(b'abc'); > > t.write(b'def') is equivalent to t.write(b'abcdef') > > I think this is a bad idea. The kernel's network stack should do the > buffering (and choose appropriate algorithms for that), not the > user-level framework. The transport should write the bytes as soon as > the fd is ready for writing, and it should write the same chunks as > given by the user, not a concatenation of them. I asked Glyph about this. It depends on the OS... Mac syscalls are so slow that it is better to join in user space. This should really be up to the transport, although for stream transports the given equivalency should definitely hold. > Besides, it would be better if transports weren't automatically > *streaming* transports. There are connected datagram protocols, such as > named pipes under Windows (multiprocessing already uses non-blocking > Windows named pipes). I think we need to support datagrams, but the default ought to be stream. > > Proposal: let the transport call protocol.pause() and > > protocol.resume() if they exist; if they don't exist, the protocol > > doesn't support flow control. > > +1. The Protocol base class can provide default no-op implementations. OK. > > TBD: Discuss whether user code needs to do anything to make sure that > > protocol and transport aren't garbage-collected prematurely. > > The transport should be tied to the event loop as long as the > connection holds, and the protocol will hold to the transport. OK. > > TBD: Need an interface to wait for the first of a collection of Futures. > > Have you looked at Twisted's DeferredList? > http://twistedmatrix.com/documents/12.1.0/api/twisted.internet.defer.DeferredList.html No, I am trying to stay away from them. > I think par() could take a keyword-only argument to specify you want > the callback to be triggered on the first result (and perhaps being > able to choose between "the first success result" and "the first > success or error result"). Good idea. This is unexplored. > > A trick used by Richard Oudkerk in the tulip project's proactor > > branch makes calls like recv() either return a regular result or > > raise a Future. The caller (likely a transport) must then write code > > like this: > > Isn't it a case of premature optimization? Yeah, we should not do this. > If we want to keep this, there should be a nicer API, perhaps like > Twisted's maybeDeferred: > http://twistedmatrix.com/documents/current/api/twisted.internet.defer.html#maybeDeferred Ugh. > > We might also introduce explicit locks (though these will be a bit of > > a pain to use, as we can't use the with lock: block syntax). > > I don't understand why you couldn't use "with lock" in a coroutine. Am > I misunderstanding something? If another task has the lock we must yield. But 'with' can't do that. > > Is it reasonable to map write(), writelines(), data_received() to > > single datagrams? > > Well, at least that's how Twisted does it (not sure about writelines()). OK. > Regards > > Antoine. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Fri Dec 21 22:43:28 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Fri, 21 Dec 2012 13:43:28 -0800 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Fri, Dec 21, 2012 at 6:46 AM, Brett Cannon wrote: > > On Thu, Dec 20, 2012 at 7:35 PM, Chris Jerdonek > wrote: >> >> I don't disagree that he shouldn't have cross-posted. I was just >> pointing out that the language should be clarified. What's confusing >> is that the current language implies that one shouldn't send any >> PEP-related e-mails to any mailing list other than peps at . In >> particular, how can one discuss PEPs on python-dev or python-ideas >> without violating that language (e.g. this e-mail which is related to >> PEP 1)? It is probably just a matter of clarifying what "PEP-related" >> means. > > > I'm just not seeing the confusion, sorry. And we have never really had any > confusion over this wording before. If you want to send a patch to tweak the > wording to me more clear then please go ahead and I will consider it, but > I'm not worried enough about it to try to come up with some rewording > myself. I uploaded a proposed patch to this issue: http://bugs.python.org/issue16746 --Chris From jonathan at slenders.be Fri Dec 21 23:26:09 2012 From: jonathan at slenders.be (Jonathan Slenders) Date: Fri, 21 Dec 2012 23:26:09 +0100 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: As far as I understand, "yield from" will always work, because a Future object can act like an iterator, and you can delegate your own generator to this iterator at the place of "yield from". "yield" only works if the parameter behind yield is already a Future object. Right Guido? In case of sleep, sleep could be implemented to return a Future object. 2012/12/21 Laurens Van Houtven <_ at lvh.cc> > A generic comment on yield from APIs that I'm sure has been discussed in > some e-mail I missed: is there an obvious way to know up front whether > something needs to be yielded or yield frommed? In twisted, which is what > I'm used to it's all deferreds; but here a future's yield from but sleep's > yield? > > > > On Fri, Dec 21, 2012 at 8:09 PM, Guido van Rossum wrote: > >> On Fri, Dec 21, 2012 at 11:06 AM, Jesse Noller wrote: >> > I really do like tulip as the name. It's quite pretty. >> >> I chose it because Twisted and Tornado both start with T. But those >> have kind of dark associations; I wanted to offset that with something >> lighter. (OTOH we could use a black tulip as a logo. :-) >> >> Regardless, it's not the kind of name we tend to use for the stdlib. >> It'll probably end up being asynclib or something... >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > cheers > lvh > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat Dec 22 00:07:51 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 22 Dec 2012 09:07:51 +1000 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: We were tentatively calling it "concurrent.eventloop" at the 2011 language summit. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Dec 22 02:02:09 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 17:02:09 -0800 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 1:04 PM, Laurens Van Houtven <_ at lvh.cc> wrote: > Looks reasonable to me :) Comments: > > create_transport "combines" a transport and a protocol. Is that process > reversible? that might seem like an exotic thing (and I guess it kind of > is), but I've wanted this e.g for websockets, and I guess there's a few > other cases where it could be useful :) If you really need this, it's probably best to start out doing this as a nonstandard extension of an implementation. The current *implementation* makes it simple enough, but I don't think it's worth complicating the PEP. Working code might convince me otherwise. > eof_received on protocols seems unusual. What's the rationale? Well how else would you indicate that the other end did a half-close (in Twisted terminology)? You can't call connection_lost() because you might still want to write more. E.g. this is how HTTP servers work if there's no Content-length or chunked encoding on a request body: they read until EOF, then do their thing and write the response. > I know we disagree that callbacks (of the line_received variety) are a good > idea for blocking IO (I think we should have universal protocol > implementations), but can we agree that they're what we want for tulip? If > so, I can try to figure out a way to get them to fit together :) I'm > assuming that this means you'd like protocols and transports in this PEP? Sorry, I have no idea what you're talking about. Can you clarify? I do know that the PEP is weakest in specifying how a coroutine can implement a transport. However my plans are clear: ild the old tulip code there's a BufferedReader; somehow the coroutine will receive a "stdin" and a "stdout" where the "stdin" is a BufferedReader, which has methods like read(), readline() etc. which return Futures and must be invoked using yield from; and "stdout" is a transport, which has write() and friends that don't return anything but just buffer stuff and start the I/O asynchronous (and may try to slow down the protocol by calling its pause() method). > A generic comment on yield from APIs that I'm sure has been discussed in > some e-mail I missed: is there an obvious way to know up front whether > something needs to be yielded or yield frommed? In twisted, which is what > I'm used to it's all deferreds; but here a future's yield from but sleep's > yield? In PEP 3156 conformant code you're supposed always to use 'yield from'. The only time you see a bare yield is when it's part of the implementation's internals. (However I think tulip actually will handle a yield the same way as a yield from, except that it's slower because it makes a roundtrip to the scheduler, a.k.a. trampoline.) > Will comment more as I keep reading I'm sure :) Please do! -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Dec 22 02:03:26 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 17:03:26 -0800 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 2:26 PM, Jonathan Slenders wrote: > As far as I understand, "yield from" will always work, because a Future > object can act like an iterator, and you can delegate your own generator to > this iterator at the place of "yield from". > "yield" only works if the parameter behind yield is already a Future object. > Right Guido? Correct! Sounds like you got it now. That's the magic of yield from.. > In case of sleep, sleep could be implemented to return a Future object. It does; in tulip/futures.py: def sleep(when, result=None): future = Future() future._event_loop.call_later(when, future.set_result, result) return future -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Dec 22 02:04:47 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 17:04:47 -0800 Subject: [Python-Dev] peps: Specify start_serving(). Add Post-History. In-Reply-To: <20121221220002.7ba7cfcd@pitrou.net> References: <3YSg56288VzNZl@mail.python.org> <20121221205033.634afbaf@pitrou.net> <20121221220002.7ba7cfcd@pitrou.net> Message-ID: On Fri, Dec 21, 2012 at 1:00 PM, Antoine Pitrou wrote: > On Fri, 21 Dec 2012 12:37:25 -0800 > Guido van Rossum wrote: >> I really meant *synchronously*... I usually start with working sync code >> and then figure out what to do to make it async. I'll give what you suggest >> a try. > > Ah. Then I hope the doc example can help you: > http://docs.python.org/dev/library/ssl.html#server-side-operation Heh. Thanks. However I wouldn't know where to get a certificate. And for unittests, distributing a certificate sounds like an obvious bad idea. :-) -- --Guido van Rossum (python.org/~guido) From benjamin at python.org Sat Dec 22 02:08:22 2012 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 21 Dec 2012 19:08:22 -0600 Subject: [Python-Dev] peps: Specify start_serving(). Add Post-History. In-Reply-To: References: <3YSg56288VzNZl@mail.python.org> <20121221205033.634afbaf@pitrou.net> <20121221220002.7ba7cfcd@pitrou.net> Message-ID: 2012/12/21 Guido van Rossum : > On Fri, Dec 21, 2012 at 1:00 PM, Antoine Pitrou wrote: >> On Fri, 21 Dec 2012 12:37:25 -0800 >> Guido van Rossum wrote: >>> I really meant *synchronously*... I usually start with working sync code >>> and then figure out what to do to make it async. I'll give what you suggest >>> a try. >> >> Ah. Then I hope the doc example can help you: >> http://docs.python.org/dev/library/ssl.html#server-side-operation > > Heh. Thanks. However I wouldn't know where to get a certificate. And > for unittests, distributing a certificate sounds like an obvious bad > idea. :-) It's fairly easy to generate a "fake" self-signed one for testing purposes. We already have some in the test suite. -- Regards, Benjamin From guido at python.org Sat Dec 22 02:24:15 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Dec 2012 17:24:15 -0800 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 5:17 PM, Jasper St. Pierre wrote: > On Fri, Dec 21, 2012 at 8:02 PM, Guido van Rossum wrote: > > ... snip ... > >> In PEP 3156 conformant code you're supposed always to use 'yield >> from'. The only time you see a bare yield is when it's part of the >> implementation's internals. (However I think tulip actually will >> handle a yield the same way as a yield from, except that it's slower >> because it makes a roundtrip to the scheduler, a.k.a. trampoline.) > > > Would it be possible to fail on "yield"? Silently being slower when you > forget to type a keyword is something I can imagine will creep up a lot by > mistake, and I don't think it's a good idea to silently be slower when the > only different is five more characters. That's also a possibility. If someone can figure out a patch that would be great. -- --Guido van Rossum (python.org/~guido) From jstpierre at mecheye.net Sat Dec 22 02:17:16 2012 From: jstpierre at mecheye.net (Jasper St. Pierre) Date: Fri, 21 Dec 2012 20:17:16 -0500 Subject: [Python-Dev] [Python-ideas] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: Message-ID: On Fri, Dec 21, 2012 at 8:02 PM, Guido van Rossum wrote: ... snip ... In PEP 3156 conformant code you're supposed always to use 'yield > from'. The only time you see a bare yield is when it's part of the > implementation's internals. (However I think tulip actually will > handle a yield the same way as a yield from, except that it's slower > because it makes a roundtrip to the scheduler, a.k.a. trampoline.) > Would it be possible to fail on "yield"? Silently being slower when you forget to type a keyword is something I can imagine will creep up a lot by mistake, and I don't think it's a good idea to silently be slower when the only different is five more characters. > Will comment more as I keep reading I'm sure :) > > Please do! > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Jasper -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Sat Dec 22 12:37:40 2012 From: mark at hotpy.org (Mark Shannon) Date: Sat, 22 Dec 2012 11:37:40 +0000 Subject: [Python-Dev] Testing the tests by modifying the ordering of dict items. In-Reply-To: <1356073531934-5000138.post@n6.nabble.com> References: <4F05A9CC.3000806@hotpy.org> <1356073531934-5000138.post@n6.nabble.com> Message-ID: <50D59B84.6060701@hotpy.org> On 21/12/12 07:05, csebasha wrote: > Hello Mark, > > Did you raise bug for this? > No need now. The hash randomization, which was added a while ago, will render the tests obsolete. Cheers, Mark. From stefan at bytereef.org Sat Dec 22 15:14:27 2012 From: stefan at bytereef.org (Stefan Krah) Date: Sat, 22 Dec 2012 15:14:27 +0100 Subject: [Python-Dev] hg.python.org down Message-ID: <20121222141427.GA28134@sleipnir.bytereef.org> Hi, hg.python.org seems to be unreachable (tested from various IP addresses). Stefan Krah From ncoghlan at gmail.com Sat Dec 22 15:58:15 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 23 Dec 2012 00:58:15 +1000 Subject: [Python-Dev] hg.python.org down In-Reply-To: <20121222141427.GA28134@sleipnir.bytereef.org> References: <20121222141427.GA28134@sleipnir.bytereef.org> Message-ID: On Sun, Dec 23, 2012 at 12:14 AM, Stefan Krah wrote: > Hi, > > hg.python.org seems to be unreachable (tested from various IP addresses). The docs build daemon started complaining on python-checkins about 2:10 pm UTC (on the 22nd), so about the same time you noticed the issue. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chris.jerdonek at gmail.com Sat Dec 22 21:26:14 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sat, 22 Dec 2012 12:26:14 -0800 Subject: [Python-Dev] hg.python.org down In-Reply-To: References: <20121222141427.GA28134@sleipnir.bytereef.org> Message-ID: On Sat, Dec 22, 2012 at 6:58 AM, Nick Coghlan wrote: > On Sun, Dec 23, 2012 at 12:14 AM, Stefan Krah wrote: >> Hi, >> >> hg.python.org seems to be unreachable (tested from various IP addresses). > > The docs build daemon started complaining on python-checkins about > 2:10 pm UTC (on the 22nd), so about the same time you noticed the > issue. For the record, it seems to be back up. I don't know since when precisely, but the last of the complaints on python-checkins seems to have been about two hours ago. (The complaints were happening every 5 minutes.) --Chris From manisandro at gmail.com Sat Dec 22 20:36:04 2012 From: manisandro at gmail.com (Sandro Mani) Date: Sat, 22 Dec 2012 20:36:04 +0100 Subject: [Python-Dev] [Distutils, Python3] Incorrect shared library extension on linux Message-ID: <50D60BA4.8060500@gmail.com> Hello, First: I'm using Python3 as available in Fedora rawhide (python3-3.3.0-2.fc19.x86_64). Attempting to build a project using python3/distutils, I noticed that find_library_file would not find any library at all. Some investigation showed that this was due to the fact that libraries were searched with the ".cpython-33m.so" extension. Even more investigation showed that the library extension was read being overridden by the one defined in the /usr/lib64/python3.3/config-3.3m/Makefile shipped by python3-libs. See below for the detailed analysis. The python-versioned extension obviously makes no sense for regular shared objects which are not python binary modules, so this is clearly wrong. As a workaround I commented sysconfig.py at customize_compiler::235 (compiler.shared_lib_extension = so_ext, see below), and things seem to work. Is this a distribution bug or an upstream bug? Thanks, Sandro Detailed analysis: setup.py: def _find_library_file(self, library): return self.compiler.find_library_file(self.compiler.library_dirs, library) --- In function /usr/lib64/python3.3/distutils/unixcompiler.py at find_library_file::266: shared_f = self.library_filename(lib, lib_type='shared') In function /usr/lib64/python3.3/distutils/ccompiler.py at library_filename::882: ext = getattr(self, lib_type + "_lib_extension") -> Where does shared_lib_extension get defined? * At /usr/lib64/python3.3/distutils/ccompiler.py::66 shared_lib_extension = None -> default for abstract class * At /usr/lib64/python3.3/distutils/unixcompiler.py::77 shared_lib_extension = ".so" -> this is the correct value * In function /usr/lib64/python3.3/distutils/sysconfig.py at customize_compiler::235 by /usr/lib64/python3.3/distutils/sysconfig.py at customize_compiler::235 compiler.shared_lib_extension = so_ext by /usr/lib64/python3.3/distutils/sysconfig.py at customize_compiler::194 (cc, cxx, opt, cflags, ccshared, ldshared, so_ext, ar, ar_flags) = \ get_config_vars('CC', 'CXX', 'OPT', 'CFLAGS', 'CCSHARED', 'LDSHARED', 'SO', 'AR', 'ARFLAGS')) by /usr/lib64/python3.3/distutils/sysconfig.py at get_config_vars::530 526 global _config_vars 527 if _config_vars is None: 528 func = globals().get("_init_" + os.name) # -> os.name = posix 529 if func: 530 func() # -> _init_posix, populates _config_vars by /usr/lib64/python3.3/distutils/sysconfig.py at _init_posix::439 435 g = {} 436 # load the installed Makefile: 437 try: 438 filename = get_makefile_filename() # /usr/lib64/python3.3/config-3.3m/Makefile 439 parse_makefile(filename, g) ... 476 global _config_vars 477 _config_vars = g # -> _config_vars["SO"] = ".cpython-33m.so" by /usr/lib64/python3.3/config-3.3m/Makefile::122 SO= .cpython-33m.so From barry at python.org Sat Dec 22 21:46:40 2012 From: barry at python.org (Barry Warsaw) Date: Sat, 22 Dec 2012 15:46:40 -0500 Subject: [Python-Dev] [Python-checkins] Cron /home/docs/build-devguide In-Reply-To: <50D619DB.2080304@udel.edu> References: <50D619DB.2080304@udel.edu> Message-ID: <20121222154640.2d41b24e@limelight.wooz.org> On Dec 22, 2012, at 03:36 PM, Terry Reedy wrote: >I always reject the requests as I don't believe these messages belong here. I >even asked, some months ago, on pydev who was responsible for the robot that >sends these but got no answer. Today, apparently, another list admin decided >on the opposite response and gave root at python.org blanket permission to flood >this list with irrelavancy. It it not my responsibility and I have no idea >how to fix it. > >While people with push priviliges are supposed to subscribe to this list, I >know there is at least one who unsubscribed because of the volume. This will >only encourage more to leave, so I hope someone can stop it. Actually, I made docs at dinsdale.python.org an acceptable alias so these messages won't just fill up the hold queue of the list. It did get the problem fixed, didn't it? ;) I don't remember the previous conversation. If folks really don't want those messages hitting the checkins list, then errors should probably be sent to some address that can do something about the problem when they occur. Maybe that's not docs at dinsdale.python.org, or maybe that alias should point somewhere else (is that address ever used in the good path?). I have no idea where that address is used. As for the noise issue, well, I hope such failures shouldn't happen very often. We can set up an auto-discard, but then I worry that problems will just go unnoticed for days. -Barry From mal at egenix.com Sat Dec 22 22:17:32 2012 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 22 Dec 2012 22:17:32 +0100 Subject: [Python-Dev] [Python-checkins] Cron /home/docs/build-devguide In-Reply-To: <50D619DB.2080304@udel.edu> References: <50D619DB.2080304@udel.edu> Message-ID: <50D6236C.8040305@egenix.com> On 22.12.2012 21:36, Terry Reedy wrote: > > On 12/22/2012 1:30 PM, Cron Daemon wrote: >> abort: error: Connection timed out >> _______________________________________________ >> Python-checkins mailing list >> Python-checkins at python.org >> http://mail.python.org/mailman/listinfo/python-checkins > > As a volunteer checkin-list admin, I occasionally get messages like this: > ''' > As list administrator, your authorization is requested for the > following mailing list posting: > > List: Python-checkins at python.org > From: root at python.org > Subject: Cron /home/docs/build-devguide > Reason: Message has implicit destination > > At your convenience, visit: > > http://mail.python.org/mailman/admindb/python-checkins > > to approve or deny the request. > ''' > > I always reject the requests as I don't believe these messages belong here. I even asked, some > months ago, on pydev who was responsible for the robot that sends these but got no answer. Today, > apparently, another list admin decided on the opposite response and gave root at python.org blanket > permission to flood this list with irrelavancy. It it not my responsibility and I have no idea how > to fix it. You can add a sender filter to have the messages automatically discarded. > While people with push priviliges are supposed to subscribe to this list, I know there is at least > one who unsubscribed because of the volume. This will only encourage more to leave, so I hope > someone can stop it. I think such messages should go to a sys admin list. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 22 2012) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2012-12-14: Released mxODBC.Connect 2.0.2 ... http://egenix.com/go38 2013-01-22: Python Meeting Duesseldorf ... 31 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From tjreedy at udel.edu Sun Dec 23 00:13:44 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 22 Dec 2012 18:13:44 -0500 Subject: [Python-Dev] Cron /home/docs/build-devguide In-Reply-To: <20121222154640.2d41b24e@limelight.wooz.org> References: <50D619DB.2080304@udel.edu> <20121222154640.2d41b24e@limelight.wooz.org> Message-ID: On 12/22/2012 3:46 PM, Barry Warsaw wrote: > On Dec 22, 2012, at 03:36 PM, Terry Reedy wrote: > Actually, I made docs at dinsdale.python.org an acceptable alias so these > messages won't just fill up the hold queue of the list. It did get the > problem fixed, didn't it? ;) It solved the admin problem for we two, but over 100(?) people, including me, got a slew of bogus checkin messages. > I don't remember the previous conversation. If folks really don't want those > messages hitting the checkins list, then errors should probably be sent to > some address that can do something about the problem when they occur. Maybe > that's not docs at dinsdale.python.org, or maybe that alias should point > somewhere else (is that address ever used in the good path?). I have no idea > where that address is used. > > As for the noise issue, well, I hope such failures shouldn't happen very > often. We can set up an auto-discard, but then I worry that problems will > just go unnoticed for days. They should be sent to the small group of people who can usefully respond to them. -- Terry Jan Reedy From solipsis at pitrou.net Sun Dec 23 12:22:54 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 23 Dec 2012 12:22:54 +0100 Subject: [Python-Dev] [Distutils, Python3] Incorrect shared library extension on linux References: <50D60BA4.8060500@gmail.com> Message-ID: <20121223122254.1032af6b@pitrou.net> Hello, On Sat, 22 Dec 2012 20:36:04 +0100 Sandro Mani wrote: > Hello, > > First: I'm using Python3 as available in Fedora rawhide > (python3-3.3.0-2.fc19.x86_64). > > Attempting to build a project using python3/distutils, I noticed that > find_library_file would not find any library at all. Some investigation > showed that this was due to the fact that libraries were searched with > the ".cpython-33m.so" extension. Even more investigation showed that the > library extension was read being overridden by the one defined in the > /usr/lib64/python3.3/config-3.3m/Makefile shipped by python3-libs. See > below for the detailed analysis. The python-versioned extension > obviously makes no sense for regular shared objects which are not python > binary modules, so this is clearly wrong. As a workaround I commented > sysconfig.py at customize_compiler::235 (compiler.shared_lib_extension = > so_ext, see below), and things seem to work. > > Is this a distribution bug or an upstream bug? Probably an upstream bug, I suggest you file it at http://bugs.python.org. Regards Antoine. From tjreedy at udel.edu Sun Dec 23 21:03:36 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 23 Dec 2012 15:03:36 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16045: add more unit tests for built-in int() In-Reply-To: <3YTgPm4v0RzRXN@mail.python.org> References: <3YTgPm4v0RzRXN@mail.python.org> Message-ID: <50D76398.8040403@udel.edu> > + # For example, PyPy 1.9.0 raised TypeError for these cases because it > + # expects x to be a string if base is given. > + @support.cpython_only > + def test_base_arg_with_no_x_arg(self): > + self.assertEquals(int(base=6), 0) > + # Even invalid bases don't raise an exception. > + self.assertEquals(int(base=1), 0) > + self.assertEquals(int(base=1000), 0) > + self.assertEquals(int(base='foo'), 0) I think the above behavior is buggy and should be changed rather than frozen into CPython with a test. According to the docs, PyPy does it right. The current online doc gives the signature as int(x=0) int(x, base=10) <> The 3.3.0 docstring says "When converting a string, use the optional base. It is an error to supply a base when converting a non-string." Certainly, accepting any object as a base, violating "The allowed values are 0 and 2-36." just because giving a base is itself invalid is crazy. -- Terry Jan Reedy From chris.jerdonek at gmail.com Sun Dec 23 22:47:46 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 23 Dec 2012 13:47:46 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16045: add more unit tests for built-in int() In-Reply-To: <50D76398.8040403@udel.edu> References: <3YTgPm4v0RzRXN@mail.python.org> <50D76398.8040403@udel.edu> Message-ID: On Sun, Dec 23, 2012 at 12:03 PM, Terry Reedy wrote: > >> + # For example, PyPy 1.9.0 raised TypeError for these cases because it >> + # expects x to be a string if base is given. >> + @support.cpython_only >> + def test_base_arg_with_no_x_arg(self): >> + self.assertEquals(int(base=6), 0) >> + # Even invalid bases don't raise an exception. >> + self.assertEquals(int(base=1), 0) >> + self.assertEquals(int(base=1000), 0) >> + self.assertEquals(int(base='foo'), 0) > > > I think the above behavior is buggy and should be changed rather than frozen > into CPython with a test. According to the docs, PyPy does it right. I support further discussion here. (I did draft the patch, but it was a first version. I did not commit the patch.) > The current online doc gives the signature as > int(x=0) > int(x, base=10) <> > > The 3.3.0 docstring says > "When converting a string, use the optional base. It is an error to supply > a base when converting a non-string." One way to partially explain CPython's behavior is that when base is provided, the function behaves as if x defaults to '0' rather than 0. This is similar to the behavior of str(), which defaults to b'' when encoding or errors is provided, but otherwise defaults to '': http://docs.python.org/dev/library/stdtypes.html#str > Certainly, accepting any object as a base, violating "The allowed values are > 0 and 2-36." just because giving a base is itself invalid is crazy. For further background (and you can see this is the 2.7 commit), int(base='foo') did raise TypeError in 2.7, but this particular case was relaxed in Python 3. --Chris From tjreedy at udel.edu Mon Dec 24 03:19:14 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 23 Dec 2012 21:19:14 -0500 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16045: add more unit tests for built-in int() In-Reply-To: References: <3YTgPm4v0RzRXN@mail.python.org> <50D76398.8040403@udel.edu> Message-ID: On 12/23/2012 4:47 PM, Chris Jerdonek wrote: > On Sun, Dec 23, 2012 at 12:03 PM, Terry Reedy wrote: >> >>> + # For example, PyPy 1.9.0 raised TypeError for these cases because it >>> + # expects x to be a string if base is given. >>> + @support.cpython_only >>> + def test_base_arg_with_no_x_arg(self): >>> + self.assertEquals(int(base=6), 0) >>> + # Even invalid bases don't raise an exception. >>> + self.assertEquals(int(base=1), 0) >>> + self.assertEquals(int(base=1000), 0) >>> + self.assertEquals(int(base='foo'), 0) >> >> >> I think the above behavior is buggy and should be changed rather than frozen >> into CPython with a test. According to the docs, PyPy does it right. In any case, the discrepancy between doc and behavior is a bug and should be fixed one way or the other way. Unlike int(), I do not see a realistic use case for int(base=x) that would make it anything other than a bug. > I support further discussion here. (I did draft the patch, but it was > a first version. I did not commit the patch.) > >> The current online doc gives the signature as >> int(x=0) >> int(x, base=10) <> >> >> The 3.3.0 docstring says >> "When converting a string, use the optional base. It is an error to supply >> a base when converting a non-string." > > One way to partially explain CPython's behavior is that when base is > provided, the function behaves as if x defaults to '0' rather than 0. That explanation does not work. int('0', base = invalid) and int(x='0', base=invalid) raise TypeError or ValueError. If providing a value explicit changes behavior, then that value is not the default. To make '0' really be the base-present default, the doc and above behavior should be changed. Or, make '' the default and have int('', base=whatever) return 0 instead of raising. (This would be the actual parallel to the str case.) > This is similar to the behavior of str(), which defaults to b'' when > encoding or errors is provided, but otherwise defaults to '': This is different. Providing b'' explicitly has no effect. str(encoding=x, errors=y) and str(b'', encoding=x, errors=y) act the same. If x or y is not a string, both raise TypeError. (Unlike int and base.) A bad encoding string is ignored because the encoding lookup is not done unless there is something to encode. (This is why the ignore-base base-default should be '', not '0'.) A bad error specification is (I believe) ignored for any error-free bytes/encoding pair because, again, the lookup is only done when needed. > http://docs.python.org/dev/library/stdtypes.html#str > >> Certainly, accepting any object as a base, violating "The allowed values are >> 0 and 2-36." just because giving a base is itself invalid is crazy. > > For further background (and you can see this is the 2.7 commit), > int(base='foo') did raise TypeError in 2.7, but this particular case > was relaxed in Python 3. Since the doc was not changed, that introduced a bug. -- Terry Jan Reedy From chris.jerdonek at gmail.com Mon Dec 24 04:24:44 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sun, 23 Dec 2012 19:24:44 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16045: add more unit tests for built-in int() In-Reply-To: References: <3YTgPm4v0RzRXN@mail.python.org> <50D76398.8040403@udel.edu> Message-ID: On Sun, Dec 23, 2012 at 6:19 PM, Terry Reedy wrote: > On 12/23/2012 4:47 PM, Chris Jerdonek wrote: >> On Sun, Dec 23, 2012 at 12:03 PM, Terry Reedy wrote: >>> >>>> + # For example, PyPy 1.9.0 raised TypeError for these cases because >>>> it >>>> + # expects x to be a string if base is given. >>>> + @support.cpython_only >>>> + def test_base_arg_with_no_x_arg(self): >>>> + self.assertEquals(int(base=6), 0) >>>> + # Even invalid bases don't raise an exception. >>>> + self.assertEquals(int(base=1), 0) >>>> + self.assertEquals(int(base=1000), 0) >>>> + self.assertEquals(int(base='foo'), 0) >>> >>> I think the above behavior is buggy and should be changed rather than >>> frozen >>> into CPython with a test. According to the docs, PyPy does it right. > > In any case, the discrepancy between doc and behavior is a bug and should be > fixed one way or the other way. Unlike int(), I do not see a realistic use > case for int(base=x) that would make it anything other than a bug. Just to be clear, I agree with you that something needs fixing (and again, I did not commit the patch). But I want to clarify a couple of your responses to my points. >> One way to partially explain CPython's behavior is that when base is >> provided, the function behaves as if x defaults to '0' rather than 0. > > > That explanation does not work. int('0', base = invalid) and int(x='0', > base=invalid) raise TypeError or ValueError. I was referring to the behavioral discrepancy between CPython returning 0 for int(base=valid) and the part of the docstring you quoted which says, "It is an error to supply a base when converting a non-string." I wasn't justifying the case of int(base=invalid). That's why I said "partially" explains. The int(base=valid) case is covered by the following line of the CPython-specific test that was committed (which in PyPy raises TypeError): + self.assertEquals(int(base=6), 0) > If providing a value explicit > changes behavior, then that value is not the default. To make '0' really be > the base-present default, the doc and above behavior should be changed. Or, > make '' the default and have int('', base=whatever) return 0 instead of > raising. (This would be the actual parallel to the str case.) >> This is similar to the behavior of str(), which defaults to b'' when >> encoding or errors is provided, but otherwise defaults to '': > > This is different. Providing b'' explicitly has no effect. > str(encoding=x, errors=y) and str(b'', encoding=x, errors=y) act the same. > If x or y is not a string, both raise TypeError. (Unlike int and base.) A > bad encoding string is ignored because the encoding lookup is not done > unless there is something to encode. (This is why the ignore-base > base-default should be '', not '0'.) A bad error specification is (I > believe) ignored for any error-free bytes/encoding pair because, again, the > lookup is only done when needed. Again, I was referring to the "valid" case. My point was that str()'s object argument defaults to '' when encoding or errors isn't given, and otherwise defaults to b''. You can see that the object argument defaults to '' in the simpler case here: >>> str(), str(object=''), str(object=b'') ('', '', "b''") But when the encoding argument is given the default is different (it is b''): >>> str(object='', encoding='utf-8') TypeError: decoding str is not supported >>> str(encoding='utf-8'), str(object=b'', encoding='utf-8') ('', '') But again, these are clarifications of my comments. I'm not disagreeing with your larger point. --Chris From ajaygargnsit at gmail.com Mon Dec 24 08:57:06 2012 From: ajaygargnsit at gmail.com (Ajay Garg) Date: Mon, 24 Dec 2012 13:27:06 +0530 Subject: [Python-Dev] Is it possible to switch into the context of a child-process, spawned by "subprocess" module? Message-ID: Hi all. This is more of knowing whether something is possible in the core python architecture; hence the question to this mailing-list :) I have a situation where I am spawning a child process via "subprocess" module. This child process is equivalent to the process that would have been created, if I had run a vanilla python-script in another shell. In this (new) (child) process, new objects are instantiated, and methods get called on those objects as usual. Now, what I need is to somehow switch into this (new) (child) process from the current (parent) process, and be able to call methods-on-the-objects-of-the-child-process. Also, please note that since the child process contains GUI, I intend to have the results of calling the methods-on-the-objects-of-the-child-process being effective on the child-process GUI. Is it possible? Or am I trying to achieve something impossible as per python-core-architecture? I will be thankful for any pointers regarding this. Regards, Ajay -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajaygargnsit at gmail.com Mon Dec 24 09:26:20 2012 From: ajaygargnsit at gmail.com (Ajay Garg) Date: Mon, 24 Dec 2012 13:56:20 +0530 Subject: [Python-Dev] Running GUI and "GObject.mainloop.run()" together? Message-ID: Hi all. This is another question that arises as part of my efforts to run a GUI application, as well as a dbus-service within the same process. (The other question being at http://mail.python.org/pipermail/python-dev/2012-December/123287.html) For a recap of the brief history, I have a parent process, that is spawning a child process via "subprocess". Currently, the child-process is a GUI process; however, I intend to "behave" it as a dbus-service as well. Thus, a) I subclassed the child-process "main" class with "dbus.service.Object"; however, I then got the error "metaclass conflict: the metaclass of a derived class must be a (non-strict)subclass of the metaclasses of all its bases " b) I then used composition, wherein another class, "RemoteListener" deriving from "dbus.service.Object" was made an attribute of the "main" class. That worked. However, when I do dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) GObject.mainloop.run() in the "RemoteListener"'s __init__ method, the GUI of the "main" class fails to load (apparently, because the "mainloop.run()" causes the singular, main-thread to go into busy-wait). c) I tried option b), but now instantiating "RemoteListener" in a separate thread; however, no improvement :( Is there a way to run GUI and a dbus-service together? Or is a dbus-service pure "backend" process? :P I will be grateful for any pointers; if you need me to test anything else, please let me know, I will be more than willing :) Regards, Ajay -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Mon Dec 24 09:36:52 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 24 Dec 2012 10:36:52 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Fix #14470. Remove w9xpopen per PEP 11. In-Reply-To: <3YTzS91KzGzRgT@mail.python.org> References: <3YTzS91KzGzRgT@mail.python.org> Message-ID: You missed artifacts in ./PC/VC6 ./PC/VS7.1 ./PC/VS8.0 ./PC/VS9.0 On Mon, Dec 24, 2012 at 12:55 AM, brian.curtin wrote: > http://hg.python.org/cpython/rev/c903e4f1121d > changeset: 81005:c903e4f1121d > parent: 81003:e3d0417d8266 > user: Brian Curtin > date: Sun Dec 23 16:53:21 2012 -0600 > summary: > Fix #14470. Remove w9xpopen per PEP 11. > > As stated in PEP 11, 3.4 removes code on Windows platforms where > COMSPEC points to command.com. The w9xpopen project in Visual Studio > was added to support that case, and there was a special case in subprocess > to cover that situation. This change removes the w9xpopen project from > the Visual Studio solution and removes any references to the w9xpopen > executable. > > files: > Lib/subprocess.py | 32 -- > PC/w9xpopen.c | 112 ------- > PCbuild/pcbuild.sln | 2 - > PCbuild/python.vcxproj | 6 +- > PCbuild/w9xpopen.vcxproj | 287 ------------------- > PCbuild/w9xpopen.vcxproj.filters | 13 - > Tools/msi/msi.py | 2 - > 7 files changed, 1 insertions(+), 453 deletions(-) > > > diff --git a/Lib/subprocess.py b/Lib/subprocess.py > --- a/Lib/subprocess.py > +++ b/Lib/subprocess.py > @@ -1029,23 +1029,6 @@ > return Handle(h) > > > - def _find_w9xpopen(self): > - """Find and return absolut path to w9xpopen.exe""" > - w9xpopen = os.path.join( > - os.path.dirname(_winapi.GetModuleFileName(0)), > - "w9xpopen.exe") > - if not os.path.exists(w9xpopen): > - # Eeek - file-not-found - possibly an embedding > - # situation - see if we can locate it in sys.exec_prefix > - w9xpopen = os.path.join(os.path.dirname(sys.base_exec_prefix), > - "w9xpopen.exe") > - if not os.path.exists(w9xpopen): > - raise SubprocessError( > - "Cannot locate w9xpopen.exe, which is needed for " > - "Popen to work with your shell or platform.") > - return w9xpopen > - > - > def _execute_child(self, args, executable, preexec_fn, close_fds, > pass_fds, cwd, env, > startupinfo, creationflags, shell, > @@ -1074,21 +1057,6 @@ > startupinfo.wShowWindow = _winapi.SW_HIDE > comspec = os.environ.get("COMSPEC", "cmd.exe") > args = '{} /c "{}"'.format (comspec, args) > - if (_winapi.GetVersion() >= 0x80000000 or > - os.path.basename(comspec).lower() == "command.com"): > - # Win9x, or using command.com on NT. We need to > - # use the w9xpopen intermediate program. For more > - # information, see KB Q150956 > - # (http://web.archive.org/web/20011105084002/http://support.microsoft.com/support/kb/articles/Q150/9/56.asp) > - w9xpopen = self._find_w9xpopen() > - args = '"%s" %s' % (w9xpopen, args) > - # Not passing CREATE_NEW_CONSOLE has been known to > - # cause random failures on win9x. Specifically a > - # dialog: "Your program accessed mem currently in > - # use at xxx" and a hopeful warning about the > - # stability of your system. Cost is Ctrl+C won't > - # kill children. > - creationflags |= _winapi.CREATE_NEW_CONSOLE > > # Start the process > try: > diff --git a/PC/w9xpopen.c b/PC/w9xpopen.c > deleted file mode 100644 > --- a/PC/w9xpopen.c > +++ /dev/null > @@ -1,112 +0,0 @@ > -/* > - * w9xpopen.c > - * > - * Serves as an intermediate stub Win32 console application to > - * avoid a hanging pipe when redirecting 16-bit console based > - * programs (including MS-DOS console based programs and batch > - * files) on Window 95 and Windows 98. > - * > - * This program is to be launched with redirected standard > - * handles. It will launch the command line specified 16-bit > - * console based application in the same console, forwarding > - * its own redirected standard handles to the 16-bit child. > - > - * AKA solution to the problem described in KB: Q150956. > - */ > - > -#define WIN32_LEAN_AND_MEAN > -#include > -#include > -#include /* for malloc and its friends */ > - > -const char *usage = > -"This program is used by Python's os.popen function\n" > -"to work around a limitation in Windows 95/98. It is\n" > -"not designed to be used as a stand-alone program."; > - > -int main(int argc, char *argv[]) > -{ > - BOOL bRet; > - STARTUPINFO si; > - PROCESS_INFORMATION pi; > - DWORD exit_code=0; > - size_t cmdlen = 0; > - int i; > - char *cmdline, *cmdlinefill; > - > - if (argc < 2) { > - if (GetFileType(GetStdHandle(STD_INPUT_HANDLE))==FILE_TYPE_CHAR) > - /* Attached to a console, and therefore not executed by Python > - Display a message box for the inquisitive user > - */ > - MessageBox(NULL, usage, argv[0], MB_OK); > - else { > - /* Eeek - executed by Python, but args are screwed! > - Write an error message to stdout so there is at > - least some clue for the end user when it appears > - in their output. > - A message box would be hidden and blocks the app. > - */ > - fprintf(stdout, "Internal popen error - no args specified\n%s\n", usage); > - } > - return 1; > - } > - /* Build up the command-line from the args. > - Args with a space are quoted, existing quotes are escaped. > - To keep things simple calculating the buffer size, we assume > - every character is a quote - ie, we allocate double what we need > - in the worst case. As this is only double the command line passed > - to us, there is a good chance this is reasonably small, so the total > - allocation will almost always be < 512 bytes. > - */ > - for (i=1;i - cmdlen += strlen(argv[i])*2 + 3; /* one space, maybe 2 quotes */ > - cmdline = cmdlinefill = (char *)malloc(cmdlen+1); > - if (cmdline == NULL) > - return -1; > - for (i=1;i - const char *arglook; > - int bQuote = strchr(argv[i], ' ') != NULL; > - if (bQuote) > - *cmdlinefill++ = '"'; > - /* escape quotes */ > - for (arglook=argv[i];*arglook;arglook++) { > - if (*arglook=='"') > - *cmdlinefill++ = '\\'; > - *cmdlinefill++ = *arglook; > - } > - if (bQuote) > - *cmdlinefill++ = '"'; > - *cmdlinefill++ = ' '; > - } > - *cmdlinefill = '\0'; > - > - /* Make child process use this app's standard files. */ > - ZeroMemory(&si, sizeof si); > - si.cb = sizeof si; > - si.dwFlags = STARTF_USESTDHANDLES; > - si.hStdInput = GetStdHandle(STD_INPUT_HANDLE); > - si.hStdOutput = GetStdHandle(STD_OUTPUT_HANDLE); > - si.hStdError = GetStdHandle(STD_ERROR_HANDLE); > - > - bRet = CreateProcess( > - NULL, cmdline, > - NULL, NULL, > - TRUE, 0, > - NULL, NULL, > - &si, &pi > - ); > - > - free(cmdline); > - > - if (bRet) { > - if (WaitForSingleObject(pi.hProcess, INFINITE) != WAIT_FAILED) { > - GetExitCodeProcess(pi.hProcess, &exit_code); > - } > - CloseHandle(pi.hProcess); > - CloseHandle(pi.hThread); > - return exit_code; > - } > - > - return 1; > -} > diff --git a/PCbuild/pcbuild.sln b/PCbuild/pcbuild.sln > --- a/PCbuild/pcbuild.sln > +++ b/PCbuild/pcbuild.sln > @@ -14,8 +14,6 @@ > EndProject > Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "pythonw", "pythonw.vcxproj", "{F4229CC3-873C-49AE-9729-DD308ED4CD4A}" > EndProject > -Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "w9xpopen", "w9xpopen.vcxproj", "{E9E0A1F6-0009-4E8C-B8F8-1B8F5D49A058}" > -EndProject > Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "make_buildinfo", "make_buildinfo.vcxproj", "{C73F0EC1-358B-4177-940F-0846AC8B04CD}" > EndProject > Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "winsound", "winsound.vcxproj", "{28B5D777-DDF2-4B6B-B34F-31D938813856}" > diff --git a/PCbuild/python.vcxproj b/PCbuild/python.vcxproj > --- a/PCbuild/python.vcxproj > +++ b/PCbuild/python.vcxproj > @@ -357,12 +357,8 @@ > {cf7ac3d1-e2df-41d2-bea6-1e2556cdea26} > false > > - > - {e9e0a1f6-0009-4e8c-b8f8-1b8f5d49a058} > - false > - > > > > > - > \ No newline at end of file > + > diff --git a/PCbuild/w9xpopen.vcxproj b/PCbuild/w9xpopen.vcxproj > deleted file mode 100644 > --- a/PCbuild/w9xpopen.vcxproj > +++ /dev/null > @@ -1,287 +0,0 @@ > -? > - > - > - > - Debug > - Win32 > - > - > - Debug > - x64 > - > - > - PGInstrument > - Win32 > - > - > - PGInstrument > - x64 > - > - > - PGUpdate > - Win32 > - > - > - PGUpdate > - x64 > - > - > - Release > - Win32 > - > - > - Release > - x64 > - > - > - > - {E9E0A1F6-0009-4E8C-B8F8-1B8F5D49A058} > - w9xpopen > - > - > - > - Application > - false > - MultiByte > - > - > - Application > - false > - MultiByte > - > - > - Application > - false > - MultiByte > - > - > - Application > - false > - NotSet > - > - > - Application > - false > - MultiByte > - > - > - Application > - false > - MultiByte > - > - > - Application > - false > - MultiByte > - > - > - Application > - false > - MultiByte > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - > - <_ProjectFileVersion>10.0.40219.1 > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - AllRules.ruleset > - > - > - > - > - > - Disabled > - EnableFastChecks > - MultiThreadedDebug > - > - > - Console > - > - > - > - > - X64 > - > - > - Disabled > - EnableFastChecks > - MultiThreadedDebug > - > - > - Console > - > - > - > - > - MaxSpeed > - OnlyExplicitInline > - true > - MultiThreaded > - true > - > - > - false > - Console > - > - > - > - > - X64 > - > - > - MaxSpeed > - OnlyExplicitInline > - true > - MultiThreaded > - true > - > - > - false > - Console > - > - > - > - > - MaxSpeed > - OnlyExplicitInline > - true > - MultiThreaded > - true > - > - > - false > - Console > - > - > - > - > - > - > - X64 > - > - > - MaxSpeed > - OnlyExplicitInline > - true > - MultiThreaded > - true > - > - > - false > - Console > - > - > - MachineX64 > - > - > - > - > - MaxSpeed > - OnlyExplicitInline > - true > - MultiThreaded > - true > - > - > - false > - Console > - > - > - > - > - > - > - X64 > - > - > - MaxSpeed > - OnlyExplicitInline > - true > - MultiThreaded > - true > - > - > - false > - Console > - > - > - MachineX64 > - > - > - > - > - > - > - > - > - > \ No newline at end of file > diff --git a/PCbuild/w9xpopen.vcxproj.filters b/PCbuild/w9xpopen.vcxproj.filters > deleted file mode 100644 > --- a/PCbuild/w9xpopen.vcxproj.filters > +++ /dev/null > @@ -1,13 +0,0 @@ > -? > - > - > - > - {abc2dffd-3f2a-47bd-b89b-0314c99ef21e} > - > - > - > - > - Source Files > - > - > - > \ No newline at end of file > diff --git a/Tools/msi/msi.py b/Tools/msi/msi.py > --- a/Tools/msi/msi.py > +++ b/Tools/msi/msi.py > @@ -956,8 +956,6 @@ > # Add all executables, icons, text files into the TARGETDIR component > root = PyDirectory(db, cab, None, srcdir, "TARGETDIR", "SourceDir") > default_feature.set_current() > - if not msilib.Win64: > - root.add_file("%s/w9xpopen.exe" % PCBUILD) > root.add_file("README.txt", src="README") > root.add_file("NEWS.txt", src="Misc/NEWS") > generate_license() > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Thanks, Andrew Svetlov From storchaka at gmail.com Mon Dec 24 11:27:30 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 24 Dec 2012 12:27:30 +0200 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16045: add more unit tests for built-in int() In-Reply-To: <50D76398.8040403@udel.edu> References: <3YTgPm4v0RzRXN@mail.python.org> <50D76398.8040403@udel.edu> Message-ID: On 23.12.12 22:03, Terry Reedy wrote: > I think the above behavior is buggy and should be changed rather than > frozen into CPython with a test. According to the docs, PyPy does it right. http://bugs.python.org/issue16761 From tjreedy at udel.edu Mon Dec 24 12:21:24 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 24 Dec 2012 06:21:24 -0500 Subject: [Python-Dev] Is it possible to switch into the context of a child-process, spawned by "subprocess" module? In-Reply-To: References: Message-ID: On 12/24/2012 2:57 AM, Ajay Garg wrote: > Hi all. > > This is more of knowing whether something is possible in the core python > architecture; hence the question to this mailing-list :) The pydev list is for discussion of development of future Python, not for how to use current Python. Usage questions should go to python-list or other forums. (I am 99.99% sure the answer to this one is no.) -- Terry Jan Reedy From techtonik at gmail.com Mon Dec 24 14:42:20 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 24 Dec 2012 16:42:20 +0300 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: What should I do in case Eric lost interest after his GSoC project for PSF appeared as useless for python-dev community? Should I rewrite the proposal from scratch? On Thu, Dec 20, 2012 at 11:18 PM, Brett Cannon wrote: > You cannot rewrite an existing PEP if you are not one of the original > owners, nor can you add yourself as an author to a PEP without permission > from the original authors. > > And please do not CC the peps mailing list on discussions. It should only > be used to mail in new PEPs or acceptable patches to PEPs. > > > On Wed, Dec 19, 2012 at 5:20 PM, anatoly techtonik wrote: > >> On Sun, Dec 9, 2012 at 7:17 AM, Gregory P. Smith wrote: >> >>> I'm really not sure what this PEP is trying to get at given that it >>> contains no examples and sounds from the descriptions to be adding a >>> complicated api on top of something that already, IMNSHO, has too much it >>> (subprocess.Popen). >>> >>> Regardless, any user can use the stdout/err/in file objects with their >>> own code that handles them asynchronously (yes that can be painful but that >>> is what is required for _any_ socket or pipe I/O you don't want to block >>> on). >>> >> >> And how to use stdout/stderr/in asynchronously in cross-platform manner? >> IIUC the problem is that every read is blocking. >> >> >>> It *sounds* to me like this entire PEP could be written and released as >>> a third party module on PyPI that offers a subprocess.Popen subclass adding >>> some more convenient non-blocking APIs. That's where I'd start if I were >>> interested in this as a future feature. >>> >> >> I've rewritten the PEP based on how do I understand the code. I don't >> know how to update it and how to comply with open documentation license, so >> I just attach it and add PEPs list to CC. Me too has a feeling that the PEP >> should be stripped of additional high level API until low level >> functionality is well understood and accepted. >> >> -- >> anatoly t. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/brett%40python.org >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joaquinsargiotto at gmail.com Mon Dec 24 15:26:37 2012 From: joaquinsargiotto at gmail.com (Joaquin Sargiotto) Date: Mon, 24 Dec 2012 11:26:37 -0300 Subject: [Python-Dev] Is it possible to switch into the context of a child-process, spawned by "subprocess" module? In-Reply-To: References: Message-ID: El dic 24, 2012 4:59 a.m., "Ajay Garg" escribi?: > > Hi all. > > This is more of knowing whether something is possible in the core python architecture; hence the question to this mailing-list :) > > I have a situation where I am spawning a child process via "subprocess" module. > This child process is equivalent to the process that would have been created, if I had run a vanilla python-script in another shell. > > In this (new) (child) process, new objects are instantiated, and methods get called on those objects as usual. > > Now, what I need is to somehow switch into this (new) (child) process from the current (parent) process, and be able to call methods-on-the-objects-of-the-child-process. > Also, please note that since the child process contains GUI, I intend to have the results of calling the methods-on-the-objects-of-the-child-process being effective on the child-process GUI. > > > Is it possible? Or am I trying to achieve something impossible as per python-core-architecture? > Hint: xmlrpclib. And that should be the end of this thread. Regards > > > I will be thankful for any pointers regarding this. > > Regards, > Ajay > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/joaquinsargiotto%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajaygargnsit at gmail.com Mon Dec 24 17:02:41 2012 From: ajaygargnsit at gmail.com (Ajay Garg) Date: Mon, 24 Dec 2012 21:32:41 +0530 Subject: [Python-Dev] Is it possible to switch into the context of a child-process, spawned by "subprocess" module? In-Reply-To: References: Message-ID: Terry, Sorry; and thanks for the info. Joaquin, Thanks for the pointer; I will investigate :) On Mon, Dec 24, 2012 at 7:56 PM, Joaquin Sargiotto < joaquinsargiotto at gmail.com> wrote: > > El dic 24, 2012 4:59 a.m., "Ajay Garg" escribi?: > > > > > Hi all. > > > > This is more of knowing whether something is possible in the core python > architecture; hence the question to this mailing-list :) > > > > I have a situation where I am spawning a child process via "subprocess" > module. > > This child process is equivalent to the process that would have been > created, if I had run a vanilla python-script in another shell. > > > > In this (new) (child) process, new objects are instantiated, and methods > get called on those objects as usual. > > > > Now, what I need is to somehow switch into this (new) (child) process > from the current (parent) process, and be able to call > methods-on-the-objects-of-the-child-process. > > Also, please note that since the child process contains GUI, I intend to > have the results of calling the methods-on-the-objects-of-the-child-process > being effective on the child-process GUI. > > > > > > Is it possible? Or am I trying to achieve something impossible as per > python-core-architecture? > > > > Hint: xmlrpclib. > > And that should be the end of this thread. > > Regards > > > > > > > I will be thankful for any pointers regarding this. > > > > Regards, > > Ajay > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/joaquinsargiotto%40gmail.com > > > > -- Regards, Ajay -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Mon Dec 24 18:02:27 2012 From: brian at python.org (Brian Curtin) Date: Mon, 24 Dec 2012 11:02:27 -0600 Subject: [Python-Dev] [Python-checkins] cpython: Fix #14470. Remove w9xpopen per PEP 11. In-Reply-To: References: <3YTzS91KzGzRgT@mail.python.org> Message-ID: On Mon, Dec 24, 2012 at 2:36 AM, Andrew Svetlov wrote: > You missed artifacts in ./PC/VC6 ./PC/VS7.1 ./PC/VS8.0 ./PC/VS9.0 Fixed in http://hg.python.org/cpython/rev/deee9f0a4b98 Also reported http://bugs.python.org/issue16769 about removing some those directories because they are pretty much useless. From jeremy.kloth at gmail.com Mon Dec 24 20:16:08 2012 From: jeremy.kloth at gmail.com (Jeremy Kloth) Date: Mon, 24 Dec 2012 12:16:08 -0700 Subject: [Python-Dev] [Python-checkins] cpython: Use OESeeror instead of os.error (#16720) In-Reply-To: <3YVSsD55GszRnt@mail.python.org> References: <3YVSsD55GszRnt@mail.python.org> Message-ID: On Mon, Dec 24, 2012 at 11:00 AM, andrew.svetlov wrote: > http://hg.python.org/cpython/rev/6cfe2982de42 > changeset: 81017:6cfe2982de42 > parent: 81011:a7c9869a5114 > user: Andrew Svetlov > date: Mon Dec 24 19:58:48 2012 +0200 > summary: > Use OESeeror instead of os.error (#16720) > > diff --git a/Lib/os.py b/Lib/os.py > --- a/Lib/os.py > +++ b/Lib/os.py > @@ -275,7 +275,7 @@ > while head and tail: > try: > rmdir(head) > - except error: > + except OSrror: > break > head, tail = path.split(head) > Shouldn't that be OSError? From andrew.svetlov at gmail.com Mon Dec 24 20:48:27 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Mon, 24 Dec 2012 21:48:27 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Use OESeeror instead of os.error (#16720) In-Reply-To: References: <3YVSsD55GszRnt@mail.python.org> Message-ID: Sorry, my bad. Fixed in e2e5181b10f8 On Mon, Dec 24, 2012 at 9:16 PM, Jeremy Kloth wrote: > On Mon, Dec 24, 2012 at 11:00 AM, andrew.svetlov > wrote: >> http://hg.python.org/cpython/rev/6cfe2982de42 >> changeset: 81017:6cfe2982de42 >> parent: 81011:a7c9869a5114 >> user: Andrew Svetlov >> date: Mon Dec 24 19:58:48 2012 +0200 >> summary: >> Use OESeeror instead of os.error (#16720) >> >> diff --git a/Lib/os.py b/Lib/os.py >> --- a/Lib/os.py >> +++ b/Lib/os.py >> @@ -275,7 +275,7 @@ >> while head and tail: >> try: >> rmdir(head) >> - except error: >> + except OSrror: >> break >> head, tail = path.split(head) >> > > Shouldn't that be OSError? -- Thanks, Andrew Svetlov From storchaka at gmail.com Mon Dec 24 20:59:29 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 24 Dec 2012 21:59:29 +0200 Subject: [Python-Dev] [Python-checkins] cpython: Use OESeeror instead of os.error (#16720) In-Reply-To: References: <3YVSsD55GszRnt@mail.python.org> Message-ID: On 24.12.12 21:48, Andrew Svetlov wrote: > Sorry, my bad. Fixed in e2e5181b10f8 It's my fault. Sorry. >>> summary: >>> Use OESeeror instead of os.error (#16720) But it's not my. ;) From glyph at twistedmatrix.com Mon Dec 24 23:58:17 2012 From: glyph at twistedmatrix.com (Glyph) Date: Mon, 24 Dec 2012 14:58:17 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: <20121221204411.1269322b@pitrou.net> Message-ID: <41C7D2F7-CA38-46C0-8477-B3F2FA034712@twistedmatrix.com> On Dec 21, 2012, at 1:10 PM, Guido van Rossum wrote: > > > The transport is free to buffer the bytes, but it must eventually > > > cause the bytes to be transferred to the entity at the other end, and > > > it must maintain stream behavior. That is, t.write(b'abc'); > > > t.write(b'def') is equivalent to t.write(b'abcdef') > > > > I think this is a bad idea. The kernel's network stack should do the > > buffering (and choose appropriate algorithms for that), not the > > user-level framework. The transport should write the bytes as soon as > > the fd is ready for writing, and it should write the same chunks as > > given by the user, not a concatenation of them. > > I asked Glyph about this. It depends on the OS... Mac syscalls are so slow that it is better to join in user space. This should really be up to the transport, although for stream transports the given equivalency should definitely hold. > It's not so much that "mac syscalls are slow" as that "syscalls are not free, and the cost varies". Older versions of MacOS were particularly bad. Some versions of Linux had bizarre regressions in the performance of send() or recv() or pipe(). The things that pass for syscalls on Windows can be particularly catastrophically slow (although this is practically a consideration for filesystem APIs, not socket APIs, who knows what this the future will hold). There are a number of other reasons why this should be this way as well. User-space has the ability to buffer indefinitely, and the kernel does not. Sometimes, send() returns a truncated value, and you have to deal with this. Since you've allocated the memory for the value you're calling write() with anyway, you might as well stash it away in the framework. The alternative is to let every application implement - and by implement, I mean "screw up" - a low-performance buffering implementation. User-space has more information about the type of information being sent. If the user does write() write() write() within one loop iteration, the framework can hypothetically optimize that into a single syscall using scatter-gather I/O. (Fun fact: we tried this, and it turns out that some implementations of scatter-gather I/O are actually *slower* than naive repeated calls; information like this should, again, be preserved within the framework.) In order to preserve compatibility with other systems (Twisted, Tornado, et. al.), the framework must be within its rights to do the buffering itself, even if it actually does exactly what you're suggesting because that happens to be better for performance in some circumstances. Choosing different buffering strategies for different applications is an important tuning option. Applications which appear to work in some contexts if the boundaries of data passed to send() are exactly the same as the boundaries of the data sent to write() should not be coddled; this just makes them harder to debug later. They should be broken as soon as possible. This is a subtle, pernicious and nearly constant error that people new to networking make and the sooner it surfaces, the better. The segments passed to data_received() should be as different as possible from the segments passed to write(). > > Besides, it would be better if transports weren't automatically > > *streaming* transports. There are connected datagram protocols, such as > > named pipes under Windows (multiprocessing already uses non-blocking > > Windows named pipes). > > I think we need to support datagrams, but the default ought to be stream. > In my humble (but entirely, verifiably correct) opinion, thinking of this as a "default" is propagating a design error in the BSD sockets API. Datagram and stream sockets have radically different semantics. In Twisted, "dataReceived" and "datagramReceived" are different methods for a good reason. Again, it's very very easy to fall into the trap of thinking that a TCP segment is a datagram and writing all your application code as if it were. After all, it probably works over localhost most of the time! This difference in semantics mirrored by a difference in method naming has helped quite a few people grok the distinction between streaming and datagrams over the years; I think it would be a good idea if Tulip followed suit. -glyph -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Mon Dec 24 23:59:13 2012 From: glyph at twistedmatrix.com (Glyph) Date: Mon, 24 Dec 2012 14:59:13 -0800 Subject: [Python-Dev] PEP 3156 - Asynchronous IO Support Rebooted In-Reply-To: References: <20121221204411.1269322b@pitrou.net> Message-ID: <78FB229D-E20A-431F-9641-2BE97A4305F5@twistedmatrix.com> On Dec 21, 2012, at 1:10 PM, Guido van Rossum wrote: > > > TBD: Need an interface to wait for the first of a collection of Futures. > > > > Have you looked at Twisted's DeferredList? > > http://twistedmatrix.com/documents/12.1.0/api/twisted.internet.defer.DeferredList.html > > No, I am trying to stay away from them. > Those who do not understand Deferreds are doomed to re-implement them poorly ;-). (And believe me, I've seen more than a few poor re-implementations at this point...) -g -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.pruitt at gmail.com Tue Dec 25 04:53:58 2012 From: eric.pruitt at gmail.com (Eric Pruitt) Date: Mon, 24 Dec 2012 21:53:58 -0600 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: <20121225035358.GA7501@codevat.com> Hey, Anatoly, you are free to modify the PEP and code. I do not have any plans to work on this right now. Eric On Mon, Dec 24, 2012 at 04:42:20PM +0300, anatoly techtonik wrote: > What should I do in case Eric lost interest after his GSoC project for PSF > appeared as useless for python-dev community? Should I rewrite the proposal > from scratch? > > On Thu, Dec 20, 2012 at 11:18 PM, Brett Cannon wrote: > > > You cannot rewrite an existing PEP if you are not one of the original > > owners, nor can you add yourself as an author to a PEP without permission > > from the original authors. > > > > And please do not CC the peps mailing list on discussions. It should only > > be used to mail in new PEPs or acceptable patches to PEPs. From brian at python.org Tue Dec 25 05:44:51 2012 From: brian at python.org (Brian Curtin) Date: Mon, 24 Dec 2012 22:44:51 -0600 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Mon, Dec 24, 2012 at 7:42 AM, anatoly techtonik wrote: > What should I do in case Eric lost interest after his GSoC project for PSF > appeared as useless for python-dev community? Should I rewrite the proposal > from scratch? Before you attempt that, start by trying to have a better attitude towards people's contributions around here. From ajaygargnsit at gmail.com Tue Dec 25 17:34:21 2012 From: ajaygargnsit at gmail.com (Ajay Garg) Date: Tue, 25 Dec 2012 22:04:21 +0530 Subject: [Python-Dev] Running GUI and "GObject.mainloop.run()" together? In-Reply-To: <50D9CC4F.3000403@collabora.co.uk> References: <50D9CC4F.3000403@collabora.co.uk> Message-ID: Thanks Simon. Thanks for the extensive info; however it needs some hours (if not days :P) to be digested. On Tue, Dec 25, 2012 at 9:24 PM, Simon McVittie < simon.mcvittie at collabora.co.uk> wrote: > On 24/12/12 08:26, Ajay Garg wrote: > > For a recap of the brief history, I have a parent process, that is > > spawning a child process via "subprocess". > > Currently, the child-process is a GUI process; however, I intend to > > "behave" it as a dbus-service as well. > > In general that is something that can work, but it's necessary to > understand a bit about how main loops work, and how the modules of your > process deal with a main loop. > > Just saying "GUI" is not very informative: there are dozens of GUI > frameworks that you might be using, each with their own requirements and > oddities. If you say Gtk, or Qt, or Tk, or Windows MFC, or whatever > specific GUI framework you're using, then it becomes possible to say > something concrete about your situation. > > Based on later mails in the thread you seem to be using Gtk. > > I should note here that you seem to be using PyGtk (the "traditional" > Gtk 2 Python binding), which is deprecated. The modern version is to use > PyGI, the Python GObject-Introspection binding, and Gtk 3. > > When using PyGI, you have a choice of two D-Bus implementations: either > GDBus (part of gi.repository.GIO), or dbus-python ("import dbus"). I > would recommend GDBus, since dbus-python is constrained by backwards > compatibility with some flawed design decisions. > > However, assuming you're stuck with dbus-python: > > > I then used composition, wherein another class, "RemoteListener" > > deriving from "dbus.service.Object" was made an attribute of the "main" > > class. That worked. > > However, when I do > > > > dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) > > GObject.mainloop.run() > > > > in the "RemoteListener"'s __init__ method, the GUI of the "main" class > > fails to load (apparently, because the "mainloop.run()" causes the > > singular, main-thread to go into busy-wait). > > Almost; it's not a busy-wait. GObject.mainloop.run() is the equivalent > of this pseudocode: > > def run(self): > while not global_default_main_context.someone_has_called_quit: > if global_default_main_context.has_more_events(): > global_default_main_context.process_next_event() > else: > global_default_main_context.wait_for_an_event() > > so it will loop until someone calls GObject.mainloop.quit() or > equivalent, or forever if that never happens - but as long as nothing > "interesting" happens, it will block on a poll() or select() syscall in > what my pseudocode calls wait_for_an_event(), which is the right thing > to do in event-driven programming like GLib/Gtk. > > (If you replace the last line of my pseudocode with "continue", that > would be a busy-wait.) > > > I tried option b), but now instantiating "RemoteListener" in a separate > > thread > > It is unclear whether the dbus-glib main loop glue (as set up by > DBusGMainLoop) is thread-safe or not. The safest assumption is always > "if you don't know whether foo is thread-safe, it probably isn't". In > any case, if it *is* thread-safe, the subset of it that's exposed > through dbus-python isn't enough to use it in multiple threads. > > GDBus, as made available via PyGI (specifically, gi.repository.GIO), is > known to be thread-safe. > > > Is there a way to run GUI and a dbus-service together? > > The general answer: only if either the GUI and the D-Bus code > run in different threads, or if they run in the same thread and can be > made to share a main context. > > The specific answer for Gtk: yes, they can easily share a main context. > > This: > > > dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) > > sets up dbus-python's mainloop integration to integrate with the global > default main-context in GLib (implementation detail: it currently uses > dbus-glib to do that). What that means is that whenever a D-Bus > connection started by dbus-python wants to listen for events on a > socket, or wait for a timeout, it will ask GLib to add those to the > global default main context as event sources. > > This: > > > GObject.mainloop.run() > > iterates GLib's global default main context, analogous to the pseudocode > I mentioned before. Any "interesting" events that happen will cause your > code to be executed. > > A typical GUI application also needs to run the main loop to > wait for events. In PyGtk, you'd typically do that with: > > > Gtk.main() > > Gtk also uses GLib's global default main context, so this is pretty > similar to GObject.mainloop.run() - if you just remove the call to > GObject.mainloop.run() and use Gtk.main() instead, everything should be > fine. > > > As per http://www.pygtk.org/pygtk2reference/class- > > gobjectmainloop.html, it seems that we must be able to add event > > sources to gobject.mainloop > > Yes. For instance, gobject.timeout_add(), gobject.idle_add() and > gobject.io_add_watch() all add event sources to the default main context. > > dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) tells dbus-python > that when it needs to add an event source to "the" main loop, it should > use equivalent C functions in GLib to do so. > > (In principle, DBusGMainLoop ought to take a GObject.MainContext as an > optional argument - but that's never been implemented, and it currently > always uses the default main context, which is the same one Gtk uses, > and which should only be iterated from the main thread.) > > > Once the event sources are added, each instance of gobject.mainloop > > (in its particular thread), will cater to only those sources. > > No, that's not true; gobject.mainloop is a namespace for a set of global > functions, not an object. If you must use multiple threads (not > recommended), please see the GLib C API documentation for details of how > main loops and main contexts relate, then the PyGtk documentation to see > how that translates into Python. > > > How is dbus."mainloop.glib.DBusGMainLoop(set_as_default=True)" > > related to gobject.mainloop? > > It instantiates a new DBusGMainLoop and sets it as dbus-python's global > default main-loop-integration object. (With hindsight, DBusGMainLoop was > a poor choice of name - it should have been DBusGMainIntegration or > something.) The result is that whenever a new dbus.connection.Connection > is instantiated, it will call methods on that DBusGMainLoop to connect > its event sources up to the default GLib main context, which is the same > one used by Gtk. > > dbus.bus.BusConnection, dbus.Bus, dbus.SessionBus etc. are > dbus.connection.Connection subclasses, so anything I say about > dbus.connection.Connection applies equally to them. > > > How is dbus."mainloop.glib.DBusGMainLoop(set_as_default=False)" > > related to gobject.mainloop? > > It instantiates a new DBusGMainLoop and doesn't use it for anything. If > you save the returned DBusGMainLoop in a variable (e.g. > my_dbus_g_main_loop = DBusGMainLoop(...)), then you can pass a keyword > argument mainloop=my_dbus_g_main_loop to a dbus.connection.Connection > constructor, and that dbus.connection.Connection will use that > DBusGMainLoop instead of dbus-python's global default. In practice, only > a very unusual application would need to do that. > > There is currently no point in having more than one DBusGMainLoop; it > would become useful if dbus-glib was thread-safe, and if dbus-python > supported non-default GLib main-contexts. > > > Is it necessary at all to specify > > "mainloop.glib.DBusGMainLoop(set_as_default=True)" or > > "mainloop.glib.DBusGMainLoop(set_as_default=False)" when using > > gobject.mainloop? > > Yes. Otherwise, dbus-python has no way to know that your application is > going to be iterating the GLib main loop, as opposed to Qt or Tk or > Enlightenment or something. > > > currently for the client, I am having the (client) (parent) process > > run the command "dbus-send" via the python-subprocess API. > > Does there exist a python API to do it in a cleaner manner? > > Yes, either dbus-python or GDBus. Each of those can do everything > dbus-send can, and more. > For a start, could you please point me to the paradigm to send a dbus-signal from the client to the server (where the server has the "add_to_signal_receiver" been set up). >From the limited googling that I did, I remember someone saying that for sending a signal, the typical setting-up-of-a-proxy-object is not required; however, I could not hit upon the exact dbus-python mechanism to send a signal :-\ > > S > _______________________________________________ > dbus mailing list > dbus at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dbus > -- Regards, Ajay -------------- next part -------------- An HTML attachment was scrubbed... URL: From ajaygargnsit at gmail.com Tue Dec 25 18:05:31 2012 From: ajaygargnsit at gmail.com (Ajay Garg) Date: Tue, 25 Dec 2012 22:35:31 +0530 Subject: [Python-Dev] Running GUI and "GObject.mainloop.run()" together? In-Reply-To: References: <50D9CC4F.3000403@collabora.co.uk> Message-ID: Also, I think I am now starting to get a hang of things; however, one doubt solved raises another doubt :D The reason I began looking out for the two-threads-approach, is because when trying to use the GUI (Gtk) application as a dbus-service, I was getting the error "This connection was not provided by any of .service files". I now see that the reason of it was :: I wasn't using "dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)" prior to the statement "Gtk.main()". Now, by your helpful guidance, wherein you stated that "Gtk.main()" has the same effect as "GObject.MainLoop.run()", I added the statement "dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)" before "Gtk.main()", and voila.. now I could get a proxy object, and invoke methods remotely. Thus, two threads are now not needed. However, it now raises a new doubt : in my second approach, wherein I used "add_to_signal_receiver" (at the server side), and dbus-send (at the client side), how come things worked now, since I did not add "dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)" in this approach at all? I presume that even in this case, dbus would need to know that the GUI application is ok to listen to dbus-signals? Are the different requirements in these two approaches expected? Or is it an inconsistency with dbus-python? On Tue, Dec 25, 2012 at 10:04 PM, Ajay Garg wrote: > Thanks Simon. > > Thanks for the extensive info; however it needs some hours (if not days > :P) to be digested. > > On Tue, Dec 25, 2012 at 9:24 PM, Simon McVittie < > simon.mcvittie at collabora.co.uk> wrote: > >> On 24/12/12 08:26, Ajay Garg wrote: >> > For a recap of the brief history, I have a parent process, that is >> > spawning a child process via "subprocess". >> > Currently, the child-process is a GUI process; however, I intend to >> > "behave" it as a dbus-service as well. >> >> In general that is something that can work, but it's necessary to >> understand a bit about how main loops work, and how the modules of your >> process deal with a main loop. >> >> Just saying "GUI" is not very informative: there are dozens of GUI >> frameworks that you might be using, each with their own requirements and >> oddities. If you say Gtk, or Qt, or Tk, or Windows MFC, or whatever >> specific GUI framework you're using, then it becomes possible to say >> something concrete about your situation. >> >> Based on later mails in the thread you seem to be using Gtk. >> >> I should note here that you seem to be using PyGtk (the "traditional" >> Gtk 2 Python binding), which is deprecated. The modern version is to use >> PyGI, the Python GObject-Introspection binding, and Gtk 3. >> >> When using PyGI, you have a choice of two D-Bus implementations: either >> GDBus (part of gi.repository.GIO), or dbus-python ("import dbus"). I >> would recommend GDBus, since dbus-python is constrained by backwards >> compatibility with some flawed design decisions. >> >> However, assuming you're stuck with dbus-python: >> >> > I then used composition, wherein another class, "RemoteListener" >> > deriving from "dbus.service.Object" was made an attribute of the "main" >> > class. That worked. >> > However, when I do >> > >> > dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) >> > GObject.mainloop.run() >> > >> > in the "RemoteListener"'s __init__ method, the GUI of the "main" class >> > fails to load (apparently, because the "mainloop.run()" causes the >> > singular, main-thread to go into busy-wait). >> >> Almost; it's not a busy-wait. GObject.mainloop.run() is the equivalent >> of this pseudocode: >> >> def run(self): >> while not global_default_main_context.someone_has_called_quit: >> if global_default_main_context.has_more_events(): >> global_default_main_context.process_next_event() >> else: >> global_default_main_context.wait_for_an_event() >> >> so it will loop until someone calls GObject.mainloop.quit() or >> equivalent, or forever if that never happens - but as long as nothing >> "interesting" happens, it will block on a poll() or select() syscall in >> what my pseudocode calls wait_for_an_event(), which is the right thing >> to do in event-driven programming like GLib/Gtk. >> >> (If you replace the last line of my pseudocode with "continue", that >> would be a busy-wait.) >> >> > I tried option b), but now instantiating "RemoteListener" in a separate >> > thread >> >> It is unclear whether the dbus-glib main loop glue (as set up by >> DBusGMainLoop) is thread-safe or not. The safest assumption is always >> "if you don't know whether foo is thread-safe, it probably isn't". In >> any case, if it *is* thread-safe, the subset of it that's exposed >> through dbus-python isn't enough to use it in multiple threads. >> >> GDBus, as made available via PyGI (specifically, gi.repository.GIO), is >> known to be thread-safe. >> >> > Is there a way to run GUI and a dbus-service together? >> >> The general answer: only if either the GUI and the D-Bus code >> run in different threads, or if they run in the same thread and can be >> made to share a main context. >> >> The specific answer for Gtk: yes, they can easily share a main context. >> >> This: >> >> > dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) >> >> sets up dbus-python's mainloop integration to integrate with the global >> default main-context in GLib (implementation detail: it currently uses >> dbus-glib to do that). What that means is that whenever a D-Bus >> connection started by dbus-python wants to listen for events on a >> socket, or wait for a timeout, it will ask GLib to add those to the >> global default main context as event sources. >> >> This: >> >> > GObject.mainloop.run() >> >> iterates GLib's global default main context, analogous to the pseudocode >> I mentioned before. Any "interesting" events that happen will cause your >> code to be executed. >> >> A typical GUI application also needs to run the main loop to >> wait for events. In PyGtk, you'd typically do that with: >> >> > Gtk.main() >> >> Gtk also uses GLib's global default main context, so this is pretty >> similar to GObject.mainloop.run() - if you just remove the call to >> GObject.mainloop.run() and use Gtk.main() instead, everything should be >> fine. >> >> > As per http://www.pygtk.org/pygtk2reference/class- >> > gobjectmainloop.html, it seems that we must be able to add event >> > sources to gobject.mainloop >> >> Yes. For instance, gobject.timeout_add(), gobject.idle_add() and >> gobject.io_add_watch() all add event sources to the default main context. >> >> dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) tells dbus-python >> that when it needs to add an event source to "the" main loop, it should >> use equivalent C functions in GLib to do so. >> >> (In principle, DBusGMainLoop ought to take a GObject.MainContext as an >> optional argument - but that's never been implemented, and it currently >> always uses the default main context, which is the same one Gtk uses, >> and which should only be iterated from the main thread.) >> >> > Once the event sources are added, each instance of gobject.mainloop >> > (in its particular thread), will cater to only those sources. >> >> No, that's not true; gobject.mainloop is a namespace for a set of global >> functions, not an object. If you must use multiple threads (not >> recommended), please see the GLib C API documentation for details of how >> main loops and main contexts relate, then the PyGtk documentation to see >> how that translates into Python. >> >> > How is dbus."mainloop.glib.DBusGMainLoop(set_as_default=True)" >> > related to gobject.mainloop? >> >> It instantiates a new DBusGMainLoop and sets it as dbus-python's global >> default main-loop-integration object. (With hindsight, DBusGMainLoop was >> a poor choice of name - it should have been DBusGMainIntegration or >> something.) The result is that whenever a new dbus.connection.Connection >> is instantiated, it will call methods on that DBusGMainLoop to connect >> its event sources up to the default GLib main context, which is the same >> one used by Gtk. >> >> dbus.bus.BusConnection, dbus.Bus, dbus.SessionBus etc. are >> dbus.connection.Connection subclasses, so anything I say about >> dbus.connection.Connection applies equally to them. >> >> > How is dbus."mainloop.glib.DBusGMainLoop(set_as_default=False)" >> > related to gobject.mainloop? >> >> It instantiates a new DBusGMainLoop and doesn't use it for anything. If >> you save the returned DBusGMainLoop in a variable (e.g. >> my_dbus_g_main_loop = DBusGMainLoop(...)), then you can pass a keyword >> argument mainloop=my_dbus_g_main_loop to a dbus.connection.Connection >> constructor, and that dbus.connection.Connection will use that >> DBusGMainLoop instead of dbus-python's global default. In practice, only >> a very unusual application would need to do that. >> >> There is currently no point in having more than one DBusGMainLoop; it >> would become useful if dbus-glib was thread-safe, and if dbus-python >> supported non-default GLib main-contexts. >> >> > Is it necessary at all to specify >> > "mainloop.glib.DBusGMainLoop(set_as_default=True)" or >> > "mainloop.glib.DBusGMainLoop(set_as_default=False)" when using >> > gobject.mainloop? >> >> Yes. Otherwise, dbus-python has no way to know that your application is >> going to be iterating the GLib main loop, as opposed to Qt or Tk or >> Enlightenment or something. >> >> > currently for the client, I am having the (client) (parent) process >> > run the command "dbus-send" via the python-subprocess API. >> > Does there exist a python API to do it in a cleaner manner? >> >> Yes, either dbus-python or GDBus. Each of those can do everything >> dbus-send can, and more. >> > > For a start, could you please point me to the paradigm to send a > dbus-signal from the client to the server (where the server has the > "add_to_signal_receiver" been set up). > > From the limited googling that I did, I remember someone saying that for > sending a signal, the typical setting-up-of-a-proxy-object is not required; > however, I could not hit upon the exact dbus-python mechanism to send a > signal :-\ > > > >> >> S >> _______________________________________________ >> dbus mailing list >> dbus at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dbus >> > > > > -- > Regards, > Ajay > -- Regards, Ajay -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Tue Dec 25 18:45:43 2012 From: brett at python.org (Brett Cannon) Date: Tue, 25 Dec 2012 12:45:43 -0500 Subject: [Python-Dev] PEP 3145 (With Contents) In-Reply-To: References: <171e8a410909082052j6f9b81c2i9030c5f9074125c@mail.gmail.com> <171e8a410909150925w5711d955v1376771e29999493@mail.gmail.com> <20090915182456.12215.1594428165.divmod.xquotient.17@localhost.localdomain> Message-ID: On Dec 24, 2012 11:44 PM, "Brian Curtin" wrote: > > On Mon, Dec 24, 2012 at 7:42 AM, anatoly techtonik wrote: > > What should I do in case Eric lost interest after his GSoC project for PSF > > appeared as useless for python-dev community? Should I rewrite the proposal > > from scratch? > > Before you attempt that, start by trying to have a better attitude > towards people's contributions around here. Ignoring the extremely negative and counter-productive attitude (which if not changed could quite easily lead to no PEP editors wanting to work with you, Anatoly, and thus blocking your changes from being accepted), you are also ignoring the two other authors on that PEP, who also need to agree to adding you to the PEP as an author and your general direction/approach. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Tue Dec 25 22:55:06 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 25 Dec 2012 23:55:06 +0200 Subject: [Python-Dev] Raising OSError concrete classes from errno code Message-ID: Currently we have exception tree of classes inherited from OSError When we use C API we can call PyErr_SetFromErrno and PyErr_SetFromErrnoWithFilename[Object] functions. This ones raise concrete exception class (FileNotFoundError for example) looking on implicit errno value. I cannot see the way to do it from python. Maybe adding builtin like exception_from_errno(errno, filename=None) make some value? Function returns exception instance, concrete class depends of errno value For example if I've got EPOLLERR from poller call I can get error code via s.getsockopt(SOL_SOCKET, SO_ERROR) but I cannot raise concrete exception from given errno code. -- Thanks, Andrew Svetlov From benjamin at python.org Tue Dec 25 23:03:22 2012 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 25 Dec 2012 16:03:22 -0600 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: References: Message-ID: 2012/12/25 Andrew Svetlov : > Currently we have exception tree of classes inherited from OSError > When we use C API we can call PyErr_SetFromErrno and > PyErr_SetFromErrnoWithFilename[Object] functions. > This ones raise concrete exception class (FileNotFoundError for > example) looking on implicit errno value. > I cannot see the way to do it from python. > > Maybe adding builtin like exception_from_errno(errno, filename=None) > make some value? > Function returns exception instance, concrete class depends of errno value I think a static method on OSError like .from_errno would be good. -- Regards, Benjamin From andrew.svetlov at gmail.com Tue Dec 25 23:05:26 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 26 Dec 2012 00:05:26 +0200 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: References: Message-ID: static method is better than new builtin function, agree. On Wed, Dec 26, 2012 at 12:03 AM, Benjamin Peterson wrote: > 2012/12/25 Andrew Svetlov : >> Currently we have exception tree of classes inherited from OSError >> When we use C API we can call PyErr_SetFromErrno and >> PyErr_SetFromErrnoWithFilename[Object] functions. >> This ones raise concrete exception class (FileNotFoundError for >> example) looking on implicit errno value. >> I cannot see the way to do it from python. >> >> Maybe adding builtin like exception_from_errno(errno, filename=None) >> make some value? >> Function returns exception instance, concrete class depends of errno value > > I think a static method on OSError like .from_errno would be good. > > > -- > Regards, > Benjamin -- Thanks, Andrew Svetlov From storchaka at gmail.com Wed Dec 26 09:50:40 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 26 Dec 2012 10:50:40 +0200 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: References: Message-ID: On 25.12.12 23:55, Andrew Svetlov wrote: > Currently we have exception tree of classes inherited from OSError > When we use C API we can call PyErr_SetFromErrno and > PyErr_SetFromErrnoWithFilename[Object] functions. > This ones raise concrete exception class (FileNotFoundError for > example) looking on implicit errno value. > I cannot see the way to do it from python. >>> raise OSError(errno.ENOENT, 'No such file or directory', 'qwerty') Traceback (most recent call last): File "", line 1, in FileNotFoundError: [Errno 2] No such file or directory: 'qwerty' From ncoghlan at gmail.com Wed Dec 26 11:16:23 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 26 Dec 2012 20:16:23 +1000 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: References: Message-ID: On Wed, Dec 26, 2012 at 6:50 PM, Serhiy Storchaka wrote: > On 25.12.12 23:55, Andrew Svetlov wrote: >> >> Currently we have exception tree of classes inherited from OSError >> When we use C API we can call PyErr_SetFromErrno and >> PyErr_SetFromErrnoWithFilename[Object] functions. >> This ones raise concrete exception class (FileNotFoundError for >> example) looking on implicit errno value. >> I cannot see the way to do it from python. > > >>>> raise OSError(errno.ENOENT, 'No such file or directory', 'qwerty') > Traceback (most recent call last): > File "", line 1, in > FileNotFoundError: [Errno 2] No such file or directory: 'qwerty' As Serhiy's example shows, this mapping of error numbers to subclasses is implemented directly in OSError.__new__. We did this so that code could catch the new exceptions, even when dealing with old code that raises the legacy exception types. http://docs.python.org/3/library/exceptions#OSError could probably do with an example like the one quoted in order to make this clearer Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From andrew.svetlov at gmail.com Wed Dec 26 12:37:13 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 26 Dec 2012 13:37:13 +0200 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: References: Message-ID: On Wed, Dec 26, 2012 at 12:16 PM, Nick Coghlan wrote: > On Wed, Dec 26, 2012 at 6:50 PM, Serhiy Storchaka wrote: >> On 25.12.12 23:55, Andrew Svetlov wrote: >>> >>> Currently we have exception tree of classes inherited from OSError >>> When we use C API we can call PyErr_SetFromErrno and >>> PyErr_SetFromErrnoWithFilename[Object] functions. >>> This ones raise concrete exception class (FileNotFoundError for >>> example) looking on implicit errno value. >>> I cannot see the way to do it from python. >> >> >>>>> raise OSError(errno.ENOENT, 'No such file or directory', 'qwerty') >> Traceback (most recent call last): >> File "", line 1, in >> FileNotFoundError: [Errno 2] No such file or directory: 'qwerty' > > As Serhiy's example shows, this mapping of error numbers to subclasses > is implemented directly in OSError.__new__. We did this so that code > could catch the new exceptions, even when dealing with old code that > raises the legacy exception types. > Sorry. Looks like OSError.__new__ requires at least two arguments for executing subclass search mechanism: >>> OSError(errno.ENOENT) OSError(2,) >>> OSError(errno.ENOENT, 'error msg') FileNotFoundError(2, 'error msg') I had tried first one and got confuse. > http://docs.python.org/3/library/exceptions#OSError could probably do > with an example like the one quoted in order to make this clearer > Added http://bugs.python.org/issue16785 for this. > Cheers, > Nick. > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From storchaka at gmail.com Wed Dec 26 13:02:05 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 26 Dec 2012 14:02:05 +0200 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: References: Message-ID: It sould be in Python-Ideas: add keyword argument support for OSError and subclasses with suitable default values. I.e. >>> OSError(errno=errno.ENOENT) FileNotFoundError(2, 'No such file or directory') >>> FileNotFoundError(filename='qwerty') FileNotFoundError(2, 'No such file or directory') >>> FileNotFoundError(strerr='Bad file') FileNotFoundError(2, 'Bad file') From jcea at jcea.es Wed Dec 26 16:53:35 2012 From: jcea at jcea.es (Jesus Cea) Date: Wed, 26 Dec 2012 16:53:35 +0100 Subject: [Python-Dev] push changesets hooks failing Message-ID: <50DB1D7F.9030806@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I got this when pushing: """ jcea at ubuntu:~/hg/python/cpython$ hg push pushing to ssh://hg at hg.python.org/cpython/ searching for changes searching for changes remote: adding changesets remote: adding manifests remote: adding file changes remote: added 4 changesets with 9 changes to 3 files remote: buildbot: change(s) sent successfully remote: sent email to roundup at report at bugs.python.org remote: notified python-checkins at python.org of incoming changeset 0ffaf1079a7a remote: error: incoming.irker hook raised an exception: [Errno 111] Connection refused remote: notified python-checkins at python.org of incoming changeset 3801ee5d5d73 remote: error: incoming.irker hook raised an exception: [Errno 111] Connection refused remote: notified python-checkins at python.org of incoming changeset b6a9f8fd9443 remote: error: incoming.irker hook raised an exception: [Errno 111] Connection refused remote: notified python-checkins at python.org of incoming changeset 3f7d5c235d82 remote: error: incoming.irker hook raised an exception: [Errno 111] Connection refused """ - -- Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQCVAwUBUNsdf5lgi5GaxT1NAQKuzQP+IPef5nx00zKdUwL4LoLDds05Dl+WtrFu Vs+Nvm4haa1+NNJ1owodtA5Xp01pDhMrhv4dvFcfEdbF2zLi3h8Xo+9oO6sEGhqE cMJZJxRCa4RdC9zpFzw0jWS7Udn/j91veWqaR/HLPYeKWcaXqWOegI+f2aoCBbQ7 5cd8Ynqihxw= =xUEy -----END PGP SIGNATURE----- From andrew.svetlov at gmail.com Wed Dec 26 17:07:17 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 26 Dec 2012 18:07:17 +0200 Subject: [Python-Dev] push changesets hooks failing In-Reply-To: <50DB1D7F.9030806@jcea.es> References: <50DB1D7F.9030806@jcea.es> Message-ID: Looks like IRC bot is broken for last days. I constantly get the same, but it related only to IRC, not to HG repo itself. On Wed, Dec 26, 2012 at 5:53 PM, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I got this when pushing: > > """ > jcea at ubuntu:~/hg/python/cpython$ hg push > pushing to ssh://hg at hg.python.org/cpython/ > searching for changes > searching for changes > remote: adding changesets > remote: adding manifests > remote: adding file changes > remote: added 4 changesets with 9 changes to 3 files > remote: buildbot: change(s) sent successfully > remote: sent email to roundup at report at bugs.python.org > remote: notified python-checkins at python.org of incoming changeset > 0ffaf1079a7a > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > remote: notified python-checkins at python.org of incoming changeset > 3801ee5d5d73 > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > remote: notified python-checkins at python.org of incoming changeset > b6a9f8fd9443 > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > remote: notified python-checkins at python.org of incoming changeset > 3f7d5c235d82 > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > """ > > - -- > Jes?s Cea Avi?n _/_/ _/_/_/ _/_/_/ > jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ > jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ > . _/_/ _/_/ _/_/ _/_/ _/_/ > "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ > "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ > "El amor es poner tu felicidad en la felicidad de otro" - Leibniz > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with undefined - http://www.enigmail.net/ > > iQCVAwUBUNsdf5lgi5GaxT1NAQKuzQP+IPef5nx00zKdUwL4LoLDds05Dl+WtrFu > Vs+Nvm4haa1+NNJ1owodtA5Xp01pDhMrhv4dvFcfEdbF2zLi3h8Xo+9oO6sEGhqE > cMJZJxRCa4RdC9zpFzw0jWS7Udn/j91veWqaR/HLPYeKWcaXqWOegI+f2aoCBbQ7 > 5cd8Ynqihxw= > =xUEy > -----END PGP SIGNATURE----- > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From solipsis at pitrou.net Wed Dec 26 17:42:02 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 26 Dec 2012 17:42:02 +0100 Subject: [Python-Dev] Raising OSError concrete classes from errno code References: Message-ID: <20121226174202.36b73434@pitrou.net> On Wed, 26 Dec 2012 13:37:13 +0200 Andrew Svetlov wrote: > > > > As Serhiy's example shows, this mapping of error numbers to subclasses > > is implemented directly in OSError.__new__. We did this so that code > > could catch the new exceptions, even when dealing with old code that > > raises the legacy exception types. > > > Sorry. > Looks like OSError.__new__ requires at least two arguments for > executing subclass search mechanism: > > >>> OSError(errno.ENOENT) > OSError(2,) > >>> OSError(errno.ENOENT, 'error msg') > FileNotFoundError(2, 'error msg') Indeed, it does. I did this for consistency, because calling OSError with only one argument doesn't set the "errno" attribute at all: >>> e = OSError(5) >>> e.errno >>> Regards Antoine. From andrew.svetlov at gmail.com Wed Dec 26 18:23:40 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 26 Dec 2012 19:23:40 +0200 Subject: [Python-Dev] Raising OSError concrete classes from errno code In-Reply-To: <20121226174202.36b73434@pitrou.net> References: <20121226174202.36b73434@pitrou.net> Message-ID: Thanks for the elaboration! On Wed, Dec 26, 2012 at 6:42 PM, Antoine Pitrou wrote: > On Wed, 26 Dec 2012 13:37:13 +0200 > Andrew Svetlov wrote: >> > >> > As Serhiy's example shows, this mapping of error numbers to subclasses >> > is implemented directly in OSError.__new__. We did this so that code >> > could catch the new exceptions, even when dealing with old code that >> > raises the legacy exception types. >> > >> Sorry. >> Looks like OSError.__new__ requires at least two arguments for >> executing subclass search mechanism: >> >> >>> OSError(errno.ENOENT) >> OSError(2,) >> >>> OSError(errno.ENOENT, 'error msg') >> FileNotFoundError(2, 'error msg') > > Indeed, it does. I did this for consistency, because calling OSError > with only one argument doesn't set the "errno" attribute at all: > >>>> e = OSError(5) >>>> e.errno >>>> > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From g.brandl at gmx.net Wed Dec 26 18:50:57 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 26 Dec 2012 18:50:57 +0100 Subject: [Python-Dev] push changesets hooks failing In-Reply-To: References: <50DB1D7F.9030806@jcea.es> Message-ID: Should now be fixed. I updated the daemon behind the hook to the newest version, and hope it will be more stable now. Georg On 12/26/2012 05:07 PM, Andrew Svetlov wrote: > Looks like IRC bot is broken for last days. > I constantly get the same, but it related only to IRC, not to HG repo itself. > > On Wed, Dec 26, 2012 at 5:53 PM, Jesus Cea wrote: > I got this when pushing: > > """ > jcea at ubuntu:~/hg/python/cpython$ hg push > pushing to ssh://hg at hg.python.org/cpython/ > searching for changes > searching for changes > remote: adding changesets > remote: adding manifests > remote: adding file changes > remote: added 4 changesets with 9 changes to 3 files > remote: buildbot: change(s) sent successfully > remote: sent email to roundup at report at bugs.python.org > remote: notified python-checkins at python.org of incoming changeset > 0ffaf1079a7a > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > remote: notified python-checkins at python.org of incoming changeset > 3801ee5d5d73 > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > remote: notified python-checkins at python.org of incoming changeset > b6a9f8fd9443 > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > remote: notified python-checkins at python.org of incoming changeset > 3f7d5c235d82 > remote: error: incoming.irker hook raised an exception: [Errno 111] > Connection refused > """ > >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com > > > From andrew.svetlov at gmail.com Wed Dec 26 18:53:16 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 26 Dec 2012 19:53:16 +0200 Subject: [Python-Dev] push changesets hooks failing In-Reply-To: References: <50DB1D7F.9030806@jcea.es> Message-ID: Thanks On Wed, Dec 26, 2012 at 7:50 PM, Georg Brandl wrote: > Should now be fixed. I updated the daemon behind the hook to the newest > version, and hope it will be more stable now. > > Georg > > On 12/26/2012 05:07 PM, Andrew Svetlov wrote: >> Looks like IRC bot is broken for last days. >> I constantly get the same, but it related only to IRC, not to HG repo itself. >> >> On Wed, Dec 26, 2012 at 5:53 PM, Jesus Cea wrote: >> I got this when pushing: >> >> """ >> jcea at ubuntu:~/hg/python/cpython$ hg push >> pushing to ssh://hg at hg.python.org/cpython/ >> searching for changes >> searching for changes >> remote: adding changesets >> remote: adding manifests >> remote: adding file changes >> remote: added 4 changesets with 9 changes to 3 files >> remote: buildbot: change(s) sent successfully >> remote: sent email to roundup at report at bugs.python.org >> remote: notified python-checkins at python.org of incoming changeset >> 0ffaf1079a7a >> remote: error: incoming.irker hook raised an exception: [Errno 111] >> Connection refused >> remote: notified python-checkins at python.org of incoming changeset >> 3801ee5d5d73 >> remote: error: incoming.irker hook raised an exception: [Errno 111] >> Connection refused >> remote: notified python-checkins at python.org of incoming changeset >> b6a9f8fd9443 >> remote: error: incoming.irker hook raised an exception: [Errno 111] >> Connection refused >> remote: notified python-checkins at python.org of incoming changeset >> 3f7d5c235d82 >> remote: error: incoming.irker hook raised an exception: [Errno 111] >> Connection refused >> """ >> >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com >> >> >> > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com -- Thanks, Andrew Svetlov From ajaygargnsit at gmail.com Wed Dec 26 21:00:22 2012 From: ajaygargnsit at gmail.com (Ajay Garg) Date: Thu, 27 Dec 2012 01:30:22 +0530 Subject: [Python-Dev] Running GUI and "GObject.mainloop.run()" together? In-Reply-To: <6B8B567C-5679-4D41-B9BD-3453EC9D26EB@twistedmatrix.com> References: <50D9CC4F.3000403@collabora.co.uk> <6B8B567C-5679-4D41-B9BD-3453EC9D26EB@twistedmatrix.com> Message-ID: Oops.. I am extremely sorry for posting to Python-Dev (I did not intend to; just a bad habit of "Reply-to-All"). Sorry again. On Thu, Dec 27, 2012 at 1:28 AM, Glyph wrote: > On Dec 25, 2012, at 9:05 AM, Ajay Garg wrote: > > > Also, I think I am now starting to get a hang of things; however, one > doubt solved raises another doubt :D > > > > The reason I began looking out for the two-threads-approach, is because > when trying to use the GUI (Gtk) application as a dbus-service, I was > getting the error "This connection was not provided by any of .service > files". > > Please stop cross-posting this to the python-dev list. It isn't > appropriate, as several people have said already. > > -glyph > > -- Regards, Ajay -------------- next part -------------- An HTML attachment was scrubbed... URL: From svenbrauch at googlemail.com Thu Dec 27 14:03:21 2012 From: svenbrauch at googlemail.com (Sven Brauch) Date: Thu, 27 Dec 2012 14:03:21 +0100 Subject: [Python-Dev] Range information in the AST -- once more Message-ID: Hello! I'm writing a static language analyzer for an IDE which reuses the CPython parser (for parsing) [1]. Two years ago, I asked about a few changes to be made to the AST provided by CPython, but the discussion thread dried up before a definite decision was made. I decided to just copy the parser code to my project and make the necessary changes there back then. I'm bringing this up again now since after my project has seen its first release recently, packagers are (understandably) a bit unhappy about the python fork residing in its repository. I would really like to get rid of that fork and link against the vanilla libpython instead. There's two things which are at the very least required to make this work: 1. The col_offset and lineno of an Attribute must give the beginning of the word that names the attribute, not the beginning of the expression. Example: In "foo.bar.baz", the col_offset of the Attribute belonging to "bar" says "0" currently, it would need to be "4". 2. Column offsets and line numbers would need to be available for function arguments (those with and without stars), and for alias nodes. In total, this requires very little change to the existing code, "a few tens of lines changed at most" order of magnitude; those are mostly trivial changes. For what I can tell, the impact on existing code using the AST stuff will be about zero. Even if there was some really obscure case where the change would matter, porting would only require about three lines of python code. Additionally, there's a few more things which would be useful to have available from the AST (namely the ranges of class and function names when they are defined -- currently only the start of the first decorator is available), but since those are reasonably easy to work around it's not that important. It would still be nice tough. If you think this is a reasonable suggestion then I'll be happy to provide a patch for more detailed discussion. Greetings, Sven ________ [1] See https://projects.kde.org/kdev-python From ncoghlan at gmail.com Thu Dec 27 15:44:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 28 Dec 2012 00:44:39 +1000 Subject: [Python-Dev] Range information in the AST -- once more In-Reply-To: References: Message-ID: On Thu, Dec 27, 2012 at 11:03 PM, Sven Brauch wrote: > Hello! > > I'm writing a static language analyzer for an IDE which reuses the > CPython parser (for parsing) [1]. Two years ago, I asked about a few > changes to be made to the AST provided by CPython, but the discussion > thread dried up before a definite decision was made. I decided to just > copy the parser code to my project and make the necessary changes > there back then. I'm bringing this up again now since after my project > has seen its first release recently, packagers are (understandably) a > bit unhappy about the python fork residing in its repository. I would > really like to get rid of that fork and link against the vanilla > libpython instead. It certainly sounds like its worth considering for 3.4. It's a new feature, though, so it unfortunately wouldn't be possible to backport it to any earlier releases. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From svenbrauch at googlemail.com Thu Dec 27 16:38:21 2012 From: svenbrauch at googlemail.com (Sven Brauch) Date: Thu, 27 Dec 2012 16:38:21 +0100 Subject: [Python-Dev] Range information in the AST -- once more In-Reply-To: References: Message-ID: 2012/12/27 Nick Coghlan : > It certainly sounds like its worth considering for 3.4. It's a new > feature, though, so it unfortunately wouldn't be possible to backport > it to any earlier releases. Yes, that is understandable. It wouldn't be much of a problem tough, my whole project is pretty bleeding-edge anyways, and depending on python >= 3.4 wouldn't hurt. For me it would only be important to have an acceptable solution for this long-term. Greetings, Sven From storchaka at gmail.com Thu Dec 27 17:15:06 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 27 Dec 2012 18:15:06 +0200 Subject: [Python-Dev] How old Python version should be supported in tests? Message-ID: <50DC740A.8070203@gmail.com> I found a code like "if sys.version_info < (2, 4):" in some tests. Should old versions (< 2.6) be supported in tests? Can such support code be removed (note that other tests likely doesn't compatible with old versions)? From guido at python.org Thu Dec 27 17:20:08 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Dec 2012 08:20:08 -0800 Subject: [Python-Dev] Range information in the AST -- once more In-Reply-To: References: Message-ID: So just submit a patch to the tracker... --Guido On Thursday, December 27, 2012, Sven Brauch wrote: > 2012/12/27 Nick Coghlan >: > > It certainly sounds like its worth considering for 3.4. It's a new > > feature, though, so it unfortunately wouldn't be possible to backport > > it to any earlier releases. > > Yes, that is understandable. It wouldn't be much of a problem tough, > my whole project is pretty bleeding-edge anyways, and depending on > python >= 3.4 wouldn't hurt. For me it would only be important to have > an acceptable solution for this long-term. > > Greetings, > Sven > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Dec 27 17:24:37 2012 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 27 Dec 2012 10:24:37 -0600 Subject: [Python-Dev] How old Python version should be supported in tests? In-Reply-To: <50DC740A.8070203@gmail.com> References: <50DC740A.8070203@gmail.com> Message-ID: 2012/12/27 Serhiy Storchaka : > I found a code like "if sys.version_info < (2, 4):" in some tests. Should > old versions (< 2.6) be supported in tests? Can such support code be removed > (note that other tests likely doesn't compatible with old versions)? It would be great if it could all be killed, but I suppose it might be in some externally maintained module. Which tests? -- Regards, Benjamin From rdmurray at bitdance.com Thu Dec 27 17:58:47 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 27 Dec 2012 11:58:47 -0500 Subject: [Python-Dev] How old Python version should be supported in tests? In-Reply-To: References: <50DC740A.8070203@gmail.com> Message-ID: <20121227165848.43F752500B5@webabinitio.net> On Thu, 27 Dec 2012 10:24:37 -0600, Benjamin Peterson wrote: > 2012/12/27 Serhiy Storchaka : > > I found a code like "if sys.version_info < (2, 4):" in some tests. Should > > old versions (< 2.6) be supported in tests? Can such support code be removed > > (note that other tests likely doesn't compatible with old versions)? > > It would be great if it could all be killed, but I suppose it might be > in some externally maintained module. Which tests? There are also a few cases where for one reason or another the module maintainer wants it to stay backward compatible. I'm thinking specifically of the platform module, which I'm pretty sure has that restriction. So, it has to be considered on a case by case basis. --David From benjamin at python.org Thu Dec 27 18:08:55 2012 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 27 Dec 2012 11:08:55 -0600 Subject: [Python-Dev] How old Python version should be supported in tests? In-Reply-To: <201212271902.31677.storchaka@gmail.com> References: <50DC740A.8070203@gmail.com> <201212271902.31677.storchaka@gmail.com> Message-ID: 2012/12/27 Serhiy Storchaka : > ?????? 27 ??????? 2012 18:24:37 ?? ????????: >> It would be great if it could all be killed, but I suppose it might be >> in some externally maintained module. Which tests? > > They are bsddb, sqlite3, ctypes and multiprocessing. I don't see the point in permuting thing too much in the 2.7 branch. -- Regards, Benjamin From chris.jerdonek at gmail.com Thu Dec 27 21:26:13 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Thu, 27 Dec 2012 12:26:13 -0800 Subject: [Python-Dev] [Python-checkins] cpython (merge 2.7 -> 2.7): Null merge. In-Reply-To: <3YXMV60LQSzRYd@mail.python.org> References: <3YXMV60LQSzRYd@mail.python.org> Message-ID: On Thu, Dec 27, 2012 at 12:05 PM, serhiy.storchaka wrote: > http://hg.python.org/cpython/rev/26eb2979465c > changeset: 81094:26eb2979465c > branch: 2.7 > parent: 81086:ccbb16719540 > parent: 81090:d3c81ef728ae > user: Serhiy Storchaka > date: Thu Dec 27 22:00:12 2012 +0200 > summary: > Null merge. Great to see your first check-ins, Serhiy. Congratulations! I think for this case we usually say "Merge heads," which is different from the case of a null merge (i.e. where the diff is empty, for example when registering that a 3.x commit should not be forward-ported to a later version). --Chris > > files: > Lib/idlelib/EditorWindow.py | 2 +- > Misc/NEWS | 3 +++ > 2 files changed, 4 insertions(+), 1 deletions(-) > > > diff --git a/Lib/idlelib/EditorWindow.py b/Lib/idlelib/EditorWindow.py > --- a/Lib/idlelib/EditorWindow.py > +++ b/Lib/idlelib/EditorWindow.py > @@ -1611,7 +1611,7 @@ > try: > try: > _tokenize.tokenize(self.readline, self.tokeneater) > - except _tokenize.TokenError: > + except (_tokenize.TokenError, SyntaxError): > # since we cut off the tokenizer early, we can trigger > # spurious errors > pass > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -168,6 +168,9 @@ > Library > ------- > > +- Issue #16504: IDLE now catches SyntaxErrors raised by tokenizer. Patch by > + Roger Serwy. > + > - Issue #16702: test_urllib2_localnet tests now correctly ignores proxies for > localhost tests. > > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From svenbrauch at googlemail.com Thu Dec 27 22:48:07 2012 From: svenbrauch at googlemail.com (Sven Brauch) Date: Thu, 27 Dec 2012 22:48:07 +0100 Subject: [Python-Dev] Range information in the AST -- once more In-Reply-To: References: Message-ID: 2012/12/27 Guido van Rossum : > So just submit a patch to the tracker... > > --Guido > > > On Thursday, December 27, 2012, Sven Brauch wrote: >> >> 2012/12/27 Nick Coghlan : >> > It certainly sounds like its worth considering for 3.4. It's a new >> > feature, though, so it unfortunately wouldn't be possible to backport >> > it to any earlier releases. >> >> Yes, that is understandable. It wouldn't be much of a problem tough, >> my whole project is pretty bleeding-edge anyways, and depending on >> python >= 3.4 wouldn't hurt. For me it would only be important to have >> an acceptable solution for this long-term. >> >> Greetings, >> Sven >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > -- > --Guido van Rossum (python.org/~guido) I submitted a patch to the tracker, see http://bugs.python.org/issue16795. The patch only contains the very minimum set of changes which are necessary. There's still a few things you'll need to work around when writing a static language analyzer, but none of them is too much work, so I didn't include them for now in order to keep things compact. Thanks and best regards, Sven From status at bugs.python.org Fri Dec 28 18:07:21 2012 From: status at bugs.python.org (Python tracker) Date: Fri, 28 Dec 2012 18:07:21 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20121228170721.6E7B21C9A3@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2012-12-21 - 2012-12-28) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3834 (-10) closed 24745 (+68) total 28579 (+58) Open issues with patches: 1678 Issues opened (32) ================== #9856: Change object.__format__(s) where s is non-empty to a TypeErro http://bugs.python.org/issue9856 reopened by eric.smith #16747: Remove 'file' type reference from 'iterable' glossary entry http://bugs.python.org/issue16747 opened by zach.ware #16748: Ensure test discovery doesn't break for modules testing C and http://bugs.python.org/issue16748 opened by zach.ware #16749: Fatal Python Error http://bugs.python.org/issue16749 opened by DaVinci #16754: Incorrect shared library extension on linux http://bugs.python.org/issue16754 opened by smani #16755: Distutils2 incorrectly works with unicode package names http://bugs.python.org/issue16755 opened by hotsyk #16757: Faster _PyUnicode_FindMaxChar() http://bugs.python.org/issue16757 opened by serhiy.storchaka #16758: SubprocessStartupError http://bugs.python.org/issue16758 opened by LtCdr(ret)nazrinasir #16761: Fix int(base=X) http://bugs.python.org/issue16761 opened by serhiy.storchaka #16762: test_subprocess failure on OpenBSD/NetBSD buildbots http://bugs.python.org/issue16762 opened by neologix #16763: test_ssl with connect_ex don't handle unreachable server corre http://bugs.python.org/issue16763 opened by neologix #16764: Make zlib accept keyword-arguments http://bugs.python.org/issue16764 opened by ebfe #16767: Cannot install Python 2.7 in Wine 1.4.1 http://bugs.python.org/issue16767 opened by Joe.Borg #16769: Remove some old Visual Studio versions from PC/ directory http://bugs.python.org/issue16769 opened by brian.curtin #16772: int() accepts float number base http://bugs.python.org/issue16772 opened by serhiy.storchaka #16773: int() half-accepts UserString http://bugs.python.org/issue16773 opened by serhiy.storchaka #16774: Additional recipes for itertools docs http://bugs.python.org/issue16774 opened by kachayev #16776: Document PyCFunction_New and PyCFunction_NewEx functions http://bugs.python.org/issue16776 opened by asvetlov #16778: Logger.findCaller needs to be smarter http://bugs.python.org/issue16778 opened by glynnc #16781: execfile/exec execution in other than global scope uses locals http://bugs.python.org/issue16781 opened by techtonik #16782: No curses.initwin: Incorrect package docstring for curses http://bugs.python.org/issue16782 opened by ballingt #16783: sqlite3 accepts strings it cannot return http://bugs.python.org/issue16783 opened by William.D..Colburn #16784: Int tests enhancement and refactoring http://bugs.python.org/issue16784 opened by serhiy.storchaka #16785: Document the fact that constructing OSError with erron returns http://bugs.python.org/issue16785 opened by asvetlov #16786: argparse doesn't offer localization interface for "version" ac http://bugs.python.org/issue16786 opened by thorsten #16787: asyncore.dispatcher_with_send - increase the send buffer size http://bugs.python.org/issue16787 opened by neologix #16795: Patch: some changes to AST to make it more useful for static l http://bugs.python.org/issue16795 opened by scummos #16798: DTD not checked http://bugs.python.org/issue16798 opened by txomon #16799: switch regrtest from getopt options to argparse Namespace http://bugs.python.org/issue16799 opened by chris.jerdonek #16800: tempfile._get_default_tempdir() leaves files behind when HD is http://bugs.python.org/issue16800 opened by kichik #16801: Preserve original representation for integers / floats in docs http://bugs.python.org/issue16801 opened by larry #16802: fileno argument to socket.socket() undocumented http://bugs.python.org/issue16802 opened by sbt Most recent 15 issues with no replies (15) ========================================== #16802: fileno argument to socket.socket() undocumented http://bugs.python.org/issue16802 #16799: switch regrtest from getopt options to argparse Namespace http://bugs.python.org/issue16799 #16798: DTD not checked http://bugs.python.org/issue16798 #16795: Patch: some changes to AST to make it more useful for static l http://bugs.python.org/issue16795 #16786: argparse doesn't offer localization interface for "version" ac http://bugs.python.org/issue16786 #16783: sqlite3 accepts strings it cannot return http://bugs.python.org/issue16783 #16782: No curses.initwin: Incorrect package docstring for curses http://bugs.python.org/issue16782 #16776: Document PyCFunction_New and PyCFunction_NewEx functions http://bugs.python.org/issue16776 #16773: int() half-accepts UserString http://bugs.python.org/issue16773 #16763: test_ssl with connect_ex don't handle unreachable server corre http://bugs.python.org/issue16763 #16762: test_subprocess failure on OpenBSD/NetBSD buildbots http://bugs.python.org/issue16762 #16754: Incorrect shared library extension on linux http://bugs.python.org/issue16754 #16742: PyOS_Readline drops GIL and calls PyOS_StdioReadline, which is http://bugs.python.org/issue16742 #16733: Solaris ctypes_test failures http://bugs.python.org/issue16733 #16732: setup.py support for xxmodule without tkinker http://bugs.python.org/issue16732 Most recent 15 issues waiting for review (15) ============================================= #16800: tempfile._get_default_tempdir() leaves files behind when HD is http://bugs.python.org/issue16800 #16795: Patch: some changes to AST to make it more useful for static l http://bugs.python.org/issue16795 #16787: asyncore.dispatcher_with_send - increase the send buffer size http://bugs.python.org/issue16787 #16782: No curses.initwin: Incorrect package docstring for curses http://bugs.python.org/issue16782 #16774: Additional recipes for itertools docs http://bugs.python.org/issue16774 #16772: int() accepts float number base http://bugs.python.org/issue16772 #16764: Make zlib accept keyword-arguments http://bugs.python.org/issue16764 #16762: test_subprocess failure on OpenBSD/NetBSD buildbots http://bugs.python.org/issue16762 #16761: Fix int(base=X) http://bugs.python.org/issue16761 #16757: Faster _PyUnicode_FindMaxChar() http://bugs.python.org/issue16757 #16755: Distutils2 incorrectly works with unicode package names http://bugs.python.org/issue16755 #16747: Remove 'file' type reference from 'iterable' glossary entry http://bugs.python.org/issue16747 #16743: mmap on Windows can mishandle files larger than sys.maxsize http://bugs.python.org/issue16743 #16739: texttestresult should decorate the stream with _WritelnDecorat http://bugs.python.org/issue16739 #16732: setup.py support for xxmodule without tkinker http://bugs.python.org/issue16732 Top 10 most discussed issues (10) ================================= #16772: int() accepts float number base http://bugs.python.org/issue16772 20 msgs #16743: mmap on Windows can mishandle files larger than sys.maxsize http://bugs.python.org/issue16743 19 msgs #16761: Fix int(base=X) http://bugs.python.org/issue16761 15 msgs #14373: C implementation of functools.lru_cache http://bugs.python.org/issue14373 8 msgs #16694: Add pure Python operator module http://bugs.python.org/issue16694 8 msgs #9856: Change object.__format__(s) where s is non-empty to a TypeErro http://bugs.python.org/issue9856 7 msgs #8713: multiprocessing needs option to eschew fork() under Linux http://bugs.python.org/issue8713 6 msgs #16712: collections.abc.Sequence should not provide __reversed__ http://bugs.python.org/issue16712 6 msgs #16781: execfile/exec execution in other than global scope uses locals http://bugs.python.org/issue16781 6 msgs #16737: Different behaviours in script run directly and via runpy.run_ http://bugs.python.org/issue16737 5 msgs Issues closed (63) ================== #9022: TypeError in wsgiref.handlers when using CGIHandler http://bugs.python.org/issue9022 closed by orsenthil #10646: ntpath.samefile doesn't work for hard links http://bugs.python.org/issue10646 closed by brian.curtin #10919: Environment variables are not expanded in _winreg when using R http://bugs.python.org/issue10919 closed by brian.curtin #11939: Implement stat.st_dev and os.path.samefile on windows http://bugs.python.org/issue11939 closed by brian.curtin #12944: Accept arbitrary files for packaging's upload command http://bugs.python.org/issue12944 closed by asvetlov #13198: Remove duplicate definition of write_record_file http://bugs.python.org/issue13198 closed by asvetlov #14420: winreg SetValueEx DWord type incompatible with value argument http://bugs.python.org/issue14420 closed by brian.curtin #14470: Remove using of w9xopen in subprocess module http://bugs.python.org/issue14470 closed by brian.curtin #14574: SocketServer doesn't handle client disconnects properly http://bugs.python.org/issue14574 closed by asvetlov #14834: A list of broken links on the python.org website http://bugs.python.org/issue14834 closed by benjamin.peterson #14870: Descriptions of os.utime() and os.utimensat() use wrong notati http://bugs.python.org/issue14870 closed by hynek #15302: Use argparse instead of getopt in test.regrtest http://bugs.python.org/issue15302 closed by chris.jerdonek #15324: --fromfile, --match, and --randomize don't work in regrtest http://bugs.python.org/issue15324 closed by chris.jerdonek #15325: --fromfile does not work for regrtest http://bugs.python.org/issue15325 closed by chris.jerdonek #15326: --random does not work for regrtest http://bugs.python.org/issue15326 closed by chris.jerdonek #15422: Get rid of PyCFunction_New macro http://bugs.python.org/issue15422 closed by asvetlov #15701: AttributeError from HTTPError when using digest auth http://bugs.python.org/issue15701 closed by orsenthil #16443: Add docstrings to regular expression match objects http://bugs.python.org/issue16443 closed by asvetlov #16496: Simplify and optimize random_seed() http://bugs.python.org/issue16496 closed by mark.dickinson #16504: IDLE - fatal error when opening a file with certain tokenizing http://bugs.python.org/issue16504 closed by serhiy.storchaka #16511: IDLE configuration file: blank height and width fields trip up http://bugs.python.org/issue16511 closed by asvetlov #16581: define "PEP editor" in PEP 1 http://bugs.python.org/issue16581 closed by ncoghlan #16618: Different glob() results for strings and bytes http://bugs.python.org/issue16618 closed by hynek #16644: Wrong code in ContextManagerTests.test_invalid_args() in test_ http://bugs.python.org/issue16644 closed by asvetlov #16650: Popen._internal_poll() references errno.ECHILD outside of the http://bugs.python.org/issue16650 closed by asvetlov #16666: docs wrongly imply socket.getaddrinfo takes keyword arguments http://bugs.python.org/issue16666 closed by ezio.melotti #16672: improve tracing performances when f_trace is NULL http://bugs.python.org/issue16672 closed by python-dev #16677: Hard to find operator precedence in Lang Ref. http://bugs.python.org/issue16677 closed by ezio.melotti #16682: Document that audioop works with bytes, not strings http://bugs.python.org/issue16682 closed by serhiy.storchaka #16689: stdout stderr redirection mess http://bugs.python.org/issue16689 closed by neologix #16702: Force urllib2_localnet test not to use http proxies http://bugs.python.org/issue16702 closed by orsenthil #16713: "tel" URIs should support params http://bugs.python.org/issue16713 closed by orsenthil #16715: Get rid of IOError. Use OSError instead http://bugs.python.org/issue16715 closed by asvetlov #16720: Get rid of os.error. Use OSError instead http://bugs.python.org/issue16720 closed by asvetlov #16744: sys.path.append causes wrong behaviour http://bugs.python.org/issue16744 closed by christian.heimes #16745: Hide symbols in _decimal.so http://bugs.python.org/issue16745 closed by skrah #16746: clarify what should be sent to peps@ http://bugs.python.org/issue16746 closed by chris.jerdonek #16750: Python Code module implements uncomputable function http://bugs.python.org/issue16750 closed by mark.dickinson #16751: Using modern unittest asserts in the documentation http://bugs.python.org/issue16751 closed by rhettinger #16752: Missing import in modulefinder.py http://bugs.python.org/issue16752 closed by brett.cannon #16753: #include broken on FreeBSD 9.1-RELEASE http://bugs.python.org/issue16753 closed by skrah #16756: buggy assignment to items of a list created by a * operator http://bugs.python.org/issue16756 closed by christian.heimes #16759: winreg.QueryValueEx returns signed 32bit value instead of unsi http://bugs.python.org/issue16759 closed by brian.curtin #16760: Get rid of MatchObject in regex HOWTO http://bugs.python.org/issue16760 closed by ezio.melotti #16765: Superfluous import in cgi module http://bugs.python.org/issue16765 closed by ezio.melotti #16766: small disadvantage of htmlentitydefs http://bugs.python.org/issue16766 closed by ezio.melotti #16768: CTRL-Y, yank, behaves as CTRL-Z with curses on OS X http://bugs.python.org/issue16768 closed by ned.deily #16770: Selection in IDLE often skips first character http://bugs.python.org/issue16770 closed by ned.deily #16771: issuse http://bugs.python.org/issue16771 closed by benjamin.peterson #16775: Add test coverage for os.removedirs() http://bugs.python.org/issue16775 closed by asvetlov #16777: "Evaluation order" doc section is wrong about dicts http://bugs.python.org/issue16777 closed by ezio.melotti #16779: Fix compiler warning when building extension modules on 64-bit http://bugs.python.org/issue16779 closed by skrah #16780: fail to compile python in msys with mingw http://bugs.python.org/issue16780 closed by r.david.murray #16788: Add samestat to Lib/ntpath.py __all__ http://bugs.python.org/issue16788 closed by brian.curtin #16789: :meth:`quit` links to constants instead of own module http://bugs.python.org/issue16789 closed by python-dev #16790: provide ability to share tests between int and long tests http://bugs.python.org/issue16790 closed by chris.jerdonek #16791: itertools.chain.from_iterable doesn't stop http://bugs.python.org/issue16791 closed by rhettinger #16792: Mark small ints test as CPython-only http://bugs.python.org/issue16792 closed by serhiy.storchaka #16793: Get rid of deprecated assertEquals etc in tests http://bugs.python.org/issue16793 closed by serhiy.storchaka #16794: Can't get a list of modules in Python's help system http://bugs.python.org/issue16794 closed by r.david.murray #16796: Fix argparse docs typo: "an special action" to "a special acti http://bugs.python.org/issue16796 closed by ezio.melotti #16797: sporadic test_faulthandler failure http://bugs.python.org/issue16797 closed by ezio.melotti #879399: socket line buffering http://bugs.python.org/issue879399 closed by asvetlov From regebro at gmail.com Fri Dec 28 19:02:36 2012 From: regebro at gmail.com (Lennart Regebro) Date: Fri, 28 Dec 2012 19:02:36 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <20121220114315.554a52ac@resist.wooz.org> References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> Message-ID: On Thu, Dec 20, 2012 at 5:43 PM, Barry Warsaw wrote: > > That would be `class UnknownTimeZoneError(ValueError, TimeZoneError)`. > As of today, in Pytz, UnknownTimeZoneError in fact subclasses KeyError. Any opinions against that? There is no TimeZoneError today, and it would only be used for this UnknownTimeZoneError, so I'm not sure it has much value. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri Dec 28 21:04:13 2012 From: barry at python.org (Barry Warsaw) Date: Fri, 28 Dec 2012 15:04:13 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> Message-ID: <20121228150413.5566fdbd@anarchist> On Dec 28, 2012, at 07:02 PM, Lennart Regebro wrote: >On Thu, Dec 20, 2012 at 5:43 PM, Barry Warsaw wrote: > >> >> That would be `class UnknownTimeZoneError(ValueError, TimeZoneError)`. >> > >As of today, in Pytz, UnknownTimeZoneError in fact subclasses KeyError. Any >opinions against that? There is no TimeZoneError today, and it would only be >used for this UnknownTimeZoneError, so I'm not sure it has much value. Agreed. If this is the only exception defined in the module, it sounds fine to me. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tjreedy at udel.edu Fri Dec 28 21:45:37 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 28 Dec 2012 15:45:37 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> Message-ID: On 12/28/2012 1:02 PM, Lennart Regebro wrote: > On Thu, Dec 20, 2012 at 5:43 PM, Barry Warsaw > wrote: > > > That would be `class UnknownTimeZoneError(ValueError, TimeZoneError)`. > > > As of today, in Pytz, UnknownTimeZoneError in fact subclasses KeyError. > Any opinions against that? Since the erroneous value is used as a key for a database lookup, and the error is probably detected by trying the lookup, I think that is ok. even if the user does not use []s. > There is no TimeZoneError today, and it would only be used for this > UnknownTimeZoneError, so I'm not sure it has much value. -- Terry Jan Reedy From regebro at gmail.com Fri Dec 28 21:23:46 2012 From: regebro at gmail.com (Lennart Regebro) Date: Fri, 28 Dec 2012 21:23:46 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update Message-ID: Happy Holidays! Here is the update of PEP 431 with the changes that emerged after the earlier discussion. A raw download is here: https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt PEP: 431 Title: Time zone support improvements Version: $Revision$ Last-Modified: $Date$ Author: Lennart Regebro BDFL-Delegate: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 11-Dec-2012 Post-History: 11-Dec-2012, 28-Dec-2012 Abstract ======== This PEP proposes the implementation of concrete time zone support in the Python standard library, and also improvements to the time zone API to deal with ambiguous time specifications during DST changes. Proposal ======== Concrete time zone support -------------------------- The time zone support in Python has no concrete implementation in the standard library outside of a tzinfo baseclass that supports fixed offsets. To properly support time zones you need to include a database over all time zones, both current and historical, including daylight saving changes. But such information changes frequently, so even if we include the last information in a Python release, that information would be outdated just a few months later. Time zone support has therefore only been available through two third-party modules, ``pytz`` and ``dateutil``, both who include and wrap the "zoneinfo" database. This database, also called "tz" or "The Olsen database", is the de-facto standard time zone database over time zones, and it is included in most Unix and Unix-like operating systems, including OS X. This gives us the opportunity to include the code that supports the zoneinfo data in the standard library, but by default use the operating system's copy of the data, which typically will be kept updated by the updating mechanism of the operating system or distribution. For those who have an operating system that does not include the zoneinfo database, for example Windows, the Python source distribution will include a copy of the zoneinfo database, and a distribution containing the latest zoneinfo database will also be available at the Python Package Index, so it can be easily installed with the Python packaging tools such as ``easy_install`` or ``pip``. This could also be done on Unices that are no longer recieving updates and therefore has an outdated database. With such a mechanism Python would have full time zone support in the standard library on any platform, and a simple package installation would provide an updated time zone database on those platforms where the zoneinfo database isn't included, such as Windows, or on platforms where OS updates are no longer provided. The time zone support will be implemented by making the ``datetime`` module into a package, and creating a new submodule called `timezone``, based on Stuart Bishop's ``pytz`` module. Getting the local time zone --------------------------- On Unix there is no standard way of finding the name of the time zone that is being used. All the information that is available is the time zone abbreviations, such as ``EST`` and ``PDT``, but many of those abbreviations are ambigious and therefore you can't rely on them to figure out which time zone you are located in. There is however a standard for finding the compiled time zone information since it's located in ``/etc/localtime``. Therefore it is possible to create a local time zone object with the correct time zone information even though you don't know the name of the time zone. A function in ``datetime`` should be provided to return the local time zone. The support for this will be made by integrating Lennart Regebro's ``tzlocal`` module into the new ``timezone`` module. For Windows it will look up the local Windows time zone name, and use a mapping between Windows time zone names and zoneinfo time zone names provided by the Unicode consortium to convert that to a zoneinfo timezone. The mapping should be updated before each major or bugfix release, scripts for doing so will be provided in the ``Tools/`` directory. Ambiguous times --------------- When changing over from daylight savings time the clock is turned back one hour. This means that the times during that hour happens twice, once without DST and then once with DST. Similarily, when changing to daylight savings time, one hour goes missing. The current time zone API can not differentiating between the two ambiguous times during a change from DST. For example, in Stockholm the time of 2012-11-28 02:00:00 happens twice, both at UTC 2012-11-28 00:00:00 and also at 2012-11-28 01:00:00. The current time zone API can not disambiguate this and therefore it's unclear which time should be returned:: # This could be either 00:00 or 01:00 UTC: >>> dt = datetime(2012, 10, 28, 2, 0, tzinfo=timezone('Europe/Stockholm')) # But we can not specify which: >>> dt.astimezone(timezone('UTC')) datetime.datetime(2012, 10, 28, 1, 0, tzinfo=) ``pytz`` solved this problem by adding ``is_dst`` parameters to several methods of the tzinfo objects to make it possible to disambiguate times when this is desired. This PEP proposes to add these ``is_dst`` parameters to the relevant methods of the ``datetime`` API, and therefore add this functionality directly to ``datetime``. This is likely the hardest part of this PEP as this involves updating the C version of the ``datetime`` library with this functionality, as this involved writing new code, and not just reorganizing existing external libraries. Implementation API ================== The zoneinfo database --------------------- The latest version of the zoneinfo database should exist in the ``Lib/tzdata`` directory of the Python source control system. This copy of the database should be updated before every Python feature and bug-fix release, but not for releases of Python versions that are in security-fix-only-mode. Scripts to update the database will be provided in ``Tools/``, and the release instructions will be updated to include this update. The new ``datetime.timezone``-module ------------------------------------ The public API of the new ``timezone``-module contains one new class, one new function and one new exception. * New class: ``DstTzInfo`` This class provides a concrete implementation of the ``zoneinfo`` base class that implements DST support. * New function :``timezone(name=None, db_path=None)`` This function takes a name string that must be a string specifying a valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11". If not given, the local timezone will be looked up. If an invalid zone name are given, or the local timezone can not be retrieved, the function raises `UnknownTimeZoneError`. The function also takes an optional path to the location of the zoneinfo database which should be used. If not specified, the function will look for databases in the following order: 1. Use the database in ``/usr/share/zoneinfo``, if it exists. 2. Check if the `tzdata-update` module is installed, and then use that database. 3. Check the Python-provided database in ``Lib/tzdata``. If no database is found an ``UnknownTimeZoneError`` or subclass thereof will be raised with a message explaining that no zoneinfo database can be found, but that you can install one with the ``tzdata-update`` package. * New Exception: ``UnknownTimeZoneError`` This exception is a subclass of KeyError and raised when giving a time zone specification that can't be found:: >>> datetime.Timezone('Europe/New_York') Traceback (most recent call last): ... UnknownTimeZoneError: There is no time zone called 'Europe/New_York' Changes in the ``datetime``-module ---------------------------------- A new ``is_dst`` parameter is added to several of the `tzinfo` methods to handle time ambiguity during DST changeovers. * ``tzinfo.utcoffset(self, dt, is_dst=False)`` * ``tzinfo.dst(self, dt, is_dst=False)`` * ``tzinfo.tzname(self, dt, is_dst=False)`` The ``is_dst`` parameter can be ``False`` (default), ``True``, or ``None``. ``False`` will specify that the given datetime should be interpreted as not happening during daylight savings time, ie that the time specified is after the change from DST. ``True`` will specify that the given datetime should be interpreted as happening during daylight savings time, ie that the time specified is before the change from DST. ``None`` will raise an ``AmbiguousTimeError`` exception if the time specified was during a DST change over. It will also raise a ``NonExistentTimeError`` if a time is specified during the "missing time" in a change to DST. There are also three new exceptions: * ``InvalidTimeError`` This exception serves as a base for ``AmbiguousTimeError`` and ``NonExistentTimeError``, to enable you to trap these two separately. It will subclass from ValueError, so that you can catch these errors together with inputs like the 29th of February 2011. * ``AmbiguousTimeError`` This exception is raised when giving a datetime specification that are ambigious while setting ``is_dst`` to None:: >>> datetime(2012, 11, 28, 2, 0, tzinfo=timezone('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... AmbiguousTimeError: 2012-10-28 02:00:00 is ambiguous in time zone Europe/Stockholm * ``NonExistentTimeError`` This exception is raised when giving a datetime specification that are ambigious while setting ``is_dst`` to None:: >>> datetime(2012, 3, 25, 2, 0, tzinfo=timezone('Europe/Stockholm'), is_dst=None) >>> Traceback (most recent call last): ... NonExistentTimeError: 2012-03-25 02:00:00 does not exist in time zone Europe/Stockholm The ``tzdata-update``-package ----------------------------- The zoneinfo database will be packaged for easy installation with ``easy_install``/``pip``/``buildout``. This package will not install any Python code, and will not contain any Python code except that which is needed for installation. Differences from the ``pytz`` API ================================= * ``pytz`` has the functions ``localize()`` and ``normalize()`` to work around that ``tzinfo`` doesn't have is_dst. When ``is_dst`` is implemented directly in ``datetime.tzinfo`` they are no longer needed. * ``timezone()`` will return the local time zone if called without parameters. * The class ``pytz.StaticTzInfo`` is there to provide the ``is_dst`` support for static timezones. When ``is_dst`` support is included in ``datetime.tzinfo`` it is no longer needed. * ``InvalidTimeError`` subclasses from ``ValueError``. Discussion ========== Should the windows installer include the data package? ------------------------------------------------------ It has been suggested that the Windows installer should include the data package. This would mean that an explicit installation no longer would be needed on Windows. On the other hand, that would mean that many using Windows would not be aware that the database quickly becomes outdated and would not keep it updated. Resources ========= * http://pytz.sourceforge.net/ * http://pypi.python.org/pypi/tzlocal * http://pypi.python.org/pypi/python-dateutil * http://unicode.org/cldr/data/common/supplemental/windowsZones.xml Copyright ========= This document has been placed in the public domain. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Dec 28 22:12:38 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 28 Dec 2012 22:12:38 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On 28 Dec, 2012, at 21:23, Lennart Regebro wrote: > Happy Holidays! Here is the update of PEP 431 with the changes that emerged after the earlier discussion. Why is the new timezone support added in a submodule of datetime? Adding the new function and exception to datetime itself wouldn't clutter the API that much, and datetime already contains some timezone support (datetime.tzinfo). Ronald From solipsis at pitrou.net Fri Dec 28 22:59:50 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 28 Dec 2012 22:59:50 +0100 Subject: [Python-Dev] peps: PEP 432 updates in response to initial comments References: <3YXwF72ym4zP4V@mail.python.org> <50DDF90F.3040008@udel.edu> Message-ID: <20121228225950.053a9485@pitrou.net> On Fri, 28 Dec 2012 14:54:55 -0500 Terry Reedy wrote: > > 4. Running ./python_d from within a PCBuild/python_d interactive window > and on a regular disk averages .10 seconds. The slowdown is probably a > mixture of disk access and extra debug code, but is not bad. There is no > flashing (probably because there already is a window, whereas on Windows > IDLE runs code within a windowless pythonw process) and ^C works. This > is definitely a better environment for this type of test ;-). You'd get more meaningful numbers by using a non-debug build (PCBuild/python.exe, I guess). Our debugging additions + the lack of compiler optimizations butcher performance. It would be extra nice if you had numbers comparing 3.3, 3.2 and 2.7 (under Windows, that is). Regards Antoine. From barry at barrys-emacs.org Fri Dec 28 22:39:56 2012 From: barry at barrys-emacs.org (Barry Scott) Date: Fri, 28 Dec 2012 21:39:56 +0000 Subject: [Python-Dev] PYTHONPATH processing change from 2.6 to 2.7 and Mac bundle builder problems Message-ID: I'm trying to track down why bundlebuilder no longer works with python 2.7 to create runnable Mac OS X apps. I have got as far as seeing that imports of modules are failing. What I see is that sys.path does not contain all the elements from the PYTHONPATH variable. No matter what I put in PYTHONPATH only the first element is in sys.path. In detail here is what I'm using to test this: $ export PYTHONHOME=/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources $ export PYTHONPATH=/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources:/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/Modules.zip $ python2.7 -c "import sys;print sys.path" ['', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources'] $ python2.6 -c "import sys;print sys.path" 'import site' failed; use -v for traceback ['', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/Modules.zip', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python26.zip', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/plat-darwin', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/plat-mac', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/plat-mac/lib-scriptpackages', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/lib-tk', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/lib-old', '/Users/barry/wc/svn/pysvn/WorkBench/Kit/MacOSX/tmp/pysvn_workbench_svn178-1.6.6-0-x86_64/WorkBench.app/Contents/Resources/lib/python2.6/lib-dynload'] Any insight into what has changed and what might need changing in bundlebuilder would be appreciated. Barry From solipsis at pitrou.net Fri Dec 28 23:30:16 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 28 Dec 2012 23:30:16 +0100 Subject: [Python-Dev] PYTHONPATH processing change from 2.6 to 2.7 and Mac bundle builder problems References: Message-ID: <20121228233016.377aeffb@pitrou.net> On Fri, 28 Dec 2012 21:39:56 +0000 Barry Scott wrote: > I'm trying to track down why bundlebuilder no longer works with python 2.7 > to create runnable Mac OS X apps. > > I have got as far as seeing that imports of modules are failing. > > What I see is that sys.path does not contain all the elements from the > PYTHONPATH variable. > > No matter what I put in PYTHONPATH only the first element is in sys.path. I can't reproduce under Linux: $ PYTHONPATH=/x:/y python -Sc "import sys; print(sys.path)" ['', '/x', '/y', '/usr/lib/python27.zip', '/usr/lib64/python2.7/', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload'] Regards Antoine. From doug.hellmann at gmail.com Fri Dec 28 23:59:05 2012 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Fri, 28 Dec 2012 17:59:05 -0500 Subject: [Python-Dev] question about packaging Message-ID: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> A couple of us from the OpenStack project are interested in getting involved in the packaging rewrite/update project. I was following that work for a while, but have lost track of its current state. Can someone point me to the right mailing list, and maybe a status page or something so I can start figuring out where we might be able to help? Thanks, Doug From barry at barrys-emacs.org Sat Dec 29 00:10:11 2012 From: barry at barrys-emacs.org (Barry Scott) Date: Fri, 28 Dec 2012 23:10:11 +0000 Subject: [Python-Dev] PYTHONPATH processing change from 2.6 to 2.7 and Mac bundle builder problems In-Reply-To: <20121228233016.377aeffb@pitrou.net> References: <20121228233016.377aeffb@pitrou.net> Message-ID: <9E6E3321-B0E7-4E77-AFCB-9C78556499EF@barrys-emacs.org> You did not set PYTHONHOME that effects the code in calculate_path a lot. Also there is platform specific code in tht code. Barry On 28 Dec 2012, at 22:30, Antoine Pitrou wrote: > On Fri, 28 Dec 2012 21:39:56 +0000 > Barry Scott wrote: >> I'm trying to track down why bundlebuilder no longer works with python 2.7 >> to create runnable Mac OS X apps. >> >> I have got as far as seeing that imports of modules are failing. >> >> What I see is that sys.path does not contain all the elements from the >> PYTHONPATH variable. >> >> No matter what I put in PYTHONPATH only the first element is in sys.path. > > I can't reproduce under Linux: > > $ PYTHONPATH=/x:/y python -Sc "import sys; print(sys.path)" > ['', '/x', '/y', '/usr/lib/python27.zip', '/usr/lib64/python2.7/', > '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', > '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload'] > > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/barry%40barrys-emacs.org > From a.cavallo at cavallinux.eu Sat Dec 29 00:12:34 2012 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Fri, 28 Dec 2012 23:12:34 +0000 Subject: [Python-Dev] question about packaging In-Reply-To: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> Message-ID: <50DE2762.90509@cavallinux.eu> There's the distutil mailing list: http://mail.python.org/mailman/listinfo/distutils-sig Doug Hellmann wrote: > A couple of us from the OpenStack project are interested in getting involved in the packaging rewrite/update project. I was following that work for a while, but have lost track of its current state. Can someone point me to the right mailing list, and maybe a status page or something so I can start figuring out where we might be able to help? > > Thanks, > Doug > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/a.cavallo%40cavallinux.eu From nad at acm.org Sat Dec 29 00:57:33 2012 From: nad at acm.org (Ned Deily) Date: Fri, 28 Dec 2012 15:57:33 -0800 Subject: [Python-Dev] PYTHONPATH processing change from 2.6 to 2.7 and Mac bundle builder problems References: <20121228233016.377aeffb@pitrou.net> <9E6E3321-B0E7-4E77-AFCB-9C78556499EF@barrys-emacs.org> Message-ID: In article <9E6E3321-B0E7-4E77-AFCB-9C78556499EF at barrys-emacs.org>, Barry Scott wrote: > You did not set PYTHONHOME that effects the code in calculate_path a lot. > Also there is platform specific code in tht code. > On 28 Dec 2012, at 22:30, Antoine Pitrou wrote: > > On Fri, 28 Dec 2012 21:39:56 +0000 > > Barry Scott wrote: > >> I'm trying to track down why bundlebuilder no longer works with python 2.7 > >> to create runnable Mac OS X apps. > >> > >> I have got as far as seeing that imports of modules are failing. > >> > >> What I see is that sys.path does not contain all the elements from the > >> PYTHONPATH variable. > >> > >> No matter what I put in PYTHONPATH only the first element is in sys.path. > > > > I can't reproduce under Linux: > > > > $ PYTHONPATH=/x:/y python -Sc "import sys; print(sys.path)" > > ['', '/x', '/y', '/usr/lib/python27.zip', '/usr/lib64/python2.7/', > > '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', > > '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload'] Barry, I think this discussion should be taking place on the bug tracker (http://bugs.python.org), rather than in python-dev. bundlebuilder is unique to OS X and fairly esoteric. Please open an issue there and include a sample of how you created an app with bundlebuilder and what Python 2.7 version you are using and what version of OS X. Thanks! -- Ned Deily, nad at acm.org From steve at pearwood.info Sat Dec 29 02:23:00 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 29 Dec 2012 12:23:00 +1100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> Message-ID: <50DE45F4.5060108@pearwood.info> On 29/12/12 05:02, Lennart Regebro wrote: > On Thu, Dec 20, 2012 at 5:43 PM, Barry Warsaw wrote: > >> >> That would be `class UnknownTimeZoneError(ValueError, TimeZoneError)`. >> > > As of today, in Pytz, UnknownTimeZoneError in fact subclasses KeyError. > Any opinions against that? The PEP says: * New function :``timezone(name=None, db_path=None)`` This function takes a name string that must be a string specifying a valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or "Etc/GMT+11". It isn't 100% clear to me from the PEP what a valid name string would be, but I assume that it will accept anything that the time.tzset function will accept: http://docs.python.org/3/library/time.html#time.tzset If so, then valid "name strings" may be either: - strings which define the timezone rule explicitly, e.g: 'AEST-10AEDT-11,M10.5.0,M3.5.0' - or for convenience, rules already defined in your OS's timezone database: 'Australia/Melbourne' In either case, I don't think KeyError is the appropriate exception type. I think that if I were to see a time zone string such as: 'Europe/Melbourne' # no such place 'Eorupe/Stockhome' # misspelled 'Etc/GMT+999' # invalid offset 'AEST+10ASDT+11,M1050,M350' # invalid starting and ending dates '*&vbegs156s^g' # utter rubbish I would describe it as an *invalid* timezone, not a "missing" timezone. So ValueError is a more appropriate base exception than KeyError. > There is no TimeZoneError today, and it would only be used for this > UnknownTimeZoneError, so I'm not sure it has much value. In that case, can you rename UnknownTimeZoneError to TimeZoneError, which is shorter and easier to read, write and remember? (We have KeyError rather than UnknownKeyError, NameError rather than UnknownNameError, etc.) -- Steven From regebro at gmail.com Sat Dec 29 05:40:54 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sat, 29 Dec 2012 05:40:54 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50DE45F4.5060108@pearwood.info> References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> <50DE45F4.5060108@pearwood.info> Message-ID: On Sat, Dec 29, 2012 at 2:23 AM, Steven D'Aprano wrote: > The PEP says: > > * New function :``timezone(name=None, db_path=None)`` > > > This function takes a name string that must be a string specifying a > valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or > "Etc/GMT+11". > > > It isn't 100% clear to me from the PEP what a valid name string would be, > but I assume that it will accept anything that the time.tzset function > will accept: > No, valid names are the names of time zones in the zoneinfo database. There isn't really any usecase for defining up your own rules as that would mean that you want a time zone that doesn't exist, which seems a bit pointless. :-) (We have KeyError rather than UnknownKeyError, NameError rather than > UnknownNameError, etc.) > Sure, but what would otherwise a KeyError be unless an unkown or non-existing key? //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Sat Dec 29 05:48:01 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sat, 29 Dec 2012 05:48:01 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Fri, Dec 28, 2012 at 10:12 PM, Ronald Oussoren wrote: > > On 28 Dec, 2012, at 21:23, Lennart Regebro wrote: > > > Happy Holidays! Here is the update of PEP 431 with the changes that > emerged after the earlier discussion. > > Why is the new timezone support added in a submodule of datetime? Because several people wanted it that way and nobody objected. > Adding the new > function and exception to datetime itself wouldn't clutter the API that > much It will make the datetime.py twice as long though, and the second longest module in the stdlib, beaten only by decimal.py. Perhaps this is not a problem. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sat Dec 29 06:26:41 2012 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 28 Dec 2012 23:26:41 -0600 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: 2012/12/28 Lennart Regebro : > It will make the datetime.py twice as long though, and the second longest > module in the stdlib, beaten only by decimal.py. Perhaps this is not a > problem. No one ever accused datetime manipulation of being simple. -- Regards, Benjamin From ncoghlan at gmail.com Sat Dec 29 07:00:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 29 Dec 2012 16:00:39 +1000 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 6:23 AM, Lennart Regebro wrote: > Happy Holidays! Here is the update of PEP 431 with the changes that emerged > after the earlier discussion. > > A raw download is here: > https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt For UI purposes, "pytz" has some helpers to get lists of timezone names (all, common and country specific): http://pytz.sourceforge.net/#helpers Is there a specific reason you chose to exclude those from the PEP? > Discussion > ========== > > Should the windows installer include the data package? > ------------------------------------------------------ > > It has been suggested that the Windows installer should include the data > package. This would mean that an explicit installation no longer would be > needed on Windows. On the other hand, that would mean that many using > Windows > would not be aware that the database quickly becomes outdated and would not > keep it updated. I'm still a fan of *always* shipping fallback tzdata, regardless of platform. The stdlib would then look in three places for timezone data when datetime.timezone was first imported: 1. the "tzdata-update" database 2. the OS provided database 3. the fallback database Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From doug.hellmann at gmail.com Sat Dec 29 00:38:01 2012 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Fri, 28 Dec 2012 18:38:01 -0500 Subject: [Python-Dev] question about packaging In-Reply-To: <50DE2762.90509@cavallinux.eu> References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> <50DE2762.90509@cavallinux.eu> Message-ID: <4FB57362-504A-41E6-AB94-5C8EE8C55858@gmail.com> Is that where the discussions are actively happening? Last time I looked at it I thought there was a google group or something. It's bee quite a while, though, so I may just be confused. On Dec 28, 2012, at 6:12 PM, Antonio Cavallo wrote: > There's the distutil mailing list: > > http://mail.python.org/mailman/listinfo/distutils-sig > > > > Doug Hellmann wrote: >> A couple of us from the OpenStack project are interested in getting involved in the packaging rewrite/update project. I was following that work for a while, but have lost track of its current state. Can someone point me to the right mailing list, and maybe a status page or something so I can start figuring out where we might be able to help? >> >> Thanks, >> Doug >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/a.cavallo%40cavallinux.eu From solipsis at pitrou.net Sat Dec 29 11:45:45 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Dec 2012 11:45:45 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update References: Message-ID: <20121229114545.27f2432c@pitrou.net> On Sat, 29 Dec 2012 16:00:39 +1000 Nick Coghlan wrote: > > > Discussion > > ========== > > > > Should the windows installer include the data package? > > ------------------------------------------------------ > > > > It has been suggested that the Windows installer should include the data > > package. This would mean that an explicit installation no longer would be > > needed on Windows. On the other hand, that would mean that many using > > Windows > > would not be aware that the database quickly becomes outdated and would not > > keep it updated. > > I'm still a fan of *always* shipping fallback tzdata, regardless of > platform. The stdlib would then look in three places for timezone data > when datetime.timezone was first imported: > > 1. the "tzdata-update" database > 2. the OS provided database > 3. the fallback database +1 ! Regards Antoine. From solipsis at pitrou.net Sat Dec 29 11:47:39 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Dec 2012 11:47:39 +0100 Subject: [Python-Dev] question about packaging References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> Message-ID: <20121229114739.54ef8b52@pitrou.net> On Fri, 28 Dec 2012 17:59:05 -0500 Doug Hellmann wrote: > A couple of us from the OpenStack project are interested in getting involved in the packaging rewrite/update project. I was following that work for a while, but have lost track of its current state. Can someone point me to the right mailing list, and maybe a status page or something so I can start figuring out where we might be able to help? The current effort seems to be distlib, Vinay's project to gather the "good parts" of packaging and distutils as a library API: http://packages.python.org/distlib/ (there's an active bitbucket repo) Regards Antoine. From dirkjan at ochtman.nl Sat Dec 29 15:29:54 2012 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Sat, 29 Dec 2012 15:29:54 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: <20121229114545.27f2432c@pitrou.net> References: <20121229114545.27f2432c@pitrou.net> Message-ID: On Sat, Dec 29, 2012 at 11:45 AM, Antoine Pitrou wrote: >> I'm still a fan of *always* shipping fallback tzdata, regardless of >> platform. The stdlib would then look in three places for timezone data >> when datetime.timezone was first imported: >> >> 1. the "tzdata-update" database >> 2. the OS provided database >> 3. the fallback database > > +1 ! Yeah, from me as well. Cheers, Dirkjan From a.cavallo at cavallinux.eu Sat Dec 29 15:37:21 2012 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Sat, 29 Dec 2012 14:37:21 +0000 Subject: [Python-Dev] question about packaging In-Reply-To: <20121229114739.54ef8b52@pitrou.net> References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> <20121229114739.54ef8b52@pitrou.net> Message-ID: <50DF0021.7070003@cavallinux.eu> Correct if I'm wrong but distlib isn't targeting resources managent? distutils is targeted to distribute python modules/packages instead; small differences but on the field they really mean different things. distlib is under http://hg.python.org/distlib too :O ..my first guess would be that's the released "branch". Antoine Pitrou wrote: > On Fri, 28 Dec 2012 17:59:05 -0500 > Doug Hellmann wrote: >> A couple of us from the OpenStack project are interested in getting involved in the packaging rewrite/update project. I was following that work for a while, but have lost track of its current state. Can someone point me to the right mailing list, and maybe a status page or something so I can start figuring out where we might be able to help? > > The current effort seems to be distlib, Vinay's project to gather the > "good parts" of packaging and distutils as a library API: > http://packages.python.org/distlib/ > (there's an active bitbucket repo) > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/a.cavallo%40cavallinux.eu From doug.hellmann at gmail.com Sat Dec 29 17:00:25 2012 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Sat, 29 Dec 2012 11:00:25 -0500 Subject: [Python-Dev] question about packaging In-Reply-To: <20121229114739.54ef8b52@pitrou.net> References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> <20121229114739.54ef8b52@pitrou.net> Message-ID: On Dec 29, 2012, at 5:47 AM, Antoine Pitrou wrote: > On Fri, 28 Dec 2012 17:59:05 -0500 > Doug Hellmann wrote: >> A couple of us from the OpenStack project are interested in getting involved in the packaging rewrite/update project. I was following that work for a while, but have lost track of its current state. Can someone point me to the right mailing list, and maybe a status page or something so I can start figuring out where we might be able to help? > > The current effort seems to be distlib, Vinay's project to gather the > "good parts" of packaging and distutils as a library API: > http://packages.python.org/distlib/ > (there's an active bitbucket repo) > > Regards > > Antoine. Thanks, I'll start digging in there and reading the PEPs to catch up. Doug From regebro at gmail.com Sat Dec 29 19:29:57 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sat, 29 Dec 2012 19:29:57 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 7:00 AM, Nick Coghlan wrote: > On Sat, Dec 29, 2012 at 6:23 AM, Lennart Regebro > wrote: > > Happy Holidays! Here is the update of PEP 431 with the changes that > emerged > > after the earlier discussion. > > > > A raw download is here: > > https://raw.github.com/regebro/tz-pep/master/pep-04tz.txt > > For UI purposes, "pytz" has some helpers to get lists of timezone > names (all, common and country specific): > http://pytz.sourceforge.net/#helpers > Funnily enough, I woke up this morning thinking that this should be added, and wondering why pytz didn't have such lists. So I just missed (or rather forgot) that they existed. I'll add them. Is there a specific reason you chose to exclude those from the PEP? > > > Discussion > > ========== > > > > Should the windows installer include the data package? > > ------------------------------------------------------ > > > > It has been suggested that the Windows installer should include the data > > package. This would mean that an explicit installation no longer would be > > needed on Windows. On the other hand, that would mean that many using > > Windows > > would not be aware that the database quickly becomes outdated and would > not > > keep it updated. > > I'm still a fan of *always* shipping fallback tzdata Yes, and I did update the rest of the PEP with this, but I missed the discussion part. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Sat Dec 29 19:38:57 2012 From: tseaver at palladion.com (Tres Seaver) Date: Sat, 29 Dec 2012 13:38:57 -0500 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/29/2012 01:00 AM, Nick Coghlan wrote: > I'm still a fan of *always* shipping fallback tzdata, regardless of > platform. The stdlib would then look in three places for timezone > data when datetime.timezone was first imported: > > 1. the "tzdata-update" database 2. the OS provided database 3. the > fallback database - -Lots for enabling fallback by default except on platforms known not to have their own database, or given some explicit 'siteconfigure.py'-like knob enabling it. A clean error is better than a bad-but-silent answer. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlDfOMEACgkQ+gerLs4ltQ7mfQCgxV13Ch7eW/yDwCPMfEebeNuY xr0An1yvuUkVUQGY8nKDt9GxemdLlHMA =JtY0 -----END PGP SIGNATURE----- From regebro at gmail.com Sat Dec 29 19:54:42 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sat, 29 Dec 2012 19:54:42 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 7:38 PM, Tres Seaver wrote: > - -Lots for enabling fallback by default except on platforms known not to > have their own database Well, it's the same thing really. If the platform does have a database, the fallback will not be used. Of course, there is the case of the database existing on the platform normally, but somebody for some reason deleting the files, but I don't think that case deserves an error message. I also expect that most platform distributions, such as for Ubuntu, will not include the fallback database, as it will never be used. I'll add something about that and that we need to raise an error of some sort (any opinions on what?) if no database is found at all. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Sat Dec 29 19:56:43 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sat, 29 Dec 2012 19:56:43 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 7:54 PM, Lennart Regebro wrote: > On Sat, Dec 29, 2012 at 7:38 PM, Tres Seaver wrote: > >> - -Lots for enabling fallback by default except on platforms known not to >> have their own database > > > Well, it's the same thing really. If the platform does have a database, > the fallback will not be used. > Of course, there is the case of the database existing on the platform > normally, but somebody for some reason deleting the files, but I don't > think that case deserves an error message. > > I also expect that most platform distributions, such as for Ubuntu, will > not include the fallback database, as it will never be used. I'll add > something about that and that we need to raise an error of some sort (any > opinions on what?) if no database is found at all. > Actually I already added that, but opinions on what error to raise are still welcome. Currently it says: If no database is found an ``UnknownTimeZoneError`` or subclass thereof will be raised with a message explaining that no zoneinfo database can be found, but that you can install one with the ``tzdata-update`` package. -------------- next part -------------- An HTML attachment was scrubbed... URL: From arfrever.fta at gmail.com Sat Dec 29 20:05:04 2012 From: arfrever.fta at gmail.com (Arfrever Frehtes Taifersar Arahesis) Date: Sat, 29 Dec 2012 20:05:04 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: <201212292005.06359.Arfrever.FTA@gmail.com> 2012-12-29 19:54:42 Lennart Regebro napisa?(a): > On Sat, Dec 29, 2012 at 7:38 PM, Tres Seaver wrote: > > > - -Lots for enabling fallback by default except on platforms known not to > > have their own database > > Well, it's the same thing really. If the platform does have a database, the > fallback will not be used. I suggest that configure script support --enable-internal-timezone-database / --disable-internal-timezone-database options. --disable-internal-timezone-database should cause that installational targets in Makefile would not install files of timezone database. -- Arfrever Frehtes Taifersar Arahesis -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part. URL: From solipsis at pitrou.net Sat Dec 29 20:04:19 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 29 Dec 2012 20:04:19 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update References: Message-ID: <20121229200419.4fc73e12@pitrou.net> On Sat, 29 Dec 2012 19:56:43 +0100 Lennart Regebro wrote: > On Sat, Dec 29, 2012 at 7:54 PM, Lennart Regebro wrote: > > > On Sat, Dec 29, 2012 at 7:38 PM, Tres Seaver wrote: > > > >> - -Lots for enabling fallback by default except on platforms known not to > >> have their own database > > > > > > Well, it's the same thing really. If the platform does have a database, > > the fallback will not be used. > > Of course, there is the case of the database existing on the platform > > normally, but somebody for some reason deleting the files, but I don't > > think that case deserves an error message. > > > > I also expect that most platform distributions, such as for Ubuntu, will > > not include the fallback database, as it will never be used. I'll add > > something about that and that we need to raise an error of some sort (any > > opinions on what?) if no database is found at all. > > > > Actually I already added that, but opinions on what error to raise are > still welcome. Currently it says: > > If no database is found an ``UnknownTimeZoneError`` or subclass thereof > will > be raised with a message explaining that no zoneinfo database can be > found, > but that you can install one with the ``tzdata-update`` package. Why should we care about that situation if we *do* provide a database? Distributions can decide to exclude some files from their packages, but it's their problem, not ours. Regards Antoine. From eliben at gmail.com Sat Dec 29 20:32:56 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 29 Dec 2012 11:32:56 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16641: Fix default values of sched.scheduler.enter arguments were In-Reply-To: <3YYZKb2NQLzRw2@mail.python.org> References: <3YYZKb2NQLzRw2@mail.python.org> Message-ID: On Sat, Dec 29, 2012 at 11:17 AM, serhiy.storchaka < python-checkins at python.org> wrote: > http://hg.python.org/cpython/rev/1c9c0f92df65 > changeset: 81134:1c9c0f92df65 > branch: 3.3 > parent: 81132:5db0833f135b > user: Serhiy Storchaka > date: Sat Dec 29 21:13:45 2012 +0200 > summary: > Issue #16641: Fix default values of sched.scheduler.enter arguments were > modifiable. > > files: > Doc/library/sched.rst | 23 ++++++++++++++--------- > Lib/sched.py | 8 ++++++-- > Misc/NEWS | 3 +++ > 3 files changed, 23 insertions(+), 11 deletions(-) > > > diff --git a/Doc/library/sched.rst b/Doc/library/sched.rst > --- a/Doc/library/sched.rst > +++ b/Doc/library/sched.rst > @@ -36,19 +36,22 @@ > > >>> import sched, time > >>> s = sched.scheduler(time.time, time.sleep) > - >>> def print_time(): print("From print_time", time.time()) > + >>> def print_time(a='default'): > + ... print("From print_time", time.time(), a) > ... > >>> def print_some_times(): > ... print(time.time()) > - ... s.enter(5, 1, print_time, ()) > - ... s.enter(10, 1, print_time, ()) > + ... s.enter(10, 1, print_time) > + ... s.enter(5, 2, print_time, argument=('positional',)) > + ... s.enter(5, 1, print_time, kwargs={'a': 'keyword'}) > ... s.run() > ... print(time.time()) > ... > >>> print_some_times() > 930343690.257 > - From print_time 930343695.274 > - From print_time 930343700.273 > + From print_time 930343695.274 positional > + From print_time 930343695.275 keyword > + From print_time 930343700.273 default > 930343700.276 > > .. _scheduler-objects: > @@ -59,7 +62,7 @@ > :class:`scheduler` instances have the following methods and attributes: > > > -.. method:: scheduler.enterabs(time, priority, action, argument=[], > kwargs={}) > +.. method:: scheduler.enterabs(time, priority, action, argument=(), > kwargs={}) > > Schedule a new event. The *time* argument should be a numeric type > compatible > with the return value of the *timefunc* function passed to the > constructor. > @@ -67,8 +70,10 @@ > *priority*. > > Executing the event means executing ``action(*argument, **kwargs)``. > - *argument* must be a sequence holding the parameters for *action*. > - *kwargs* must be a dictionary holding the keyword parameters for > *action*. > + Optional *argument* argument must be a sequence holding the parameters > + for *action* if any used. > + Optional *kwargs* argument must be a dictionary holding the keyword > + parameters for *action* if any used. > I don't see how this change improves the documentation. To keep the grammar correct and just state that the arguments are optional, I would simply replace "must be" by "is". For example: *argument* is a sequence holding the parameters for *action*. This is short, and since the function signature clearly shows that argument has a default value, I think it conveys the meaning it should. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Sat Dec 29 21:16:05 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sat, 29 Dec 2012 21:16:05 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: <20121229200419.4fc73e12@pitrou.net> References: <20121229200419.4fc73e12@pitrou.net> Message-ID: On Sat, Dec 29, 2012 at 8:04 PM, Antoine Pitrou wrote: > On Sat, 29 Dec 2012 19:56:43 +0100 > Lennart Regebro wrote: > > On Sat, Dec 29, 2012 at 7:54 PM, Lennart Regebro > wrote: > > > > > On Sat, Dec 29, 2012 at 7:38 PM, Tres Seaver >wrote: > > > > > >> - -Lots for enabling fallback by default except on platforms known > not to > > >> have their own database > > > > > > > > > Well, it's the same thing really. If the platform does have a database, > > > the fallback will not be used. > > > Of course, there is the case of the database existing on the platform > > > normally, but somebody for some reason deleting the files, but I don't > > > think that case deserves an error message. > > > > > > I also expect that most platform distributions, such as for Ubuntu, > will > > > not include the fallback database, as it will never be used. I'll add > > > something about that and that we need to raise an error of some sort > (any > > > opinions on what?) if no database is found at all. > > > > > > > Actually I already added that, but opinions on what error to raise are > > still welcome. Currently it says: > > > > If no database is found an ``UnknownTimeZoneError`` or subclass > thereof > > will > > be raised with a message explaining that no zoneinfo database can be > > found, > > but that you can install one with the ``tzdata-update`` package. > > Why should we care about that situation if we *do* provide a database? > Distributions can decide to exclude some files from their packages, but > it's their problem, not ours. > Yes, but a comprehensible error message is useful even if somebody messed up the system/configuration. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.jerdonek at gmail.com Sat Dec 29 21:48:48 2012 From: chris.jerdonek at gmail.com (Chris Jerdonek) Date: Sat, 29 Dec 2012 12:48:48 -0800 Subject: [Python-Dev] [Python-checkins] Cron /home/docs/build-devguide In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 12:05 PM, Cron Daemon wrote: > /home/docs/devguide/documenting.rst:768: WARNING: term not in glossary: bytecode Why is this warning reported? I can't reproduce on my system, and on my system and in the published online docs, the term successfully links to: http://docs.python.org/3/glossary.html#term-bytecode (in the section http://docs.python.org/devguide/documenting.html#information-units ) --Chris > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins From g.brandl at gmx.net Sat Dec 29 22:07:49 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 29 Dec 2012 22:07:49 +0100 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16641: Fix default values of sched.scheduler.enter arguments were In-Reply-To: References: <3YYZKb2NQLzRw2@mail.python.org> Message-ID: On 12/29/2012 08:32 PM, Eli Bendersky wrote: > > > > On Sat, Dec 29, 2012 at 11:17 AM, serhiy.storchaka > wrote: > > http://hg.python.org/cpython/rev/1c9c0f92df65 > changeset: 81134:1c9c0f92df65 > branch: 3.3 > parent: 81132:5db0833f135b > user: Serhiy Storchaka > > date: Sat Dec 29 21:13:45 2012 +0200 > summary: > Issue #16641: Fix default values of sched.scheduler.enter arguments were > modifiable. > > files: > Doc/library/sched.rst | 23 ++++++++++++++--------- > Lib/sched.py | 8 ++++++-- > Misc/NEWS | 3 +++ > 3 files changed, 23 insertions(+), 11 deletions(-) > > > diff --git a/Doc/library/sched.rst b/Doc/library/sched.rst > --- a/Doc/library/sched.rst > +++ b/Doc/library/sched.rst > @@ -36,19 +36,22 @@ > > >>> import sched, time > >>> s = sched.scheduler(time.time, time.sleep) > - >>> def print_time(): print("From print_time", time.time()) > + >>> def print_time(a='default'): > + ... print("From print_time", time.time(), a) > ... > >>> def print_some_times(): > ... print(time.time()) > - ... s.enter(5, 1, print_time, ()) > - ... s.enter(10, 1, print_time, ()) > + ... s.enter(10, 1, print_time) > + ... s.enter(5, 2, print_time, argument=('positional',)) > + ... s.enter(5, 1, print_time, kwargs={'a': 'keyword'}) > ... s.run() > ... print(time.time()) > ... > >>> print_some_times() > 930343690.257 > - From print_time 930343695.274 > - From print_time 930343700.273 > + From print_time 930343695.274 positional > + From print_time 930343695.275 keyword > + From print_time 930343700.273 default > 930343700.276 > > .. _scheduler-objects: > @@ -59,7 +62,7 @@ > :class:`scheduler` instances have the following methods and attributes: > > > -.. method:: scheduler.enterabs(time, priority, action, argument=[], kwargs={}) > +.. method:: scheduler.enterabs(time, priority, action, argument=(), kwargs={}) > > Schedule a new event. The *time* argument should be a numeric type > compatible > with the return value of the *timefunc* function passed to the constructor. > @@ -67,8 +70,10 @@ > *priority*. > > Executing the event means executing ``action(*argument, **kwargs)``. > - *argument* must be a sequence holding the parameters for *action*. > - *kwargs* must be a dictionary holding the keyword parameters for *action*. > + Optional *argument* argument must be a sequence holding the parameters > + for *action* if any used. > + Optional *kwargs* argument must be a dictionary holding the keyword > + parameters for *action* if any used. > > > I don't see how this change improves the documentation. To keep the grammar > correct and just state that the arguments are optional, I would simply replace > "must be" by "is". For example: > > *argument* is a sequence holding the parameters for *action*. > > This is short, and since the function signature clearly shows that argument has > a default value, I think it conveys the meaning it should. Hi Eli, I'm sure we non-native speakers are fine with any improvements you can make during commit review. Georg From eliben at gmail.com Sat Dec 29 22:44:14 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 29 Dec 2012 13:44:14 -0800 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16641: Fix default values of sched.scheduler.enter arguments were In-Reply-To: References: <3YYZKb2NQLzRw2@mail.python.org> Message-ID: > > Executing the event means executing ``action(*argument, > **kwargs)``. > > - *argument* must be a sequence holding the parameters for > *action*. > > - *kwargs* must be a dictionary holding the keyword parameters for > *action*. > > + Optional *argument* argument must be a sequence holding the > parameters > > + for *action* if any used. > > + Optional *kwargs* argument must be a dictionary holding the > keyword > > + parameters for *action* if any used. > > > > > > I don't see how this change improves the documentation. To keep the > grammar > > correct and just state that the arguments are optional, I would simply > replace > > "must be" by "is". For example: > > > > *argument* is a sequence holding the parameters for *action*. > > > > This is short, and since the function signature clearly shows that > argument has > > a default value, I think it conveys the meaning it should. > > Hi Eli, > > I'm sure we non-native speakers are fine with any improvements you can make > during commit review. > > Georg > Georg, I also wrote a private email to Serhiy proposing to help, but since you brought this up here: I think that my comment was constructive. What should have I done differently? Go ahead and modify the phrasing in a separate commit? I see a couple of problems with that: 1. It can be somewhat disrespectful to a new committer, and I wanted to reach a consensus first. 2. Serhiy diligently committed this into 3 or 4 different Python branches. With all due respect, going through the merge/push dance is far above the effort I'm willing to invest in this. Eli P.S. I would argue that you are more native-speaker than myself w.r.t. English :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Sat Dec 29 23:17:03 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 29 Dec 2012 23:17:03 +0100 Subject: [Python-Dev] [Python-checkins] cpython (3.3): Issue #16641: Fix default values of sched.scheduler.enter arguments were In-Reply-To: References: <3YYZKb2NQLzRw2@mail.python.org> Message-ID: On 12/29/2012 10:44 PM, Eli Bendersky wrote: > > > Executing the event means executing ``action(*argument, **kwargs)``. > > - *argument* must be a sequence holding the parameters for *action*. > > - *kwargs* must be a dictionary holding the keyword parameters for > *action*. > > + Optional *argument* argument must be a sequence holding the parameters > > + for *action* if any used. > > + Optional *kwargs* argument must be a dictionary holding the keyword > > + parameters for *action* if any used. > > > > > > I don't see how this change improves the documentation. To keep the grammar > > correct and just state that the arguments are optional, I would simply replace > > "must be" by "is". For example: > > > > *argument* is a sequence holding the parameters for *action*. > > > > This is short, and since the function signature clearly shows that > argument has > > a default value, I think it conveys the meaning it should. > > Hi Eli, > > I'm sure we non-native speakers are fine with any improvements you can make > during commit review. > > Georg > > > Georg, > > I also wrote a private email to Serhiy proposing to help, but since you brought > this up here: I think that my comment was constructive. What should have I done > differently? Nothing at all. In case you read my comment as sarcasm, please don't. It probably was a little unreflected. > Go ahead and modify the phrasing in a separate commit? I see a couple of problems with that: > > 1. It can be somewhat disrespectful to a new committer, and I wanted to reach a > consensus first. I would not feel unrespected by someone correcting nits in my English grammar. (Maybe it would be different for Python grammar :) > 2. Serhiy diligently committed this into 3 or 4 different Python branches. With > all due respect, going through the merge/push dance is far above the effort I'm > willing to invest in this. I understand. I probably would have not done it instantly either, but put it on my todo list and committed when I had some other change as well. > Eli > > P.S. I would argue that you are more native-speaker than myself w.r.t. English :-) Then I was mistaken there. (Are there multiple levels of nativeness? :) Or is it because English is half-Germanic?) cheers, Georg From cs at zip.com.au Sat Dec 29 23:55:26 2012 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 30 Dec 2012 09:55:26 +1100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: <20121229225526.GA2713@cskk.homeip.net> On 29Dec2012 21:16, Lennart Regebro wrote: | On Sat, Dec 29, 2012 at 8:04 PM, Antoine Pitrou wrote: | > Why should we care about that situation if we *do* provide a database? | > Distributions can decide to exclude some files from their packages, but | > it's their problem, not ours. | | Yes, but a comprehensible error message is useful even if somebody messed | up the system/configuration. Couldn't you just agree to augument the exception with some "I looked here, there and there" information. It avoids a lot of bikeshedding and makes things clear. You're not diagnosing system misconfiguration, just saying "I can't find stuff, and here is where I looked". Cheers, -- Cameron Simpson From tjreedy at udel.edu Sat Dec 29 23:59:04 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 29 Dec 2012 17:59:04 -0500 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: <20121229200419.4fc73e12@pitrou.net> Message-ID: On 12/29/2012 3:16 PM, Lennart Regebro wrote: > Yes, but a comprehensible error message is useful even if somebody > messed up the system/configuration. Just reuse whatever exception type you create and add a sensible message. Hopefully, it will never be seen. -- Terry Jan Reedy From eliben at gmail.com Sun Dec 30 01:31:33 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 29 Dec 2012 16:31:33 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? Message-ID: Hi, This came up while investigating some test-order-dependency failures in issue 16076. test___all__ goes over modules that have `__all__` in them and does 'from import *' on them. This leaves a lot of modules in sys.modules, which may interfere with some tests that do fancy things with sys,modules. In particular, the ElementTree tests have trouble with it because they carefully set up the imports to get the C or the Python version of etree (see issues 15083 and 15075). Would it make sense to save the sys.modules state and restore it in test___all__ so that sys.modules isn't affected by this test? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sun Dec 30 01:46:39 2012 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 29 Dec 2012 18:46:39 -0600 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: 2012/12/29 Eli Bendersky : > Hi, > > This came up while investigating some test-order-dependency failures in > issue 16076. > > test___all__ goes over modules that have `__all__` in them and does 'from > import *' on them. This leaves a lot of modules in sys.modules, > which may interfere with some tests that do fancy things with sys,modules. > In particular, the ElementTree tests have trouble with it because they > carefully set up the imports to get the C or the Python version of etree > (see issues 15083 and 15075). > > Would it make sense to save the sys.modules state and restore it in > test___all__ so that sys.modules isn't affected by this test? Sounds reasonable to me. -- Regards, Benjamin From andrew.svetlov at gmail.com Sun Dec 30 01:48:13 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 30 Dec 2012 02:48:13 +0200 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: See (unfinished but trivial enough) http://bugs.python.org/issue14715 for proposed way to modules/importsystem cleanup On Sun, Dec 30, 2012 at 2:31 AM, Eli Bendersky wrote: > Hi, > > This came up while investigating some test-order-dependency failures in > issue 16076. > > test___all__ goes over modules that have `__all__` in them and does 'from > import *' on them. This leaves a lot of modules in sys.modules, > which may interfere with some tests that do fancy things with sys,modules. > In particular, the ElementTree tests have trouble with it because they > carefully set up the imports to get the C or the Python version of etree > (see issues 15083 and 15075). > > Would it make sense to save the sys.modules state and restore it in > test___all__ so that sys.modules isn't affected by this test? > > Eli > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/andrew.svetlov%40gmail.com > -- Thanks, Andrew Svetlov From eliben at gmail.com Sun Dec 30 01:56:51 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 29 Dec 2012 16:56:51 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 4:46 PM, Benjamin Peterson wrote: > 2012/12/29 Eli Bendersky : > > Hi, > > > > This came up while investigating some test-order-dependency failures in > > issue 16076. > > > > test___all__ goes over modules that have `__all__` in them and does 'from > > import *' on them. This leaves a lot of modules in sys.modules, > > which may interfere with some tests that do fancy things with > sys,modules. > > In particular, the ElementTree tests have trouble with it because they > > carefully set up the imports to get the C or the Python version of etree > > (see issues 15083 and 15075). > > > > Would it make sense to save the sys.modules state and restore it in > > test___all__ so that sys.modules isn't affected by this test? > > Sounds reasonable to me. > Thanks. http://bugs.python.org/issue16817 Eli > > -- > Regards, > Benjamin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Sun Dec 30 02:02:34 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Dec 2012 01:02:34 +0000 (UTC) Subject: [Python-Dev] question about packaging References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> <20121229114739.54ef8b52@pitrou.net> Message-ID: > On Dec 29, 2012, at 5:47 AM, Antoine Pitrou wrote: > > The current effort seems to be distlib, Vinay's project to gather the > > "good parts" of packaging and distutils as a library API: > > http://packages.python.org/distlib/ > > (there's an active bitbucket repo) See https://bitbucket.org/vinay.sajip/distlib/ for the latest code, which is periodically pushed to http://hg.python.org/distlib/ The latest documentation is at https://distlib.readthedocs.org/en/latest/ While distlib focuses on the packaging PEPs and standardised formats, it is intended to be possible to build packaging systems on top of it. Compared to distutils2, distlib aims to make it easier to transition from existing packaging infrastructure and tools (distutils, setuptools/distribute). Some of the PEPs are still in flux (e.g. PEP 426, PEP 427). Regards, Vinay Sajip From vinay_sajip at yahoo.co.uk Sun Dec 30 02:06:58 2012 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 30 Dec 2012 01:06:58 +0000 (UTC) Subject: [Python-Dev] question about packaging References: <480CF8A8-0461-4C20-8A3C-2944C883E78B@gmail.com> <20121229114739.54ef8b52@pitrou.net> <50DF0021.7070003@cavallinux.eu> Message-ID: Antonio Cavallo cavallinux.eu> writes: > Correct if I'm wrong but distlib isn't targeting resources managent? > distutils is targeted to distribute python modules/packages instead; > small differences but on the field they really mean different things. distlib is intended to target more than resource management, but it's not a full-blown packaging system. Rather, it's intended to implement common pieces of functionality needed by packaging systems in a hopefully non-controversial and useful way. > distlib is under http://hg.python.org/distlib too :O Actually the BitBucket repo is more active and readthedocs has the latest docs, but I do periodically update the above repo on hg.python.org. Regards, Vinay Sajip From ncoghlan at gmail.com Sun Dec 30 02:34:23 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 30 Dec 2012 11:34:23 +1000 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: On Sun, Dec 30, 2012 at 10:46 AM, Benjamin Peterson wrote: > 2012/12/29 Eli Bendersky : >> Hi, >> >> This came up while investigating some test-order-dependency failures in >> issue 16076. >> >> test___all__ goes over modules that have `__all__` in them and does 'from >> import *' on them. This leaves a lot of modules in sys.modules, >> which may interfere with some tests that do fancy things with sys,modules. >> In particular, the ElementTree tests have trouble with it because they >> carefully set up the imports to get the C or the Python version of etree >> (see issues 15083 and 15075). >> >> Would it make sense to save the sys.modules state and restore it in >> test___all__ so that sys.modules isn't affected by this test? > > Sounds reasonable to me. I've tried this as an inherent property of regrtest before (to resolve some problems with test_pydoc), and it didn't work - we have too many modules with non-local side effects (IIRC, mostly related to the copy and pickle registries). Given that it checks the whole standard library, test___all__ is likely to run into the same problem. Hence test.support.import_fresh_module - it can ensure you get the module you want, regardless of the preexisting contents of sys.modules. (http://docs.python.org/dev/library/test#test.support.import_fresh_module) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From regebro at gmail.com Sun Dec 30 07:42:39 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sun, 30 Dec 2012 07:42:39 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: <20121229225526.GA2713@cskk.homeip.net> References: <20121229225526.GA2713@cskk.homeip.net> Message-ID: On Sat, Dec 29, 2012 at 11:55 PM, Cameron Simpson wrote: > On 29Dec2012 21:16, Lennart Regebro wrote: > | On Sat, Dec 29, 2012 at 8:04 PM, Antoine Pitrou > wrote: > | > Why should we care about that situation if we *do* provide a database? > | > Distributions can decide to exclude some files from their packages, but > | > it's their problem, not ours. > | > | Yes, but a comprehensible error message is useful even if somebody messed > | up the system/configuration. > > Couldn't you just agree to augument the exception with some "I looked > here, there and there" information. It avoids a lot of bikeshedding and > makes things clear. You're not diagnosing system misconfiguration, just > saying "I can't find stuff, and here is where I looked". > Since the location of the tzdata-update package isn't a fixed place it's hard to say "I looked here, there and there", though. I don't think anyone has suggested making any diagnostics. :-) //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Sun Dec 30 08:03:16 2012 From: regebro at gmail.com (Lennart Regebro) Date: Sun, 30 Dec 2012 08:03:16 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: <20121229200419.4fc73e12@pitrou.net> Message-ID: On Sat, Dec 29, 2012 at 11:59 PM, Terry Reedy wrote: > On 12/29/2012 3:16 PM, Lennart Regebro wrote: > > Yes, but a comprehensible error message is useful even if somebody >> messed up the system/configuration. >> > > Just reuse whatever exception type you create and add a sensible message. > Hopefully, it will never be seen. > I haven't implemented this, but I suspect the code will in the end look for the tzdata module, which means that if it doesn't exist, the error raised is an ImportError. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Sun Dec 30 09:50:06 2012 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 30 Dec 2012 19:50:06 +1100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: <20121230085006.GA423@cskk.homeip.net> On 30Dec2012 07:42, Lennart Regebro wrote: | On Sat, Dec 29, 2012 at 11:55 PM, Cameron Simpson wrote: | | > On 29Dec2012 21:16, Lennart Regebro wrote: | > | On Sat, Dec 29, 2012 at 8:04 PM, Antoine Pitrou | > wrote: | > | > Why should we care about that situation if we *do* provide a database? | > | > Distributions can decide to exclude some files from their packages, but | > | > it's their problem, not ours. | > | | > | Yes, but a comprehensible error message is useful even if somebody messed | > | up the system/configuration. | > | > Couldn't you just agree to augument the exception with some "I looked | > here, there and there" information. It avoids a lot of bikeshedding and | > makes things clear. You're not diagnosing system misconfiguration, just | > saying "I can't find stuff, and here is where I looked". | | Since the location of the tzdata-update package isn't a fixed place it's | hard to say "I looked here, there and there", though. I don't think anyone | has suggested making any diagnostics. :-) I think I misunderstood the context. Never mind. -- Cameron Simpson Displays are sold by the acre, not the function. - overhead by WIRED at the Intelligent Printing conference Oct2006 From steve at pearwood.info Sun Dec 30 11:19:54 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 30 Dec 2012 21:19:54 +1100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> <50DE45F4.5060108@pearwood.info> Message-ID: <50E0154A.6070108@pearwood.info> On 29/12/12 15:40, Lennart Regebro wrote: > On Sat, Dec 29, 2012 at 2:23 AM, Steven D'Apranowrote: > >> The PEP says: >> >> * New function :``timezone(name=None, db_path=None)`` >> >> >> This function takes a name string that must be a string specifying a >> valid zoneinfo timezone, ie "US/Eastern", "Europe/Warsaw" or >> "Etc/GMT+11". >> >> >> It isn't 100% clear to me from the PEP what a valid name string would be, >> but I assume that it will accept anything that the time.tzset function >> will accept: >> > > No, valid names are the names of time zones in the zoneinfo database. If I've understood it correctly, that contradicts the PEP. One example given is "Etc/GMT+11", which is not a timezone *name*, but a timezone name *plus an offset*. Presumably if GMT+11 is legal, so should be GMT+10:30. There is no "Etc/GMT+11" named here: http://en.wikipedia.org/wiki/List_of_tz_database_time_zones nor is it included in /usr/share/zoneinfo/zone.tab in either of the systems I looked at (one Debian, one Centos), but there is Etc/GMT. So I conclude that the PEP allows timezone rules, not just names. Either way, I think the PEP needs to clarify what counts as a valid name string. > There > isn't really any usecase for defining up your own rules as that would mean > that you want a time zone that doesn't exist, which seems a bit pointless. > :-) It means you want a time zone that doesn't exist in the database, which is not the same as not existing in real life. Perhaps the database is out-of-date, or the government has suddenly declared a daylight savings change that isn't reflected yet in the database. Or you want to set your own TZ rules for testing. Or you've just declared independence from the central government and are setting up your own TZ rules. time.tzset supports rules as well as names. Is there some reason why this module should not do the same? I also quote from /usr/share/doc/tzdata-2012f/README on my Centos system: [quote] README for the tz distribution [...] The 1989 update of the time zone package featured [...] * reference data from the United States Naval Observatory for folks who want to do additional time zones [end quote] So the people who prepare tzdata on Red Hat systems clearly think that there are use-cases for making additional time zones. -- Steven From ronaldoussoren at mac.com Sun Dec 30 10:47:00 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Sun, 30 Dec 2012 10:47:00 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On 29 Dec, 2012, at 5:48, Lennart Regebro wrote: > On Fri, Dec 28, 2012 at 10:12 PM, Ronald Oussoren wrote: > > On 28 Dec, 2012, at 21:23, Lennart Regebro wrote: > > > Happy Holidays! Here is the update of PEP 431 with the changes that emerged after the earlier discussion. > > Why is the new timezone support added in a submodule of datetime? > > Because several people wanted it that way and nobody objected. > > Adding the new > function and exception to datetime itself wouldn't clutter the API that much > > It will make the datetime.py twice as long though, and the second longest module in the stdlib, beaten only by decimal.py. Perhaps this is not a problem. The module could be split into several modules in a package without affecting the public API if that would help with maintenance, simular to unittest. Ronald -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Sun Dec 30 13:43:10 2012 From: shibturn at gmail.com (Richard Oudkerk) Date: Sun, 30 Dec 2012 12:43:10 +0000 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: On 30/12/2012 12:31am, Eli Bendersky wrote: > Would it make sense to save the sys.modules state and restore it in > test___all__ so that sys.modules isn't affected by this test? Deleting module objects can cause problems because the destructor replaces values in the globals dict by None. If anything defined there has "escaped" and depends on any globals then you are liable to encounter errors. For example, setuptools restores sys.modules after running each test. This was causing errors at shutdown from an atexit function registered by multiprocessing. The atexit function was still registered, but no longer valid, because the module had been garbage collected and the globals had been replaced by None. Personally I would like to get rid of the "purge globals" behaviour for modules deleted before shutdown has started: if someone manipulates sys.modules then they can just call gc.collect() if they want to promptly get rid of orphaned reference cycles. -- Richard From eliben at gmail.com Sun Dec 30 14:33:56 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 30 Dec 2012 05:33:56 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: On Sat, Dec 29, 2012 at 5:34 PM, Nick Coghlan wrote: > On Sun, Dec 30, 2012 at 10:46 AM, Benjamin Peterson > wrote: > > 2012/12/29 Eli Bendersky : > >> Hi, > >> > >> This came up while investigating some test-order-dependency failures in > >> issue 16076. > >> > >> test___all__ goes over modules that have `__all__` in them and does > 'from > >> import *' on them. This leaves a lot of modules in sys.modules, > >> which may interfere with some tests that do fancy things with > sys,modules. > >> In particular, the ElementTree tests have trouble with it because they > >> carefully set up the imports to get the C or the Python version of etree > >> (see issues 15083 and 15075). > >> > >> Would it make sense to save the sys.modules state and restore it in > >> test___all__ so that sys.modules isn't affected by this test? > > > > Sounds reasonable to me. > > I've tried this as an inherent property of regrtest before (to resolve > some problems with test_pydoc), and it didn't work - we have too many > modules with non-local side effects (IIRC, mostly related to the copy > and pickle registries). > > Given that it checks the whole standard library, test___all__ is > likely to run into the same problem. > > Yes, I'm running into all kinds of weird problems when saving/restoring sys.modules around test___all__. This is not the first time I get to fight this test-run-dependency problem and it's very frustrating. This may be a naive question, but why don't we run the tests in separate interpreters? For example with -j we do (which creates all kinds of strange intermittent problems depending on which tests got bundled into the same process). Is this a matter of performance? Because that would help get rid of these dependencies between tests, which would probably save core devs some work and headache. After all, since a lot of the interpreter state is global (for example sys.modules), does it not make sense to run each test in a clean environment? Many tests do fancy things with the global environment which makes them difficult to keep clean and separate. > Hence test.support.import_fresh_module - it can ensure you get the > module you want, regardless of the preexisting contents of > sys.modules. ( > http://docs.python.org/dev/library/test#test.support.import_fresh_module) > Yes, this is the solution currently used in test_xml_etree. However, once pickling tests are added things stop working. Pickle uses __import__ to import the module a class belongs to, bypassing all such trickery. So if test___all__ got _elementtree into sys.modules, pickle's __import__ finds it even if all the tests in test_xml_etree manage to ignore it for the Python version because they use import_fresh_module. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Sun Dec 30 14:54:20 2012 From: stefan at bytereef.org (Stefan Krah) Date: Sun, 30 Dec 2012 14:54:20 +0100 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: <20121230135420.GA24322@sleipnir.bytereef.org> Eli Bendersky wrote: > Yes, this is the solution currently used in test_xml_etree. However, once > pickling tests are added things stop working. Pickle uses __import__ to import > the module a class belongs to, bypassing all such trickery. So if test___all__ > got _elementtree into sys.modules, pickle's __import__ finds it even if all the > tests in test_xml_etree manage to ignore it for the Python version because they > use import_fresh_module. I ran into the same problem for test_decimal. The only thing that appears to work is to set sys.modules['decimal'] explicitly before calling dumps()/loads(). See: PythonAPItests.test_pickle() ContextAPItests.test_pickle() test_decimal ruthlessly switches sys.modules['decimal'] many times. At the end of all tests there is a sanity check that asserts that the number of changes were in fact balanced. Stefan Krah From eliben at gmail.com Sun Dec 30 15:06:30 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 30 Dec 2012 06:06:30 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: <20121230135420.GA24322@sleipnir.bytereef.org> References: <20121230135420.GA24322@sleipnir.bytereef.org> Message-ID: On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah wrote: > Eli Bendersky wrote: > > Yes, this is the solution currently used in test_xml_etree. However, once > > pickling tests are added things stop working. Pickle uses __import__ to > import > > the module a class belongs to, bypassing all such trickery. So if > test___all__ > > got _elementtree into sys.modules, pickle's __import__ finds it even if > all the > > tests in test_xml_etree manage to ignore it for the Python version > because they > > use import_fresh_module. > > I ran into the same problem for test_decimal. The only thing that appears > to work is to set sys.modules['decimal'] explicitly before calling > dumps()/loads(). See: > > PythonAPItests.test_pickle() > ContextAPItests.test_pickle() > > > test_decimal ruthlessly switches sys.modules['decimal'] many times. At the > end of all tests there is a sanity check that asserts that the number of > changes were in fact balanced. Thank you, I'll try this. I'm also experimenting with other approaches. By the way, from clean default checkout: $ ./python -mtest test___all__ test_int [1/2] test___all__ [2/2] test_int test test_int failed -- Traceback (most recent call last): File "/home/eliben/python-src/default/Lib/test/test_int.py", line 236, in test_keyword_args self.assertRaises(TypeError, int, base=10) AssertionError: TypeError not raised by int 1 test OK. 1 test failed: test_int Should this really fail? [I haven't investigated the root cause yet] Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Dec 30 15:17:01 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 30 Dec 2012 06:17:01 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: <20121230135420.GA24322@sleipnir.bytereef.org> Message-ID: On Sun, Dec 30, 2012 at 6:06 AM, Eli Bendersky wrote: > > > > On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah wrote: > >> Eli Bendersky wrote: >> > Yes, this is the solution currently used in test_xml_etree. However, >> once >> > pickling tests are added things stop working. Pickle uses __import__ to >> import >> > the module a class belongs to, bypassing all such trickery. So if >> test___all__ >> > got _elementtree into sys.modules, pickle's __import__ finds it even if >> all the >> > tests in test_xml_etree manage to ignore it for the Python version >> because they >> > use import_fresh_module. >> >> I ran into the same problem for test_decimal. The only thing that appears >> to work is to set sys.modules['decimal'] explicitly before calling >> dumps()/loads(). See: >> >> PythonAPItests.test_pickle() >> ContextAPItests.test_pickle() >> >> >> test_decimal ruthlessly switches sys.modules['decimal'] many times. At the >> end of all tests there is a sanity check that asserts that the number of >> changes were in fact balanced. > > > Thank you, I'll try this. I'm also experimenting with other approaches. > > By the way, from clean default checkout: > > $ ./python -mtest test___all__ test_int > [1/2] test___all__ > [2/2] test_int > test test_int failed -- Traceback (most recent call last): > File "/home/eliben/python-src/default/Lib/test/test_int.py", line 236, > in test_keyword_args > self.assertRaises(TypeError, int, base=10) > AssertionError: TypeError not raised by int > > 1 test OK. > 1 test failed: > test_int > > Should this really fail? [I haven't investigated the root cause yet] > This is a false alarm, sorry. Please ignore this report (I had some stale build artifacts). Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun Dec 30 15:19:50 2012 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 30 Dec 2012 06:19:50 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: <20121230135420.GA24322@sleipnir.bytereef.org> References: <20121230135420.GA24322@sleipnir.bytereef.org> Message-ID: On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah wrote: > Eli Bendersky wrote: > > Yes, this is the solution currently used in test_xml_etree. However, once > > pickling tests are added things stop working. Pickle uses __import__ to > import > > the module a class belongs to, bypassing all such trickery. So if > test___all__ > > got _elementtree into sys.modules, pickle's __import__ finds it even if > all the > > tests in test_xml_etree manage to ignore it for the Python version > because they > > use import_fresh_module. > > I ran into the same problem for test_decimal. The only thing that appears > to work is to set sys.modules['decimal'] explicitly before calling > dumps()/loads(). See: > > PythonAPItests.test_pickle() > ContextAPItests.test_pickle() > > Yes, this seems to have done the trick. Thanks for the suggestion. I'm still curious about the test-in-clean-env question though. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at bytereef.org Sun Dec 30 15:31:34 2012 From: stefan at bytereef.org (Stefan Krah) Date: Sun, 30 Dec 2012 15:31:34 +0100 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: <20121230135420.GA24322@sleipnir.bytereef.org> Message-ID: <20121230143134.GA24716@sleipnir.bytereef.org> Eli Bendersky wrote: > Yes, this seems to have done the trick. Thanks for the suggestion. > > I'm still curious about the test-in-clean-env question though. I think that in general we do want to check unexpected interactions between tests, that's also why the test order is randomized. Here's an example that might not have been caught with clean-env tests: http://bugs.python.org/issue7384 Stefan Krah From ncoghlan at gmail.com Sun Dec 30 15:38:47 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Dec 2012 00:38:47 +1000 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: <20121230135420.GA24322@sleipnir.bytereef.org> Message-ID: On Mon, Dec 31, 2012 at 12:19 AM, Eli Bendersky wrote: > > > > On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah wrote: >> >> Eli Bendersky wrote: >> > Yes, this is the solution currently used in test_xml_etree. However, >> > once >> > pickling tests are added things stop working. Pickle uses __import__ to >> > import >> > the module a class belongs to, bypassing all such trickery. So if >> > test___all__ >> > got _elementtree into sys.modules, pickle's __import__ finds it even if >> > all the >> > tests in test_xml_etree manage to ignore it for the Python version >> > because they >> > use import_fresh_module. >> >> I ran into the same problem for test_decimal. The only thing that appears >> to work is to set sys.modules['decimal'] explicitly before calling >> dumps()/loads(). See: >> >> PythonAPItests.test_pickle() >> ContextAPItests.test_pickle() >> > > Yes, this seems to have done the trick. Thanks for the suggestion. It may be worth offering a context manager/decorator equivalent to "import_fresh_module". > I'm still curious about the test-in-clean-env question though. As Stefan noted, the main advantage we get is that sometimes the failure to clean up properly is in the standard lib code rather than the tests, and with complete isolation we'd be less likely to notice the problem. Once you combine that with the fact that rearchitecting regrtest to work that way would be quite a bit of work, the motivation to make it happen goes way down. However, specifically spinning out the "import the world" tests like test_pydoc and test___all__ to a separate process might be worth the effort. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rdmurray at bitdance.com Sun Dec 30 15:48:30 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 30 Dec 2012 09:48:30 -0500 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: <20121230135420.GA24322@sleipnir.bytereef.org> Message-ID: <20121230144830.F00802500B2@webabinitio.net> On Mon, 31 Dec 2012 00:38:47 +1000, Nick Coghlan wrote: > On Mon, Dec 31, 2012 at 12:19 AM, Eli Bendersky wrote: > > On Sun, Dec 30, 2012 at 5:54 AM, Stefan Krah wrote: > >> > >> Eli Bendersky wrote: > >> > Yes, this is the solution currently used in test_xml_etree. However, > >> > once > >> > pickling tests are added things stop working. Pickle uses __import__ to > >> > import > >> > the module a class belongs to, bypassing all such trickery. So if > >> > test___all__ > >> > got _elementtree into sys.modules, pickle's __import__ finds it even if > >> > all the > >> > tests in test_xml_etree manage to ignore it for the Python version > >> > because they > >> > use import_fresh_module. > >> > >> I ran into the same problem for test_decimal. The only thing that appears > >> to work is to set sys.modules['decimal'] explicitly before calling > >> dumps()/loads(). See: > >> > >> PythonAPItests.test_pickle() > >> ContextAPItests.test_pickle() > > > > Yes, this seems to have done the trick. Thanks for the suggestion. > > It may be worth offering a context manager/decorator equivalent to > "import_fresh_module". I suggested making import_fresh_module a context manager in the issue that Eli opened about test___all__. > > I'm still curious about the test-in-clean-env question though. > > As Stefan noted, the main advantage we get is that sometimes the > failure to clean up properly is in the standard lib code rather than > the tests, and with complete isolation we'd be less likely to notice > the problem. > > Once you combine that with the fact that rearchitecting regrtest to > work that way would be quite a bit of work, the motivation to make it > happen goes way down. > > However, specifically spinning out the "import the world" tests like > test_pydoc and test___all__ to a separate process might be worth the > effort. Adding something to regertest (or unittest?) so that certain nominated test modules are run in a subprocess has been discussed previously, but so far no one has stepped up to implement it :) (I think this came up originally for test_site, but I don't remember for sure.) --David From tseaver at palladion.com Sun Dec 30 17:39:21 2012 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 30 Dec 2012 11:39:21 -0500 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50E0154A.6070108@pearwood.info> References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> <50DE45F4.5060108@pearwood.info> <50E0154A.6070108@pearwood.info> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/30/2012 05:19 AM, Steven D'Aprano wrote: > There is no "Etc/GMT+11" named here: > > http://en.wikipedia.org/wiki/List_of_tz_database_time_zones > > nor is it included in /usr/share/zoneinfo/zone.tab in either of the > systems I looked at (one Debian, one Centos), but there is Etc/GMT. So > I conclude that the PEP allows timezone rules, not just names. FWIW, my Ubuntu box has zone data for 'ETC/GMT+11': $ file /usr/share/zoneinfo/posix/Etc/GMT+11 /usr/share/zoneinfo/posix/Etc/GMT+11: timezone data, version 2, \ 1 gmt time flag, 1 std time flag, no leap seconds, no transition times, \ 1 abbreviation char Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlDgbjkACgkQ+gerLs4ltQ6w2QCgzqAFfOAigwVZMZEh+il+0grb jsYAoMm1g8xnXe1dcgkFMEX0n14grDSH =rCdb -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Sun Dec 30 22:25:25 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 31 Dec 2012 10:25:25 +1300 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: References: Message-ID: <50E0B145.1050905@canterbury.ac.nz> Richard Oudkerk wrote: > Personally I would like to get rid of the "purge globals" behaviour for > modules deleted before shutdown has started: if someone manipulates > sys.modules then they can just call gc.collect() if they want to > promptly get rid of orphaned reference cycles. Now that we have cyclic gc, is there any need for the shutdown purge at all? -- Greg From barry at barrys-emacs.org Sun Dec 30 22:50:26 2012 From: barry at barrys-emacs.org (Barry Scott) Date: Sun, 30 Dec 2012 21:50:26 +0000 Subject: [Python-Dev] PYTHONPATH processing change from 2.6 to 2.7 and Mac bundle builder problems In-Reply-To: References: <20121228233016.377aeffb@pitrou.net> <9E6E3321-B0E7-4E77-AFCB-9C78556499EF@barrys-emacs.org> Message-ID: <5C3D2F1F-FC03-4C26-A2C4-CABEC729E271@barrys-emacs.org> Issue filed as http://bugs.python.org/issue16821 I now have a fix that I can use, a trivia patch to the bundlebuilder.py from 2.6 gives me working code. The bundelbuilder in 2.7 is not in good shape, the code changes from the 2.6 version have bugs in them, at least one is a show stopper. I'd have to assume that the code was not tested. I'd suggest that you revert to the 2.6 version and apply the patch in the bug report so that this can make it into a 2.7.4 if you do another 2.7 release. I also noticed that it says that bundelbuilder will not be in python 3. Do you expect this functionality to be maintained outside of the core python code? Barry On 28 Dec 2012, at 23:57, Ned Deily wrote: > In article <9E6E3321-B0E7-4E77-AFCB-9C78556499EF at barrys-emacs.org>, > Barry Scott wrote: >> You did not set PYTHONHOME that effects the code in calculate_path a lot. >> Also there is platform specific code in tht code. >> On 28 Dec 2012, at 22:30, Antoine Pitrou wrote: >>> On Fri, 28 Dec 2012 21:39:56 +0000 >>> Barry Scott wrote: >>>> I'm trying to track down why bundlebuilder no longer works with python 2.7 >>>> to create runnable Mac OS X apps. >>>> >>>> I have got as far as seeing that imports of modules are failing. >>>> >>>> What I see is that sys.path does not contain all the elements from the >>>> PYTHONPATH variable. >>>> >>>> No matter what I put in PYTHONPATH only the first element is in sys.path. >>> >>> I can't reproduce under Linux: >>> >>> $ PYTHONPATH=/x:/y python -Sc "import sys; print(sys.path)" >>> ['', '/x', '/y', '/usr/lib/python27.zip', '/usr/lib64/python2.7/', >>> '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', >>> '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload'] > > Barry, > > I think this discussion should be taking place on the bug tracker > (http://bugs.python.org), rather than in python-dev. bundlebuilder is > unique to OS X and fairly esoteric. Please open an issue there and > include a sample of how you created an app with bundlebuilder and what > Python 2.7 version you are using and what version of OS X. > > Thanks! > > -- > Ned Deily, > nad at acm.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/barry%40barrys-emacs.org > From solipsis at pitrou.net Sun Dec 30 22:52:37 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 30 Dec 2012 22:52:37 +0100 Subject: [Python-Dev] test___all__ polluting sys.modules? References: <50E0B145.1050905@canterbury.ac.nz> Message-ID: <20121230225237.55e2f539@pitrou.net> On Mon, 31 Dec 2012 10:25:25 +1300 Greg Ewing wrote: > Richard Oudkerk wrote: > > Personally I would like to get rid of the "purge globals" behaviour for > > modules deleted before shutdown has started: if someone manipulates > > sys.modules then they can just call gc.collect() if they want to > > promptly get rid of orphaned reference cycles. > > Now that we have cyclic gc, is there any need for the > shutdown purge at all? If you have an object with a __del__ method as a module global, the cyclic gc will refuse to consider the module globals at all (which means it will affect unrelated objects). So, yes, I think the shutdown purge is still necessary. Perhaps there are ways to make it smarter. Regards Antoine. From steve at pearwood.info Mon Dec 31 03:06:21 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 31 Dec 2012 13:06:21 +1100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: <20121229200419.4fc73e12@pitrou.net> Message-ID: <50E0F31D.10507@pearwood.info> On 30/12/12 07:16, Lennart Regebro wrote: >>> If no database is found an ``UnknownTimeZoneError`` or subclass >> thereof >>> will >>> be raised with a message explaining that no zoneinfo database can be >>> found, >>> but that you can install one with the ``tzdata-update`` package. >> >> Why should we care about that situation if we *do* provide a database? >> Distributions can decide to exclude some files from their packages, but >> it's their problem, not ours. >> > > Yes, but a comprehensible error message is useful even if somebody messed > up the system/configuration. +1 -- Steven From ncoghlan at gmail.com Mon Dec 31 06:40:39 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Dec 2012 15:40:39 +1000 Subject: [Python-Dev] BDFL delegate for Daniel Holth's packaging PEPs? Message-ID: Does anyone object to my naming myself as BDFL-Delegate for Daniel Holth's packaging PEPs? PEP 425 Compatibility Tags for Built Distributions PEP 426 Metadata for Python Software Packages 1.3 PEP 427 The Wheel Binary Package Format 0.1 I've mentioned doing so before, but IIRC it was in the depths of a larger thread, so I figured I should make a separate post before claiming them in the PEPs repo. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From arfrever.fta at gmail.com Mon Dec 31 10:45:08 2012 From: arfrever.fta at gmail.com (Arfrever Frehtes Taifersar Arahesis) Date: Mon, 31 Dec 2012 10:45:08 +0100 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: <20121230144830.F00802500B2@webabinitio.net> References: <20121230144830.F00802500B2@webabinitio.net> Message-ID: <201212311045.10552.Arfrever.FTA@gmail.com> 2012-12-30 15:48:30 R. David Murray napisa?(a): > On Mon, 31 Dec 2012 00:38:47 +1000, Nick Coghlan wrote: > > However, specifically spinning out the "import the world" tests like > > test_pydoc and test___all__ to a separate process might be worth the > > effort. > > Adding something to regertest (or unittest?) so that certain nominated > test modules are run in a subprocess has been discussed previously, but > so far no one has stepped up to implement it :) Actually patches have been implemented about 2 years ago, but nobody committed them. http://bugs.python.org/issue1674555 > (I think this came up originally for test_site, but I don't remember for sure.) Yes, test_site. -- Arfrever Frehtes Taifersar Arahesis -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part. URL: From solipsis at pitrou.net Mon Dec 31 12:44:25 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Dec 2012 12:44:25 +0100 Subject: [Python-Dev] BDFL delegate for Daniel Holth's packaging PEPs? References: Message-ID: <20121231124425.48605328@pitrou.net> On Mon, 31 Dec 2012 15:40:39 +1000 Nick Coghlan wrote: > Does anyone object to my naming myself as BDFL-Delegate for Daniel > Holth's packaging PEPs? > > PEP 425 Compatibility Tags for Built Distributions > PEP 426 Metadata for Python Software Packages 1.3 > PEP 427 The Wheel Binary Package Format 0.1 > > I've mentioned doing so before, but IIRC it was in the depths of a > larger thread, so I figured I should make a separate post before > claiming them in the PEPs repo. Ok for me. It would be nice if one of the past disutils maintainers gave their approval too, but they don't seem very active. Regards Antoine. From tarek at ziade.org Mon Dec 31 13:36:23 2012 From: tarek at ziade.org (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 31 Dec 2012 13:36:23 +0100 Subject: [Python-Dev] BDFL delegate for Daniel Holth's packaging PEPs? In-Reply-To: <20121231124425.48605328@pitrou.net> References: <20121231124425.48605328@pitrou.net> Message-ID: <50E186C7.7070103@ziade.org> On 12/31/12 12:44 PM, Antoine Pitrou wrote: > On Mon, 31 Dec 2012 15:40:39 +1000 > Nick Coghlan wrote: >> Does anyone object to my naming myself as BDFL-Delegate for Daniel >> Holth's packaging PEPs? >> >> PEP 425 Compatibility Tags for Built Distributions >> PEP 426 Metadata for Python Software Packages 1.3 >> PEP 427 The Wheel Binary Package Format 0.1 >> >> I've mentioned doing so before, but IIRC it was in the depths of a >> larger thread, so I figured I should make a separate post before >> claiming them in the PEPs repo. > Ok for me. It would be nice if one of the past disutils maintainers > gave their approval too, but they don't seem very active. FWIW I think Nick is perfect for this job. Cheers Tarek -- Tarek Ziad? ? http://ziade.org ? @tarek_ziade From solipsis at pitrou.net Mon Dec 31 13:51:08 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 31 Dec 2012 13:51:08 +0100 Subject: [Python-Dev] BDFL delegate for Daniel Holth's packaging PEPs? References: <20121231124425.48605328@pitrou.net> <50E186C7.7070103@ziade.org> Message-ID: <20121231135108.3e2f5365@pitrou.net> On Mon, 31 Dec 2012 13:36:23 +0100 Tarek Ziad? wrote: > On 12/31/12 12:44 PM, Antoine Pitrou wrote: > > On Mon, 31 Dec 2012 15:40:39 +1000 > > Nick Coghlan wrote: > >> Does anyone object to my naming myself as BDFL-Delegate for Daniel > >> Holth's packaging PEPs? > >> > >> PEP 425 Compatibility Tags for Built Distributions > >> PEP 426 Metadata for Python Software Packages 1.3 > >> PEP 427 The Wheel Binary Package Format 0.1 > >> > >> I've mentioned doing so before, but IIRC it was in the depths of a > >> larger thread, so I figured I should make a separate post before > >> claiming them in the PEPs repo. > > Ok for me. It would be nice if one of the past disutils maintainers > > gave their approval too, but they don't seem very active. > FWIW I think Nick is perfect for this job. I meant approval for the PEP, not for Nick :) From regebro at gmail.com Mon Dec 31 14:17:51 2012 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 31 Dec 2012 14:17:51 +0100 Subject: [Python-Dev] Draft PEP for time zone support. In-Reply-To: <50E0154A.6070108@pearwood.info> References: <50C8541E.60406@python.org> <7wsj71vamk.fsf@benfinney.id.au> <20121220114315.554a52ac@resist.wooz.org> <50DE45F4.5060108@pearwood.info> <50E0154A.6070108@pearwood.info> Message-ID: On Sun, Dec 30, 2012 at 11:19 AM, Steven D'Aprano wrote: > If I've understood it correctly, that contradicts the PEP. One example > given is "Etc/GMT+11", which is not a timezone *name*, but a timezone > name *plus an offset*. Presumably if GMT+11 is legal, so should be > GMT+10:30. > This depends on your definition of a timezone name. There is no generally accepted authority for time zone names, the closest one we get is the zoneinfo database itself, which is maintained by ICANN. It has an Etc/GMT+11. There is no "Etc/GMT+11" named here: > > http://en.wikipedia.org/wiki/**List_of_tz_database_time_zones > It exists in the database files, http://www.iana.org/time-zones, the ``etcetera`` file. > nor is it included in /usr/share/zoneinfo/zone.tab in either of the systems > zone.tab contains none of the Etc/Something zones. > I looked at (one Debian, one Centos), but there is Etc/GMT. So I conclude > that the PEP allows timezone rules, not just names. > This is more problematic, and for that reason I'll change the PEP to use another example. > Either way, I think the PEP needs to clarify what counts as a valid name > string. A timezone file that exists in the zoneinfo database used. Perhaps the database is out-of-date, or the government has suddenly declared > a daylight savings change that isn't reflected yet in the database. Or you > want to set your own TZ rules for testing. Or you've just declared > independence > from the central government and are setting up your own TZ rules. > > time.tzset supports rules as well as names. Is there some reason why this > module should not do the same? > You will be able to make your own rules, the simplest is probably by adding it to your zoneinfo database. Doing so is however not trivial, and outside of the scope of this PEP. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Mon Dec 31 14:18:55 2012 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 31 Dec 2012 14:18:55 +0100 Subject: [Python-Dev] PEP 431 Time zone support improvements - Update In-Reply-To: References: Message-ID: On Sun, Dec 30, 2012 at 10:47 AM, Ronald Oussoren wrote: > The module could be split into several modules in a package without > affecting the public API if that would help with maintenance, simular to > unittest. > This is of course true. Maybe that is a good idea. //Lennart -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Dec 31 14:40:31 2012 From: brett at python.org (Brett Cannon) Date: Mon, 31 Dec 2012 08:40:31 -0500 Subject: [Python-Dev] FYI: don't CC the peps mailing list Message-ID: Since this has happened for the second time in the past month, I want to prevent a trend from starting here. Please do not CC the peps mailing list on any discussions as it makes it impossible to know what emails are about an actual update vs. people replying to some discussion which in no way affects the PEP editors. Emails to the peps list should deal only with adding/updating peps and only between the PEP editors and PEP authors. Lennart, I tossed all emails that got held up in moderation, so please check that the latest version of your PEP is in as you want, else send a patch to the peps list directly (inlined text is not as easy to work with). -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Mon Dec 31 15:06:04 2012 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 31 Dec 2012 09:06:04 -0500 Subject: [Python-Dev] BDFL delegate for Daniel Holth's packaging PEPs? In-Reply-To: <20121231135108.3e2f5365@pitrou.net> References: <20121231124425.48605328@pitrou.net> <50E186C7.7070103@ziade.org> <20121231135108.3e2f5365@pitrou.net> Message-ID: <20121231140605.4B5612500B2@webabinitio.net> On Mon, 31 Dec 2012 13:51:08 +0100, Antoine Pitrou wrote: > On Mon, 31 Dec 2012 13:36:23 +0100 > Tarek Ziad?? wrote: > > On 12/31/12 12:44 PM, Antoine Pitrou wrote: > > > On Mon, 31 Dec 2012 15:40:39 +1000 > > > Nick Coghlan wrote: > > >> Does anyone object to my naming myself as BDFL-Delegate for Daniel > > >> Holth's packaging PEPs? > > >> > > >> PEP 425 Compatibility Tags for Built Distributions > > >> PEP 426 Metadata for Python Software Packages 1.3 > > >> PEP 427 The Wheel Binary Package Format 0.1 > > >> > > >> I've mentioned doing so before, but IIRC it was in the depths of a > > >> larger thread, so I figured I should make a separate post before > > >> claiming them in the PEPs repo. > > > Ok for me. It would be nice if one of the past disutils maintainers > > > gave their approval too, but they don't seem very active. > > FWIW I think Nick is perfect for this job. > > I meant approval for the PEP, not for Nick :) Well, if Nick is "perfect" for the job, this amounts to the almost the same thing :) --David From brett at python.org Mon Dec 31 15:35:17 2012 From: brett at python.org (Brett Cannon) Date: Mon, 31 Dec 2012 09:35:17 -0500 Subject: [Python-Dev] [Python-checkins] peps: Further PEP 432 updates In-Reply-To: <3YZ2nZ4h29zRy8@mail.python.org> References: <3YZ2nZ4h29zRy8@mail.python.org> Message-ID: On Sun, Dec 30, 2012 at 8:39 AM, nick.coghlan wrote: > [SNIP] > The ``-E`` command line option allows all environment variables to be > -ignored when initialising the Python interpreter. An embedding application > +ignored when initializing the Python interpreter. An embedding application > can enable this behaviour by setting ``Py_IgnoreEnvironmentFlag`` before > calling ``Py_Initialize()``. > > In the CPython source code, the ``Py_GETENV`` macro implicitly checks this > flag, and always produces ``NULL`` if it is set. > > + > > > That is true and that is a bug. =) http://bugs.python.org/issue16826 -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael at voidspace.org.uk Mon Dec 31 16:26:53 2012 From: michael at voidspace.org.uk (Michael Foord) Date: Mon, 31 Dec 2012 15:26:53 +0000 Subject: [Python-Dev] Fwd: Broken links on http://www.python.org/download/mac/tcltk/#activetcl-8-5-11 References: Message-ID: <5702D9FF-5A6A-4131-A093-1054155D44DD@voidspace.org.uk> There's a problem with the instructions for using Active TCL with Python for the Mac. Michael Begin forwarded message: > From: "Dr. Anthony G. Francis, Jr." > Subject: Broken links on http://www.python.org/download/mac/tcltk/#activetcl-8-5-11 > Date: 19 December 2012 22:08:02 GMT > To: webmaster at python.org > > Hey gang, > > The Python install instructions for Tck/TK appear to be out of date: > > http://www.python.org/download/mac/tcltk/#activetcl-8-5-11 > As of this writing, there are two known serious problems with the most recent ActivelTcl 8.5 releases for Mac OS X, 8.5.12 and 8.5.12.1. (See Issues #15574 and #15853 for more information.) Until these issues are resolved, use the previous release of ActiveTcl, 8.5.11.1, which is available for download here. This is an Aqua Cocoa Tk. > > But the "download here" link is borken: > > http://downloads.activestate.com/ActiveTcl/releases/8.5.11.1/ > Not Found > The requested URL /ActiveTcl/releases/8.5.11.1/ was not found on this server. > Apache/2.2.3 (CentOS) Server at downloads.activestate.com Port 80 > > As it so turns out, this is on purpose on ActiveState's part: > > http://www.activestate.com/activetcl/downloads > Looking for access to older versions of ActiveTcl? > Community Edition offers access to the newest versions of ActiveTcl. > Access to older versions is available in Business Edition and Enterprise Edition. > > Business edition is a thousand bucks a pop ( http://www.activestate.com/business-edition ) so I has a bit of a sad - I was just trying to run a quick experiment and have dived down a rabbit hole. It appears also that ActiveTcl has gone beyond the buggy edition listed on the Python.org tcltk page, so I'm going to give that a try. > > -Anthony > -- > Dr. Anthony G. Francis, Jr. ~ Renaissance Engineer ~ 408 221 7894 > centaur at dresan.com ~ www.fanufiku.com ~ www.dakotafrost.com -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Dec 31 16:38:22 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Jan 2013 01:38:22 +1000 Subject: [Python-Dev] BDFL delegate for Daniel Holth's packaging PEPs? In-Reply-To: <50E186C7.7070103@ziade.org> References: <20121231124425.48605328@pitrou.net> <50E186C7.7070103@ziade.org> Message-ID: On Mon, Dec 31, 2012 at 10:36 PM, Tarek Ziad? wrote: > On 12/31/12 12:44 PM, Antoine Pitrou wrote: >> >> On Mon, 31 Dec 2012 15:40:39 +1000 >> Nick Coghlan wrote: >>> >>> Does anyone object to my naming myself as BDFL-Delegate for Daniel >>> Holth's packaging PEPs? >>> >>> PEP 425 Compatibility Tags for Built Distributions >>> PEP 426 Metadata for Python Software Packages 1.3 >>> PEP 427 The Wheel Binary Package Format 0.1 >>> >>> I've mentioned doing so before, but IIRC it was in the depths of a >>> larger thread, so I figured I should make a separate post before >>> claiming them in the PEPs repo. >> >> Ok for me. It would be nice if one of the past disutils maintainers >> gave their approval too, but they don't seem very active. > > FWIW I think Nick is perfect for this job. Picking the other people they want a +1 from before giving their own approval is one of the privileges of BDFL delegation, and indeed I'd include Tarek and/or ?ric in my list for these ones :) I'll update the PEPs accordingly. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From eliben at gmail.com Mon Dec 31 17:29:51 2012 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 31 Dec 2012 08:29:51 -0800 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: <201212311045.10552.Arfrever.FTA@gmail.com> References: <20121230144830.F00802500B2@webabinitio.net> <201212311045.10552.Arfrever.FTA@gmail.com> Message-ID: On Mon, Dec 31, 2012 at 1:45 AM, Arfrever Frehtes Taifersar Arahesis < arfrever.fta at gmail.com> wrote: > 2012-12-30 15:48:30 R. David Murray napisa?(a): > > On Mon, 31 Dec 2012 00:38:47 +1000, Nick Coghlan > wrote: > > > However, specifically spinning out the "import the world" tests like > > > test_pydoc and test___all__ to a separate process might be worth the > > > effort. > > > > Adding something to regertest (or unittest?) so that certain nominated > > test modules are run in a subprocess has been discussed previously, but > > so far no one has stepped up to implement it :) > > Actually patches have been implemented about 2 years ago, but nobody > committed them. > http://bugs.python.org/issue1674555 > > > (I think this came up originally for test_site, but I don't remember for > sure.) > > Yes, test_site. > Thank you Arfrever. I'll take a look at those patches. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Mon Dec 31 18:35:32 2012 From: shibturn at gmail.com (Richard Oudkerk) Date: Mon, 31 Dec 2012 17:35:32 +0000 Subject: [Python-Dev] test___all__ polluting sys.modules? In-Reply-To: <20121230225237.55e2f539@pitrou.net> References: <50E0B145.1050905@canterbury.ac.nz> <20121230225237.55e2f539@pitrou.net> Message-ID: On 30/12/2012 9:52pm, Antoine Pitrou wrote: > If you have an object with a __del__ method as a module global, the > cyclic gc will refuse to consider the module globals at all (which > means it will affect unrelated objects). > > So, yes, I think the shutdown purge is still necessary. Perhaps there > are ways to make it smarter. With my earlier suggestion a module deleted from sys.modules before shutdown can have an unreclaimable global dict (if it contains a global with a __del__ method). Perhaps, instead, modules could use a weakrefable subclass of dict for their globals dicts. The module destructor could save the global dicts of deleted modules in a registry. At shutdown any remaining globals dicts can be purged. -- Richard From nad at acm.org Mon Dec 31 19:38:16 2012 From: nad at acm.org (Ned Deily) Date: Mon, 31 Dec 2012 10:38:16 -0800 Subject: [Python-Dev] Fwd: Broken links on http://www.python.org/download/mac/tcltk/#activetcl-8-5-11 References: <5702D9FF-5A6A-4131-A093-1054155D44DD@voidspace.org.uk> Message-ID: In article <5702D9FF-5A6A-4131-A093-1054155D44DD at voidspace.org.uk>, Michael Foord wrote: > There's a problem with the instructions for using Active TCL with Python for > the Mac. No longer an issue: I updated the web page on 12-26: Revision history ? 2012-12-26 - updated for ActiveTcl 8.5.13 and Issue 15853 patch installer -- Ned Deily, nad at acm.org