From alexandre at peadrop.com Sun Sep 2 19:14:45 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Sun, 2 Sep 2007 13:14:45 -0400 Subject: [Python-Dev] Avoiding cascading test failures In-Reply-To: <43aa6ff70708282015q56699bebk4bc52a38749d121e@mail.gmail.com> References: <43aa6ff70708282015q56699bebk4bc52a38749d121e@mail.gmail.com> Message-ID: On 8/28/07, Collin Winter wrote: > On 8/22/07, Alexandre Vassalotti wrote: > > When I was fixing tests failing in the py3k branch, I found the number > > duplicate failures annoying. Often, a single bug, in an important > > method or function, caused a large number of testcase to fail. So, I > > thought of a simple mechanism for avoiding such cascading failures. > > > > My solution is to add a notion of dependency to testcases. A typical > > usage would look like this: > > > > @depends('test_getvalue') > > def test_writelines(self): > > ... > > memio.writelines([buf] * 100) > > self.assertEqual(memio.getvalue(), buf * 100) > > ... > > This definitely seems like a neat idea. Some thoughts: > > * How do you deal with dependencies that cross test modules? Say > test A depends on test B, how do we know whether it's worthwhile > to run A if B hasn't been run yet? It looks like you run the test > anyway (I haven't studied the code closely), but that doesn't > seem ideal. I am not sure what you mean by "test modules". Do you mean module in the Python sense, or like a test-case class? > * This might be implemented in the wrong place. For example, the [x > for x in dir(self) if x.startswith('test')] you do is most certainly > better-placed in a custom TestLoader implementation. That certainly is a good suggestion. I am not sure yet how I will implement my idea in the unittest module. However, I pretty sure that it will be quite different from my prototype. > But despite that, I think it's a cool idea and worth pursuing. Could > you set up a branch (probably of py3k) so we can see how this plays > out in the large? Sure. I need to finish merging pickle and cPickle for Py3k before tackling this project, though. -- Alexandre From ryan.freckleton at gmail.com Mon Sep 3 06:34:26 2007 From: ryan.freckleton at gmail.com (Ryan Freckleton) Date: Sun, 2 Sep 2007 22:34:26 -0600 Subject: [Python-Dev] Product function patch [issue 1093] Message-ID: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> Hello, At one time Guido mentioned adding a built-in product() function to cover some of the remaining use cases of the built-in reduce(). I don't know if this function is still wanted or needed, but I've created an implementation with tests and documentation at http://bugs.python.org/issue1093 . If it is still wanted, could someone review it and give me feedback on it? Thanks, -- ===== --Ryan E. Freckleton From martin at v.loewis.de Mon Sep 3 07:27:08 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 03 Sep 2007 07:27:08 +0200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> Message-ID: <46DB9B2C.7010202@v.loewis.de> > At one time Guido mentioned adding a built-in product() function to > cover some of the remaining use cases of the built-in reduce(). What is the use case for product()? Regards, Martin From skip at pobox.com Mon Sep 3 14:24:30 2007 From: skip at pobox.com (skip at pobox.com) Date: Mon, 3 Sep 2007 07:24:30 -0500 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DB9B2C.7010202@v.loewis.de> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DB9B2C.7010202@v.loewis.de> Message-ID: <18139.64766.602709.442409@montanaro.dyndns.org> >> At one time Guido mentioned adding a built-in product() function to >> cover some of the remaining use cases of the built-in reduce(). Martin> What is the use case for product()? As I recall, there were basically two uses of reduce(), to sum a series or (less frequently) to take the product of a series. sum() obviously takes care of the first use case. product() would take care of the second. Skip From guido at python.org Mon Sep 3 16:37:59 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Sep 2007 07:37:59 -0700 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <18139.64766.602709.442409@montanaro.dyndns.org> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DB9B2C.7010202@v.loewis.de> <18139.64766.602709.442409@montanaro.dyndns.org> Message-ID: Actually, if you use Google code search, you'll find that multiplying the numbers in a list doesn't have much use at all. After summing numbers, joining strings is by far the most common usage -- which is much better done with the str.join() method. (PS. I rejected the issue; product() was proposed and rejected when sum() was originally proposed and accepted, and I don't see anything to change my mind.) On 9/3/07, skip at pobox.com wrote: > > >> At one time Guido mentioned adding a built-in product() function to > >> cover some of the remaining use cases of the built-in reduce(). > > Martin> What is the use case for product()? > > As I recall, there were basically two uses of reduce(), to sum a series or > (less frequently) to take the product of a series. sum() obviously takes > care of the first use case. product() would take care of the second. > > Skip > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From orsenthil at gmail.com Mon Sep 3 20:47:48 2007 From: orsenthil at gmail.com (O.R.Senthil Kumaran) Date: Tue, 4 Sep 2007 00:17:48 +0530 Subject: [Python-Dev] Roundup issue mails "Do not thread!" Message-ID: <20070903184748.GB5632@gmail.com> Hi all, Has anyone observed missing "email-threads" issue with Roundup bug tracker email? Any work around for that? I use mutt and find that roundup bug issue000xx mails are not being threaded. Its not do with settings, I believe. The issue000xxx emails might not being have In-Reply-To or References: header. Why are some msgs threaded and others not? You have some msgs which don't have correct In-Reply-To: References: headers (or not set at all) and you've turned on $strict_threads What do "->", "-?-" and "*>" mean in thread trees? When you turn off $strict_threads msgs with similar subjects get grouped together. In ... -- O.R.Senthil Kumaran http://uthcode.sarovar.org From ty.newton at copperchipgames.com Tue Sep 4 00:20:29 2007 From: ty.newton at copperchipgames.com (Ty Newton) Date: Tue, 04 Sep 2007 08:20:29 +1000 Subject: [Python-Dev] Porting information Message-ID: <46DC88AD.5060001@copperchipgames.com> Hi, I'm looking into porting CPython to native C# (not like IronPython) so that it can be used in game software on the XBox360: integrated with the indie development tool XNA Game Studio Express. I am looking for some guidance on how to approach this in the most effective way. I've started by looking at the parser portion of the code. However I am not certain this is the best place to start. Since there are so many ports I assume there is a well trodden path to completing this kind of task. Any suggestions would be greatly appreciated. I would prefer to break the task into portions that can be verified (tested for correctness) independently or as a stack (one on top of the next). That way I can catch errors early and have more confidence in what I am creating. When I looked through the test suites they all seem to be written in Python. Is there a test suite for the core of CPython i.e. before the C code can interpret Python code? Thanks, Ty From greg.ewing at canterbury.ac.nz Tue Sep 4 00:59:30 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Sep 2007 10:59:30 +1200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> Message-ID: <46DC91D2.7060407@canterbury.ac.nz> Ryan Freckleton wrote: > At one time Guido mentioned adding a built-in product() function to > cover some of the remaining use cases of the built-in reduce(). Speaking of such things, I was thinking the other day that it might be useful to have somewhere in the stdlib a full set of functions for doing elementwise operations and reductions on the built-in array type. This would make it possible for one to do efficient bulk arithmetic when the need arises from time to time without having to pull in a heavyweight dependency such as Numeric or numpy. -- Greg From martin at v.loewis.de Tue Sep 4 04:21:25 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 04:21:25 +0200 Subject: [Python-Dev] Roundup issue mails "Do not thread!" In-Reply-To: <20070903184748.GB5632@gmail.com> References: <20070903184748.GB5632@gmail.com> Message-ID: <46DCC125.8040109@v.loewis.de> > The issue000xxx emails might not being have In-Reply-To or References: header. If messages are entered through the web interface, they won't have these headers. Regards, Martin From orsenthil at gmail.com Tue Sep 4 04:49:43 2007 From: orsenthil at gmail.com (O.R.Senthil Kumaran) Date: Tue, 4 Sep 2007 08:19:43 +0530 Subject: [Python-Dev] Roundup issue mails "Do not thread!" In-Reply-To: <46DCC125.8040109@v.loewis.de> References: <20070903184748.GB5632@gmail.com> <46DCC125.8040109@v.loewis.de> Message-ID: <20070904024943.GA3605@gmail.com> * "Martin v. L?wis" [2007-09-04 04:21:25]: > > The issue000xxx emails might not being have In-Reply-To or References: header. > > If messages are entered through the web interface, they won't have these > headers. Then I should file a bug/feature request for Roundup. How are others keeping track? Whenever I open an issue after analyzing the email message, I find that it salready discussed and state is changed, I had missed the further emails on the same issue due to non-threads. Thanks, -- O.R.Senthil Kumaran http://uthcode.sarovar.org From guido at python.org Tue Sep 4 04:57:49 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Sep 2007 19:57:49 -0700 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DC91D2.7060407@canterbury.ac.nz> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> Message-ID: On 9/3/07, Greg Ewing wrote: > Speaking of such things, I was thinking the other day > that it might be useful to have somewhere in the stdlib > a full set of functions for doing elementwise operations > and reductions on the built-in array type. > > This would make it possible for one to do efficient > bulk arithmetic when the need arises from time to time > without having to pull in a heavyweight dependency > such as Numeric or numpy. But what's the point, given that numpy already exists? Wouldn't you just be redoing the work that numpy has already done? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Sep 4 05:13:59 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 05:13:59 +0200 Subject: [Python-Dev] Roundup issue mails "Do not thread!" In-Reply-To: <20070904024943.GA3605@gmail.com> References: <20070903184748.GB5632@gmail.com> <46DCC125.8040109@v.loewis.de> <20070904024943.GA3605@gmail.com> Message-ID: <46DCCD77.7090300@v.loewis.de> > Then I should file a bug/feature request for Roundup. Please consider what you are asking for. How precisely should roundup set the In-reply-to header? It won't know what message this is a reply to, or whether it is a reply at all. > How are others keeping track? Whenever I open an issue after analyzing the email > message, I find that it salready discussed and state is changed, I > had missed the further emails on the same issue due to non-threads. My email tool has better threading than yours, I guess. IceDove (Thunderbird) will thread the messages by subject also. The non-threaded ones get displayed on the second level, appearing in reply to the original message (or, rather, the youngest message with the same subject - just as if the message mentioned in In-Reply-To has already been deleted). Regards, Martin From martin at v.loewis.de Tue Sep 4 05:33:10 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 05:33:10 +0200 Subject: [Python-Dev] Porting information In-Reply-To: <46DC88AD.5060001@copperchipgames.com> References: <46DC88AD.5060001@copperchipgames.com> Message-ID: <46DCD1F6.2040906@v.loewis.de> > I've started by looking at the parser portion of the code. However I am > not certain this is the best place to start. Since there are so many > ports I assume there is a well trodden path to completing this kind of > task. I believe this assumption is wrong. There are not many ports, only a handful (or less - Jython, IronPython, PyPy). While Jython and IronPython may have similar implementation strategies, I would expect that PyPy took an entirely different approach. In any case, there certainly is a step that you apparently failed to perform as the very first step: set some explicit goals. What kind of compatibility do you want to achieve in your port, what other goals would you like to follow? IOW, why is IronPython not what you want (it *is* a port of CPython to C#, in a sense), and why is the C# support in PyPy not good enough for you? > I would prefer to break the task into portions that can be verified > (tested for correctness) independently or as a stack (one on top of the > next). That way I can catch errors early and have more confidence in > what I am creating. As I don't know what you want to achieve, it is difficult to tell you what steps to take. I assume your implementation would be similar to CPython in that it uses the same byte code format. So one path would be to ignore the compiler at all, and assume that the byte code format is given, i.e. start with port ceval.c. I'm not sure whether you also want to provide the same low-level API (i.e. whether you want to provide "Embedding and Extending"); it surely can't be the *same* API, since your's will be C#, whereas CPython's is, well, C. If you implement ceval.c, you will find quickly that you need much of the Objects folder, so implementing the 10 or so most important objects would be the natural starting point (type, int, string, tuple, dict, frame, code, class, method - assuming you would target Python 1.5 first, i.e. no bool, cell, descr, gen, iter, weakref, unicode, object). > When I looked through the test suites they all seem to be written in > Python. Is there a test suite for the core of CPython i.e. before the C > code can interpret Python code? Yes and no. The core Python is tested through compilation - if it compiles without warnings on the relevant compilers, it is considered good enough to run the Python test suite. For selected features of the interpreter, there are specific tests, in particular test_capi. The core of CPython (compiler, objects, builtins) is then tested through Python code. Regards, Martin From greg.ewing at canterbury.ac.nz Tue Sep 4 11:18:43 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 04 Sep 2007 21:18:43 +1200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> Message-ID: <46DD22F3.6070701@canterbury.ac.nz> Guido van Rossum wrote: > But what's the point, given that numpy already exists? Wouldn't you > just be redoing the work that numpy has already done? Sometimes I just want to do something simple like adding two vectors together, and it seems unreasonable to add the whole of numpy as a dependency just to get that. Currently Python has built-in ways of doing arithmetic, and built-in ways of storing arrays of numbers efficiently, but no built-in way of doing arithmetic on arrays of numbers efficiently. I'd like to see some of the core machinery of numpy moved into the Python stdlib, and numpy refactored so that it builds on that. Then there wouldn't be duplication. -- Greg From steve at shrogers.com Tue Sep 4 13:57:30 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 04 Sep 2007 05:57:30 -0600 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DD22F3.6070701@canterbury.ac.nz> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> Message-ID: <46DD482A.1000801@shrogers.com> Greg Ewing wrote: > Guido van Rossum wrote: > >> But what's the point, given that numpy already exists? Wouldn't you >> just be redoing the work that numpy has already done? >> > > Sometimes I just want to do something simple like > adding two vectors together, and it seems unreasonable > to add the whole of numpy as a dependency just to > get that. ... > > I'd like to see some of the core machinery of numpy moved > into the Python stdlib, and numpy refactored so that it > builds on that. Then there wouldn't be duplication. > Concur. Array processing would be a very practical addition to the standard library. It's used extensively in engineering, finance, and the sciences. It looks like they may find room in the OLPC XO for key subsets of NumPy and Matplotlib. They want it both as a teaching resource and to optimize their software suite as a whole. If they're successful, we'll have a lot of young pythoneers expecting this functionality. # Steve From martin at v.loewis.de Tue Sep 4 14:54:49 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 14:54:49 +0200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DD22F3.6070701@canterbury.ac.nz> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> Message-ID: <46DD5599.9010003@v.loewis.de> > I'd like to see some of the core machinery of numpy moved > into the Python stdlib, and numpy refactored so that it > builds on that. Then there wouldn't be duplication. I think this requires a PEP, and explicit support from the NumPy people. Regards, Martin From guido at python.org Tue Sep 4 16:38:58 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 07:38:58 -0700 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DD482A.1000801@shrogers.com> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> Message-ID: On 9/4/07, Steven H. Rogers wrote: > Concur. Array processing would be a very practical addition to the > standard library. It's used extensively in engineering, finance, and > the sciences. It looks like they may find room in the OLPC XO for key > subsets of NumPy and Matplotlib. They want it both as a teaching > resource and to optimize their software suite as a whole. If they're > successful, we'll have a lot of young pythoneers expecting this > functionality. I still don't see why the standard library needs to be weighed down with a competitor to numpy. Including a subset of numpy was considered in the past, but it's hard to decide on the right subset. In the end it was decided that numpy is too big to become a standard library. Given all the gyrations it has gone through I definitely believe this was the right decision. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Tue Sep 4 16:52:29 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 05 Sep 2007 00:52:29 +1000 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DD5599.9010003@v.loewis.de> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD5599.9010003@v.loewis.de> Message-ID: <46DD712D.9080103@gmail.com> Martin v. L?wis wrote: >> I'd like to see some of the core machinery of numpy moved into the >> Python stdlib, and numpy refactored so that it builds on that. Then >> there wouldn't be duplication. > > I think this requires a PEP, and explicit support from the NumPy > people. Travis has actually been working on this off-and-on for the last couple of years, including mentoring an SoC project last year. I believe PEP 3118 (the revised buffer protocol) was one of the major outcomes - rather than having yet-another-array-type to copy data to and from in order to use different libraries, the focus moved to permitting better interoperability amongst the array types that already exist. Once we support better interoperability at the data storage level, it will actually become *more* useful to have a simple multi-dimensional array type in the standard library as you could easily pass those objects to functions from more powerful array manipulation libraries as your needs become more complicated. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From janssen at parc.com Tue Sep 4 20:33:50 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 11:33:50 PDT Subject: [Python-Dev] frozenset C API? Message-ID: <07Sep4.113351pdt."57996"@synergy1.parc.xerox.com> I'm looking at building a "frozenset" instance as a return value from a C function, and the C API seems ridiculously clumsy. Maybe I'm misunderstanding it. Apparently, I need to create a list object, then pass that to PyFrozenSet_New(), then decref the list object. Is that correct? What I'd like is something more like PyFrozenSet_NEW(int) => PySetObject * PyFrozenSet_SET_ITEM(s, i, v) Any idea why these aren't part of the API? Bill From guido at python.org Tue Sep 4 20:37:58 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 11:37:58 -0700 Subject: [Python-Dev] frozenset C API? In-Reply-To: <-4762611594645938717@unknownmsgid> References: <-4762611594645938717@unknownmsgid> Message-ID: I guess nobody has tried to create frozenset instances from C code before. Almost everyone uses set anyway. What are you trying to do? On 9/4/07, Bill Janssen wrote: > I'm looking at building a "frozenset" instance as a return value from > a C function, and the C API seems ridiculously clumsy. Maybe I'm > misunderstanding it. Apparently, I need to create a list object, then > pass that to PyFrozenSet_New(), then decref the list object. > > Is that correct? > > What I'd like is something more like > > PyFrozenSet_NEW(int) => PySetObject * > PyFrozenSet_SET_ITEM(s, i, v) > > Any idea why these aren't part of the API? > > Bill > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Tue Sep 4 21:02:06 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 21:02:06 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep4.113351pdt."57996"@synergy1.parc.xerox.com> References: <07Sep4.113351pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DDABAE.9020004@v.loewis.de> Bill Janssen schrieb: > I'm looking at building a "frozenset" instance as a return value from > a C function, and the C API seems ridiculously clumsy. Maybe I'm > misunderstanding it. Apparently, I need to create a list object, then > pass that to PyFrozenSet_New(), then decref the list object. > > Is that correct? Almost. It doesn't have to be a list - any iterable object would do. Regards, Martin From ty.newton at copperchipgames.com Tue Sep 4 21:07:41 2007 From: ty.newton at copperchipgames.com (Ty Newton) Date: Wed, 05 Sep 2007 05:07:41 +1000 Subject: [Python-Dev] Compiling cpython2.5.1 in VS2005? Message-ID: <46DDACFD.2080606@copperchipgames.com> Hi, I was building Python 2.5.1 in Visual Studio 2005 and noticed some problems with the instructions. Can someone confirm this and update the readme file in the PCbuild8 directory? I don't yet have access to the repository. This is what the readme.txt file says to do: All you need to do is open the workspace "pcbuild.sln" in VisualStudio 2005, select the platform, select the Debug or Release setting (using "Solution Configuration" from the "Standard" toolbar"), and build the solution. The proper order to build subprojects: 1) pythoncore (this builds the main Python DLL and library files, python25.{dll, lib} in Release mode) NOTE: in previous releases, this subproject was named after the release number, e.g. python20. 2) python (this builds the main Python executable, python.exe in Release mode) This is my experience. DEBUG configuration: When I select 'pythoncore' (right click) from the solution explorer, select 'project only', select 'build only pythoncore' I get this error report: Warning 1 warning C4005: 'Yield' : macro redefinition E:\Program Files\Microsoft Visual Studio 8\VC\PlatformSDK\include\winbase.h 57 Warning 2 warning C4005: 'Yield' : macro redefinition E:\Program Files\Microsoft Visual Studio 8\VC\PlatformSDK\include\winbase.h 57 Error 3 fatal error C1083: Cannot open source file: '.\getbuildinfo2.c': No such file or directory c1 It looks like the project dependencies are not kicking in. I assume this is caused by building the project instead of the solution. So I did them manually. First make_versioninfo project: I select 'make_versioninfo' (right click) from the solution explorer, select 'project only', select 'build only make_versioninfo'. This succeeds. Second make_buildinfo project: I select 'make_buildinfo' (right click) from the solution explorer, select 'project only', select 'build only make_buildinfo'. This succeeds. Finally I try to make pythoncore again: I select 'pythoncore' (right click) from the solution explorer, select 'project only', select 'build only pythoncore'. This succeeds. Now I build python and it also succeeds. One last thing I noticed is if there are spaces in the path of the source files the compilation also fails. Regards, Ty From python at rcn.com Tue Sep 4 21:14:17 2007 From: python at rcn.com (Raymond Hettinger) Date: Tue, 4 Sep 2007 15:14:17 -0400 (EDT) Subject: [Python-Dev] frozenset C API? Message-ID: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net> You can create a frozenset from any iterable using PyFrozenSet_New(). If you don't have an iterable and want to build-up the frozenset one element at a time, the approach is to create a regular set (or some other mutable container), add to it, then convert it to a frozenset when you're done: s = PySet_New(NULL); PySet_Add(s, obj1); PySet_Add(s, obj2); PySet_Add(s, obj3); f = PyFrozenSet_New(s); Py_DECREF(s); That approach is similar to what you do with tuples in pure python. You either build them from an iterable "t = tuple(iterable)" or your build-up a mutable object one element at a time and convert it all at once: s = [] s.append(obj1) s.append(obj2) t = tuple(s) The API you propose doesn't work because sets and frozensets are not indexed like tuples and lists. Accordingly, sets and frozensets have a C API that is more like dictionaries. Since dictionaries are not indexable, they also cannot have an API like the one you propose: PyDict_NEW(int) => PySetObject * PyDict_SET_ITEM(s, index, key, value) If you find all this really annoying and need to fashion a small frozenset with a few known objects, consider using the abstract API: f = PyObject_CallFunction(&PyFrozenSet_Type, "(OOO)", obj1, obj2, obj3); That will roll the whole process up into one line. Hope this was helpful, Raymond --------------------------------------------------------------- Bill Janssen Add To Address Book|This is Spam Subject:[Python-Dev] frozenset C API? To:python-dev at python.org I'm looking at building a "frozenset" instance as a return value from a C function, and the C API seems ridiculously clumsy. Maybe I'm misunderstanding it. Apparently, I need to create a list object, then pass that to PyFrozenSet_New(), then decref the list object. Is that correct? What I'd like is something more like PyFrozenSet_NEW(int) => PySetObject * PyFrozenSet_SET_ITEM(s, i, v) Any idea why these aren't part of the API? From janssen at parc.com Tue Sep 4 21:21:37 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 12:21:37 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: References: <-4762611594645938717@unknownmsgid> Message-ID: <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> I'm working on issue 1583946. Nagle pointed out that each DN (the "subject" and "issuer" fields in a certificate) may have multiple values for the same attribute name, and I haven't been able to rule this out yet. X.509 DNs are sets of X.500 attributes, and X.500 attributes may be either single-valued or multiple-valued. I haven't found anything in the X.509 standard that prohibits multiple-valued attributes (yet -- I'm still looking), so I'm working on an alternative to using dicts to represent the set of attributes in the certificate that's returned from ssl.sslsocket.getpeercert(). "frozenset" seems the most appropriate -- it's a non-ordered immutable set of attributes. Could use a tuple, but (1) that implies order, and (2) using set operations on the attribute set would be handy to test for various things, particularly "issubset" and "issuperset". I think frozenset is quite analogous to tuple at this level, and I suggest that a similar set of C construction functions would be a good thing. Bill > I guess nobody has tried to create frozenset instances from C code > before. Almost everyone uses set anyway. What are you trying to do? > > On 9/4/07, Bill Janssen wrote: > > I'm looking at building a "frozenset" instance as a return value from > > a C function, and the C API seems ridiculously clumsy. Maybe I'm > > misunderstanding it. Apparently, I need to create a list object, then > > pass that to PyFrozenSet_New(), then decref the list object. > > > > Is that correct? > > > > What I'd like is something more like > > > > PyFrozenSet_NEW(int) => PySetObject * > > PyFrozenSet_SET_ITEM(s, i, v) > > > > Any idea why these aren't part of the API? > > > > Bill > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Tue Sep 4 21:31:09 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 12:31:09 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net> References: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net> Message-ID: <07Sep4.123111pdt."57996"@synergy1.parc.xerox.com> Raymond, thanks for the note. > You can create a frozenset from any iterable using PyFrozenSet_New(). > > If you don't have an iterable and want to build-up the frozenset one element at a time, the approach is to create a regular set (or some other mutable container), add to it, then convert it to a frozenset when you're done: > > s = PySet_New(NULL); > PySet_Add(s, obj1); > PySet_Add(s, obj2); > PySet_Add(s, obj3); > f = PyFrozenSet_New(s); > Py_DECREF(s); This is essentially the same thing I mentioned, except using a set instead of a list as the iterable. I'm just a tad annoyed at the fact that I know at set creation time exactly how many elements it's going to have, and this procedure strikes me as a somewhat inefficient way to create that set. Just tickles my "C inefficiency" funnybone a bit :-). > The API you propose doesn't work because sets and frozensets are not > indexed like tuples and lists. Accordingly, sets and frozensets have > a C API that is more like dictionaries. Since dictionaries are not > indexable, they also cannot have an API like the one you propose: > > PyDict_NEW(int) => PySetObject * > PyDict_SET_ITEM(s, index, key, value) Didn't really mean to propose "PyDict_SET_ITEM(s, index, key, value)", should have been PyDict_SET_ITEM(s, index, value) But your point is still well taken. How about this one, though: PyDict_NEW(int) => PySetObject * PyDict_ADD(s, value) ADD would just stick value in the next empty slot (and steal its reference). Bill From janssen at parc.com Tue Sep 4 22:11:11 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 13:11:11 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep4.123111pdt."57996"@synergy1.parc.xerox.com> References: <20070904151417.AFJ20377@ms10.lnh.mail.rcn.net> <07Sep4.123111pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep4.131116pdt."57996"@synergy1.parc.xerox.com> > But your point is still well taken. How about this one, though: > > PyDict_NEW(int) => PySetObject * > PyDict_ADD(s, value) > > ADD would just stick value in the next empty slot (and steal its > reference). Sorry, I meant to say PyFrozenSet_NEW(int) => PySetObject * PyFrozenSet_ADD(s, value) Bill From python at rcn.com Tue Sep 4 22:19:45 2007 From: python at rcn.com (Raymond Hettinger) Date: Tue, 4 Sep 2007 16:19:45 -0400 (EDT) Subject: [Python-Dev] frozenset C API? Message-ID: <20070904161945.AFJ44412@ms10.lnh.mail.rcn.net> [Bill Janssen] > > How about this one, though: > > PyDict_NEW(int) => PySetObject * > PyDict_ADD(s, value) > > ADD would just stick value in the next > empty slot (and steal its reference). Dicts, sets and frozenset are implemented as hash tables, not as arrays, so the above suggestion doesn't make any sense to me. The location of the "next empty slot" depends on a the key associated with the value being added (btw, where is the "key" handled in your proposed API?). Consequently, the PyDict_New(int) step would have no way to know where to create the n empty slots (since their location is determined by the hash value of the keys). That is a reason that the tuple/list API differs from the set/frozenet/dict API. Raymond From martin at v.loewis.de Tue Sep 4 22:55:38 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 22:55:38 +0200 Subject: [Python-Dev] Compiling cpython2.5.1 in VS2005? In-Reply-To: <46DDACFD.2080606@copperchipgames.com> References: <46DDACFD.2080606@copperchipgames.com> Message-ID: <46DDC64A.9010700@v.loewis.de> > Can someone confirm this and update the readme file in the PCbuild8 > directory? I don't yet have access to the repository. Please provide patches instead, and post them on bugs.python.org. Regards, Martin From greg.ewing at canterbury.ac.nz Tue Sep 4 22:58:07 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Sep 2007 08:58:07 +1200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DD5599.9010003@v.loewis.de> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD5599.9010003@v.loewis.de> Message-ID: <46DDC6DF.1080307@canterbury.ac.nz> Martin v. L?wis wrote: > I think this requires a PEP, and explicit support from the > NumPy people. Someone who knows more about numpy's internals would be needed to figure out what the details should be like in order to be usable by numpy. But I could write a PEP about how what I have in mind would look from the Python level. -- Greg From martin at v.loewis.de Tue Sep 4 23:26:20 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 23:26:20 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DDCD7C.40004@v.loewis.de> > I'm working on issue 1583946. Nagle pointed out that each DN (the > "subject" and "issuer" fields in a certificate) may have multiple > values for the same attribute name, and I haven't been able to rule > this out yet. This is indeed common. In particular, DN= and OU= often occur multiple times. > X.509 DNs are sets of X.500 attributes, and X.500 > attributes may be either single-valued or multiple-valued. Conceptually perhaps (although I doubt that). Practically, Name is Name ::= CHOICE { RDNSequence } RDNSequence ::= SEQUENCE OF RelativeDistinguishedName RelativeDistinguishedName ::= SET OF AttributeTypeAndValue AttributeTypeAndValue ::= SEQUENCE { type AttributeType, value AttributeValue } So it's a sequence of sets of key/value pairs. If you want to have the same type twice, you have two options: either make multiple RDNs, each single-valued, or make a single RDN, with multiple kv-pairs. IIUC, the intention of the multi-valued RDNs is that you have an entity described by multiple attributes. For example, relative to O=Foo, neither GN=Bill nor SN=Janssen might correctly identify a person. So you would create O=Foo,GN=Bill+SN=Janssen. That's allowed, but not really common - instead, people both a) use CN as a unique identifier, and b) put separate attributes for a single object into separate RDNs, as if email=janssen at parc.com was a subnode in the DIT relative to CN="Bill Janssen". > I haven't > found anything in the X.509 standard that prohibits multiple-valued > attributes (yet -- I'm still looking), so I'm working on an > alternative to using dicts to represent the set of attributes in the > certificate that's returned from ssl.sslsocket.getpeercert(). Conceptually, it should be a list (order *is* relevant). It can then be debated whether the RDN can be represented as a dictionary; my understanding is that the intention of RDNs is that the AttributeType is unique within an RDN (but I may be wrong). Regards, Martin From greg.ewing at canterbury.ac.nz Tue Sep 4 23:26:32 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Sep 2007 09:26:32 +1200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> Message-ID: <46DDCD88.7080009@canterbury.ac.nz> Guido van Rossum wrote: > I still don't see why the standard library needs to be weighed down > with a competitor to numpy. The way to get things done efficiently with an interpreted language is for the language or its libraries to provide primitives that work on large chunks of data at once, and can be combined in flexible ways. Python provides many such primitives for working with strings -- the string methods, regexps, etc. But it doesn't provide *any* for numbers, and that strikes me as an odd gap in functionality. What I have in mind would be quite small, so it wouldn't "weigh down" the stdlib. You could think of it as an extension to the operator module that turns it into something useful. :-) And, as I said, if it's designed so that numpy can build on it, then it needn't be competing with numpy. > Including a subset of numpy was considered > in the past, but it's hard to decide on the right subset. What I'm thinking of wouldn't be a "subset" of numpy, in the sense that it wouldn't necessarily share any of the numpy API from the Python perspective. All it would provide is the minimum necessary primitives to get the grunt work done. I'm thinking of having a bunch of functions like add_elementwise(src1, src2, dst, start, chunk, stride) where src1, src2 and dst are anything supporting the new buffer protocol. That should be sufficient to support something with a numpy-like API, I think. -- Greg From greg.ewing at canterbury.ac.nz Tue Sep 4 23:30:25 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 05 Sep 2007 09:30:25 +1200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DD712D.9080103@gmail.com> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD5599.9010003@v.loewis.de> <46DD712D.9080103@gmail.com> Message-ID: <46DDCE71.7090608@canterbury.ac.nz> Nick Coghlan wrote: > Travis has actually been working on this off-and-on for the last couple > of years, Well, yes, but that's concentrating on a different aspect of things -- the data storage. My proposal concerns what you can *do* with the data, independent of the way it's stored. My idea and Travis's would complement each other, I think. -- Greg From martin at v.loewis.de Tue Sep 4 23:42:27 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 04 Sep 2007 23:42:27 +0200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DDCD88.7080009@canterbury.ac.nz> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DDCD88.7080009@canterbury.ac.nz> Message-ID: <46DDD143.3030205@v.loewis.de> > What I have in mind would be quite small, so it wouldn't > "weigh down" the stdlib. If it's a builtin, it certainly would. Every builtin weighs down the library, as it clutters the global(est) namespace. > I'm thinking of having a bunch of functions like > > add_elementwise(src1, src2, dst, start, chunk, stride) > > where src1, src2 and dst are anything supporting the > new buffer protocol. That should be sufficient to support > something with a numpy-like API, I think. This sounds like a topic for python-ideas. Regards, Martin From hasan.diwan at gmail.com Tue Sep 4 23:43:04 2007 From: hasan.diwan at gmail.com (Hasan Diwan) Date: Tue, 4 Sep 2007 14:43:04 -0700 Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError? Message-ID: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com> I'm trying to fix a failing unit test in revision 57974. The test in question claims that math.sqrt(-1) should raise ValueError; the code itself gives "nan" as a result for that expression. I can modify the test and therefore have it pass, but I'm not sure if an exception would be more appropriate. I'd be happy for some direction here. Many thanks! -- Cheers, Hasan Diwan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070904/7e0c52ba/attachment.htm From ty.newton at copperchipgames.com Tue Sep 4 23:46:33 2007 From: ty.newton at copperchipgames.com (Ty Newton) Date: Wed, 05 Sep 2007 07:46:33 +1000 Subject: [Python-Dev] Compiling cpython2.5.1 in VS2005? In-Reply-To: <46DDC64A.9010700@v.loewis.de> References: <46DDACFD.2080606@copperchipgames.com> <46DDC64A.9010700@v.loewis.de> Message-ID: <46DDD239.5050706@copperchipgames.com> oh, sorry. I'll do that. Ty Martin v. L?wis wrote: >> Can someone confirm this and update the readme file in the PCbuild8 >> directory? I don't yet have access to the repository. > > Please provide patches instead, and post them on bugs.python.org. > > Regards, > Martin > > From guido at python.org Tue Sep 4 23:55:28 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 14:55:28 -0700 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DDCD88.7080009@canterbury.ac.nz> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DDCD88.7080009@canterbury.ac.nz> Message-ID: By all means do write up a PEP -- it's hard to generalize from that one example. On 9/4/07, Greg Ewing wrote: > Guido van Rossum wrote: > > I still don't see why the standard library needs to be weighed down > > with a competitor to numpy. > > The way to get things done efficiently with an interpreted > language is for the language or its libraries to provide > primitives that work on large chunks of data at once, and > can be combined in flexible ways. > > Python provides many such primitives for working with > strings -- the string methods, regexps, etc. But it doesn't > provide *any* for numbers, and that strikes me as an odd > gap in functionality. > > What I have in mind would be quite small, so it wouldn't > "weigh down" the stdlib. You could think of it as an > extension to the operator module that turns it into > something useful. :-) > > And, as I said, if it's designed so that numpy can build > on it, then it needn't be competing with numpy. > > > Including a subset of numpy was considered > > in the past, but it's hard to decide on the right subset. > > What I'm thinking of wouldn't be a "subset" of numpy, in > the sense that it wouldn't necessarily share any of the > numpy API from the Python perspective. All it would > provide is the minimum necessary primitives to get the > grunt work done. > > I'm thinking of having a bunch of functions like > > add_elementwise(src1, src2, dst, start, chunk, stride) > > where src1, src2 and dst are anything supporting the > new buffer protocol. That should be sufficient to support > something with a numpy-like API, I think. > > -- > Greg > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Sep 4 23:58:27 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 14:58:27 -0700 Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError? In-Reply-To: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com> References: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com> Message-ID: Is this on OSX? That test has been failing (because on that platform sqrt(-1) returns nan instead of raising ValueError) for years -- but the test is only run when run in verbose mode, which mostly hides the issue. Have you read the comment for the test? On 9/4/07, Hasan Diwan wrote: > I'm trying to fix a failing unit test in revision 57974. The test in > question claims that math.sqrt(-1) should raise ValueError; the code itself > gives "nan" as a result for that expression. I can modify the test and > therefore have it pass, but I'm not sure if an exception would be more > appropriate. I'd be happy for some direction here. Many thanks! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From hasan.diwan at gmail.com Wed Sep 5 00:13:50 2007 From: hasan.diwan at gmail.com (Hasan Diwan) Date: Tue, 4 Sep 2007 15:13:50 -0700 Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError? In-Reply-To: References: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com> Message-ID: <2cda2fc90709041513g7f67ec10l29c94c1d8fc4c0d7@mail.gmail.com> On 04/09/07, Guido van Rossum wrote: > > Is this on OSX? That test has been failing (because on that platform > sqrt(-1) returns nan instead of raising ValueError) for years -- but > the test is only run when run in verbose mode, which mostly hides the > issue. Have you read the comment for the test? Indeed, I am on OSX. Yes, I have read the comment for the test. Would the following pseudocode be an acceptable fix for the problem: if sys.platform == 'darwin' and math.sqrt(-1) == nan: return else: try: x = math.sqrt(-1) except ValueError: pass ... or should I just not bother? -- Cheers, Hasan Diwan -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070904/a598e9d5/attachment.htm From janssen at parc.com Wed Sep 5 00:21:10 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 15:21:10 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DDCD7C.40004@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> Message-ID: <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> > > X.509 DNs are sets of X.500 attributes, and X.500 > > attributes may be either single-valued or multiple-valued. > > Conceptually perhaps (although I doubt that). I got that from David Chadwick's book at http://sec.cs.kent.ac.uk/x500book/. ``An attribute comprises an attribute type and one or more attribute values.'' The question is, how would a multiple-valued attribute be represented in a certificate Name? I'm presuming it would appear as multiple attributes with the same "type", but different values. > Conceptually, it should be a list (order *is* relevant). It can > then be debated whether the RDN can be represented as a dictionary; > my understanding is that the intention of RDNs is that the AttributeType > is unique within an RDN (but I may be wrong). > Name ::= CHOICE { RDNSequence } > > RDNSequence ::= SEQUENCE OF RelativeDistinguishedName > > RelativeDistinguishedName ::= > SET OF AttributeTypeAndValue > > AttributeTypeAndValue ::= SEQUENCE { > type AttributeType, > value AttributeValue } Order is important in the directory tree, but not (I think) in the DN; that name is just an unordered set of attributes, because the hierarchy information has already been lost (the RDN elements cannot be distinguished from each other using only the internal certificate information). In any case, it certainly sounds to me as if there can be multiple instances of AttributeTypeAndValue with the same "type" field in a single Name. So I'll represent them as tuples, which will preserve the order in which they occur in the certificate, and make the value immutable. Applications which need them as sets can create their own frozensets from that tuple. Bill From janssen at parc.com Wed Sep 5 00:26:56 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 15:26:56 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <20070904161945.AFJ44412@ms10.lnh.mail.rcn.net> References: <20070904161945.AFJ44412@ms10.lnh.mail.rcn.net> Message-ID: <07Sep4.152702pdt."57996"@synergy1.parc.xerox.com> > Dicts, sets and frozenset are implemented as hash tables, not as arrays, I see, thanks. > The location of the "next empty slot" depends on a the key > associated with the value being added (btw, where is the "key" handled > in your proposed API?). What key? It's a set, not a mapping. The value is the key. Bill From guido at python.org Wed Sep 5 00:45:15 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 15:45:15 -0700 Subject: [Python-Dev] Math.sqrt(-1) -- nan or ValueError? In-Reply-To: <2cda2fc90709041513g7f67ec10l29c94c1d8fc4c0d7@mail.gmail.com> References: <2cda2fc90709041443v225562c5taf3112d341652a42@mail.gmail.com> <2cda2fc90709041513g7f67ec10l29c94c1d8fc4c0d7@mail.gmail.com> Message-ID: I think it's better for the test to fail, to indicate that there's an unresolved problem on the platform. On 9/4/07, Hasan Diwan wrote: > On 04/09/07, Guido van Rossum wrote: > > Is this on OSX? That test has been failing (because on that platform > > sqrt(-1) returns nan instead of raising ValueError) for years -- but > > the test is only run when run in verbose mode, which mostly hides the > > issue. Have you read the comment for the test? > > Indeed, I am on OSX. Yes, I have read the comment for the test. Would the > following pseudocode be an acceptable fix for the problem: > if sys.platform == 'darwin' and math.sqrt(-1) == nan: > return > else: > try: > x = math.sqrt(-1) > except ValueError: > pass > ... > or should I just not bother? > -- > Cheers, > > Hasan Diwan < hasan.diwan at gmail.com> -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ty.newton at copperchipgames.com Wed Sep 5 02:45:52 2007 From: ty.newton at copperchipgames.com (Ty Newton) Date: Wed, 05 Sep 2007 10:45:52 +1000 Subject: [Python-Dev] Porting information In-Reply-To: <46DCD1F6.2040906@v.loewis.de> References: <46DC88AD.5060001@copperchipgames.com> <46DCD1F6.2040906@v.loewis.de> Message-ID: <46DDFC40.2000808@copperchipgames.com> Thanks Martin, Martin v. L?wis wrote: >> I've started by looking at the parser portion of the code. However I am >> not certain this is the best place to start. Since there are so many >> ports I assume there is a well trodden path to completing this kind of >> task. > > I believe this assumption is wrong. There are not many ports, only a > handful (or less - Jython, IronPython, PyPy). While Jython and > IronPython may have similar implementation strategies, I would expect > that PyPy took an entirely different approach. > > In any case, there certainly is a step that you apparently failed > to perform as the very first step: set some explicit goals. What > kind of compatibility do you want to achieve in your port, what > other goals would you like to follow? > I thought I'd try and keep my message short so I decided not to go into the explicit objectives. At the most basic it is the ability for developers to run compiled Python as part of the game code. The next step up from that is allowing Python source code to execute and be modified in a 'simple' interactive coding tool: allowing for 'tweaking' code to be implemented outside of the game engine team. Principal constraint: Microsoft support for independent development on the 360 is only provided through the use of their slimmed down .Net compact framework and the XNA Game Studio Express development environment (C# only). This allows Microsoft to implement security within the tool chain and deployment pipeline to sandbox strictly. > IOW, why is IronPython not what you want (it *is* a port of CPython > to C#, in a sense), and why is the C# support in PyPy not good enough > for you? > The impact, to this project, of the reduced API and strict sandboxing in the 360 dev environment is Python implementations like IronPython are not feasible. IronPython uses the reflection capabilities of C# to interpret directly to CLR. Without reflection IronPython simply cannot operate. Unfortunately the 360 API does not include reflection functionality. I had a look into PyPy and concluded that it could produce a result that would operate however I was less certain about integrating it into a development tool chain for the 360. It seems more likely that a 'C#Python' would result in a cleaner development environment - much like the embedded inclusion of Lua scripting in many games software. >> I would prefer to break the task into portions that can be verified >> (tested for correctness) independently or as a stack (one on top of the >> next). That way I can catch errors early and have more confidence in >> what I am creating. > > As I don't know what you want to achieve, it is difficult to tell > you what steps to take. > > I assume your implementation would be similar to CPython in that > it uses the same byte code format. So one path would be to ignore > the compiler at all, and assume that the byte code format is given, > i.e. start with port ceval.c. > > I'm not sure whether you also want to provide the same low-level > API (i.e. whether you want to provide "Embedding and Extending"); > it surely can't be the *same* API, since your's will be C#, whereas > CPython's is, well, C. If you implement ceval.c, you will find > quickly that you need much of the Objects folder, so implementing > the 10 or so most important objects would be the natural starting > point (type, int, string, tuple, dict, frame, code, class, method - > assuming you would target Python 1.5 first, i.e. no bool, cell, > descr, gen, iter, weakref, unicode, object). > >> When I looked through the test suites they all seem to be written in >> Python. Is there a test suite for the core of CPython i.e. before the C >> code can interpret Python code? > > Yes and no. The core Python is tested through compilation - if it > compiles without warnings on the relevant compilers, it is considered > good enough to run the Python test suite. For selected features of > the interpreter, there are specific tests, in particular test_capi. > > The core of CPython (compiler, objects, builtins) is then tested > through Python code. > This seems like a sensible way to start since the test harness needs a Python interpreter. Although it seems counter-intuitive to build the bytecode interpreter so that I can test the bytecode compiler... > Regards, > Martin > > Thanks for the advice Martin. Regards, Ty From steve at shrogers.com Wed Sep 5 03:39:42 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 04 Sep 2007 19:39:42 -0600 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> Message-ID: <46DE08DE.4090503@shrogers.com> Guido van Rossum wrote: > I still don't see why the standard library needs to be weighed down > with a competitor to numpy. Including a subset of numpy was considered > in the past, but it's hard to decide on the right subset. In the end > it was decided that numpy is too big to become a standard library. > Given all the gyrations it has gone through I definitely believe this > was the right decision. A competitor to NumPy would be counter-productive, but including a core subset in the standard library that NumPy could be built upon would add valuable functionality to Python out of the box. It was probably the best decision to not include NumPy when it was previously considered, but I think it should be reconsidered for Python 3.x. While defining the right subset to include has it's difficulties, I believe it can be done. What would be a reasonable target size for inclusion in the standard library? # Steve From steve at shrogers.com Wed Sep 5 03:56:23 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 04 Sep 2007 19:56:23 -0600 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DDC6DF.1080307@canterbury.ac.nz> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD5599.9010003@v.loewis.de> <46DDC6DF.1080307@canterbury.ac.nz> Message-ID: <46DE0CC7.3040503@shrogers.com> Greg Ewing wrote: > Martin v. L?wis wrote: > >> I think this requires a PEP, and explicit support from the >> NumPy people. >> > > Someone who knows more about numpy's internals would > be needed to figure out what the details should be > like in order to be usable by numpy. But I could write > a PEP about how what I have in mind would look from > the Python level. > I'm confident that the NumPy developers would support this in principle. If you want help with the PEP, I'm willing to help. From guido at python.org Wed Sep 5 04:03:51 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 19:03:51 -0700 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: <46DE08DE.4090503@shrogers.com> References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DE08DE.4090503@shrogers.com> Message-ID: On 9/4/07, Steven H. Rogers wrote: > Guido van Rossum wrote: > > I still don't see why the standard library needs to be weighed down > > with a competitor to numpy. Including a subset of numpy was considered > > in the past, but it's hard to decide on the right subset. In the end > > it was decided that numpy is too big to become a standard library. > > Given all the gyrations it has gone through I definitely believe this > > was the right decision. > A competitor to NumPy would be counter-productive, but including a core > subset in the standard library that NumPy could be built upon would add > valuable functionality to Python out of the box. It was probably the > best decision to not include NumPy when it was previously considered, > but I think it should be reconsidered for Python 3.x. While defining > the right subset to include has it's difficulties, I believe it can be > done. What would be a reasonable target size for inclusion in the > standard library? What makes 3.0 so special? Additions to the stdlib can be considered at any feature release. Frankly, 3.0 is already so loaded with new features (and removals) that I'm not sure it's worth pile this onto it. That said, I would much rather argue with a detailed PEP than with yet another suggestion that we do something. I am already doing enough -- it's up for some other folks to get together and produce a proposal. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at shrogers.com Wed Sep 5 04:35:41 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 04 Sep 2007 20:35:41 -0600 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DE08DE.4090503@shrogers.com> Message-ID: <46DE15FD.9000801@shrogers.com> Guido van Rossum wrote: > What makes 3.0 so special? Additions to the stdlib can be considered > at any feature release. Frankly, 3.0 is already so loaded with new > features (and removals) that I'm not sure it's worth pile this onto > it. > I actually wrote 3.x, not 3.0. I agree that it makes no sense to add anything more to 3.0. From robert.kern at gmail.com Wed Sep 5 04:45:45 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 04 Sep 2007 21:45:45 -0500 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DE08DE.4090503@shrogers.com> Message-ID: Guido van Rossum wrote: > On 9/4/07, Steven H. Rogers wrote: >> Guido van Rossum wrote: >>> I still don't see why the standard library needs to be weighed down >>> with a competitor to numpy. Including a subset of numpy was considered >>> in the past, but it's hard to decide on the right subset. In the end >>> it was decided that numpy is too big to become a standard library. >>> Given all the gyrations it has gone through I definitely believe this >>> was the right decision. >> A competitor to NumPy would be counter-productive, but including a core >> subset in the standard library that NumPy could be built upon would add >> valuable functionality to Python out of the box. It was probably the >> best decision to not include NumPy when it was previously considered, >> but I think it should be reconsidered for Python 3.x. While defining >> the right subset to include has it's difficulties, I believe it can be >> done. What would be a reasonable target size for inclusion in the >> standard library? > > What makes 3.0 so special? Additions to the stdlib can be considered > at any feature release. The 3.x compatibility break (however alleviated by 2to3) makes a nice clean cutoff. The numpy that works on Pythons 3.x would essentially be a port from the current numpy. Consequently, we could modify the numpy for Pythons 3.x to always rely on the stdlib API to build on top of. We couldn't do that for the version targeted to Pythons 2.x because we could only rely on its presence for 2.6+. I don't mind maintaining two versions of numpy, one for Python 2.x and one for 3.x, but I don't care to maintain three. I invite Greg and Steven and whoever else is interested to discuss ideas for the PEP on numpy-discussion. I'm skeptical, seeing what currently has been suggested, but some more details could easily allay that. http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From janssen at parc.com Wed Sep 5 04:58:11 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 4 Sep 2007 19:58:11 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> > In any case, it certainly sounds to me as if there can be multiple > instances of AttributeTypeAndValue with the same "type" field in a > single Name. So I'll represent them as tuples, which will preserve > the order in which they occur in the certificate, and make the value > immutable. Applications which need them as sets can create their > own frozensets from that tuple. Here's an example of the new format: {'issuer': (('countryName', u'US'), ('organizationName', u'VeriSign, Inc.'), ('organizationalUnitName', u'VeriSign Trust Network'), ('organizationalUnitName', u'Terms of use at https://www.verisign.com/rpa (c)06'), ('commonName', u'VeriSign Class 3 Extended Validation SSL SGC CA')), 'notAfter': 'May 8 23:59:59 2009 GMT', 'notBefore': 'May 9 00:00:00 2007 GMT', 'subject': (('serialNumber', u'2497886'), ('1.3.6.1.4.1.311.60.2.1.3', u'US'), ('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'), ('countryName', u'US'), ('postalCode', u'94043'), ('stateOrProvinceName', u'California'), ('localityName', u'Mountain View'), ('streetAddress', u'487 East Middlefield Road'), ('organizationName', u'VeriSign, Inc.'), ('organizationalUnitName', u'Production Security Services'), ('organizationalUnitName', u'Terms of use at www.verisign.com/rpa (c)06'), ('commonName', u'www.verisign.com')), 'version': 2} Bill From guido at python.org Wed Sep 5 05:18:36 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Sep 2007 20:18:36 -0700 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DE08DE.4090503@shrogers.com> Message-ID: On 9/4/07, Robert Kern wrote: > The 3.x compatibility break (however alleviated by 2to3) makes a nice clean > cutoff. The numpy that works on Pythons 3.x would essentially be a port from the > current numpy. Consequently, we could modify the numpy for Pythons 3.x to always > rely on the stdlib API to build on top of. We couldn't do that for the version > targeted to Pythons 2.x because we could only rely on its presence for 2.6+. I > don't mind maintaining two versions of numpy, one for Python 2.x and one for > 3.x, but I don't care to maintain three. I just had a discussion with Glyph "Twisted" Lefkowitz about this. He warns that if every project using Python uses 3.0's incompatibility as an excuse to make their own library/package/project incompatible as well, we will end up with total pandemonium (my paraphrase). I think he has a good point -- we shouldn't be injecting any more instability into the world than absolutely necessary. In any case, the rift is more likely to be between 2.5 and 2.6, since 2.6 will provide backports of most 3.0 features (though without some of the accompanying cleanups, in order to also provide strong backwards compatibility). To be honest, I also doubt the viability of designing and implementing something that would satisfy Greg Ewing's goals *and* be stable enough in the standard library, in under a year. But as I said before, I don't see much point in arguing much further until I see the PEP. I may yet be convinced, but it will have to be a good design and a well-argued proposal. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at shrogers.com Wed Sep 5 05:17:43 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Tue, 04 Sep 2007 21:17:43 -0600 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DE08DE.4090503@shrogers.com> Message-ID: <46DE1FD7.7060706@shrogers.com> Robert Kern wrote: > I invite Greg and Steven and whoever else is interested to discuss ideas for the > PEP on numpy-discussion. I'm skeptical, seeing what currently has been > suggested, but some more details could easily allay that. > > http://projects.scipy.org/mailman/listinfo/numpy-discussion > Accepted, that's probably the best place for this to continue. Greg's suggestion sounds plausible to me, but needs to be fleshed out. From robert.kern at gmail.com Wed Sep 5 05:28:46 2007 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 04 Sep 2007 22:28:46 -0500 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DE08DE.4090503@shrogers.com> Message-ID: Guido van Rossum wrote: > On 9/4/07, Robert Kern wrote: >> The 3.x compatibility break (however alleviated by 2to3) makes a nice clean >> cutoff. The numpy that works on Pythons 3.x would essentially be a port from the >> current numpy. Consequently, we could modify the numpy for Pythons 3.x to always >> rely on the stdlib API to build on top of. We couldn't do that for the version >> targeted to Pythons 2.x because we could only rely on its presence for 2.6+. I >> don't mind maintaining two versions of numpy, one for Python 2.x and one for >> 3.x, but I don't care to maintain three. > > I just had a discussion with Glyph "Twisted" Lefkowitz about this. He > warns that if every project using Python uses 3.0's incompatibility as > an excuse to make their own library/package/project incompatible as > well, we will end up with total pandemonium (my paraphrase). I think > he has a good point -- we shouldn't be injecting any more instability > into the world than absolutely necessary. I agree. I didn't mean to imply that the 3.x version of numpy would be incompatible to users of it, just that the codebase that implements it will be different, whether it is automatically or manually translated. Of course, if the API is introduced in 3.(x>0), we end up with the same problem I wanted to avoid. Ah well. See you on the flip side of the PEP. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From martin at v.loewis.de Wed Sep 5 07:25:12 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 07:25:12 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DE3DB8.6000004@v.loewis.de> >>> X.509 DNs are sets of X.500 attributes, and X.500 >>> attributes may be either single-valued or multiple-valued. >> Conceptually perhaps (although I doubt that). > > I got that from David Chadwick's book at http://sec.cs.kent.ac.uk/x500book/. > > ``An attribute comprises an attribute type and one or more attribute values.'' Ah, ok. But then, the DN is not a *set* of such attributes, but a sequence. > The question is, how would a multiple-valued attribute be represented > in a certificate Name? I'm presuming it would appear as multiple > attributes with the same "type", but different values. Within a single RelativeDistinguishedName, yes. > Order is important in the directory tree, but not (I think) in the DN; > that name is just an unordered set of attributes, because the > hierarchy information has already been lost (the RDN elements cannot > be distinguished from each other using only the internal certificate > information). Hmm. The directory tree only exists through the order in the DN. E.g from http://java.sun.com/products/jndi/tutorial/ldap/models/x500.html "The X.500 namespace is hierarchical. An entry is unambiguously identified by a distinguished name (DN). A distinguished name is the concatenation of selected attributes from each entry, called the relative distinguished name (RDN), in the tree along a path leading from the root down to the named entry." If the RDNs within a DN would not be ordered, you would not get a hierarchical tree, and you could not identify entries unambiguously. > In any case, it certainly sounds to me as if there can be multiple > instances of AttributeTypeAndValue with the same "type" field in a > single Name. So I'll represent them as tuples, which will preserve > the order in which they occur in the certificate, and make the value > immutable. Ok. I think this will still not support multi-valued RDNs properly, but those are uncommon in PKI. Regards, Martin From martin at v.loewis.de Wed Sep 5 07:48:11 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 07:48:11 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DE431B.20403@v.loewis.de> > Here's an example of the new format: > > {'issuer': (('countryName', u'US'), > ('organizationName', u'VeriSign, Inc.'), > ('organizationalUnitName', u'VeriSign Trust Network'), > ('organizationalUnitName', > u'Terms of use at https://www.verisign.com/rpa (c)06'), > ('commonName', > u'VeriSign Class 3 Extended Validation SSL SGC CA')), Can you please take a look at the attached certificates? How are they represented? The DNs of these are structurally different, one being /DC=org/DC=python/CN=foo/CN=bar and the other /DC=org/DC=python/CN=foo2+CN=bar2 Regards, Martin -------------- next part -------------- A non-text attachment was scrubbed... Name: ca1.crt Type: application/x-x509-ca-cert Size: 1008 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070905/61187ed8/attachment-0002.crt -------------- next part -------------- A non-text attachment was scrubbed... Name: ca2.crt Type: application/x-x509-ca-cert Size: 1008 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070905/61187ed8/attachment-0003.crt From ndbecker2 at gmail.com Wed Sep 5 14:57:06 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 05 Sep 2007 08:57:06 -0400 Subject: [Python-Dev] python sphinx install? Message-ID: I'm interested in trying out new style (python 2.6) documentation. I see we're using docutils + sphinx? I did: svn co http://svn.python.org/projects/doctools/trunk/ How can I install this to try it with python-2.5? From g.brandl at gmx.net Wed Sep 5 15:05:55 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 05 Sep 2007 15:05:55 +0200 Subject: [Python-Dev] python sphinx install? In-Reply-To: References: Message-ID: Neal Becker schrieb: > I'm interested in trying out new style (python 2.6) documentation. I see > we're using docutils + sphinx? > > I did: svn co http://svn.python.org/projects/doctools/trunk/ > > How can I install this to try it with python-2.5? What do you want to try with Python 2.5? If you want to build the Python 2.6/3.0 docs, it's easiest to check the Python sources out from http://svn.python.org/projects/python/trunk, go to the Doc directory and do "make html". This will checkout sphinx and all other needed libraries into Doc/tools and build the docs. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From alan.mcintyre at gmail.com Wed Sep 5 15:09:21 2007 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 5 Sep 2007 09:09:21 -0400 Subject: [Python-Dev] x86 XP trunk failure Message-ID: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> Hi all, My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk) keeps failing because of a crash that appears to be in the bsddb module. I assume the master deems the slave to be lost because it's sitting there waiting on me to make a choice on the "debug/abort" dialog box. I can provide details if anybody needs them. I just figured somebody might want to know that this is actual build/test problem instead of some kind of issue with the internet connection here. Thanks, Alan From ndbecker2 at gmail.com Wed Sep 5 15:18:29 2007 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 05 Sep 2007 09:18:29 -0400 Subject: [Python-Dev] python sphinx install? References: Message-ID: Georg Brandl wrote: > Neal Becker schrieb: >> I'm interested in trying out new style (python 2.6) documentation. I see >> we're using docutils + sphinx? >> >> I did: svn co http://svn.python.org/projects/doctools/trunk/ >> >> How can I install this to try it with python-2.5? > > What do you want to try with Python 2.5? > > If you want to build the Python 2.6/3.0 docs, it's easiest to check the > Python sources out from http://svn.python.org/projects/python/trunk, go to > the Doc directory and do "make html". This will checkout sphinx and all > other needed libraries into Doc/tools and build the docs. > > Georg > I want to document my own python code. I figured I might as well start using the new documentation system - but I'm using python-2.5. I intend to use epydoc. I thought maybe I could just add sphinx to my docutils, but maybe not? From g.brandl at gmx.net Wed Sep 5 15:30:24 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 05 Sep 2007 15:30:24 +0200 Subject: [Python-Dev] python sphinx install? In-Reply-To: References: Message-ID: Neal Becker schrieb: > Georg Brandl wrote: > >> Neal Becker schrieb: >>> I'm interested in trying out new style (python 2.6) documentation. I see >>> we're using docutils + sphinx? >>> >>> I did: svn co http://svn.python.org/projects/doctools/trunk/ >>> >>> How can I install this to try it with python-2.5? >> >> What do you want to try with Python 2.5? >> >> If you want to build the Python 2.6/3.0 docs, it's easiest to check the >> Python sources out from http://svn.python.org/projects/python/trunk, go to >> the Doc directory and do "make html". This will checkout sphinx and all >> other needed libraries into Doc/tools and build the docs. >> >> Georg >> > > I want to document my own python code. I figured I might as well start > using the new documentation system - but I'm using python-2.5. I intend to > use epydoc. I thought maybe I could just add sphinx to my docutils, but > maybe not? I see. Currently, sphinx is not ready to be used by other projects, at least not in conjunction with tools like epydoc. (You should, however, be able to create a rst document hierarchy like Python's and use sphinx for it.) As soon as all needs for the Python documentation are fulfilled, I'll think about how to make the toolset available for other projects. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From martin at v.loewis.de Wed Sep 5 15:34:03 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 15:34:03 +0200 Subject: [Python-Dev] x86 XP trunk failure In-Reply-To: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> Message-ID: <46DEB04B.3090500@v.loewis.de> > My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk) > keeps failing because of a crash that appears to be in the bsddb > module. I assume the master deems the slave to be lost because it's > sitting there waiting on me to make a choice on the "debug/abort" > dialog box. What branch, and for how long has this dialog been sitting around? For crashes in 3.0, there should not be any such dialogs anymore, but there may have been before I turned them off. > I can provide details if anybody needs them. I just figured somebody > might want to know that this is actual build/test problem instead of > some kind of issue with the internet connection here. Thanks. You can discard any such dialogs - most likely, they really were from the 3.0 branch, which is known to crash in bsddb. Regards, Martin From alan.mcintyre at gmail.com Wed Sep 5 15:40:57 2007 From: alan.mcintyre at gmail.com (Alan McIntyre) Date: Wed, 5 Sep 2007 09:40:57 -0400 Subject: [Python-Dev] x86 XP trunk failure In-Reply-To: <46DEB04B.3090500@v.loewis.de> References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> <46DEB04B.3090500@v.loewis.de> Message-ID: <1d36917a0709050640y6fead745p6bac24f6f6bb520d@mail.gmail.com> On 9/5/07, "Martin v. L?wis" wrote: > > My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk) > > keeps failing because of a crash that appears to be in the bsddb > > module. I assume the master deems the slave to be lost because it's > > sitting there waiting on me to make a choice on the "debug/abort" > > dialog box. > > What branch, and for how long has this dialog been sitting around? It's the trunk; at the moment the debugger is sitting there with python_d.exe at a breakpoint. The current instance of python_d being debugged is only a day or so old. I don't know when this problem started happening, but I think it's been a while (it was happening for all the visible builds on the dashboard when I first noticed it a day or two ago). > For crashes in 3.0, there should not be any such dialogs anymore, but > there may have been before I turned them off. > > > I can provide details if anybody needs them. I just figured somebody > > might want to know that this is actual build/test problem instead of > > some kind of issue with the internet connection here. > > Thanks. You can discard any such dialogs - most likely, they really were > from the 3.0 branch, which is known to crash in bsddb. Ok. From janssen at parc.com Wed Sep 5 17:17:12 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 08:17:12 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DE3DB8.6000004@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> Message-ID: <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> > > In any case, it certainly sounds to me as if there can be multiple > > instances of AttributeTypeAndValue with the same "type" field in a > > single Name. So I'll represent them as tuples, which will preserve > > the order in which they occur in the certificate, and make the value > > immutable. > > Ok. I think this will still not support multi-valued RDNs properly, but > those are uncommon in PKI. I'm not sure why not... Can you say more? Bill From janssen at parc.com Wed Sep 5 17:44:09 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 08:44:09 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DE431B.20403@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> <46DE431B.20403@v.loewis.de> Message-ID: <07Sep5.084418pdt."57996"@synergy1.parc.xerox.com> >The DNs of these are structurally different, one being >/DC=org/DC=python/CN=foo/CN=bar and the other >/DC=org/DC=python/CN=foo2+CN=bar2 Ah, I see what you're driving at. You can inspect them yourself by looking at the certs with openssl: % openssl x509 -text -in attachment-0002.crt Certificate: Data: Version: 3 (0x2) Serial Number: a9:29:70:b4:3a:72:27:5a Signature Algorithm: sha1WithRSAEncryption Issuer: DC=org, DC=python, CN=foo, CN=bar Validity Not Before: Sep 5 05:38:20 2007 GMT Not After : Sep 4 05:38:20 2008 GMT Subject: DC=org, DC=python, CN=foo, CN=bar Subject Public Key Info ... % openssl x509 -text -in attachment-0003.crt Certificate: Data: Version: 3 (0x2) Serial Number: 82:0a:4f:36:0f:ab:1a:c3 Signature Algorithm: sha1WithRSAEncryption Issuer: DC=org, DC=python, CN=bar2, CN=foo2 Validity Not Before: Sep 5 05:43:26 2007 GMT Not After : Sep 4 05:43:26 2008 GMT Subject: DC=org, DC=python, CN=bar2, CN=foo2 Subject Public Key Info: The hierarchy information does not appear to be preserved. Bill From janssen at parc.com Wed Sep 5 17:48:04 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 08:48:04 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DE431B.20403@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> <46DE431B.20403@v.loewis.de> Message-ID: <07Sep5.084813pdt."57996"@synergy1.parc.xerox.com> More succinctly: % openssl x509 -subject -noout -in attachment-0002.crt subject= /DC=org/DC=python/CN=foo/CN=bar % openssl x509 -subject -noout -in attachment-0003.crt subject= /DC=org/DC=python/CN=bar2/CN=foo2 % Bill From martin at v.loewis.de Wed Sep 5 17:49:10 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 17:49:10 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DECFF6.4040107@v.loewis.de> >> Ok. I think this will still not support multi-valued RDNs properly, but >> those are uncommon in PKI. > > I'm not sure why not... Can you say more? See the example certificates. If you get (('cn','a'),('email','b')), you can't tell whether that's two single-valued RDNs in a DN, or one multi-valued RDN with two attribute/value pairs. Regards, Martin From martin at v.loewis.de Wed Sep 5 18:05:27 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 18:05:27 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.084418pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> <46DE431B.20403@v.loewis.de> <07Sep5.084418pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DED3C7.3050308@v.loewis.de> > The hierarchy information does not appear to be preserved. But it only appears so. OpenSSL does not know how to render it properly (hence I say it is not very common in PKI), but they started supporting that when generating certificates, with the -multivalue-rdn option for req, and if you do openssl asn1parse -in ca1.crt you see that they differ: (ca1) l= 17 cons: SEQUENCE l= 10 prim: OBJECT :domainComponent l= 3 prim: IA5STRING :org l= 22 cons: SET l= 20 cons: SEQUENCE l= 10 prim: OBJECT :domainComponent l= 6 prim: IA5STRING :python l= 12 cons: SET l= 10 cons: SEQUENCE l= 3 prim: OBJECT :commonName l= 3 prim: PRINTABLESTRING :foo l= 12 cons: SET l= 10 cons: SEQUENCE l= 3 prim: OBJECT :commonName l= 3 prim: PRINTABLESTRING :bar (ca2) l= 17 cons: SEQUENCE l= 10 prim: OBJECT :domainComponent l= 3 prim: IA5STRING :org l= 22 cons: SET l= 20 cons: SEQUENCE l= 10 prim: OBJECT :domainComponent l= 6 prim: IA5STRING :python l= 26 cons: SET l= 11 cons: SEQUENCE l= 3 prim: OBJECT :commonName l= 4 prim: PRINTABLESTRING :bar2 l= 11 cons: SEQUENCE l= 3 prim: OBJECT :commonName l= 4 prim: PRINTABLESTRING :foo2 In the first case, foo and bar are in different sets, in the second case, they are in the same set. For people concerned about security, that makes a difference. If OpenSSL actually supports that in its APIs, my proposal would be to make a multi-valued RDN a more-than-two-tuple, e.g. (('DC','org'),('DC','python'),('CN','bar2','CN','foo2')) That would make it possible to distinguish the names (pun intended), yet still don't produce structural overhead for the normal case of single-valued RDNs. Regards, Martin From martin at v.loewis.de Wed Sep 5 18:06:53 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 18:06:53 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.084813pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> <46DE431B.20403@v.loewis.de> <07Sep5.084813pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DED41D.8010506@v.loewis.de> > % openssl x509 -subject -noout -in attachment-0002.crt > subject= /DC=org/DC=python/CN=foo/CN=bar > % openssl x509 -subject -noout -in attachment-0003.crt > subject= /DC=org/DC=python/CN=bar2/CN=foo2 Well, that's the same bug that John Nagle complains about. This output is incorrect. Regards, Martin From janssen at parc.com Wed Sep 5 18:12:34 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 09:12:34 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DECFF6.4040107@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> Message-ID: <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> > See the example certificates. If you get (('cn','a'),('email','b')), > you can't tell whether that's two single-valued RDNs in a DN, > or one multi-valued RDN with two attribute/value pairs. Yup, got it. I don't see a way in the OpenSSL library functions I'm using (X509_NAME_ENTRY_get_object, X509_NAME_ENTRY_get_data) to distinguish between different RDNs, but I'll take a look at the source for X509_NAME_print_ex, which does seem to be able to do this. Bill From janssen at parc.com Wed Sep 5 18:26:42 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 09:26:42 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> > Yup, got it. I don't see a way in the OpenSSL library functions I'm > using (X509_NAME_ENTRY_get_object, X509_NAME_ENTRY_get_data) to > distinguish between different RDNs, but I'll take a look at the source > for X509_NAME_print_ex, which does seem to be able to do this. There's a field on the X509_NAME_ENTRY struct which gives the level. OK, I can make it a tuple (list of RDNs) of tuples (one for each RDN) of tuples (one for each attribute in the RDN). And maybe add a flatten function to the ssl.py module :-). Bill From janssen at parc.com Wed Sep 5 18:27:04 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 09:27:04 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DED41D.8010506@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <07Sep4.195817pdt."57996"@synergy1.parc.xerox.com> <46DE431B.20403@v.loewis.de> <07Sep5.084813pdt."57996"@synergy1.parc.xerox.com> <46DED41D.8010506@v.loewis.de> Message-ID: <07Sep5.092704pdt."57996"@synergy1.parc.xerox.com> > Well, that's the same bug that John Nagle complains about. Yes, I agree. Bill From martin at v.loewis.de Wed Sep 5 19:24:35 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 19:24:35 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DEE653.3000909@v.loewis.de> > There's a field on the X509_NAME_ENTRY struct which gives the level. > OK, I can make it a tuple (list of RDNs) of tuples (one for each RDN) > of tuples (one for each attribute in the RDN). And maybe add a > flatten function to the ssl.py module :-). > See my other proposal as well. As nobody actually uses multi-valued RDNs, an option would be to make single tuple for each RDN, containing all attributes, with alternatingly type and value. Then, a single-valued RDN would turn out as a key-value pair (two-tuple), a multi-valued RDN would have a length of 2*number-of-attributes. As for accessor functions, I'd then rather see a get_attr_by_type, returning a list of all values of attributes of that type, across all RDNs in the DN (empty if no attribute was found). People would then do x = get_attr_by_type(subj, ssl.commonName) if len(x) != 1: unsupported_certificate() CN = x[0] Regards, Martin From janssen at parc.com Wed Sep 5 19:49:09 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 10:49:09 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> > OK, I can make it a tuple (list of RDNs) of tuples (one for each RDN) > of tuples (one for each attribute in the RDN). Which gets us to this: {'issuer': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),)), 'notAfter': 'Feb 16 16:54:50 2013 GMT', 'notBefore': 'Aug 27 16:54:50 2007 GMT', 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),)), 'version': 2} and {'issuer': ((('countryName', u'US'),), (('organizationName', u'VeriSign, Inc.'),), (('organizationalUnitName', u'VeriSign Trust Network'),), (('organizationalUnitName', u'Terms of use at https://www.verisign.com/rpa (c)06'),), (('commonName', u'VeriSign Class 3 Extended Validation SSL SGC CA'),)), 'notAfter': 'May 8 23:59:59 2009 GMT', 'notBefore': 'May 9 00:00:00 2007 GMT', 'subject': ((('serialNumber', u'2497886'),), (('1.3.6.1.4.1.311.60.2.1.3', u'US'),), (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),), (('countryName', u'US'),), (('postalCode', u'94043'),), (('stateOrProvinceName', u'California'),), (('localityName', u'Mountain View'),), (('streetAddress', u'487 East Middlefield Road'),), (('organizationName', u'VeriSign, Inc.'),), (('organizationalUnitName', u'Production Security Services'),), (('organizationalUnitName', u'Terms of use at www.verisign.com/rpa (c)06'),), (('commonName', u'www.verisign.com'),)), 'version': 2} Ugly, but accurate. Or is it? Do you really think that "serialNumber" is at the top of a naming tree somewhere? Bill From martin at v.loewis.de Wed Sep 5 20:31:27 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 20:31:27 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DEF5FF.8040602@v.loewis.de> > 'subject': ((('serialNumber', u'2497886'),), > (('1.3.6.1.4.1.311.60.2.1.3', u'US'),), > (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),), > (('countryName', u'US'),), > (('postalCode', u'94043'),), > (('stateOrProvinceName', u'California'),), > (('localityName', u'Mountain View'),), > (('streetAddress', u'487 East Middlefield Road'),), > (('organizationName', u'VeriSign, Inc.'),), > (('organizationalUnitName', u'Production Security Services'),), > (('organizationalUnitName', > u'Terms of use at www.verisign.com/rpa (c)06'),), > (('commonName', u'www.verisign.com'),)), > 'version': 2} > > Ugly, but accurate. Or is it? Do you really think that > "serialNumber" is at the top of a naming tree somewhere? Firefox claims the same order. To bad Verisign hasn't grasped the concept of distinguished names :-( Had they done it right, incorporationStateId, incorporationLocalityId, streetAddress, localityName, postalCode would all have been in the RDN with organizationName - they are all attributes of that organization (or the address attributes perhaps belong to the OU). Also, I doubt they have an organizationalUnit "Terms of use at ...". Regards, Martin From skip at pobox.com Wed Sep 5 20:47:17 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 5 Sep 2007 13:47:17 -0500 Subject: [Python-Dev] Errors in the csv module reader/writer methods - new w/ change to rst? Message-ID: <18142.63925.900009.894021@montanaro.dyndns.org> I was just looking for some csv DictWriter examples for a colleague at work and was myself confused by the apparent transformation which took place in the Reader Objects and Writer Objects sections. Each of the methods is now prefixed by "csv.csvreader." or "csv.csvwriter." Neither expression was previously defined in that section. Those undefined expressions are not in the old versions of the documentation, e.g.: http://www.python.org/doc/2.4.4/lib/node634.html http://www.python.org/doc/2.4.4/lib/node635.html http://www.python.org/doc/2.5/lib/node265.html http://www.python.org/doc/2.5/lib/node266.html I don't think they add anything useful to the documentation. Were they added just for the csv module (and possibly a few others) or was this a change to the entire libref documentation? That is, was this some sort of policy change? Can they be removed? Skip From janssen at parc.com Wed Sep 5 20:58:16 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 11:58:16 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DEF5FF.8040602@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> <46DEF5FF.8040602@v.loewis.de> Message-ID: <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> All of this makes me think that some folks may want to do more processing on certificates with more advanced tools, and for that they will need access to the full bits of the certificate. I'll add the ability to retrieve that as well. I'm wondering if I should try to pull some extension attributes out of the cert, and add them to the dict, as well. Like subjectAltName, for instance. Or should we just wait till someone wants it and files a bug report? Bill From martin at v.loewis.de Wed Sep 5 21:10:52 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 05 Sep 2007 21:10:52 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> <46DEF5FF.8040602@v.loewis.de> <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DEFF3C.90306@v.loewis.de> > I'm wondering if I should try to pull some extension attributes out of > the cert, and add them to the dict, as well. Like subjectAltName, for > instance. Or should we just wait till someone wants it and files a > bug report? If you have the time and inclination to do that, feel free to. Covering some of the most widely used extensions could be useful: subjectAltName, key usage, extended key usage. If you set up a framework for that, people will contribute others they like to see supported. Regards, Martin From db3l.net at gmail.com Wed Sep 5 21:48:04 2007 From: db3l.net at gmail.com (David Bolen) Date: Wed, 05 Sep 2007 15:48:04 -0400 Subject: [Python-Dev] x86 XP trunk failure References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> <46DEB04B.3090500@v.loewis.de> Message-ID: "Martin v. L?wis" writes: >> My build slave (http://www.python.org/dev/buildbot/trunk/x86%20XP%20trunk) >> keeps failing because of a crash that appears to be in the bsddb >> module. I assume the master deems the slave to be lost because it's >> sitting there waiting on me to make a choice on the "debug/abort" >> dialog box. > > What branch, and for how long has this dialog been sitting around? > > For crashes in 3.0, there should not be any such dialogs anymore, but > there may have been before I turned them off. I think there may actually be an issue here, if only with the tests, even though 3.0 does suppress the dialog. I think I started noticing this in the first build after bringing my buildbot online, so I think on Sep 1. I had manually done on a build on Aug 28 (running the buildbot batch file interactively) without the problem, but I haven't been able to find any relevant source tree changes in that interval. Re-fetching from that date has the problem, and I had blown away my older tree when starting up the buildbot officially (of course :-(). At least for me, it's happening on 2.5 and trunk (hard to tell about 3.0, but that's dying without a dialog), so I thought it might have been something backported. But it also appears common to more platforms than just Windows - it's just Windows that pops up that dialog. In my case, the actual dialog doesn't pop up until the end of the tests, and it seems to be occurring only if test_bsddb3 has run during the tests. On other platforms, it just shows up as a warning message, which doesn't serve to mark the tests as failing (e.g., OS X and FreeBSD) - at the of the test you get a message of: warning: DBTxn aborted in destructor. No prior commit() or abort(). which I tracked back to an abort() call within the bsddb library as final destruction is happening at Python exit. (When clearing the test_bsddb module, and the bsddb wrapper tries to access a log file related to an open transaction). So perhaps there's an issue with how one or more of the tests are constructed, or cleanup or something. I haven't narrowed it down further yet though. As with Alan, more details are available as needed. While it seems to show up in the full test run on more platforms, I have a harder time forcing it by just running test_bsddb3 on FreeBSD, for example, while I get the dialog consistently on Windows. -- David From db3l.net at gmail.com Wed Sep 5 22:25:05 2007 From: db3l.net at gmail.com (David Bolen) Date: Wed, 05 Sep 2007 16:25:05 -0400 Subject: [Python-Dev] x86 XP trunk failure References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> <46DEB04B.3090500@v.loewis.de> Message-ID: I previously wrote: > (...) > which I tracked back to an abort() call within the bsddb library as > final destruction is happening at Python exit. (When clearing the > test_bsddb module, and the bsddb wrapper tries to access a log file > related to an open transaction). (...) For those more familiar with bsddb, it's the test_1413192.py module in lib/bsddb/test that tickles the problem. It should have been more obvious, since I saw the 1413192 in the module name during exit cleanup, but mentally ignored it as an internal identifier of some sort. The test module clearly leaves an open transaction, but also purges its working directory, so maybe that's why the log file is missing. But since the test was specifically against object destruction, I'm not sure how best to restructure (maybe make env_name into a class that only prunes the directory in __del__? Although that would affect GC and thus destruction order too). This test has been around a bit, but the pruning of the directory was backported recently, which is probably the source of the problems. -- David From greg.ewing at canterbury.ac.nz Wed Sep 5 22:40:22 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 06 Sep 2007 08:40:22 +1200 Subject: [Python-Dev] Product function patch [issue 1093] In-Reply-To: References: <318072440709022134m52cc729fi575a3fb99fb10b70@mail.gmail.com> <46DC91D2.7060407@canterbury.ac.nz> <46DD22F3.6070701@canterbury.ac.nz> <46DD482A.1000801@shrogers.com> <46DDCD88.7080009@canterbury.ac.nz> Message-ID: <46DF1436.4060404@canterbury.ac.nz> Guido van Rossum wrote: > By all means do write up a PEP -- it's hard to generalize from that one example. I'll write a PEP as soon as I get a chance. But the generalisation is pretty straightforward -- just replicate that signature for each of the binary operations. -- Greg From martin at v.loewis.de Wed Sep 5 23:18:34 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Wed, 05 Sep 2007 23:18:34 +0200 Subject: [Python-Dev] x86 XP trunk failure In-Reply-To: References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> <46DEB04B.3090500@v.loewis.de> Message-ID: <46DF1D2A.8090506@v.loewis.de> > warning: DBTxn aborted in destructor. No prior commit() or abort(). I have seen these as well. bsddb isn't very forgiving when you have a Python exception inside a bsddb transaction, in the test suite. IIRC, the exception will abort the transaction, then the unittest fixture teardown will close the environment, and that will cause a bsddb crash because something is getting released that does not exist anymore. When I last looked at it, I did not see an easy way to fix it; contributions are welcome. Regards, Martin From db3l.net at gmail.com Thu Sep 6 00:35:10 2007 From: db3l.net at gmail.com (David Bolen) Date: Wed, 05 Sep 2007 18:35:10 -0400 Subject: [Python-Dev] x86 XP trunk failure References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> <46DEB04B.3090500@v.loewis.de> <46DF1D2A.8090506@v.loewis.de> Message-ID: "Martin v. L?wis" writes: >> warning: DBTxn aborted in destructor. No prior commit() or abort(). > > I have seen these as well. bsddb isn't very forgiving when you have > a Python exception inside a bsddb transaction, in the test suite. > IIRC, the exception will abort the transaction, then the unittest > fixture teardown will close the environment, and that will cause > a bsddb crash because something is getting released that does not > exist anymore. When I last looked at it, I did not see an easy way to > fix it; contributions are welcome. One thing I tried that seems to work fairly well for this case is to encapsulate much of the module-level code in the test into a class instance. That way the module-level code can instantiate and destroy the class instance rather than waiting for the interpreter exit for the latter. It definitely resolves this current issue, but when I reverted the changes to _bsddb.c that were originally made in conjunction with this test, it still seemed to pass the test. So I tried the reverted module with the original test code and it still passes. So I'm not entirely sure that the test is enforcing anything at this point, or at least I'm not sure how to be absolutely positive that the change will continue to enforce what the existing code used to test. But I can open a ticket with the proposed changes if that would help. -- David From db3l.net at gmail.com Thu Sep 6 01:01:43 2007 From: db3l.net at gmail.com (David Bolen) Date: Wed, 05 Sep 2007 19:01:43 -0400 Subject: [Python-Dev] x86 XP trunk failure References: <1d36917a0709050609q2d8d9a24s9c3b0f214762943d@mail.gmail.com> <46DEB04B.3090500@v.loewis.de> <46DF1D2A.8090506@v.loewis.de> Message-ID: I wrote: > But I can open a ticket with the proposed changes if that would help. Figure it can't hurt - I've created issue 1112 with the proposed patch to the test_1413192.py module. -- David From brett at python.org Thu Sep 6 01:38:06 2007 From: brett at python.org (Brett Cannon) Date: Wed, 5 Sep 2007 16:38:06 -0700 Subject: [Python-Dev] Google spreadsheet to collaborate on backporting Py3K stuff to 2.6 Message-ID: Neal, Anthony, Thomas W., and I have a spreadsheet that was started to keep track of what needs to be done in what needs to be done in 2.6 for Py3K transitioning: http://spreadsheets.google.com/pub?key=pCKY4oaXnT81FrGo3ShGHGg . I am opening the spreadsheet up to everyone so that others can help maintain it. There is a sheet in the Python 3000 Tasks spreadsheet that should be merged into this spreadsheet and then deleted. If anyone wants to help with that it would be great (once something has been moved from "Python 3000 Tasks" to "Python 2 -> 3 transition" just delete it from "Python 3000 Tasks"). Because Neal created this spreadsheet he is the only one who can open editing to everyone. If you would like to have edit abilities to the spreadsheet just reply to this email saying you want an invite and I will add you manually (and if you want a different address added just say so). -Brett From janssen at parc.com Thu Sep 6 05:03:58 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 20:03:58 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DEFF3C.90306@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> <46DEF5FF.8040602@v.loewis.de> <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> <46DEFF3C.90306@v.loewis.de> Message-ID: <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com> > > I'm wondering if I should try to pull some extension attributes out of > > the cert, and add them to the dict, as well. Like subjectAltName, for > > instance. Or should we just wait till someone wants it and files a > > bug report? > > If you have the time and inclination to do that, feel free to. Covering > some of the most widely used extensions could be useful: subjectAltName, > key usage, extended key usage. If you set up a framework for that, > people will contribute others they like to see supported. It's actually easier to do all or nothing. I'm tempted to just report 'critical' extensions. Bill From janssen at parc.com Thu Sep 6 05:52:08 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 5 Sep 2007 20:52:08 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> <46DEF5FF.8040602@v.loewis.de> <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> <46DEFF3C.90306@v.loewis.de> <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep5.205217pdt."57996"@synergy1.parc.xerox.com> > > > I'm wondering if I should try to pull some extension attributes out of > > > the cert, and add them to the dict, as well. Like subjectAltName, for > > > instance. Or should we just wait till someone wants it and files a > > > bug report? > > > > If you have the time and inclination to do that, feel free to. Covering > > some of the most widely used extensions could be useful: subjectAltName, > > key usage, extended key usage. If you set up a framework for that, > > people will contribute others they like to see supported. > > It's actually easier to do all or nothing. I'm tempted to just report > 'critical' extensions. Simpler to provide them all, though I should note that the purpose of the information provided here is mainly for authorization/accounting purposes, not for "other" use of the certificate. If that's desired, they should pull the binary form of the certificate (there's an interface for that), and use M2Crypto or PyOpenSSL to decode it in general. This certificate has already been validated; the issue is how to get critical information to the app so it can make authorization decisions (like subjectAltName when the subject field is empty). Reporting non-critical extensions like "extended key usage" is nifty, but seems pointless. Here's an example: {'extensions': {'Netscape Cert Type': u'SSL Server'}, 'issuer': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),)), 'notAfter': 'Feb 16 16:54:50 2013 GMT', 'notBefore': 'Aug 27 16:54:50 2007 GMT', 'serialNumber': 'FFAA4ADBF570818D', 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),)), 'version': 3} and {'extensions': {'1.3.6.1.5.5.7.1.12': u'', 'Authority Information Access': u'OCSP - URI:http://EVIntl-ocsp.verisign.com\n', 'X509v3 Authority Key Identifier': u'keyid:4E:43:C8:1D:76:EF:37:53:7A:4F:F2:58:6F:94:F3:38:E2:D5:BD:DF\n', 'X509v3 Basic Constraints': u'CA:FALSE', 'X509v3 CRL Distribution Points': u'URI:http://EVIntl-crl.verisign.com/EVIntl2006.crl\n', 'X509v3 Certificate Policies': u'Policy: 2.16.840.1.113733.1.7.23.6\n', 'X509v3 Extended Key Usage': u'TLS Web Server Authentication, TLS Web Client Authentication, Netscape Server Gated Crypto, Microsoft Server Gated Crypto', 'X509v3 Key Usage': u'Digital Signature, Key Encipherment', 'X509v3 Subject Key Identifier': u'F1:5A:89:93:55:47:4B:BA:51:F5:4E:E0:CB:16:55:F4:D7:CC:38:67'}, 'issuer': ((('countryName', u'US'),), (('organizationName', u'VeriSign, Inc.'),), (('organizationalUnitName', u'VeriSign Trust Network'),), (('organizationalUnitName', u'Terms of use at https://www.verisign.com/rpa (c)06'),), (('commonName', u'VeriSign Class 3 Extended Validation SSL SGC CA'),)), 'notAfter': 'May 8 23:59:59 2009 GMT', 'notBefore': 'May 9 00:00:00 2007 GMT', 'serialNumber': '6A4AC31B3110E6EB48F0FC51A39A171F', 'subject': ((('serialNumber', u'2497886'),), (('1.3.6.1.4.1.311.60.2.1.3', u'US'),), (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),), (('countryName', u'US'),), (('postalCode', u'94043'),), (('stateOrProvinceName', u'California'),), (('localityName', u'Mountain View'),), (('streetAddress', u'487 East Middlefield Road'),), (('organizationName', u'VeriSign, Inc.'),), (('organizationalUnitName', u'Production Security Services'),), (('organizationalUnitName', u'Terms of use at www.verisign.com/rpa (c)06'),), (('commonName', u'www.verisign.com'),)), 'version': 3} Probably another thing that *should* be reported is the cipher used to protect the information on the channel, so that the app can decide whether it's strong enough for its taste. (If it's not, it can presumably reconnect using a different variant of SSL to try for a better result, or decide not to use the server (or talk to the client) at all.) Bill From martin at v.loewis.de Thu Sep 6 08:46:50 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 06 Sep 2007 08:46:50 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep5.205217pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> <46DEF5FF.8040602@v.loewis.de> <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> <46DEFF3C.90306@v.loewis.de> <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com> <07Sep5.205217pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46DFA25A.1070901@v.loewis.de> >> It's actually easier to do all or nothing. I'm tempted to just report >> 'critical' extensions. > > Simpler to provide them all I very much doubt that, at least if you want to report decoded information. Conceptually, there is an infinite number of extensions, and when you are done, I can show you lots of certificates that have extensions that you don't support. > This certificate has already been validated; the issue is > how to get critical information to the app so it can make > authorization decisions (like subjectAltName when the subject field is > empty) > {'extensions': {'1.3.6.1.5.5.7.1.12': u'', > 'Authority Information Access': u'OCSP - URI:http://EVIntl-ocsp.verisign.com\n', > 'X509v3 Authority Key Identifier': u'keyid:4E:43:C8:1D:76:EF:37:53:7A:4F:F2:58:6F:94:F3:38:E2:D5:BD:DF\n', > 'X509v3 Basic Constraints': u'CA:FALSE', > 'X509v3 CRL Distribution Points': u'URI:http://EVIntl-crl.verisign.com/EVIntl2006.crl\n', > 'X509v3 Certificate Policies': u'Policy: 2.16.840.1.113733.1.7.23.6\n', > 'X509v3 Extended Key Usage': u'TLS Web Server Authentication, TLS Web Client Authentication, Netscape Server Gated Crypto, Microsoft Server Gated Crypto', > 'X509v3 Key Usage': u'Digital Signature, Key Encipherment', > 'X509v3 Subject Key Identifier': u'F1:5A:89:93:55:47:4B:BA:51:F5:4E:E0:CB:16:55:F4:D7:CC:38:67'}, Hmm. In this certificate, none of the extensions you report have been marked critical; they are all non-critical. Also, you are reporting the logotype (1.3.6.1.5.5.7.1.12) incorrectly; it's defined in RFC 3709, and it's definitely not an empty string in the certificate you've used. Regards, Martin From janssen at parc.com Thu Sep 6 18:11:57 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 6 Sep 2007 09:11:57 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <46DFA25A.1070901@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <07Sep4.122146pdt."57996"@synergy1.parc.xerox.com> <46DDCD7C.40004@v.loewis.de> <07Sep4.152117pdt."57996"@synergy1.parc.xerox.com> <46DE3DB8.6000004@v.loewis.de> <07Sep5.081717pdt."57996"@synergy1.parc.xerox.com> <46DECFF6.4040107@v.loewis.de> <07Sep5.091238pdt."57996"@synergy1.parc.xerox.com> <07Sep5.092643pdt."57996"@synergy1.parc.xerox.com> <07Sep5.104910pdt."57996"@synergy1.parc.xerox.com> <46DEF5FF.8040602@v.loewis.de> <07Sep5.115820pdt."57996"@synergy1.parc.xerox.com> <46DEFF3C.90306@v.loewis.de> <07Sep5.200401pdt."57996"@synergy1.parc.xerox.com> <07Sep5.205217pdt."57996"@synergy1.parc.xerox.com> <46DFA25A.1070901@v.loewis.de> Message-ID: <07Sep6.091202pdt."57996"@synergy1.parc.xerox.com> > I very much doubt that, at least if you want to report decoded > information. Conceptually, there is an infinite number of extensions, > and when you are done, I can show you lots of certificates that > have extensions that you don't support. I'm not going to decode anything; I'm just using the OpenSSL functionality and providing whatever it provides. > Hmm. In this certificate, none of the extensions you report have been > marked critical; they are all non-critical. That's what I meant by "simpler to show everything". > Also, you are reporting the logotype (1.3.6.1.5.5.7.1.12) incorrectly; > it's defined in RFC 3709, and it's definitely not an empty string in > the certificate you've used. Yes, I see. I'll poke at the OpenSSL code harder :-). Bill From radix at twistedmatrix.com Thu Sep 6 18:50:57 2007 From: radix at twistedmatrix.com (Christopher Armstrong) Date: Thu, 6 Sep 2007 12:50:57 -0400 Subject: [Python-Dev] frozenset C API? In-Reply-To: <-1936579380892715012@unknownmsgid> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> Message-ID: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> On 9/5/07, Bill Janssen wrote: > > It's actually easier to do all or nothing. I'm tempted to just report > > 'critical' extensions. > > Simpler to provide them all, though I should note that the purpose of > the information provided here is mainly for authorization/accounting > purposes, not for "other" use of the certificate. If that's desired, > they should pull the binary form of the certificate (there's an > interface for that), and use M2Crypto or PyOpenSSL to decode it in > general. This certificate has already been validated; the issue is > how to get critical information to the app so it can make > authorization decisions (like subjectAltName when the subject field is > empty). Reporting non-critical extensions like "extended key usage" > is nifty, but seems pointless. RFC 2818 """If a subjectAltName extension of type dNSName is present, that MUST be used as the identity. Otherwise, the (most specific) Common Name field in the Subject field of the certificate MUST be used. Although the use of the Common Name is existing practice, it is deprecated and Certification Authorities are encouraged to use the dNSName instead. """ This is from an explanation of how to do hostname verification when doing HTTPS requests. HTTPS clients MUST do this in order to be compliant. Is an HTTPS client not in your list of use cases? """In general, HTTP/TLS requests are generated by dereferencing a URI. As a consequence, the hostname for the server is known to the client. If the hostname is available, the client MUST check it against the server's identity as presented in the server's Certificate message, in order to prevent man-in-the-middle attacks.""" I really don't understand why you would not expose all data in the certificate. It seems totally obvious. The data is there for a reason. I want the subjectAltName. Probably other people want other stuff. Why cripple it? Please include it all. -- Christopher Armstrong International Man of Twistery http://radix.twistedmatrix.com/ http://twistedmatrix.com/ http://canonical.com/ From martin at v.loewis.de Thu Sep 6 19:03:32 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 06 Sep 2007 19:03:32 +0200 Subject: [Python-Dev] frozenset C API? In-Reply-To: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> Message-ID: <46E032E4.6050300@v.loewis.de> > I really don't understand why you would not expose all data in the > certificate. You mean, providing the entire certificate as a blob? That is planned (although perhaps not implemented). Or do you mean "expose all data in a structured manner". BECAUSE IT'S NOT POSSIBLE. Sorry for shouting, but people don't ever get the notion of "extension". > It seems totally obvious. The data is there for a reason. > I want the subjectAltName. Probably other people want other stuff. Why > cripple it? Please include it all. That's not possible. You can get the whole thing as a blob, and then you have to decode it yourself if something you want is not decoded. Regards, Martin From janssen at parc.com Thu Sep 6 19:15:41 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 6 Sep 2007 10:15:41 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> Message-ID: <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> > RFC 2818 > > """If a subjectAltName extension of type dNSName is present, that MUST > be used as the identity. Otherwise, the (most specific) Common Name > field in the Subject field of the certificate MUST be used. Although > the use of the Common Name is existing practice, it is deprecated and > Certification Authorities are encouraged to use the dNSName instead. > """ Yes, subjectAltName is a big one. But I think it may be the only extension I'll expose. The issue is that I don't see a generic way of mapping extension X into Python data structure Y; each one needs to be handled specially. If you can see a way around this, please speak up! > I really don't understand why you would not expose all data in the > certificate. It seems totally obvious. The data is there for a reason. > I want the subjectAltName. Probably other people want other stuff. Why > cripple it? Please include it all. I intend to "include it all", by giving you a way to pull the full DER form of the certificate into Python. But a number of fields in the certificate have nothing to do with authorization, like the signature, which has already been used for validation. So I don't intend to try to convert them into Python-friendly forms. Applications which want to use that information already need to have a more powerful library, like M2Crypto or PyOpenSSL, available; they can simply work with the DER form of the certificate. Bill From janssen at parc.com Thu Sep 6 19:25:39 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 6 Sep 2007 10:25:39 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> Message-ID: <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> By the way, I think the hostname matching provisions of 2818 (which is, after all, only an informational RFC, not a standard) are poorly thought out. Many machines have more hostnames than you can shake a stick at, and often provide certs with the wrong hostname in them (usually because they have no way to determine what the *right* hostname is, from inside that machine). Bill From glyph at divmod.com Thu Sep 6 19:31:55 2007 From: glyph at divmod.com (glyph at divmod.com) Date: Thu, 06 Sep 2007 17:31:55 -0000 Subject: [Python-Dev] frozenset C API? In-Reply-To: <46E032E4.6050300@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <46E032E4.6050300@v.loewis.de> Message-ID: <20070906173155.21185.610867885.divmod.xquotient.7040@joule.divmod.com> On 05:03 pm, martin at v.loewis.de wrote: >>I really don't understand why you would not expose all data in the >>certificate. > >You mean, providing the entire certificate as a blob? That is planned >(although perhaps not implemented). > >Or do you mean "expose all data in a structured manner". BECAUSE >IT'S NOT POSSIBLE. Sorry for shouting, but people don't ever get the >notion of "extension". "structure" is a relative term. A typical way to deal with extensions unknown to the implementation is to provide ways to deal with the *extension-specific* parts of the data in question, c.f. http://java.sun.com/j2se/1.4.2/docs/api/java/security/cert/X509Extension.html Exposing the entire certificate object as a blob so that some *other* library could parse it *again* seems like just giving up. However, as to the specific issue of subjectAltName which Chris first mentioned: if HTTPS isn't an important specification to take into account while designing an SSL layer for Python, then I can't imagine what is. subjectAltName should be directly supported regardless of how it deals with unknown extensions. >>It seems totally obvious. The data is there for a reason. >>I want the subjectAltName. Probably other people want other stuff. Why >>cripple it? Please include it all. >That's not possible. You can get the whole thing as a blob, and then >you have to decode it yourself if something you want is not decoded. Something very much like that is certainly possible, and has been done in numerous other places (including the Java implementation linked above). Providing a semantically rich interface to every possible X509 extension is of course ridiculous, but I don't think that's what anyone is actually proposing here. From glyph at divmod.com Thu Sep 6 19:45:18 2007 From: glyph at divmod.com (glyph at divmod.com) Date: Thu, 06 Sep 2007 17:45:18 -0000 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> Message-ID: <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com> On 05:15 pm, janssen at parc.com wrote: >>RFC 2818 >> >>"""If a subjectAltName extension of type dNSName is present, that MUST >>be used as the identity. Otherwise, the (most specific) Common Name >>field in the Subject field of the certificate MUST be used. Although >>the use of the Common Name is existing practice, it is deprecated and >>Certification Authorities are encouraged to use the dNSName instead. >>""" >Yes, subjectAltName is a big one. But I think it may be the only >extension I'll expose. The issue is that I don't see a generic way >of mapping extension X into Python data structure Y; each one needs to >be handled specially. If you can see a way around this, please speak >up! Well, I can't speak for Chris, but that will certainly make *me* happier :). >I intend to "include it all", by giving you a way to pull the full DER >form of the certificate into Python. But a number of fields in the >certificate have nothing to do with authorization, like the signature, >which has already been used for validation. So I don't intend to try >to convert them into Python-friendly forms. Applications which want to >use that information already need to have a more powerful library, like >M2Crypto or PyOpenSSL, available; they can simply work with the DER >form >of the certificate. When you say "the full DER form", are you simply referring to the full blob, or a broken-down representation by key and by extension? This begs the question: M2Crypto and PyOpenSSL already do what you're proposing to do, as far as I can tell, and are, as you say, "more powerful". There are issues with each (and issues with the GNU TLS bindings too, which I notice you didn't mention...) Speaking of issues, PyOpenSSL, for example, does not expose subjectAltName :). This has been a long thread, so I may have missed posts where this was already discussed, but even if I'm repeating this, I think it deserves to be beaten to death. *Why* are you trying to bring the number of (potentially buggy, incomplete) Python SSL bindings to 4, rather than adopting one of the existing ones and implementing a simple wrapper on top of it? PyOpenSSL, in particular, is both a popular de-facto standard *and* almost completely unmaintained; python's standard library could absorb/improve it with little fuss. From janssen at parc.com Thu Sep 6 20:15:16 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 6 Sep 2007 11:15:16 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com> Message-ID: <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com> > When you say "the full DER form", are you simply referring to the full > blob, or a broken-down representation by key and by extension? The full blob. > This begs the question: M2Crypto and PyOpenSSL already do what you're > proposing to do, as far as I can tell, and are, as you say, "more > powerful". I'm trying to give the application the ability to do some level of authorization without requiring either of those packages. Like being able to tell who's on the other side of the connection :-). Right now, I think the right fields to expose are "subject" (I see little point to exposing "issuer"), "notAfter" (you're always guaranteed to be after "notBefore", or the cert wouldn't validate, so I see little point to exposing that, but "notAfter" can be used after the connection has been established), subjectAltName if present, and perhaps the certificate's serial number. I don't see how the other fields in the cert can be profitably used. Anything else you want, you can pull over the DER blob and look into it. > PyOpenSSL, in particular, is both a popular de-facto > standard *and* almost completely unmaintained; python's standard library > could absorb/improve it with little fuss. Good idea, go for it! A full wrapper for OpenSSL is beyond the scope of my ambition; I'm simply trying to add a simple fix to what's already in the standard library. Bill From radix at twistedmatrix.com Thu Sep 6 20:18:21 2007 From: radix at twistedmatrix.com (Christopher Armstrong) Date: Thu, 6 Sep 2007 14:18:21 -0400 Subject: [Python-Dev] frozenset C API? In-Reply-To: <46E032E4.6050300@v.loewis.de> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <46E032E4.6050300@v.loewis.de> Message-ID: <60ed19d40709061118k2755bbfdkecb3eae94cf22f93@mail.gmail.com> On 9/6/07, "Martin v. L?wis" wrote: > You mean, providing the entire certificate as a blob? That is planned > (although perhaps not implemented). > > Or do you mean "expose all data in a structured manner". BECAUSE > IT'S NOT POSSIBLE. Sorry for shouting, but people don't ever get the > notion of "extension". > > > It seems totally obvious. The data is there for a reason. > > I want the subjectAltName. Probably other people want other stuff. Why > > cripple it? Please include it all. > > That's not possible. You can get the whole thing as a blob, and then > you have to decode it yourself if something you want is not decoded. Sorry, I guess I thought it was obvious. Please let me get at the bytes of just the unknown-to-ssl-module extension without forcing me to write an entire general ASN.1 certificate parser or use another (incomplete) one. Many extensions have simple data in them that is trivial to parse alone. -- Christopher Armstrong International Man of Twistery http://radix.twistedmatrix.com/ http://twistedmatrix.com/ http://canonical.com/ From guido at python.org Thu Sep 6 23:10:44 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 6 Sep 2007 14:10:44 -0700 Subject: [Python-Dev] Google spreadsheet to collaborate on backporting Py3K stuff to 2.6 In-Reply-To: References: Message-ID: I've transferred everything from my spreadsheet to Neal's. On 9/5/07, Brett Cannon wrote: > Neal, Anthony, Thomas W., and I have a spreadsheet that was started to > keep track of what needs to be done in what needs to be done in 2.6 > for Py3K transitioning: > http://spreadsheets.google.com/pub?key=pCKY4oaXnT81FrGo3ShGHGg . I am > opening the spreadsheet up to everyone so that others can help > maintain it. > > There is a sheet in the Python 3000 Tasks spreadsheet that should be > merged into this spreadsheet and then deleted. If anyone wants to > help with that it would be great (once something has been moved from > "Python 3000 Tasks" to "Python 2 -> 3 transition" just delete it from > "Python 3000 Tasks"). > > Because Neal created this spreadsheet he is the only one who can open > editing to everyone. If you would like to have edit abilities to the > spreadsheet just reply to this email saying you want an invite and I > will add you manually (and if you want a different address added just > say so). > > -Brett > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Thu Sep 6 23:55:06 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 6 Sep 2007 14:55:06 PDT Subject: [Python-Dev] Python access to data fields of SSL connection peer certificate Message-ID: <07Sep6.145509pdt."57996"@synergy1.parc.xerox.com> After a great deal of discussion, under the Subject line of "frozenset C API?" (you may have missed it :-), I'm coming to the conclusion that in revealing the fields of an SSL certificate, less is more. >From one of the messages in that thread: I'm trying to give the application the ability to do some level of authorization without requiring either of those packages. Like being able to tell who's on the other side of the connection :-). Right now, I think the right fields to expose are "subject" (I see little point to exposing "issuer"), "notAfter" (you're always guaranteed to be after "notBefore", or the cert wouldn't validate, so I see little point to exposing that, but "notAfter" can be used after the connection has been established), subjectAltName if present, and perhaps the certificate's serial number. Remember that the cert has already been validated, so I don't see how the other fields in the cert can be profitably used for authorization and/or accounting, which is the purpose of this interface. Anything else you want, you can pull over the DER blob and look into it with some other crypto package; I'll provide a way to pull the full binary form of the certificate into Python as a bytes string (as soon as the bytes API gets backported into the trunk). Under those rules, the samples in the current documentation would look like {'notAfter': 'May 8 23:59:59 2009 GMT', 'serialNumber': '6A4AC31B3110E6EB48F0FC51A39A171F', 'subject': ((('serialNumber', u'2497886'),), (('1.3.6.1.4.1.311.60.2.1.3', u'US'),), (('1.3.6.1.4.1.311.60.2.1.2', u'Delaware'),), (('countryName', u'US'),), (('postalCode', u'94043'),), (('stateOrProvinceName', u'California'),), (('localityName', u'Mountain View'),), (('streetAddress', u'487 East Middlefield Road'),), (('organizationName', u'VeriSign, Inc.'),), (('organizationalUnitName', u'Production Security Services'),), (('organizationalUnitName', u'Terms of use at www.verisign.com/rpa (c)06'),), (('commonName', u'www.verisign.com'),))} and {'notAfter': 'Feb 16 16:54:50 2013 GMT', 'serialNumber': 'FFAA4ADBF570818D', 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),))} The server cert at https://www.dcl.hpi.uni-potsdam.de/ would look like {'notAfter': 'Mar 17 13:02:27 2008 GMT', 'serialNumber': '2567F168000300000678', 'subject': ((('countryName', u'DE'),), (('stateOrProvinceName', u'Brandenburg'),), (('localityName', u'Potsdam'),), (('organizationName', u'Hasso-Plattner-Institut'),), (('organizationalUnitName', u'Operating Systems & Middleware'),), (('commonName', u'www.dcl.hpi.uni-potsdam.de'),)), 'subjectAltName': ('DNS:www.dcl.hpi.uni-potsdam.de', 'DNS:www', 'DNS:dfw', 'DNS:dfw.dcl.hpi.uni-potsdam.de', 'IP Address:141.89.224.164')} Thanks to Martin for suggesting it. Bill From brett at python.org Fri Sep 7 04:34:42 2007 From: brett at python.org (Brett Cannon) Date: Thu, 6 Sep 2007 19:34:42 -0700 Subject: [Python-Dev] PEP 362: Signature objects Message-ID: Neal Becker over on python-3000 said that the Boost people could use this. Figured it was time to present it officially to the list to see if I can get it added for 2.6/3.0. The implementation in the sandbox works in both 2.6 and 3.0 out of the box (no 2to3 necessary) so feel free to play with it. --------------------------------------------------------- Abstract ======== Python has always supported powerful introspection capabilities, including that for functions and methods (for the rest of this PEP the word "function" refers to both functions and methods). Taking a function object, you can fully reconstruct the function's signature. Unfortunately it is a little unruly having to look at all the different attributes to pull together complete information for a function's signature. This PEP proposes an object representation for function signatures. This should help facilitate introspection on functions for various uses (e.g., decorators). The introspection information contains all possible information about the parameters in a signature (including Python 3.0 features). This object, though, is not meant to replace existing ways of introspection on a function's signature. The current solutions are there to make Python's execution work in an efficient manner. The proposed object representation is only meant to help make application code have an easier time to query a function on its signature. Signature Object ================ The overall signature of an object is represented by the Signature object. This object is to store a `Parameter object`_ for each parameter in the signature. It is also to store any information about the function itself that is pertinent to the signature. A Signature object has the following structure attributes: * name : str Name of the function. This is not fully qualified because function objects for methods do not know the class they are contained within. This makes functions and methods indistinguishable from one another when passed to decorators, preventing proper creation of a fully qualified name. * var_args : str Name of the variable positional parameter (i.e., ``*args``), if present, or the empty string. * var_kw_args : str Name of the variable keyword parameter (i.e., ``**kwargs``), if present, or the empty string. * var_annotations: dict(str, object) Dict that contains the annotations for the variable parameters. The keys are of the variable parameter with values of the annotation. If an annotation does not exist for a variable parameter then the key does not exist in the dict. * parameters : list(Parameter) List of the parameters of the function as represented by Parameter objects in the order of its definition (keyword-only arguments are in the order listed by ``code.co_varnames``). * bind(\*args, \*\*kwargs) -> dict(str, Parameter) Create a mapping from arguments to parameters. The keys are the names of the parameter that an argument maps to with the value being the value the parameter would have if this function was called with the given arguments. The Signature object is stored in the ``__signature__`` attribute of a function. When it is to be created is discussed in `Open Issues`_. Parameter Object ================ A function's signature is made up of several parameters. Python's different kinds of parameters is quite large and rich and continues to grow. Parameter objects represent any possible parameter. Originally the plan was to represent parameters using a list of parameter names on the Signature object along with various dicts keyed on parameter names to disseminate the various pieces of information one can know about a parameter. But the decision was made to incorporate all information about a parameter in a single object so as to make extending the information easier. This was originally put forth by Talin and the preferred form of Guido (as discussed at the 2006 Google Sprint). The structure of the Parameter object is: * name : (str | tuple(str)) The name of the parameter as a string if it is not a tuple. If the argument is a tuple then a tuple of strings is used. * position : int The position of the parameter within the signature of the function (zero-indexed). For keyword-only parameters the position value is arbitrary while not conflicting with positional parameters. The suggestion of setting the attribute to None or -1 to represent keyword-only parameters was rejected to prevent variable type usage and as a possible point of errors, respectively. * has_default : bool True if the parameter has a default value, else False. * default_value : object The default value for the parameter, if present, else the attribute does not exist. This is done so that the attribute is not accidentally used if no default value is set as any default value could be a legitimate default value itself. * keyword_only : bool True if the parameter is keyword-only, else False. * has_annotation : bool True if the parameter has an annotation, else False. * annotation Set to the annotation for the parameter. If ``has_annotation`` is False then the attribute does not exist to prevent accidental use. Implementation ============== An implementation can be found in Python's sandbox [#impl]_. There is a function named ``signature()`` which returns the value stored on the ``__signature__`` attribute if it exists, else it creates the Signature object for the function and sets ``__signature__``. For methods this is stored directly on the im_func function object since that is what decorators work with. Open Issues =========== When to construct the Signature object? --------------------------------------- The Signature object can either be created in an eager or lazy fashion. In the eager situation, the object can be created during creation of the function object. In the lazy situation, one would pass a function object to a function and that would generate the Signature object and store it to ``__signature__`` if needed, and then return the value of ``__signature__``. Should ``Signature.bind`` return Parameter objects as keys? ----------------------------------------------------------- Instead of returning a dict with keys consisting of the name of the parameters, would it be more useful to instead use Parameter objects? The name of the argument can easily be retrieved from the key (and the name would be used as the hash for a Parameter object). Provide a mapping of parameter name to Parameter object? -------------------------------------------------------- While providing access to the parameters in order is handy, it might also be beneficial to provide a way to retrieve Parameter objects from a Signature object based on the parameter's name. Which style of access (sequential/iteration or mapping) will influence how the parameters are stored internally and whether __getitem__ accepts strings or integers. One possible compromise is to have ``__getitem__`` provide mapping support and have ``__iter__`` return Parameter objects based on their ``position`` attribute. This allows for getting the sequence of Parameter objects easily by using the ``__iter__`` method on Signature object along with the sequence constructor (e.g., ``list`` or ``tuple``). Remove ``has_*`` attributes? ---------------------------- If an EAFP approach to the API is taken, both ``has_annotation`` and ``has_default`` are unneeded as the respective ``annotation`` and ``default_value`` attributes are simply not set. It's simply a question of whether to have a EAFP or LBYL interface. Have ``var_args`` and ``_var_kw_args`` default to ``None``? ------------------------------------------------------------ It has been suggested by Fred Drake that these two attributes have a value of ``None`` instead of empty strings when they do not exist. Deprecate ``inspect.getargspec()`` and ``.formatargspec()``? ------------------------------------------------------------- Since the Signature object replicates the use of ``getargspec()`` from the ``inspect`` module it might make sense to deprecate it in 2.6. ``formatargspec()`` could also go if Signature objects gained a __str__ representation. Issue with that is types such as ``int``, when used as annotations, do not lend themselves for output (e.g., ``"type 'int'>"`` is the string represenation for ``int``). The repr representation of types would need to change in order to make this reasonable. References ========== .. [#impl] pep362 directory in Python's sandbox (http://svn.python.org/view/sandbox/trunk/pep362/) Copyright ========= This document has been placed in the public domain. From oliphant.travis at ieee.org Fri Sep 7 07:18:07 2007 From: oliphant.travis at ieee.org (Travis Oliphant) Date: Fri, 07 Sep 2007 00:18:07 -0500 Subject: [Python-Dev] Google spreadsheet to collaborate on backporting Py3K stuff to 2.6 In-Reply-To: References: Message-ID: Brett Cannon wrote: > Neal, Anthony, Thomas W., and I have a spreadsheet that was started to > keep track of what needs to be done in what needs to be done in 2.6 > for Py3K transitioning: > http://spreadsheets.google.com/pub?key=pCKY4oaXnT81FrGo3ShGHGg . I am > opening the spreadsheet up to everyone so that others can help > maintain it. > > There is a sheet in the Python 3000 Tasks spreadsheet that should be > merged into this spreadsheet and then deleted. If anyone wants to > help with that it would be great (once something has been moved from > "Python 3000 Tasks" to "Python 2 -> 3 transition" just delete it from > "Python 3000 Tasks"). > > Because Neal created this spreadsheet he is the only one who can open > editing to everyone. If you would like to have edit abilities to the > spreadsheet just reply to this email saying you want an invite and I > will add you manually (and if you want a different address added just > say so). I would like an invite. Thanks. -Travis From janssen at parc.com Fri Sep 7 22:55:57 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 13:55:57 PDT Subject: [Python-Dev] any tips on malloc debugging? Message-ID: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com> I've been expanding the SSL test suite, and found something like this cropping up, not always, but maybe 30% of the time. So I run it under gdb, but the "szone_error" breakpoint never gets hit. Any other malloc debugging tips I should know about? (gdb) info break Num Type Disp Enb Address What 1 breakpoint keep y 0x900f2e56 (gdb) (gdb) run ./Lib/test/regrtest.py -R :4: -u all test_ssl Starting program: /local/python/trunk/src/python.exe ./Lib/test/regrtest.py -R :4: -u all test_ssl test_ssl [...] python.exe(22696,0xa000d000) malloc: *** error for object 0x650800: double free python.exe(22696,0xa000d000) malloc: *** set a breakpoint in szone_error to debug test test_ssl failed -- Traceback (most recent call last): File "/local/python/trunk/src/Lib/test/test_ssl.py", line 304, in testSSL3 CERTFILE2, CERTFILE3) File "/local/python/trunk/src/Lib/test/test_ssl.py", line 203, in serverParamsTest raise test_support.TestFailed("Unexpected SSL error: " + str(x)) TestFailed: Unexpected SSL error: (8, '_ssl.c:394: EOF occurred in violation of protocol') 1 test failed: test_ssl [23436 refs] Program exited with code 01. (gdb) Bill From guido at python.org Fri Sep 7 23:16:05 2007 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Sep 2007 14:16:05 -0700 Subject: [Python-Dev] any tips on malloc debugging? In-Reply-To: <7018697645894217094@unknownmsgid> References: <7018697645894217094@unknownmsgid> Message-ID: I think there's a way to enable heavier malloc debugging than the normal --with-pydebug. You'll have to enable it manually by editing Python.h I believe. Though it may already be on if you define Py_DEBUG. (Is WITH_PYMALLOC always on?) Ther may also be a libmalloc that enables heavier debugging; the malloc man page would have info. On 9/7/07, Bill Janssen wrote: > I've been expanding the SSL test suite, and found something like this > cropping up, not always, but maybe 30% of the time. So I run it under > gdb, but the "szone_error" breakpoint never gets hit. Any other > malloc debugging tips I should know about? > > (gdb) info break > Num Type Disp Enb Address What > 1 breakpoint keep y 0x900f2e56 > (gdb) (gdb) run ./Lib/test/regrtest.py -R :4: -u all test_ssl > Starting program: /local/python/trunk/src/python.exe ./Lib/test/regrtest.py -R :4: -u all test_ssl > test_ssl > [...] > python.exe(22696,0xa000d000) malloc: *** error for object 0x650800: double free > python.exe(22696,0xa000d000) malloc: *** set a breakpoint in szone_error to debug > test test_ssl failed -- Traceback (most recent call last): > File "/local/python/trunk/src/Lib/test/test_ssl.py", line 304, in testSSL3 > CERTFILE2, CERTFILE3) > File "/local/python/trunk/src/Lib/test/test_ssl.py", line 203, in serverParamsTest > raise test_support.TestFailed("Unexpected SSL error: " + str(x)) > TestFailed: Unexpected SSL error: (8, '_ssl.c:394: EOF occurred in violation of protocol') > > 1 test failed: > test_ssl > [23436 refs] > > Program exited with code 01. > (gdb) > > Bill > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Sep 7 23:19:52 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 07 Sep 2007 23:19:52 +0200 Subject: [Python-Dev] any tips on malloc debugging? In-Reply-To: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E1C078.2030501@v.loewis.de> > I've been expanding the SSL test suite, and found something like this > cropping up, not always, but maybe 30% of the time. So I run it under > gdb, but the "szone_error" breakpoint never gets hit. Any other > malloc debugging tips I should know about? Is this a --with-pydebug build? If not, it should be. If that still does not give insights, I usually try valgrind (although usually with little success). Regards, Martin From jimjjewett at gmail.com Fri Sep 7 23:43:59 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 7 Sep 2007 17:43:59 -0400 Subject: [Python-Dev] PEP 362: Signature objects In-Reply-To: References: Message-ID: Brett Cannon wrote: > A Signature object has the following structure attributes: > * name : str > Name of the function. This is not fully qualified because > function objects for methods do not know the class they are > contained within. This makes functions and methods > indistinguishable from one another when passed to decorators, > preventing proper creation of a fully qualified name. (1) Would this change with the new static __class__ attribute used for the new super? (2) What about functions without a name? Do you want to say str or NoneType, or is that assumed? (3) Is the Signature object live or frozen? (name is writable ... will the Signature object reflect the new name, or the name in use at the time it was created?) > * var_annotations: dict(str, object) > Dict that contains the annotations for the variable parameters. > The keys are of the variable parameter with values of the Is there a special key for the "->" returns annotation, or is that available as a separate property? > The structure of the Parameter object is: > * name : (str | tuple(str)) > The name of the parameter as a string if it is not a tuple. If > the argument is a tuple then a tuple of strings is used. What is used for unnamed arguments (typically provided by C)? I like None, but I see the arguments for both "" and missing attribute. > * position : int > The position of the parameter within the signature of the > function (zero-indexed). For keyword-only parameters the position > value is arbitrary while not conflicting with positional > parameters. Is this just a property/alias for signature.parameters.index(self) ? What should a "parameter" object not associated with a specific signature return? -1, None, or missing attribute? Is there a way to get the associated Signature, or is it "compiled out" when the Signature and its child Parameters are first constructed? (I think the position property is the only attribute that would use it, unless you want some of the other attributes -- like annotations -- to be live.) ... I would also like to see a * value : object attribute; this would be missing on most functions, but might be filled in on a Signature representing a closure, or an execution frame. > When to construct the Signature object? > --------------------------------------- > The Signature object can either be created in an eager or lazy > fashion. In the eager situation, the object can be created during > creation of the function object. Since most code doesn't need it, I would expect it to be optimized out at least as often as docstrings are. > In the lazy situation, one would > pass a function object to a function and that would generate the > Signature object and store it to ``__signature__`` if > needed, and then return the value of ``__signature__``. Why store it? Do you expect many use cases to need the signature more than once (but not to save it themselves)? If there is a __signature__ attribute on a object, you have to specify whether it can be replaced, which parts of it are writable, how that will affect the function's own behavior, etc. I also suspect it might become a source of heisenbugs, like the "reference leaks" that were really DUMMY items in a dict. If the Signature is just a snapshot no longer attached to the original function, then people won't expect changes to the Signature to affect the callable. > Should ``Signature.bind`` return Parameter objects as keys? (see above) If a Signature is a snapshot (rather than a live part of the function), then it might make more sense to just add a value attribute to Parameter objects. > Provide a mapping of parameter name to Parameter object? > -------------------------------------------------------- > While providing access to the parameters in order is handy, it might > also be beneficial to provide a way to retrieve Parameter objects from > a Signature object based on the parameter's name. Which style of > access (sequential/iteration or mapping) will influence how the > parameters are stored internally and whether __getitem__ accepts > strings or integers. I think it should accept both. What storage mechanism to use is an internal detail that should be left to the implementation. I wouldn't expect Signature inspection to be inside a tight loop anyhow, unless it were part of a Generic Function dispatch engine ... and those authors (just PJE?) can optimize on what they actually need. > Remove ``has_*`` attributes? > ---------------------------- > If an EAFP approach to the API is taken, Please leave them; it is difficult to catch Exceptions in a list comprehension. > Have ``var_args`` and ``_var_kw_args`` default to ``None``? Makes sense to me, particularly since it should probably be consistent with function name, and that should probably be None. -jJ From janssen at parc.com Fri Sep 7 23:44:21 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 14:44:21 PDT Subject: [Python-Dev] any tips on malloc debugging? In-Reply-To: <46E1C078.2030501@v.loewis.de> References: <07Sep7.135559pdt."57996"@synergy1.parc.xerox.com> <46E1C078.2030501@v.loewis.de> Message-ID: <07Sep7.144424pdt."57996"@synergy1.parc.xerox.com> > Is this a --with-pydebug build? If not, it should be. Yes. > If that still does not give insights, I usually try valgrind > (although usually with little success). Actually, Google is your friend here. The message in malloc is misleading; set a breakpoint in malloc_printf instead. Bill From janssen at parc.com Fri Sep 7 23:53:36 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 14:53:36 PDT Subject: [Python-Dev] OpenSSL thread safety when reading files? Message-ID: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com> I'm seeing a number of malloc (actully, free) errors, now that I'm pounding on the OpenSSL server/client setup with lots of server threads and client threads. They all look like either (gdb) bt #0 0x9010b807 in malloc_printf () #1 0x900058ad in szone_free () #2 0x90005588 in free () #3 0x9194e508 in CRYPTO_free () #4 0x91993e77 in ERR_clear_error () #5 0x919b1884 in PEM_X509_INFO_read_bio () #6 0x9197a692 in X509_load_cert_crl_file () #7 0x9197a80e in by_file_ctrl () #8 0x919d6e2e in X509_STORE_load_locations () [...] or (much more frequently) (gdb) bt #0 0x9010b807 in malloc_printf () #1 0x900058ad in szone_free () #2 0x90005588 in free () #3 0x9194e508 in CRYPTO_free () #4 0x91993e77 in ERR_clear_error () #5 0x949fcf11 in SSL_CTX_use_certificate_chain_file () [...] Always in ERR_clear_error(), always from some frame that's reading a certificate file for some purpose. If I disable Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS around the places where the C code reads the certificate files, all these free errors go away. ERR_clear_error() is supposed to be thread-safe; it operates on a per-thread error state structure (which I make sure is initialized in my C code). But it sure looks like the client and server threads are both working with the same error state. Bill From janssen at parc.com Sat Sep 8 00:10:10 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 15:10:10 PDT Subject: [Python-Dev] OpenSSL thread safety when reading files? In-Reply-To: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep7.151015pdt."57996"@synergy1.parc.xerox.com> > I'm seeing a number of malloc (actully, free) errors, now that I'm > pounding on the OpenSSL server/client setup with lots of server > threads and client threads. They all look like either The issue seems to be that we assume OpenSSL is thread-safe (that is, we call Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS), but the _ssl.c code never did what was necessary to support that assumption. See http://www.openssl.org/docs/crypto/threads.html#DESCRIPTION. My analysis is that we need to add lock and unlock functions to the OpenSSL initialization code we currently use, which looks like this: /* Init OpenSSL */ SSL_load_error_strings(); SSLeay_add_ssl_algorithms(); Or, just not allow threads, which seems wrong. Bill From trentm at activestate.com Sat Sep 8 00:37:55 2007 From: trentm at activestate.com (Trent Mick) Date: Fri, 07 Sep 2007 15:37:55 -0700 Subject: [Python-Dev] [PEPs] Email addresses in PEPs? In-Reply-To: <4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com> References: <18121.47310.218893.540750@montanaro.dyndns.org> <4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com> Message-ID: <46E1D2C3.5030705@activestate.com> David Goodger wrote: > On 8/20/07, Brett Cannon wrote: >> I believe email addresses are automatically obfuscated as part of the >> HTML generation process, but one of the PEP editors can correct me if >> I am wrong. > > Yes, email addresses are obfuscated in PEPs. > > For example, in PEPs 0 & 12, my address is encoded as > "goodger at python.org" (the "@" is changed to " at " and > further obfuscated from there). More tricks could be played, but that > would only decrease the usefulness of addresses for legitimate > purposes. If some would find it useful, here is a snippet of code that obfuscates email addresses for HTML as done by Markdown (a text-to-html markup translator). It randomly encodes each charater as a hex or decimal HTML entity (roughly 10% raw, 45% hex, 45% dec). The email still appears normally in the browser, but is pretty obtuse when slicing and dicing the raw HTML. Would others find this useful in pep2html.py? ------------------- from random import random def _encode_email_address(self, addr): # Input: an email address, e.g. "foo at example.com" # # Output: the email address as a mailto link, with each character # of the address encoded as either a decimal or hex entity, in # the hopes of foiling most address harvesting spam bots. E.g.: # # foo@exa # mple.com # # Based on a filter by Matthew Wickline, posted to the BBEdit-Talk # mailing list: chars = [_xml_encode_email_char_at_random(ch) for ch in "mailto:" + addr] # Strip the mailto: from the visible part. addr = '%s' \ % (''.join(chars), ''.join(chars[7:])) return addr def _xml_encode_email_char_at_random(ch): r = random() # Roughly 10% raw, 45% hex, 45% dec. # '@' *must* be encoded. I [John Gruber] insist. if r > 0.9 and ch != "@": return ch elif r < 0.45: # The [1:] is to drop leading '0': 0x63 -> x63 return '&#%s;' % hex(ord(ch))[1:] else: return '&#%s;' % ord(ch) ------------------- -- Trent Mick trentm at activestate.com From janssen at parc.com Sat Sep 8 00:56:08 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 15:56:08 PDT Subject: [Python-Dev] OpenSSL thread safety when reading files? In-Reply-To: <07Sep7.151015pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.145340pdt."57996"@synergy1.parc.xerox.com> <07Sep7.151015pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep7.155615pdt."57996"@synergy1.parc.xerox.com> > My analysis is that we need to add lock and unlock functions to the > OpenSSL initialization code we currently use Yep, this seems to fix the problem. I'm now able to re-enable Py_BEGIN_ALLOW_THREADS / Py_END_ALLOW_THREADS, and still get a clean run: (gdb) run Starting program: /local/python/trunk/src/python.exe ./Lib/test/regrtest.py -R :4: -u all test_ssl test_ssl /local/python/trunk/src/Lib/test/test_ssl.py:247: DeprecationWarning: socket.ssl() is deprecated. Use ssl.sslsocket() instead. ssl_sock = socket.ssl(s) beginning 9 repetitions 123456789 ......... 1 test OK. [30009 refs] Program exited normally. (gdb) Bill From janssen at parc.com Sat Sep 8 01:09:09 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 16:09:09 PDT Subject: [Python-Dev] working with Python threads from C extension module? Message-ID: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> Reading through the C API documentation, I find: ``This is done so that dynamically loaded extensions compiled with thread support enabled can be loaded by an interpreter that was compiled with disabled thread support.'' I've currently got the set-up-SSL-threading code in _ssl.c surrounded by a "#ifdef HAVE_THREAD" bracket. It sounds like that might not be sufficient. It sounds like I need a runtime test for thread availability, instead, like this: #ifdef HAVE_THREAD if (PyEval_ThreadsInitialized()) _setup_ssl_threads(); #endif Seem right? So what happens when someone loads the _ssl module, initializes the threads, and tries to use SSL? It's going to start failing again. I think I need my own version of Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS, don't I? Which also checks to see if the SSL threading support has been initialized, in addition to the Python threading support. Something like #define SSL_ALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }} #define SSL_DISALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }} Any comments? Bill From janssen at parc.com Sat Sep 8 01:20:35 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 16:20:35 PDT Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com> > #define SSL_ALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }} > #define SSL_DISALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }} I'd forgotten how convoluted Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS were. Anyone have any other suggestions about how to do this? Raise an error if loaded in a non-threaded environment, then used in a threaded environment? Dynamic initialization of threading? Bill From janssen at parc.com Sat Sep 8 01:31:06 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 16:31:06 PDT Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep7.163111pdt."57996"@synergy1.parc.xerox.com> > So what happens when someone loads the _ssl module, initializes the > threads, and tries to use SSL? It's going to start failing again. I Which turns out to be exactly what test_ssl.py does. I'm tempted to have the _ssl module call PyEval_InitThreads(). Would that be kosher? Bill From guido at python.org Sat Sep 8 01:46:13 2007 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Sep 2007 16:46:13 -0700 Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <6121031531930291931@unknownmsgid> References: <6121031531930291931@unknownmsgid> Message-ID: Well, one shouldn't be bothering with threads unless the user intends to create threads. So I think it's not kosher. Once threads are initialized, everything runs a tad slower because the GIL manipulations actually cost time (even if there are no other threads). On 9/7/07, Bill Janssen wrote: > > So what happens when someone loads the _ssl module, initializes the > > threads, and tries to use SSL? It's going to start failing again. I > > Which turns out to be exactly what test_ssl.py does. I'm tempted > to have the _ssl module call PyEval_InitThreads(). Would that be kosher? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Sat Sep 8 01:57:40 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 7 Sep 2007 16:57:40 PDT Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: References: <6121031531930291931@unknownmsgid> Message-ID: <07Sep7.165746pdt."57996"@synergy1.parc.xerox.com> > Well, one shouldn't be bothering with threads unless the user intends > to create threads. So I think it's not kosher. Once threads are > initialized, everything runs a tad slower because the GIL > manipulations actually cost time (even if there are no other threads). I think that doing it in _ssl.c might be OK; it would only happen when the user loaded that extension module. In any case, I'm going to do it that way till we figure out a better solution. The alternatives right now are (1) let OpenSSL step all over itself (and potentially other things), or (2) remove the Py_BEGIN_ALLOW_THREADS on SSL context reads and writes. > On 9/7/07, Bill Janssen wrote: > > > So what happens when someone loads the _ssl module, initializes the > > > threads, and tries to use SSL? It's going to start failing again. I > > > > Which turns out to be exactly what test_ssl.py does. I'm tempted > > to have the _ssl module call PyEval_InitThreads(). Would that be kosher? The problem is the sequencing of the loading of the extension module, compared to when the user gets around to initializing threading. If we want to keep it kosher, we need a way to hook into PyEval_InitThreads() so that it will call the thread initialization routines of other dynamically loaded libraries that have already been loaded. Or a way to have Py_BEGIN_ALLOW_THREADS take into account that there may be more than one thread-dependent thing to check on. Bill From skip at pobox.com Sat Sep 8 02:04:11 2007 From: skip at pobox.com (skip at pobox.com) Date: Fri, 7 Sep 2007 19:04:11 -0500 Subject: [Python-Dev] [PEPs] Email addresses in PEPs? In-Reply-To: <46E1D2C3.5030705@activestate.com> References: <18121.47310.218893.540750@montanaro.dyndns.org> <4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com> <46E1D2C3.5030705@activestate.com> Message-ID: <18145.59131.323002.910688@montanaro.dyndns.org> Trent> If some would find it useful, here is a snippet of code that Trent> obfuscates email addresses for HTML as done by Markdown (a Trent> text-to-html markup translator). It randomly encodes each Trent> charater as a hex or decimal HTML entity (roughly 10% raw, 45% Trent> hex, 45% dec). Aren't most spammers' scrapers going to be intelligent enough by now (several years since they first arrived on the scene) to "see through" these sorts of common obfuscations? Skip From brett at python.org Sat Sep 8 02:59:35 2007 From: brett at python.org (Brett Cannon) Date: Fri, 7 Sep 2007 17:59:35 -0700 Subject: [Python-Dev] PEP 362: Signature objects In-Reply-To: References: Message-ID: On 9/7/07, Jim Jewett wrote: > > A Signature object has the following structure attributes: > > > * name : str > > Name of the function. This is not fully qualified because > > function objects for methods do not know the class they are > > contained within. This makes functions and methods > > indistinguishable from one another when passed to decorators, > > preventing proper creation of a fully qualified name. > > (1) Would this change with the new static __class__ attribute used > for the new super? > I don't know enough about the super implementation to know. If you can figure out the class from the function object alone then sure, this can change. > (2) What about functions without a name? Do you want to say str or > NoneType, or is that assumed? > What functions don't have a name? Even lambdas have the name ''. > (3) Is the Signature object live or frozen? (name is writable ... > will the Signature object reflect the new name, or the name in use at > the time it was created?) > They are currently one-time creation objects. One could change it to use properties and do the look up dynamically by caching the function object. But I currently have it implemented as all created in __init__ and then just left alone. > > * var_annotations: dict(str, object) > > Dict that contains the annotations for the variable parameters. > > The keys are of the variable parameter with values of the > > Is there a special key for the "->" returns annotation, or is that > available as a separate property? > Oops, that didn't get into the PEP for some reason. The Signature object has ``has_annotation``/``annotation`` attributes for the 'return' annotation. > > The structure of the Parameter object is: > > > * name : (str | tuple(str)) > > The name of the parameter as a string if it is not a tuple. If > > the argument is a tuple then a tuple of strings is used. > > What is used for unnamed arguments (typically provide by C)? I like > None, but I see the arguments for both "" and missing attribute. > It's open for debate. I didn't even think about functions not having __name__ set. Basically whatever people want to go with for var_args and var_kw_args. > > * position : int > > The position of the parameter within the signature of the > > function (zero-indexed). For keyword-only parameters the position > > value is arbitrary while not conflicting with positional > > parameters. > > Is this just a property/alias for signature.parameters.index(self) ? > Assuming that 'self' refers to some parameter, yes. > What should a "parameter" object not associated with a specific > signature return? -1, None, or missing attribute? > This is not an option as it must be specified by the Parameter constructor. A Parameter object should not exist without belonging to a Signature object. That's why neither Signature nor Parameter have their constructors specified; the signature() function is the only way you should cause the construction of either object. > Is there a way to get the associated Signature, or is it "compiled > out" when the Signature and its child Parameters are first > constructed? (I think the position property is the only attribute > that would use it, unless you want some of the other attributes -- > like annotations -- to be live.) There is currently no way to work backwards from a Parameter object to its parent Signature. It could be added if people wanted. > > ... > > I would also like to see a > > * value : object > > attribute; this would be missing on most functions, but might be > filled in on a Signature representing a closure, or an execution > frame. What for? How does either have bearing on the call signature of a function? > > > > When to construct the Signature object? > > --------------------------------------- > > > The Signature object can either be created in an eager or lazy > > fashion. In the eager situation, the object can be created during > > creation of the function object. > > Since most code doesn't need it, I would expect it to be optimized out > at least as often as docstrings are. > > > In the lazy situation, one would > > pass a function object to a function and that would generate the > > Signature object and store it to ``__signature__`` if > > needed, and then return the value of ``__signature__``. > > Why store it? Do you expect many use cases to need the signature more > than once (but not to save it themselves)? Because you can use these with decorators to allow introspection redirection:: def dec(fxn): def inner(*args, **kwargs): return fxn(*args, **kwargs) sig = signature(fxn) inner.__signature__ = sig return inner > > If there is a __signature__ attribute on a object, you have to specify > whether it can be replaced, It can. > which parts of it are writable, Any of it. > how that > will affect the function's own behavior, etc. It won't. > I also suspect it might > become a source of heisenbugs, like the "reference leaks" that were > really DUMMY items in a dict. > > If the Signature is just a snapshot no longer attached to the original > function, then people won't expect changes to the Signature to affect > the callable. > They are just snapshots unless people really want them to be live for some reason. > > Should ``Signature.bind`` return Parameter objects as keys? > > (see above) If a Signature is a snapshot (rather than a live part of > the function), then it might make more sense to just add a value > attribute to Parameter objects. > Why? You might make several calls to bind() and thus setting what a Parameter object would be bound to should be considered a temporary thing. > > Provide a mapping of parameter name to Parameter object? > > -------------------------------------------------------- > > > While providing access to the parameters in order is handy, it might > > also be beneficial to provide a way to retrieve Parameter objects from > > a Signature object based on the parameter's name. Which style of > > access (sequential/iteration or mapping) will influence how the > > parameters are stored internally and whether __getitem__ accepts > > strings or integers. > > I think it should accept both. > > What storage mechanism to use is an internal detail that should be > left to the implementation. I wouldn't expect Signature inspection to > be inside a tight loop anyhow, unless it were part of a Generic > Function dispatch engine ... and those authors (just PJE?) can > optimize on what they actually need. > I guess I can just try to do ``item.__index__()`` and if that triggers an AttributeError assume it is a name. > > Remove ``has_*`` attributes? > > ---------------------------- > > > If an EAFP approach to the API is taken, > > Please leave them; it is difficult to catch Exceptions in a list comprehension. > You can also just use hasattr() if needed. > > Have ``var_args`` and ``_var_kw_args`` default to ``None``? > > Makes sense to me, particularly since it should probably be consistent > with function name, and that should probably be None. So another vote for None. Thanks for the feedback, Jim! From guido at python.org Sat Sep 8 05:15:18 2007 From: guido at python.org (Guido van Rossum) Date: Fri, 7 Sep 2007 20:15:18 -0700 Subject: [Python-Dev] PEP 362: Signature objects In-Reply-To: References: Message-ID: On 9/7/07, Brett Cannon wrote: > On 9/7/07, Jim Jewett wrote: > > > A Signature object has the following structure attributes: > > > > > * name : str > > > Name of the function. This is not fully qualified because > > > function objects for methods do not know the class they are > > > contained within. This makes functions and methods > > > indistinguishable from one another when passed to decorators, > > > preventing proper creation of a fully qualified name. > > > > (1) Would this change with the new static __class__ attribute used > > for the new super? > > I don't know enough about the super implementation to know. If you > can figure out the class from the function object alone then sure, > this can change. I don't think it'll work -- the __class__ variable is only available *within* the function, not when one is introspecting the function object. Also, it is only available for functions that reference 'super' (or __class__ directly). As __class__ is passed into the function call as a "cell" variable (like references to variables from outer scopes), its mere presense slows down the call somewhat, hence it is only present when used. (BTW, it is not an attribute.) BTW there's a good reason why functions don't have easier access to the class in which they are defined: functions can easily be moved or shared between classes. The __class__ variable only records the class inside which the function is defined lexically, if any. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From hrvoje.niksic at avl.com Sat Sep 8 12:36:48 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Sat, 08 Sep 2007 12:36:48 +0200 Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com> Message-ID: <1189247808.11322.212.camel@localhost> On Fri, 2007-09-07 at 16:20 -0700, Bill Janssen wrote: > > #define SSL_ALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }} > > #define SSL_DISALLOW_THREADS {if (_ssl_locks != NULL) { Py_BEGIN_ALLOW_THREADS }} > > I'd forgotten how convoluted Py_BEGIN_ALLOW_THREADS and > Py_END_ALLOW_THREADS were. Anyone have any other suggestions about > how to do this? Be convoluted yourself and do this: #define PySSL_BEGIN_ALLOW_THREADS { if (_ssl_locks) { Py_BEGIN_ALLOW_THREADS #define PySSL_END_ALLOW_THREADS Py_END_ALLOW_THREADS } } (Untested, but I think it should work.) From martin at v.loewis.de Sat Sep 8 14:41:38 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Sep 2007 14:41:38 +0200 Subject: [Python-Dev] Buildbot upgraded to 0.7.5 Message-ID: <46E29882.8080707@v.loewis.de> I just upgraded the buildbot master to 0.7.5. If you see any problems, please let me know. Neal: buildbot now supports reloading of configurations, without interrupting builds. Try "buildbot reconfig" when you make a change (certain changes would still require a restart). Regards, Martin From janssen at parc.com Sat Sep 8 17:18:35 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 08:18:35 PDT Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <1189247808.11322.212.camel@localhost> References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com> <1189247808.11322.212.camel@localhost> Message-ID: <07Sep8.081836pdt."57996"@synergy1.parc.xerox.com> > Be convoluted yourself and do this: > > #define PySSL_BEGIN_ALLOW_THREADS { if (_ssl_locks) { Py_BEGIN_ALLOW_THREADS > #define PySSL_END_ALLOW_THREADS Py_END_ALLOW_THREADS } } > > (Untested, but I think it should work.) Yes, that had occurred to me. We want the code inside the braces still to run if the locks aren't held, so something more like #define PySSL_BEGIN_ALLOW_THREADS { \ PyThreadState *_save; \ if (_ssl_locks_count>0) {_save = PyEval_SaveThread();} #define PySSL_BLOCK_THREADS if (_ssl_locks_count>0){PyEval_RestoreThread(_save)}; #define PySSL_UNBLOCK_THREADS if (_ssl_locks_count>0){_save = PyEval_SaveThread()}; #define PySSL_END_ALLOW_THREADS if (_ssl_locks_count>0){PyEval_RestoreThread(_save);} \ } would do the trick. Unfortunately, this doesn't deal with the macro behaviour. The user has "turned on" threading; they expect reads and writes to yield the GIL so that other threads can make progress. But the fact that threading has been "turned on" after the SSL module has been initialized, means that threads don't work inside the SSL code. So the user's understanding of the system will be broken. No, I don't see any good way to fix this except to add a callback chain inside PyThread_init_thread, which is run down when threads are initialized. Any module which needs to set up threads registers itself on that chain, and gets called as part of PyThread_init_thread. But I'm far from the smartest person on this list :-), so perhaps someone else will see a good solution. This has got to be a problem with other extension modules linked to libraries which have their own threading abstractions. Bill From gjcarneiro at gmail.com Sat Sep 8 17:37:07 2007 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Sat, 8 Sep 2007 16:37:07 +0100 Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <7088780289868241160@unknownmsgid> References: <1189247808.11322.212.camel@localhost> <7088780289868241160@unknownmsgid> Message-ID: On 08/09/2007, Bill Janssen wrote: > > > Be convoluted yourself and do this: > > > > #define PySSL_BEGIN_ALLOW_THREADS { if (_ssl_locks) { > Py_BEGIN_ALLOW_THREADS > > #define PySSL_END_ALLOW_THREADS Py_END_ALLOW_THREADS } } > > > > (Untested, but I think it should work.) > > Yes, that had occurred to me. We want the code inside the braces > still to run if the locks aren't held, so something more like > > #define PySSL_BEGIN_ALLOW_THREADS { \ > PyThreadState *_save; \ > if (_ssl_locks_count>0) {_save = > PyEval_SaveThread();} > #define PySSL_BLOCK_THREADS if > (_ssl_locks_count>0){PyEval_RestoreThread(_save)}; > #define PySSL_UNBLOCK_THREADS if (_ssl_locks_count>0){_save = > PyEval_SaveThread()}; > #define PySSL_END_ALLOW_THREADS if > (_ssl_locks_count>0){PyEval_RestoreThread(_save);} \ > } > > would do the trick. Unfortunately, this doesn't deal with the macro > behaviour. The user has "turned on" threading; they expect reads and > writes to yield the GIL so that other threads can make progress. But > the fact that threading has been "turned on" after the SSL module has > been initialized, means that threads don't work inside the SSL code. > So the user's understanding of the system will be broken. > > No, I don't see any good way to fix this except to add a callback > chain inside PyThread_init_thread, which is run down when threads are > initialized. Any module which needs to set up threads registers itself > on that chain, and gets called as part of PyThread_init_thread. But > I'm far from the smartest person on this list :-), so perhaps someone > else will see a good solution. I think this is a helpful additional tool to solve threading problems. Doesn't solve everything, but it certainly helps :-) For instance, one thing it doesn't solve is when a library being wrapped can be initialized with multithreading support, but only allows such initialization as a very first API call; you can't initialize threading at any arbitrary time during application runtime. Unfortunately I don't think there is any sane way to fix this problem :-( This has got to be a problem with other extension modules linked to > libraries which have their own threading abstractions. Yes. Another problem is that python extensions may not wish to incur performance penalty of python threading calls. For instance, pyorbit has these macros: #define pyorbit_gil_state_ensure() (PyEval_ThreadsInitialized()? (PyGILState_Ensure()) : 0) #define pyorbit_gil_state_release(state) G_STMT_START { \ if (PyEval_ThreadsInitialized()) \ PyGILState_Release(state); \ } G_STMT_END #define pyorbit_begin_allow_threads \ G_STMT_START { \ PyThreadState *_save = NULL; \ if (PyEval_ThreadsInitialized()) \ _save = PyEval_SaveThread(); #define pyorbit_end_allow_threads \ if (PyEval_ThreadsInitialized()) \ PyEval_RestoreThread(_save); \ } G_STMT_END They all call PyEval_ThreadsInitialized() before doing anything thread related to save some performance. The other reason to do it this way is that the Python API calls themselves abort if they are called with threading not initialized. It would be nice the upstream python GIL macros were more like pyorbit and became no-ops when threading is not enabled. -- Gustavo J. A. M. Carneiro INESC Porto, Telecommunications and Multimedia Unit "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070908/5d3a01f6/attachment.htm From janssen at parc.com Sat Sep 8 18:51:41 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 09:51:41 PDT Subject: [Python-Dev] which SSL client protocols work with which server protocols? Message-ID: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com> I've now built a framework in test_ssl to test all client protocols (SSL2, SSL3, SSL23, TLS1) against all server protocols, and here's what I've come up with. Servers are along the X axis, and clients are on the Y axis. "Yes" means that that client protocol can talk to that server protocol. SSL2 SSL3 SS23 TLS1 SSL2 yes no no no SSL3 yes yes yes no SSL23 no no yes no TLS1 no no yes yes I'm a bit surprised by the facts that (1) an SSL2 client can't connect to an SSL23 server, and (2) an SSL23 client can *only* connect to an SSL23 server. Can anyone verify that these combos (the results of testing with the Python framework) are indeed to be expected? Bill From janssen at parc.com Sat Sep 8 20:41:42 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 11:41:42 PDT Subject: [Python-Dev] working with Python threads from C extension module? In-Reply-To: <07Sep8.081836pdt."57996"@synergy1.parc.xerox.com> References: <07Sep7.160910pdt."57996"@synergy1.parc.xerox.com> <07Sep7.162040pdt."57996"@synergy1.parc.xerox.com> <1189247808.11322.212.camel@localhost> <07Sep8.081836pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep8.114147pdt."57996"@synergy1.parc.xerox.com> > This has got to be a problem with other extension modules linked to > libraries which have their own threading abstractions. Sure enough, sqlite3 simply assumes threads (won't build without them), and turns them on if it's used (by calling PyThread_get_thread_ident(), which in turn calls PyThread_init_thread()). Bill From janssen at parc.com Sat Sep 8 20:57:33 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 11:57:33 PDT Subject: [Python-Dev] testing in a Python --without-threads build Message-ID: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> I can't seem to run the regression tests in a --without-threads build. Might be interesting to configure a buildbot this way to keep ourselves honest. Bill From janssen at parc.com Sat Sep 8 21:19:26 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 12:19:26 PDT Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> > I can't seem to run the regression tests in a --without-threads build. > Might be interesting to configure a buildbot this way to keep > ourselves honest. Because regrtest.py was importing test_socket_ssl without catching the ImportError exception: % ./python.exe ./Lib/test/regrtest.py test_socket_ssl test_socket_ssl test_socket_ssl skipped -- No module named thread 1 test skipped: test_socket_ssl Traceback (most recent call last): File "./Lib/test/regrtest.py", line 1190, in main() File "./Lib/test/regrtest.py", line 416, in main e = _ExpectedSkips() File "./Lib/test/regrtest.py", line 1111, in __init__ from test import test_socket_ssl File "/local/python/trunk/src/Lib/test/test_socket_ssl.py", line 8, in import threading File "/local/python/trunk/src/Lib/threading.py", line 6, in import thread ImportError: No module named thread % So, is this an "expected skip" or not? Bill From martin at v.loewis.de Sat Sep 8 21:40:52 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 08 Sep 2007 21:40:52 +0200 Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E2FAC4.2030900@v.loewis.de> >> I can't seem to run the regression tests in a --without-threads build. >> Might be interesting to configure a buildbot this way to keep >> ourselves honest. > > Because regrtest.py was importing test_socket_ssl without catching the > ImportError exception: If that is the reason you cannot run it, then it seems it works just fine. There is nothing wrong with tests getting skipped. > So, is this an "expected skip" or not? No. IIUC, "expected skips" are a platform property. For your platform, support for threads is expected (whatever your platform is as log as it was built in this millenium). Regards, Martin From janssen at parc.com Sat Sep 8 22:16:47 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 13:16:47 PDT Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <46E2FAC4.2030900@v.loewis.de> References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> <46E2FAC4.2030900@v.loewis.de> Message-ID: <07Sep8.131648pdt."57996"@synergy1.parc.xerox.com> > > Because regrtest.py was importing test_socket_ssl without catching the > > ImportError exception: > > If that is the reason you cannot run it, then it seems it works just > fine. There is nothing wrong with tests getting skipped. It wasn't getting skipped, it was crashing the regression testing harness. test_unittest catches the ImportError, but this was imported directly from regrtest.py. > > So, is this an "expected skip" or not? > > No. IIUC, "expected skips" are a platform property. For your platform, > support for threads is expected (whatever your platform is as log as > it was built in this millenium). OK. I'll put in a check for this. In fact, here's a patch: Index: Lib/test/regrtest.py =================================================================== --- Lib/test/regrtest.py (revision 58052) +++ Lib/test/regrtest.py (working copy) @@ -1108,7 +1108,6 @@ class _ExpectedSkips: def __init__(self): import os.path - from test import test_socket_ssl from test import test_timeout self.valid = False @@ -1122,8 +1121,13 @@ if not os.path.supports_unicode_filenames: self.expected.add('test_pep277') - if test_socket_ssl.skip_expected: - self.expected.add('test_socket_ssl') + try: + from test import test_socket_ssl + except ImportError: + pass + else: + if test_socket_ssl.skip_expected: + self.expected.add('test_socket_ssl') if test_timeout.skip_expected: self.expected.add('test_timeout') Bill From janssen at parc.com Sat Sep 8 22:28:22 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 13:28:22 PDT Subject: [Python-Dev] what platforms require RAND_add() before using SSL? Message-ID: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com> There are some functions in _ssl.c for gathering randomness from a daemon, and adding that randomness to the pseudo-random number generator in SSL, before using SSL. There's a note there saying that "on some platform" this is necessary. Anyone know which platforms? Bill From janssen at parc.com Sat Sep 8 22:36:39 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 13:36:39 PDT Subject: [Python-Dev] [Python-3000] 3.0 crypto In-Reply-To: <07Sep8.123933pdt."58663"@synergy1.parc.xerox.com> References: <07Sep8.123933pdt."58663"@synergy1.parc.xerox.com> Message-ID: <07Sep8.133648pdt."57996"@synergy1.parc.xerox.com> > We're already linking against the OpenSSL EVP libraries for hashlib > (and against the OpenSSL SSL libraries for the SSL support). It > wouldn't be hard to expose the EVP functions a bit more, essentially > as hash functions that return long (and reversible) hashes: > > encryptor = opensslevp.encryptor("AES-256-CBC", ...maybe some options...) > encryptor.update(...some plaintext...) Almost certainly this signature should be encryptor = opensslevp.encryptor("AES-256-CBC", KEY, ...options...) and correspondingly decryptor = opensslevp.decryptor("AES-256-CBC", KEY, ...options...) Bill From janssen at parc.com Sun Sep 9 02:19:30 2007 From: janssen at parc.com (Bill Janssen) Date: Sat, 8 Sep 2007 17:19:30 PDT Subject: [Python-Dev] can't run test_tcl remotely logged in on an OS X machine Message-ID: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com> "test_tcl" fails on me (OS X 10.4.10 on an Intel Mac, remotely logged in via SSH and X Windows): % test_tcl 2007-09-08 17:00:22.629 python.exe[4163] CFLog (0): CFMessagePort: bootstrap_register(): failed 1100 (0x44c), port = 0x3a03, name = 'Processes-0.58327041' See /usr/include/servers/bootstrap_defs.h for the error codes. 2007-09-08 17:00:22.630 python.exe[4163] CFLog (99): CFMessagePortCreateLocal(): failed to name Mach port (Processes-0.58327041) CFMessagePortCreateLocal failed (name = Processes-0.58327041 error = 0) Abort % This is on the trunk. Bill From nick.bastin at gmail.com Sun Sep 9 05:05:23 2007 From: nick.bastin at gmail.com (Nicholas Bastin) Date: Sat, 8 Sep 2007 23:05:23 -0400 Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <46E2FAC4.2030900@v.loewis.de> References: <46E2FAC4.2030900@v.loewis.de> Message-ID: <66d0a6e10709082005u4af353ebpbd0b7cd6c27db242@mail.gmail.com> Might expected skips instead be based on your current configuration instead of what someone statically decided what would be appropriate for your platform? Every new release I have to go through the 'unexpected skips' to determine that they're perfectly fine for how I configured python. It seems that we ought to provide a mechanism for querying python for how the build was configured (although for non-unittest cases, failing to import some modules is usually sufficient information - knowing why they fail probably doesn't matter) On 9/8/07, "Martin v. L?wis" wrote: > >> I can't seem to run the regression tests in a --without-threads build. > >> Might be interesting to configure a buildbot this way to keep > >> ourselves honest. > > > > Because regrtest.py was importing test_socket_ssl without catching the > > ImportError exception: > > If that is the reason you cannot run it, then it seems it works just > fine. There is nothing wrong with tests getting skipped. > > > So, is this an "expected skip" or not? > > No. IIUC, "expected skips" are a platform property. For your platform, > support for threads is expected (whatever your platform is as log as > it was built in this millenium). > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/nick.bastin%40gmail.com > From martin at v.loewis.de Sun Sep 9 09:38:29 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 09 Sep 2007 09:38:29 +0200 Subject: [Python-Dev] what platforms require RAND_add() before using SSL? In-Reply-To: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com> References: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E3A2F5.9020305@v.loewis.de> > There are some functions in _ssl.c for gathering randomness from a > daemon, and adding that randomness to the pseudo-random number > generator in SSL, before using SSL. There's a note there saying that > "on some platform" this is necessary. Anyone know which platforms? In general, anything that does not have /dev/[u]random; older Solaris releases and HP-UX in particular. Regards, Martin From martin at v.loewis.de Sun Sep 9 09:41:30 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 09 Sep 2007 09:41:30 +0200 Subject: [Python-Dev] can't run test_tcl remotely logged in on an OS X machine In-Reply-To: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com> References: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E3A3AA.6030103@v.loewis.de> > "test_tcl" fails on me (OS X 10.4.10 on an Intel Mac, remotely logged > in via SSH and X Windows): > > % test_tcl > 2007-09-08 17:00:22.629 python.exe[4163] CFLog (0): CFMessagePort: bootstrap_register(): failed 1100 (0x44c), port = 0x3a03, name = 'Processes-0.58327041' > See /usr/include/servers/bootstrap_defs.h for the error codes. > 2007-09-08 17:00:22.630 python.exe[4163] CFLog (99): CFMessagePortCreateLocal(): failed to name Mach port (Processes-0.58327041) > CFMessagePortCreateLocal failed (name = Processes-0.58327041 error = 0) > Abort > % > > This is on the trunk. That's no surprise, I would say: it seems you link against TkAqua (not X11 Tk); for that to work, you need a reference to WindowServer, which won't be available when logged in through SSL. Regards, Martin From giszo at nyomi.hu Sun Sep 9 11:17:31 2007 From: giszo at nyomi.hu (Giszo) Date: Sun, 9 Sep 2007 11:17:31 +0200 Subject: [Python-Dev] Porting python Message-ID: Hi! I've tried to port Python (2.3.6 and 2.5.1) to my own OS. The compilation of the python library is done after a few hours of work. When i try to run the compiled executable i got an error shown on the following screenshot: http://giszo.lame.hu/jshot/screens/screen31.png After a little while of debugging i know that it fails bootstrapping the exceptions because the initializer function failes to get the "__builtin__" module. Adding debug printfs to the bltinmodule.c init code it looks like the builtin module is initialized properly. I'd like to ask some help where i should start checking the code to fix the error. Thanks! ________________________________________________ Message sent using UebiMiau 2.7.9 From martin at v.loewis.de Sun Sep 9 11:46:27 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 09 Sep 2007 11:46:27 +0200 Subject: [Python-Dev] Porting python In-Reply-To: References: Message-ID: <46E3C0F3.6080701@v.loewis.de> > I'd like to ask some help where i should start checking the code to fix the > error. Python searches possible candidate locations of the standard library for a landmark, see getpath.c; currently, the landmark is os.py. If it doesn't find the landmark, it complains. Regards, Martin From janssen at parc.com Sun Sep 9 17:44:17 2007 From: janssen at parc.com (Bill Janssen) Date: Sun, 9 Sep 2007 08:44:17 PDT Subject: [Python-Dev] what platforms require RAND_add() before using SSL? In-Reply-To: <46E3A2F5.9020305@v.loewis.de> References: <07Sep8.132823pdt."57996"@synergy1.parc.xerox.com> <46E3A2F5.9020305@v.loewis.de> Message-ID: <07Sep9.084424pdt."57996"@synergy1.parc.xerox.com> > > There are some functions in _ssl.c for gathering randomness from a > > daemon, and adding that randomness to the pseudo-random number > > generator in SSL, before using SSL. There's a note there saying that > > "on some platform" this is necessary. Anyone know which platforms? > > In general, anything that does not have /dev/[u]random; > older Solaris releases and HP-UX in particular. Thanks, I"ll add that to the documentation. Any ideas what the values of the "entropy" parameter to RAND_add() are like, or how they are derived? I did a rapid skim of RFC 1750, but didn't see it there. Bill From janssen at parc.com Sun Sep 9 17:46:34 2007 From: janssen at parc.com (Bill Janssen) Date: Sun, 9 Sep 2007 08:46:34 PDT Subject: [Python-Dev] can't run test_tcl remotely logged in on an OS X machine In-Reply-To: <46E3A3AA.6030103@v.loewis.de> References: <07Sep8.171933pdt."57996"@synergy1.parc.xerox.com> <46E3A3AA.6030103@v.loewis.de> Message-ID: <07Sep9.084639pdt."57996"@synergy1.parc.xerox.com> > > "test_tcl" fails on me (OS X 10.4.10 on an Intel Mac, remotely logged > > in via SSH and X Windows): > > That's no surprise, I would say: it seems you link against TkAqua > (not X11 Tk); for that to work, you need a reference to WindowServer, > which won't be available when logged in through SSL. Actually, I think it literally *is* a surprise; if it were truly "no surprise", the testing harness would have caught it and moved on to the other tests. But if you mean, "no big deal", I agree. Bill From lukem at NetBSD.org Mon Sep 10 01:54:30 2007 From: lukem at NetBSD.org (Luke Mewburn) Date: Mon, 10 Sep 2007 09:54:30 +1000 Subject: [Python-Dev] Word size inconsistencies in C extension modules Message-ID: <20070909235430.GV25031@mewburn.net> Hi folks. While working on an in-house application that uses the curses module, we noticed that it didn't work as expected on an AIX system (powerpc 64-bit big-endian LP64), using python 2.3.5. On a hunch, I took a look through the _cursesmodule.c code and noticed the use of PyArg_ParseTuple()'s "l" decoding mode to retrieve a "long" from python into a C type (attr_t) that on AIX is an int. On 64-bit LP64 platforms, sizeof(long) > sizeof(int), so this doesn't quite work, especially on big-endian systems. Further research into curses shows that different platforms use a different underlying C type for the attr_t type (int, unsigned int, long, unsigned long), so changing the PyArg_ParseTuple() to using the "i" decoding mode probably wasn't portable. I documented this problem and provided a patch that fixes it against the head of the svn trunk in http://bugs.python.org/issue1114 (because the problem appears to still exist in the latest code.) My workaround was to use a separate explicit C "long" to decode the value from python into, and then just assign that to the final value and hope that the type promotion does the right thing on the native platfomr. My questions are: (a) What's the "preferred" style in python extension modules of parsing a number from python into a C type, where the C type size may change on different platforms? Is my method of guessing what the largest common size will be (long, unsigned long, ...), reading into that, and assigning to the final type, acceptable? (b) Is there a desire to see the standard python C extension modules cleaned up to use the answer to (a), especially where said modules may be susceptable to the word size problems I mentioned? (64bit big-endian platforms such as powerpc and sparc64 are good for detecting word-size lossage) cheers, Luke. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070910/734b38ce/attachment.pgp From janssen at parc.com Mon Sep 10 03:41:32 2007 From: janssen at parc.com (Bill Janssen) Date: Sun, 9 Sep 2007 18:41:32 PDT Subject: [Python-Dev] tests expanded for SSL module -- other suggestions? Message-ID: <07Sep9.184134pdt."57996"@synergy1.parc.xerox.com> I'm looking for suggestions for other SSL module tests. Here's the result of running my (not yet checked-in) test_ssl.py module in verbose mode. I'm pretty happy with the codebase right now, and barring other tests, I'm ready to check it in and start on the 3.x patch (or perhaps the 2.3 package). In the client/server tests, a new server thread is created for each test. In the STARTTLS test, several messages are exchanged in the clear, then the client sends a STARTTLS message and after the server replies "OK", initiates the TLS handshake. It would be nice to have an external HTTPS server on python.org that could be used for an HTTPS connection test. Is there one? Bill % ./python.exe ./Lib/test/regrtest.py -u all -v test_ssl test_ssl testCrucialConstants (test.test_ssl.BasicTests) ... ok testParseCert (test.test_ssl.BasicTests) ... {'notAfter': 'Feb 16 16:54:50 2013 GMT', 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),))} ok testRAND (test.test_ssl.BasicTests) ... RAND_status is 1 (sufficient randomness) ok testSSLconnect (test.test_ssl.BasicTests) ... ok testEcho (test.test_ssl.ConnectedTests) ... server: new connection from ('127.0.0.1', 51840) server: connection cipher is now ('AES256-SHA', 'TLSv1/SSLv3', 256) client: sending 'FOO\n'... server: read 'FOO\n', sending back 'foo\n'... client: read 'foo\n' client: closing connection. server: client closed connection ok testMalformedCert (test.test_ssl.ConnectedTests) ... ok testMalformedKey (test.test_ssl.ConnectedTests) ... ok testNULLcert (test.test_ssl.ConnectedTests) ... ok testReadCert (test.test_ssl.ConnectedTests) ... {'notAfter': 'Feb 16 16:54:50 2013 GMT', 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Delaware'),), (('localityName', u'Wilmington'),), (('organizationName', u'Python Software Foundation'),), (('organizationalUnitName', u'SSL'),), (('commonName', u'somemachine.python.org'),))} Connection cipher is ('AES256-SHA', 'TLSv1/SSLv3', 256). ok testRudeShutdown (test.test_ssl.ConnectedTests) ... ok testSSL2 (test.test_ssl.ConnectedTests) ... SSLv2->SSLv2 CERT_NONE SSLv2->SSLv2 CERT_OPTIONAL SSLv2->SSLv2 CERT_REQUIRED SSLv23->SSLv2 CERT_NONE {SSLv3->SSLv2} CERT_NONE {TLSv1->SSLv2} CERT_NONE ok testSSL23 (test.test_ssl.ConnectedTests) ... {SSLv2->SSLv23} CERT_NONE SSLv3->SSLv23 CERT_NONE SSLv23->SSLv23 CERT_NONE TLSv1->SSLv23 CERT_NONE {SSLv2->SSLv23} CERT_OPTIONAL SSLv3->SSLv23 CERT_OPTIONAL SSLv23->SSLv23 CERT_OPTIONAL TLSv1->SSLv23 CERT_OPTIONAL {SSLv2->SSLv23} CERT_REQUIRED SSLv3->SSLv23 CERT_REQUIRED SSLv23->SSLv23 CERT_REQUIRED TLSv1->SSLv23 CERT_REQUIRED ok testSSL3 (test.test_ssl.ConnectedTests) ... SSLv3->SSLv3 CERT_NONE SSLv3->SSLv3 CERT_OPTIONAL SSLv3->SSLv3 CERT_REQUIRED {SSLv2->SSLv3} CERT_NONE {SSLv23->SSLv3} CERT_NONE {TLSv1->SSLv3} CERT_NONE ok testSTARTTLS (test.test_ssl.ConnectedTests) ... client: sending 'msg 1'... server: new connection from ('127.0.0.1', 51870) server: read 'msg 1', sending back 'msg 1'... client: read 'msg 1' from server client: sending 'MSG 2'... server: read 'MSG 2', sending back 'msg 2'... client: read 'msg 2' from server client: sending 'STARTTLS'... server: read STARTTLS from client, sending OK... client: read 'OK\n' from server, starting TLS... server: connection cipher is now ('AES256-SHA', 'TLSv1/SSLv3', 256) client: sending 'MSG 3'... server: read 'MSG 3', sending back 'msg 3'... client: read 'msg 3' from server client: sending 'msg 4'... server: read 'msg 4', sending back 'msg 4'... client: read 'msg 4' from server client: closing connection. server: client closed connection ok testTLS1 (test.test_ssl.ConnectedTests) ... TLSv1->TLSv1 CERT_NONE TLSv1->TLSv1 CERT_OPTIONAL TLSv1->TLSv1 CERT_REQUIRED {SSLv2->TLSv1} CERT_NONE {SSLv3->TLSv1} CERT_NONE {SSLv23->TLSv1} CERT_NONE ok ---------------------------------------------------------------------- Ran 15 tests in 6.866s OK 1 test OK. CAUTION: stdout isn't compared in verbose mode: a test that passes in verbose mode may fail without it. [23679 refs] From pfdubois at gmail.com Mon Sep 10 07:30:14 2007 From: pfdubois at gmail.com (Paul Dubois) Date: Sun, 9 Sep 2007 22:30:14 -0700 Subject: [Python-Dev] summaries not arriving Message-ID: The weekly summaries from the new bug tracker are disappearing somewhere between the tracker and python-dev. My attempt to post one by hand was rejected by python-dev-owner (Barry Warsaw?) without explanation. Perhaps he has bounced the others; emails to python-dev-owner result in an automated message suggesting that my mail may never be read so I don't know how to ask him. As a small boy I once knew wrote, I must not use bad words. (:-> Paul Dubois -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070909/f23e68ee/attachment.htm From martin at v.loewis.de Mon Sep 10 07:37:02 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 10 Sep 2007 07:37:02 +0200 Subject: [Python-Dev] Word size inconsistencies in C extension modules In-Reply-To: <20070909235430.GV25031@mewburn.net> References: <20070909235430.GV25031@mewburn.net> Message-ID: <46E4D7FE.5090905@v.loewis.de> > (a) What's the "preferred" style in python extension modules > of parsing a number from python into a C type, where the > C type size may change on different platforms? > Is my method of guessing what the largest common size > will be (long, unsigned long, ...), reading into that, > and assigning to the final type, acceptable? Yes, that's the best thing we have come up with. You then have the issue on potential truncation on assignment: if the value passed fits into a long (say) but not an attr_t, it would be good if an error was raised. In the past, we have typically coded that ValueError explicitly after the ParseTuple call. In principle, it is possible to deal with these in ParseTuple. To do so: a) in configure.in, make a configure-time check to compute the size of the type, and possibly its signedness. b) in _cursesmodule.c, make a conditional define of ATTR_T_FMT, which would be either "i" or "l" (or #error if it's neither the size of int nor the size of long). Then rely on string concatenation in using that define. > (b) Is there a desire to see the standard python C extension > modules cleaned up to use the answer to (a), especially > where said modules may be susceptable to the word > size problems I mentioned? Most certainly. There shouldn't be that many places left, though; most have been fixed over the years already. I have a GCC patch which checks for correctness of ParseTuple calls (in terms of data size) if you are interested. Regards, Martin From glyph at divmod.com Mon Sep 10 07:58:52 2007 From: glyph at divmod.com (glyph at divmod.com) Date: Mon, 10 Sep 2007 05:58:52 -0000 Subject: [Python-Dev] Design and direction of the SSL module (was Re: frozenset C API?) In-Reply-To: <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com> <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com> Message-ID: <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com> Sorry for the late response. As always, I have a lot of other stuff going on at the moment, but I'm very interested in this subject. On 6 Sep, 06:15 pm, janssen at parc.com wrote: >>PyOpenSSL, in particular, is both a popular de-facto >>standard *and* almost completely unmaintained; python's standard >>library >>could absorb/improve it with little fuss. > >Good idea, go for it! A full wrapper for OpenSSL is beyond the scope >of my ambition; I'm simply trying to add a simple fix to what's >already in the standard library. I guess I'd like to know two things. One, what *is* the scope of your amibition? I feel silly for asking, because I am pretty sure that somewhere in the beginning of this thread I missed either a proposal, a PEP reference, or a ticket number, but I've poked around a little and I can't seem to find it. Can you provide a reference, or describe what it is you're trying to do? Two, what's the scope of "the" plans for the SSL module in general for Python? I think I misinterpreted several things that you said as "the plan" rather than your own personal requirements: but if in reality, I can "go for it", I'd really like to help make the stdlib SSL module to be a really good, full-featured OpenSSL implementation for Python so we can have it elsewhere. (If I recall correctly you mentioned you'd like to use it with earlier Python versions as well...?) Many of the things that you recommend using another SSL library for, like pulling out arbitrary extensions, are incredibly unweildy or flat- out broken in these libraries. It's not that I mind going to a different source for this functionality; it's that in many cases, there *isn't* another source :). I think I might have said this already, but subjectAltName, for example, isn't exposed in any way by PyOpenSSL. I didn't particularly want to start my own brand-new SSL wrapper project, and contributing to the actively-maintained stdlib implementation is a lot more appealing than forking the moribund PyOpenSSL. However, even with lots of help on the maintenance, converting the current SSL module into a complete SSL library is a lot of work. Here are the questions that I'd like answers to before starting to think seriously about it: * Is this idea even congruent with the overall goals of other developers interested in SSL for Python? If not, I'm obviously barking up the wrong tree. * Would it be possible to distribute as a separate library? (I think I remember Bill saying something about that already...) * When would such work have to be completed by to fit into the 2.6 release? (I just want a rough estimate, here.) * Should someone - and I guess by someone I mean me - write up a PEP describing this? My own design for an SSL wrapper - although this simply a Python layer around PyOpenSSL - is here: http://twistedmatrix.com/trac/browser/trunk/twisted/internet/_sslverify.py This isn't really complete - in particular, the documentation is lacking, and it can't implement the stuff PyOpenSSL is missing - but I definitely like the idea of having objects for DNs, certificates, CRs, keys, key pairs, and the ubiquitous certificate-plus-matching-private- key-in-one-file that you need to run an HTTPS server :). If I am going to write a PEP, it will look a lot like that file. _sslverify was originally designed for a system that does lots of automatic signing, so I am particularly interested in it being easy to implement a method like PrivateCertificate.signCertificateRequest - it's always such a pain to get all the calls for signing a CR in any given library *just so*. >>This begs the question: M2Crypto and PyOpenSSL already do what you're >>proposing to do, as far as I can tell, and are, as you say, "more >>powerful". To clarify my point here, when I say that they "already do" what you're doing, what I mean is, they already wrap SSL, and you are trying to wrap SSL :). >I'm trying to give the application the ability to do some level of >authorization without requiring either of those packages. I'd say "why wouldn't you want to require either of those packages?" but actually, I know why you wouldn't want to, and it's that they're bad. So, given that we don't want to require them, wouldn't it be nice if we didn't need to require them at all? :). >Like being >able to tell who's on the other side of the connection :-). Right >now, I think the right fields to expose are I don't quite understand what you mean by "right" fields. Right fields for what use case? This definitely isn't "right" for what I want to use SSL for. > "subject" (I see little point to exposing "issuer"), This is a good example of what I mean. For HTTPS, the relationship between the subject and the issuer is moot, but in my own projects, the relationship is very interesting. Specifically, properties of the issuer define what properties the subject may have, in the verification scheme for Vertex ( http://divmod.org/trac/wiki/DivmodVertex ). (On the other hand, Vertex requires STARTTLS, so it itself can't be an *actual* use-case for this SSL library until it also starts supporting mid- connection TLS startup.) I can understand that you might not have use-cases for exposing these features, but your phrasing suggests that it would be a bad idea to expose them, not just that it's too much work. Am I misinterpreting? Are you just saying it isn't worth the work at this point? > "notAfter" (you're always guaranteed to be after "notBefore", or the > cert wouldn't validate, so I see little point to exposing that, but > "notAfter" can be used after the connection has been established), Wouldn't it be nice to know *why* the cert didn't validate? To provide the user with a message including the notBefore date, in case their clock is set wrong or something? >I don't see how the other fields in the cert can be profitably used. The entire idea of "extensions" is pretty direct about the fact that the original implementor need not understand their profitable use :). >>When you say "the full DER form", are you simply referring to the full >>blob, or a broken-down representation by key and by extension? > >The full blob. Obviously, I think the broken-down representation would be nicer :). I know I'll have to wrangle with a bit of ASN.1 if I want to get anything useful out of most extensions, but if it's just the extension data there are a lot of cases where I think I could fake it. Re-parsing the whole DER is going to require a real, full-on ASN.1 library. From greg at krypto.org Mon Sep 10 08:40:05 2007 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 9 Sep 2007 23:40:05 -0700 Subject: [Python-Dev] BerkeleyDB 4.6.19 is buggy and causes test_bsddb3 to hang Message-ID: <52dc1c820709092340t39986c5er3a6d782409849a03@mail.gmail.com> BerkeleyDB 4.6.19 is a buggy release, the DB_HASH access method databases can lockup the process. This is why several of the bleeding edge distro buildbots are timing out while running test_bsddb3. I've created a simple C test case and made sleepycat^Woracle aware of the problem. I have a change in my sandbox to explicitly avoid linking with 4.6.19 but it seems like committing it would just pollute setup.py with vague notions of what versions of a specific library are bad. I'd prefer to just disallow use of libdb 4.6 completely in setup.py until oracle fixes this and we're sure no OS release ships with 4.6.19. thoughts? -gps http://groups.google.com/group/comp.databases.berkeley-db/browse_thread/thread/abf12452613ca7ec -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070909/6a2862d8/attachment.htm From lukem at NetBSD.org Mon Sep 10 09:11:59 2007 From: lukem at NetBSD.org (Luke Mewburn) Date: Mon, 10 Sep 2007 17:11:59 +1000 Subject: [Python-Dev] Word size inconsistencies in C extension modules In-Reply-To: <46E4D7FE.5090905@v.loewis.de> References: <20070909235430.GV25031@mewburn.net> <46E4D7FE.5090905@v.loewis.de> Message-ID: <20070910071159.GA27320@mewburn.net> On Mon, Sep 10, 2007 at 07:37:02AM +0200, "Martin v. L?wis" wrote: | In principle, it is possible to deal with these in ParseTuple. | To do so: | a) in configure.in, make a configure-time check to compute the | size of the type, and possibly its signedness. | b) in _cursesmodule.c, make a conditional define of ATTR_T_FMT, | which would be either "i" or "l" (or #error if it's neither | the size of int nor the size of long). Then rely on string | concatenation in using that define. Are there some good examples in the Python source where this technique has been used already? Or were you proposing a cleaner solution that could be experimented with? | I have a GCC patch which checks for correctness of ParseTuple | calls (in terms of data size) if you are interested. Sounds like a useful variation of the standard -Wformat stuff. This probably wouldn't have helped in the AIX situation I experienced (because the IBM compiler was used in that situation), but it could be useful on other BE LP64 platforms that are more gcc-friendly (e.g, NetBSD/sparc64). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 186 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070910/8af04ed8/attachment.pgp From martin at v.loewis.de Mon Sep 10 09:54:08 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 10 Sep 2007 09:54:08 +0200 Subject: [Python-Dev] Word size inconsistencies in C extension modules In-Reply-To: <20070910071159.GA27320@mewburn.net> References: <20070909235430.GV25031@mewburn.net> <46E4D7FE.5090905@v.loewis.de> <20070910071159.GA27320@mewburn.net> Message-ID: <46E4F820.1050204@v.loewis.de> Luke Mewburn schrieb: > On Mon, Sep 10, 2007 at 07:37:02AM +0200, "Martin v. L?wis" wrote: > | In principle, it is possible to deal with these in ParseTuple. > | To do so: > | a) in configure.in, make a configure-time check to compute the > | size of the type, and possibly its signedness. > | b) in _cursesmodule.c, make a conditional define of ATTR_T_FMT, > | which would be either "i" or "l" (or #error if it's neither > | the size of int nor the size of long). Then rely on string > | concatenation in using that define. > > Are there some good examples in the Python source where > this technique has been used already? Not directly. A check for the size of a library type can be found for fpos_t, but there, no ParseTuple depends on it. An example for using variable formatters (though again not for ParseTuple) is PY_FORMAT_SIZE_T. > Or were you proposing a cleaner solution that could be > experimented with? More that, yes. > | I have a GCC patch which checks for correctness of ParseTuple > | calls (in terms of data size) if you are interested. > > Sounds like a useful variation of the standard -Wformat stuff. Indeed, it's an extension to it. Unfortunately, introducing new kinds of formats is only possible by editing GCC (and then, the existing framework is focussed on %-style patterns, so I had to bypass that framework as well - but there were hooks for doing so). Regards, Martin From barry at python.org Mon Sep 10 13:14:30 2007 From: barry at python.org (Barry Warsaw) Date: Mon, 10 Sep 2007 07:14:30 -0400 Subject: [Python-Dev] summaries not arriving In-Reply-To: References: Message-ID: <48199DF6-10D5-413D-9DAD-F7F1FD849072@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 10, 2007, at 1:30 AM, Paul Dubois wrote: > The weekly summaries from the new bug tracker are disappearing > somewhere > between the tracker and python-dev. My attempt to post one by hand was > rejected by python-dev-owner (Barry Warsaw?) without explanation. > Perhaps he > has bounced the others; emails to python-dev-owner result in an > automated > message suggesting that my mail may never be read so I don't know > how to ask > him. Nope, I didn't bounce them. I don't /think/ they'll bounce automatically. Can you forward a bounce message to me directly? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRuUnF3EjvBPtnXfVAQKDvwQAnNbVXwY1Fc00TLLFrOffFwU+jRJVxSyJ J0HV/+ssVuX85+LfM7kAsbwZSAWM0PkTVrhfldtbD5j6D0x0II/C5GX21fQ0V+pg fyf9HQ9i1LSUe7TvvCyXGSI7d8snNBqBpsyQ2EakQ3OGlcMjILPVVmyVSDFd2mLr Z2VbrlinB58= =HohF -----END PGP SIGNATURE----- From anthony at ekit-inc.com Mon Sep 10 07:43:09 2007 From: anthony at ekit-inc.com (Anthony Baxter) Date: Mon, 10 Sep 2007 15:43:09 +1000 Subject: [Python-Dev] summaries not arriving In-Reply-To: References: Message-ID: <200709101543.10576.anthony@ekit-inc.com> On Monday 10 September 2007, Paul Dubois wrote: > As a small boy I once knew wrote, I must not use bad words. (:-> It's OK to use them about Barry, though, surely? *wave* Hi Barry. -- Anthony Baxter, ekit. anthony at ekit-inc.com (03) 9674 7015 Level 3 The Teahouse, 28 Clarendon St, Sth Melbourne Australia 3205 From barry at python.org Mon Sep 10 13:39:25 2007 From: barry at python.org (Barry Warsaw) Date: Mon, 10 Sep 2007 07:39:25 -0400 Subject: [Python-Dev] summaries not arriving In-Reply-To: <200709101543.10576.anthony@ekit-inc.com> References: <200709101543.10576.anthony@ekit-inc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 10, 2007, at 1:43 AM, Anthony Baxter wrote: > On Monday 10 September 2007, Paul Dubois wrote: >> As a small boy I once knew wrote, I must not use bad words. (:-> > > It's OK to use them about Barry, though, surely? > > *wave* Hi Barry. It's okay from /you/ Anthony, because it's the only way I know you still care. baby-unicorn-hugs-ly y'rs, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRuUs7XEjvBPtnXfVAQJZSQP+JoApQY+tY4zkZDN2OlE+jFv8xdF0vqRW LCK+p8yIQjlrkMC58c2CChvOsWTcH6tZMFAd0jK8d9q8NxyyN3tM7mbh25Rnm9fo KC9uDt787fY8RpRC5YC+zEtM589Y6omL3S4XcqdkTS9UWg6S50e9EDkqrjKmE1gb 8/1LSynRnF8= =W6ef -----END PGP SIGNATURE----- From martin at v.loewis.de Mon Sep 10 16:21:39 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 10 Sep 2007 16:21:39 +0200 Subject: [Python-Dev] BerkeleyDB 4.6.19 is buggy and causes test_bsddb3 to hang In-Reply-To: <52dc1c820709092340t39986c5er3a6d782409849a03@mail.gmail.com> References: <52dc1c820709092340t39986c5er3a6d782409849a03@mail.gmail.com> Message-ID: <46E552F3.10707@v.loewis.de> > I have a change in my sandbox to explicitly avoid linking with 4.6.19 > but it seems like committing it would just pollute setup.py with vague > notions of what versions of a specific library are bad. I'd prefer to > just disallow use of libdb 4.6 completely in setup.py until oracle fixes > this and we're sure no OS release ships with 4.6.19. > > thoughts? That sounds like the right solution to me. We should review it when/if a patch is available. "No OS release ships" is a difficult-to-test-for condition, given things like Gentoo and Debian unstable that simultaneously ship in dozens of versions (i.e. with "releases" every two hours or more often). After a bug fix is available, and some time has passed, I'd rather reallow 4.6.x, and put something into README about this bug. Thanks for investigating it. Regards, Martin From janssen at parc.com Mon Sep 10 17:37:14 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 08:37:14 PDT Subject: [Python-Dev] Design and direction of the SSL module (was Re: frozenset C API?) In-Reply-To: <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com> <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com> <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com> Message-ID: <07Sep10.083715pdt."57996"@synergy1.parc.xerox.com> > One, what *is* the scope of your > amibition? I feel silly for asking, because I am pretty sure that > somewhere in the beginning of this thread I missed either a proposal, a > PEP reference, or a ticket number, but I've poked around a little and I > can't seem to find it. Can you provide a reference, or describe what it > is you're trying to do? Sorry about that. We kind of did this on the fly at the Python sprint. I was trying to fix two problems: One, that the current socket.ssl support didn't validate certificates, and two, that you couldn't do server-side SSL with it. I'm only interested in that aspect, and in the simplest possible solution to those problems. I don't want to provide user validation callbacks, or arbitrary certificate decoding, or general-purpose crypto, or support for building automatic CA systems, or wrapping most of that great grab-bag of useful stuff called OpenSSL. Just fix the core issues with socket.ssl. Along the way I've found a nasty little threading/malloc bug in the existing code, and fixed that. I've added real documentation for the existing functionality. I've gone around with you and Martin, mainly, on what information to expose from the validated certificate, to support authorization and accounting (the answer so far: "notAfter", "subject", and "subjectAltName", if it's there). > I'd really like to help make the stdlib SSL module to > be a really good, full-featured OpenSSL implementation for Python so we > can have it elsewhere. Well, remember, it's just a socket-layer wrapper for TLS, it's not an "OpenSSL implementation", by which I suppose you mean a full wrapper for OpenSSL, much like PyOpenSSL is supposed to be. For that purpose, doesn't it make more sense to to extend/fix PyOpenSSL, rather than try to grow the deliberately limited-purpose socket.ssl support into another version of that? Can't it be revived, if it is in fact moribund? > * Would it be possible to distribute as a separate library? (I think > I remember Bill saying something about that already...) Just to be clear that what you seem to want to work on and what I'm working on seem to be two different things... I plan to build a back-port of the improved socket.ssl support as a standalone package for 2.3 (because I need to use it on OS X 10.4). > I'd say "why wouldn't you want to require either of those packages?" but > actually, I know why you wouldn't want to, and it's that they're bad. It's that they are too big and complicated to easily see how to fix. But that seems to be a side-effect of trying to wrap all of OpenSSL, which is a big, evolving project. > Wouldn't it be nice to know *why* the cert didn't validate? To provide Yes, so I've put in a bit of work making sure the OpenSSL errors are properly relayed back to the Python application. > The entire idea of "extensions" is pretty direct about the fact that the > original implementor need not understand their profitable use :). Not really. Each extension is proposed, debated, and approved before it's added to the spec for extensions. My idea is that as support for various extensions appear in OpenSSL, we can evaluate them and see if they are worth supporting in Python. > Specifically, properties of the > issuer define what properties the subject may have, in the verification > scheme for Vertex ( http://divmod.org/trac/wiki/DivmodVertex ) I didn't see a write-up of your scheme at that URL; can you point me to a particular page in the Wiki which describes the use case? I should point out that we're (actually, Greg Smith) also wrapping another chunk of the OpenSSL library for hashing. And last week I suggested that we might wrap yet another chunk for doing cryptography. This chunk-by-chunk approach might be a good way to go. If a chunk that did general X509 certificate munging did appear, I'd be happy to change the SSL support to use it. Bill From janssen at parc.com Mon Sep 10 19:30:54 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 10:30:54 PDT Subject: [Python-Dev] which SSL client protocols work with which server protocols? In-Reply-To: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com> References: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep10.103100pdt."57996"@synergy1.parc.xerox.com> > I've now built a framework in test_ssl to test all client protocols > (SSL2, SSL3, SSL23, TLS1) against all server protocols, and here's > what I've come up with. Servers are along the X axis, and clients are > on the Y axis. "Yes" means that that client protocol can talk to that > server protocol. > > SSL2 SSL3 SS23 TLS1 > SSL2 yes no no no > SSL3 yes yes yes no > SSL23 no no yes no > TLS1 no no yes yes > > I'm a bit surprised by the facts that (1) an SSL2 client can't connect > to an SSL23 server, and (2) an SSL23 client can *only* connect to an > SSL23 server. Can anyone verify that these combos (the results of > testing with the Python framework) are indeed to be expected? Sure enough, in testing on my FC7 platform, which has a more modern version of OpenSSL (0.9.8e instead of the older 0.9.7l platform I was using), an SSL2 client *can* connect to an SSL23 server. And I got one of the above entries wrong: an SSL23 client can connect to an SSL2 server. I guess in the test harness, I'll just note the discrepancy, but not fail the test either way. And I'll add a note to the documentation. Bill From pfdubois at gmail.com Mon Sep 10 19:30:47 2007 From: pfdubois at gmail.com (Paul Dubois) Date: Mon, 10 Sep 2007 10:30:47 -0700 Subject: [Python-Dev] Fwd: Summary of Tracker Issues In-Reply-To: References: <20070910030148.856EA78098@psf.upfronthosting.co.za> Message-ID: Something seems to have gone wrong with the automation of the weekly reports. They were working I think until we went live. While I work on finding out what the trouble is, here is a report covering the period since we went live. -- Paul Dubois ACTIVITY SUMMARY (08/23/07 - 09/10/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, simply click on the issue ID. Do *not* respond to this message. 1274 open (+64) / 11347 closed (+76) / 12621 total (+140) Average duration of open issues: 673 days. Median duration of open issues: 623 days. Open Issues Breakdown STATUS NumberChange open 1271 +64 pending 3 +0 Issues Created Or Reopened (140) Title Status Date Action By Patch to rename *Server modules to lower-case 08/23/07 created jasonpjason 2to3 crashes on input files with no trailing newlines CLOSED 08/23/07 created adrianholovaty Patch to rename HTMLParser module to lower_case 08/23/07 created paulsmith zipfile password fails validation 08/23/07 created djw MultiMethods with type annotations in 3000 08/23/07 created jasonpjason Patches to rename Queue module to queue 08/23/07 created paulsmith Refactor test_winreg.py to use unittest. CLOSED 08/24/07 created varmaa [py3k] Fix dumbdbm, which fixes test_shelve (for me); instrument other t CLOSED 08/24/07 created larryhastings Refactor test_signal.py to use unittest. CLOSED 08/24/07 created varmaa Implementation of PEP 3101, Advanced String Formatting CLOSED 08/24/07 created eric.smith Broken bug tracker url CLOSED 08/24/07 created nirs Wrong documentation for rfc822.Message.getheader CLOSED 08/24/07 created nirs Broken URL at Doc/install/index.rst CLOSED 08/24/07 created orsenthil eval error CLOSED 08/24/07 created Rayfward cgi: parse_qs and parse_qsl misbehave on empty strings 08/24/07 created dljessup [PATCH] Updated patch for rich dict view (dict().keys()) comparisons CLOSED 08/24/07 created keir [PATCH] Updated fix for string to unicode fixes in time and datetime CLOSED 08/24/07 created ero.carrera [PATCH] Add set operations (and, or, xor, subtract) to dict views CLOSED 08/24/07 created keir server-side ssl support CLOSED 08/25/07 created gvanrossum Cleanup pass on _curses and _curses_panel 08/25/07 created larryhastings pydoc doesn't work on pyexpat 08/25/07 created nnorwitz logging.basicConfig does not allow to set NOTSET level 08/25/07 created viper use bytes for code objects 08/25/07 created nnorwitz [PATCH] Unicode fixes in floatobject and moduleobject CLOSED 08/25/07 created ero.carrera documentation for new SSL module CLOSED 08/26/07 created janssen tracebacks from list comps (probably other comps) don't show full stack 08/26/07 created nnorwitz Backport ABC to 2.6 08/26/07 created baranguren uudecoding (uu.py) does not supprt base64, patch attached 08/26/07 created dudek Tkinter binding involving Control-spacebar raises unicode error CLOSED 08/26/07 created kbk py3k: io.StringIO.getvalue() returns \r\n CLOSED 08/26/07 created amaury.forgeotdarc py3k: Adapt _winreg.c to the new buffer API CLOSED 08/26/07 created amaury.forgeotdarc py3k: compilation with VC2005 CLOSED 08/26/07 created amaury.forgeotdarc Improve the hackish runtime_library_dirs support for gcc 08/26/07 created alexandre.vassalotti Support for newline and encoding in tempfile module CLOSED 08/27/07 created hupp [patch] Add 2to3 support for displaying warnings as Python comments 08/27/07 created adrianholovaty bytes buffer API needs to support PyBUF_LOCKDATA 08/27/07 created gregory.p.smith py3k _bsddb.c patch to use the new buffer API CLOSED 08/27/07 created gregory.p.smith Ill-coded identifier crashes python when coding spec is utf-8 CLOSED 08/27/07 created hyeshik.chang [py3k] pdb does not work in python 3000 08/27/07 created gregory.p.smith Asssertion in Windows debug build CLOSED 08/28/07 created theller Unicode problem with TZ CLOSED 08/28/07 created theller io.py problems on Windows CLOSED 08/28/07 created theller test_glob fails with UnicodeDecodeError 08/28/07 created theller test_builtin failure on Windows CLOSED 08/28/07 created theller tarfile insecure pathname extraction CLOSED 08/28/07 created lars.gustaebel Performance regression in 2.5 08/28/07 created inducer HTMLCalendar.formatyearpage not behaving as documented CLOSED 08/28/07 created inefab py3k: corrections for test_subprocess on windows 08/28/07 created amaury.forgeotdarc py3k: correction for test_float on Windows CLOSED 08/28/07 created amaury.forgeotdarc socket.socket.getsockname() has inconsistent UNIX/Windows behavior CLOSED 08/28/07 created janssen py3k: correction for test_marshal on Windows CLOSED 08/29/07 created amaury.forgeotdarc certificate in Lib/test/test_ssl.py expires in February 2013 08/29/07 created janssen SSL patch for Windows buildbots problem CLOSED 08/29/07 created janssen bogus attributes reported in asyncore doc 08/29/07 created billiejoex scriptsinstall target fails in alternate build dir 08/29/07 created skip.montanaro argument parsing in datetime_strptime CLOSED 08/29/07 created loewis test_cmd_line starts python without -E 08/29/07 created twouters Incorrect URL with webbrowser and firefox under Gnome CLOSED 08/29/07 created bingham Code Example for 'property' bug CLOSED 08/29/07 created KennethLove *args and **kwargs in function definitions CLOSED 08/29/07 created lars.gustaebel zipfile cannot handle files larger than 2GB (inside archive) 08/29/07 created Kevin Ar18 ABC caches should use weak refs 08/30/07 created gvanrossum nice to have a way to tell if a socket is bound 08/30/07 created janssen Small typo in properties example CLOSED 08/30/07 created cgrohmann Test issue CLOSED 08/30/07 created loewis ssl.py shouldn't change class names from 2.6 to 3.x 08/30/07 created janssen Implement PEPs 3109, 3134 CLOSED 08/30/07 created collinwinter test_smtplib failures (caused by asyncore) 09/01/07 reopened gvanrossum Documentation Updates for PEP 3101 string formatting CLOSED 08/31/07 created talin invalid file encoding results in "SyntaxError: None" CLOSED 08/31/07 created georg.brandl unicode identifiers in error messages CLOSED 08/31/07 created georg.brandl unicode.translate() doesn't error out on invalid translation table 08/31/07 created georg.brandl Documentaion font size too small CLOSED 08/31/07 created nirs Mysterious failure under Windows CLOSED 08/31/07 created akineko python3.0-config script does not run on py3k CLOSED 08/31/07 created koen py3k: Unicode error in os.stat on Windows CLOSED 08/31/07 created amaury.forgeotdarc py3 patch: full Unicode version for winreg module CLOSED 08/31/07 created amaury.forgeotdarc itertools missing, causes interactive help to break 09/01/07 created mattrussell cachersrc.py using tuple unpacking args CLOSED 09/01/07 created jinok decode_header does not follow RFC 2047 09/01/07 created kael Search broken CLOSED 09/01/07 created nirs file.seek allows float arguments 09/01/07 created georg.brandl platform system may be Windows or Microsoft since Vista 09/01/07 created p.lavarre at ieee.org Confusing error message when dividing timedelta using / 09/01/07 created skip.montanaro ''.find() gives wrong result in Python built with ICC CLOSED 09/01/07 created sanders_muc OS X 10.5.x Build Problems CLOSED 09/02/07 created noahgift test_email failed 09/02/07 created xyb py3k os.popen result is not iterable, patch attached 09/02/07 created carsten.haese News page broken link to 3.0a1 CLOSED 09/02/07 created grahamh ever considered adding static typing to python? CLOSED 09/02/07 created adamjw doctools/sphinx/web/application.py does not start on windows CLOSED 09/03/07 created osuchw py3k Mac installation errors CLOSED 09/03/07 created hdiogenes Unexpected results in Tutorial about Unicode CLOSED 09/03/07 created Viscaynot product function patch CLOSED 09/03/07 created ryan.freckleton TypeError in poplib.py CLOSED 09/03/07 created serge.julien make install failed CLOSED 09/03/07 created akineko Deeply recursive repr segfault 09/04/07 created rhamphoryncus input() should respect sys.stdin.encoding when in interactive mode CLOSED 09/04/07 created philyoon decode_unicode doesn't nul-terminate 09/04/07 created rhamphoryncus Mac compile fails with pydebug and framework enabled 09/04/07 created hdiogenes Can't input non-ascii characters in interactive mode CLOSED 09/04/07 created philyoon Is there just no PRINT statement any more? Or it just doesn't work. CLOSED 09/06/07 reopened loewis Add support for _msi.Record.GetString() and _msi.Record.GetInteger() 09/04/07 created atuining Typo in dummy_threading documentation CLOSED 09/04/07 created dthomasset msilib.SummaryInfo.GetProperty() truncates the string by one character 09/04/07 created atuining patch for readme.txt in PCbuild8 CLOSED 09/05/07 created chipped Error in random.shuffle CLOSED 09/05/07 created Viscaynot 2to3, lambda with non-tuple argument inside parenthesis CLOSED 09/05/07 created falsetru Problem with doctest and decorated functions 09/05/07 created danilo Warning required when calling register() on an ABCMeta subclass 09/05/07 created mark Problems with the msi installer - python-3.0a1.msi 09/05/07 created vbr Users' directories information CLOSED 09/05/07 created uzytkownik Test debug assertion in bsddb test_1413192.py CLOSED 09/05/07 created db3l interrupt_main() fails to interrupt raw_input() CLOSED 09/05/07 created anand _curses issues on 64-bit big-endian (e.g, AIX) 09/06/07 created lukemewburn Minor Change For Better cross compile 09/06/07 created zengbo reference in extending doc to non-existing file CLOSED 09/06/07 created anthon Spurious warning about missing _sha256 and _sha512 when not needed CLOSED 09/06/07 created dripton hashlib module fails with TypeError CLOSED 09/06/07 created dripton Search index is messed up after partial rebuilding CLOSED 09/06/07 created lars.gustaebel "make altinstall" installs pydoc, idle, smtpd.py with broken shebang lin 09/06/07 created dripton Document inspect.getfullargspec() 09/06/07 created brett.cannon PyTuple_Size and PyTuple_GET_SIZE return type documentation incorrect 09/07/07 created gaul split(None, maxsplit) does not strip whitespace correctly 09/07/07 created nirs Webchecker not parsing css "@import url" 09/07/07 created ready.eddy bytes.split shold have same interface as str.split, or different name 09/07/07 created nirs file.fileno and file.isatty() should be implementable by any file like o 09/07/07 created nirs No tests for inspect.getfullargspec() 09/07/07 created brett.cannon msilib.Directory.make_short only handles file names with a single dot in 09/07/07 created atuining OpenSSL detection broken for Python 3.0a1 CLOSED 09/07/07 created pythonmeister Idle - Save (buffer) - closes IDLE and does not save file (Windows XP) 09/08/07 created infixum Reference Manual: "for statement" links to "break statement" 09/08/07 created Martoon compile error in poplib.py CLOSED 09/08/07 created andre python3.0-config raises SyntaxError CLOSED 09/09/07 created complex Parsing a simple script eats all of your memory 09/09/07 created complex xview/yview of Tix.Grid is broken 09/09/07 created ocean-city Bdb documentation 09/09/07 created arklad pyexpat patch for changing buffer_size 09/09/07 created AchimGaedke Fixer needed for __future__ imports 09/09/07 created collinwinter Make python build with gcc-4.2 on OS X 10.4.9 08/23/07 created jyasskin Issues Now Closed (188) Title By Duration 2to3 crashes on input files with no trailing newlines collinwinter 14 days Refactor test_winreg.py to use unittest. loewis 10 days [py3k] Fix dumbdbm, which fixes test_shelve (for me); instrument other t loewis 10 days Refactor test_signal.py to use unittest. georg.brandl 1 days Implementation of PEP 3101, Advanced String Formatting gvanrossum 6 days Broken bug tracker url georg.brandl 0 days Wrong documentation for rfc822.Message.getheader georg.brandl 0 days Broken URL at Doc/install/index.rst loewis 9 days eval error georg.brandl 0 days [PATCH] Updated patch for rich dict view (dict().keys()) comparisons loewis 9 days [PATCH] Updated fix for string to unicode fixes in time and datetime loewis 9 days [PATCH] Add set operations (and, or, xor, subtract) to dict views loewis 9 days server-side ssl support janssen 1 days [PATCH] Unicode fixes in floatobject and moduleobject loewis 8 days documentation for new SSL module gvanrossum 2 days Tkinter binding involving Control-spacebar raises unicode error kbk 0 days py3k: io.StringIO.getvalue() returns \r\n loewis 7 days py3k: Adapt _winreg.c to the new buffer API loewis 7 days py3k: compilation with VC2005 nnorwitz 0 days Support for newline and encoding in tempfile module gvanrossum 1 days py3k _bsddb.c patch to use the new buffer API gregory.p.smith 1 days Ill-coded identifier crashes python when coding spec is utf-8 gvanrossum 2 days Asssertion in Windows debug build theller 2 days Unicode problem with TZ loewis 2 days io.py problems on Windows gvanrossum 2 days test_builtin failure on Windows georg.brandl 6 days tarfile insecure pathname extraction lars.gustaebel 2 days HTMLCalendar.formatyearpage not behaving as documented doerwalter 0 days py3k: correction for test_float on Windows gvanrossum 1 days socket.socket.getsockname() has inconsistent UNIX/Windows behavior loewis 1 days py3k: correction for test_marshal on Windows gvanrossum 1 days SSL patch for Windows buildbots problem janssen 1 days argument parsing in datetime_strptime gvanrossum 1 days Incorrect URL with webbrowser and firefox under Gnome orsenthil 1 days Code Example for 'property' bug georg.brandl 0 days *args and **kwargs in function definitions georg.brandl 1 days Small typo in properties example georg.brandl 0 days Test issue georg.brandl 0 days Implement PEPs 3109, 3134 collinwinter 0 days Documentation Updates for PEP 3101 string formatting georg.brandl 0 days invalid file encoding results in "SyntaxError: None" loewis 0 days unicode identifiers in error messages loewis 0 days Documentaion font size too small georg.brandl 1 days Mysterious failure under Windows loewis 0 days python3.0-config script does not run on py3k georg.brandl 0 days py3k: Unicode error in os.stat on Windows loewis 2 days py3 patch: full Unicode version for winreg module loewis 2 days cachersrc.py using tuple unpacking args georg.brandl 2 days Search broken georg.brandl 0 days ''.find() gives wrong result in Python built with ICC sanders_muc 0 days OS X 10.5.x Build Problems loewis 0 days News page broken link to 3.0a1 loewis 0 days ever considered adding static typing to python? loewis 0 days doctools/sphinx/web/application.py does not start on windows georg.brandl 0 days py3k Mac installation errors hdiogenes 0 days Unexpected results in Tutorial about Unicode georg.brandl 2 days product function patch gvanrossum 0 days TypeError in poplib.py gvanrossum 7 days make install failed georg.brandl 4 days input() should respect sys.stdin.encoding when in interactive mode loewis 0 days Can't input non-ascii characters in interactive mode philyoon 0 days Is there just no PRINT statement any more? Or it just doesn't work. gvanrossum 0 days Typo in dummy_threading documentation dthomasset 0 days patch for readme.txt in PCbuild8 loewis 0 days Error in random.shuffle georg.brandl 0 days 2to3, lambda with non-tuple argument inside parenthesis collinwinter 4 days Users' directories information uzytkownik 1 days Test debug assertion in bsddb test_1413192.py gregory.p.smith 1 days interrupt_main() fails to interrupt raw_input() anand 2 days reference in extending doc to non-existing file georg.brandl 0 days Spurious warning about missing _sha256 and _sha512 when not needed skip.montanaro 0 days hashlib module fails with TypeError georg.brandl 0 days Search index is messed up after partial rebuilding georg.brandl 0 days OpenSSL detection broken for Python 3.0a1 georg.brandl 0 days compile error in poplib.py georg.brandl 1 days python3.0-config raises SyntaxError loewis 0 days Need user-centered info for Windows users. georg.brandl 2460 days Codec naming scheme and aliasing support lemburg 2445 days exception item from mapped function georg.brandl 2212 days include SQL interface module georg.brandl 2183 days Add "eu#" parser marker georg.brandl 2023 days Missing docs for module imputil jafo 2024 days pydoc should respect __all__ skip.montanaro 1999 days bsddb185 module needs iterators gregory.p.smith 1981 days cStringIO should provide a binary option georg.brandl 1948 days Docs in DocBook format georg.brandl 1884 days pygettext should be installed skip.montanaro 1740 days email: minimal header encoding skip.montanaro 1708 days raw_input defers alarm signal georg.brandl 1666 days Provide "plucker" format docs. georg.brandl 1631 days Port tests to unittest (Part 2) georg.brandl 1564 days Compile error messages and PEP-263 georg.brandl 1485 days patch for build with read-only $srcdir loewis 1486 days robotparser interactively prompts for username and password skip.montanaro 1430 days Modules/Setup needs a suppress flag? skip.montanaro 1358 days configure links unnecessary library libdl loewis 1322 days "ez" format code for ParseTuple() lemburg 1311 days configure warning / sys/un.h: present but cannot be compiled georg.brandl 1310 days quopri encoding & Unicode georg.brandl 1318 days 2.3.3 str & list still use __getslice__ / __setslice__ georg.brandl 1302 days making the version of SSL configurable when creating sockets janssen 1303 days work around to compile \r\n file georg.brandl 1244 days SSL-ed sockets don't close correct? loewis 1168 days Adding missing ISO 8859 codecs, especially Thai lemburg 1116 days socket.ssl should explain that it is a 2/3 connection janssen 1080 days Use correct encoding for printing SyntaxErrors gvanrossum 1079 days Irregular behavior of datetime.__str__() skip.montanaro 1008 days Add 'update FAQ' to release checklist georg.brandl 962 days Add SSL certificate validation janssen 944 days enable time + timedelta skip.montanaro 935 days Python Programming FAQ should be updated for Python 2.4 georg.brandl 925 days zipfile UnicodeDecodeError georg.brandl 888 days Python and Turkish Locale georg.brandl 852 days a bunch of infinite C recursions brett.cannon 844 days add single html files georg.brandl 833 days crash recursive __getattr__ brett.cannon 744 days tarfile: adding filed that use direct device addressing lars.gustaebel 724 days python.sty correction - verbatim environment georg.brandl 705 days python.sty: \py at sigline correction georg.brandl 705 days Use 'seealso' to add examples to LibRef loewis 635 days add more readline support loewis 621 days 2.3.5 source RPM install fails w/o tk-devel jafo 594 days Inconsistency in Programming FAQ georg.brandl 568 days Add .format() method to str and unicode loewis 514 days datetime.time and datetime.timedelta skip.montanaro 478 days winerror module loewis 450 days Turkish Character georg.brandl 400 days sqlite3 documentation on rowcount is contradictory georg.brandl 318 days doctest simple usage recipe is misleading georg.brandl 285 days specialcase simple sliceobj in tuple/str/unicode twouters 254 days specialcase simple sliceobj in list (and bugfixes) twouters 254 days Extended slicing for UserString twouters 253 days Extended slicing for array objects twouters 253 days slice-object support for ctypes Pointer/Array twouters 256 days slice-object support for mmap twouters 253 days extended slicing for structseq twouters 253 days extended slicing for buffer objects twouters 253 days sys.intern() 2to3 fixer collinwinter 261 days webbrowser.open_new() suggestion georg.brandl 237 days re module documentation on search/match is unclear georg.brandl 235 days posixmodule.c leaks crypto context on Windows loewis 244 days doc misleading in re.compile georg.brandl 227 days Refactor test_class to use unittest lib georg.brandl 177 days Add tests for pipes module (test_pipes) georg.brandl 169 days xreload.py won't update class docstrings gvanrossum 164 days Explain __method__ lookup semantics for new-style classes georg.brandl 168 days descrintro: error describing __new__ behavior georg.brandl 154 days os.path.join.__doc__ should mention absolute paths georg.brandl 150 days Python 2.5 installer ended prematurely loewis 151 days Bad documentation for existing imp methods georg.brandl 141 days __getslice__ still used in built-in types georg.brandl 135 days pickle example contains errors georg.brandl 133 days Refactor test_frozen.py to use unittest. georg.brandl 128 days socket.error exceptions not subclass of StandardError gregory.p.smith 138 days generation errors in PDF-A4 tags for lib.pdf georg.brandl 120 days imp.find_module doc ambiguity georg.brandl 134 days run test_1565150(test_os.py) only on NTFS loewis 123 days IDLE hangs in popup method completion kbk 96 days bsddb.btopen . del of record doesn't update index gregory.p.smith 90 days -q (quiet) option for python interpreter georg.brandl 86 days _lsprof.c:ptrace_enter_call assumes PyErr_* is clean arigo 89 days struct.Struct.size is not documented georg.brandl 75 days Add/Remove programs shows Martin v L??wis loewis 79 days Add reduce to functools in 2.6 gvanrossum 69 days Incorrect docs for optparse OptionParser add_help_option georg.brandl 61 days Pickling of exceptions broken georg.brandl 59 days Examples dropped from PDF version of SQLite docs georg.brandl 58 days Improve exception pickling support georg.brandl 57 days AMD64 installer does not place python25.dll in system dir loewis 62 days expanduser("~") on Windows looks for HOME first georg.brandl 62 days fixing 2.5.1 build with unicode and dynamic loading disabled georg.brandl 43 days getaddrinfo no longer used in httplib georg.brandl 43 days chown() does not handle UID > INT_MAX georg.brandl 42 days struni: assertion in Windows debug build georg.brandl 48 days reference count discrepancy, PyErr_Print vs. PyErr_Clear georg.brandl 36 days utilize 2.5 try/except/finally in contextlib georg.brandl 35 days Documentation of descriptors needs more detail georg.brandl 32 days The -m switch does not use the builtin __main__ module ncoghlan 25 days setup.py trashes LDFLAGS georg.brandl 23 days poll() returns "status code", not "return code" georg.brandl 21 days os.chmod failure georg.brandl 35 days Byte code WITH_CLEANUP missing, MAKE_CLOSURE wrong georg.brandl 18 days Misc improvements for the io module gvanrossum 18 days bsddb can't use unicode keys gregory.p.smith 14 days Unify __builtins__ -> __builtin__ gvanrossum 15 days tempfile.TemporaryFile differs between platforms georg.brandl 6 days Segfault closing a file from concurrent threads georg.brandl 7 days References: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com> <07Sep10.103100pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep10.114436pdt."57996"@synergy1.parc.xerox.com> Here's the updated connection table: SSL2 SSL3 SS23 TLS1 SSL2 yes no yes no SSL3 yes yes yes no SSL23 yes no yes no TLS1 no no yes yes Given this, I think the client-side default should be changed from SSLv23 to SSLv3, and the server-side default should be SSLv23. Bill From trentm at activestate.com Mon Sep 10 21:06:03 2007 From: trentm at activestate.com (Trent Mick) Date: Mon, 10 Sep 2007 12:06:03 -0700 Subject: [Python-Dev] [PEPs] Email addresses in PEPs? In-Reply-To: <18145.59131.323002.910688@montanaro.dyndns.org> References: <18121.47310.218893.540750@montanaro.dyndns.org> <4335d2c40708201232s19b10c69ye44d39351a4da97d@mail.gmail.com> <46E1D2C3.5030705@activestate.com> <18145.59131.323002.910688@montanaro.dyndns.org> Message-ID: <46E5959B.5060200@activestate.com> skip at pobox.com wrote: > Trent> If some would find it useful, here is a snippet of code that > Trent> obfuscates email addresses for HTML as done by Markdown (a > Trent> text-to-html markup translator). It randomly encodes each > Trent> charater as a hex or decimal HTML entity (roughly 10% raw, 45% > Trent> hex, 45% dec). > > Aren't most spammers' scrapers going to be intelligent enough by now > (several years since they first arrived on the scene) to "see through" these > sorts of common obfuscations? Perhaps, yes. No way of really knowing. Trent -- Trent Mick trentm at activestate.com From facundobatista at gmail.com Mon Sep 10 22:25:11 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Mon, 10 Sep 2007 17:25:11 -0300 Subject: [Python-Dev] Python tickets summary Message-ID: People: I modified my tool, whichs makes a summary of all the Python tickets (I moved the source where the info is taken from SF to our Roundup). In result, the summary is now, again, updated daily: http://www.taniquetil.com.ar/facundo/py_tickets.html Enjoy it. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From janssen at parc.com Tue Sep 11 02:20:14 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 17:20:14 PDT Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile Message-ID: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> The Alpha/Tru64 buildbot seems to be having difficulty compiling the _ssl.c file. Looks like missing header files. Anyone know what the configuration of OpenSSL on that machine is like? Bill From janssen at parc.com Tue Sep 11 02:59:00 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 17:59:00 PDT Subject: [Python-Dev] Solaris 10 buildbot test_ssl failures In-Reply-To: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep10.175910pdt."57996"@synergy1.parc.xerox.com> The Solaris 10 buildbot is complaining about test_ssl, and I think it's because some of the functions in it use constants from the ssl module at the top level, i.e., def tryProtocolCombo (server_protocol, client_protocol, expectedToWork, certsreqs=ssl.CERT_NONE): Is this verboten? Bill From janssen at parc.com Tue Sep 11 03:41:02 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 18:41:02 PDT Subject: [Python-Dev] Design and direction of the SSL module (was Re: frozenset C API?) In-Reply-To: <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.101542pdt."57996"@synergy1.parc.xerox.com> <20070906174518.21185.1342025567.divmod.xquotient.7082@joule.divmod.com> <07Sep6.111518pdt."57996"@synergy1.parc.xerox.com> <20070910055852.5579.246755534.divmod.xquotient.735@joule.divmod.com> Message-ID: <07Sep10.184105pdt."57996"@synergy1.parc.xerox.com> By the way, if you're offering to help with this, there are a couple of things I could use some help with. I scratched my head a bit about how to turn the "othername" possibility of a subjectAltName into a Python data structure, using the OpenSSL C code, and finally gave up. If you could provide a C function that does that, I'd be very grateful. And there's a similar issue with the "permanent identifier" defined in RFC 4043. I don't see how to iterate over an ASN1 sequence using the OpenSSL C code -- if you can figure out how to do that and provide a C function to translate that field in a certificate into a Python data structure, it would also be a great help. Bill From janssen at parc.com Tue Sep 11 04:01:42 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 19:01:42 PDT Subject: [Python-Dev] Solaris 10 buildbot test_ssl failures In-Reply-To: <07Sep10.175910pdt."57996"@synergy1.parc.xerox.com> References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> <07Sep10.175910pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep10.190146pdt."57996"@synergy1.parc.xerox.com> > The Solaris 10 buildbot is complaining about test_ssl, and I think > it's because some of the functions in it use constants from the ssl > module at the top level, i.e., > > def tryProtocolCombo (server_protocol, > client_protocol, > expectedToWork, > certsreqs=ssl.CERT_NONE): > > Is this verboten? Of course it is. Yep, that was it. Solaris 10 is green. Bill From aahz at pythoncraft.com Tue Sep 11 05:01:46 2007 From: aahz at pythoncraft.com (Aahz) Date: Mon, 10 Sep 2007 20:01:46 -0700 Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <46E2FAC4.2030900@v.loewis.de> References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> <46E2FAC4.2030900@v.loewis.de> Message-ID: <20070911030146.GA28351@panix.com> On Sat, Sep 08, 2007, "Martin v. L?wis" wrote: > > No. IIUC, "expected skips" are a platform property. For your platform, > support for threads is expected (whatever your platform is as log as > it was built in this millenium). Really? I thought NetBSD was still iffy WRT threading. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer http://www.lysator.liu.se/c/ten-commandments.html From martin at v.loewis.de Tue Sep 11 07:36:20 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 11 Sep 2007 07:36:20 +0200 Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile In-Reply-To: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E62954.8010208@v.loewis.de> > The Alpha/Tru64 buildbot seems to be having difficulty compiling > the _ssl.c file. Looks like missing header files. Anyone know what > the configuration of OpenSSL on that machine is like? Neal Norwitz and Ralf Grosse-Kunstleve have access to that machine. Regards, Martin From anthony at ekit-inc.com Tue Sep 11 07:42:01 2007 From: anthony at ekit-inc.com (Anthony Baxter) Date: Tue, 11 Sep 2007 15:42:01 +1000 Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile In-Reply-To: <46E62954.8010208@v.loewis.de> References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> <46E62954.8010208@v.loewis.de> Message-ID: <200709111542.05696.anthony@ekit-inc.com> On Tuesday 11 September 2007, Martin v. L?wis wrote: > > The Alpha/Tru64 buildbot seems to be having difficulty > > compiling the _ssl.c file. Looks like missing header files. > > Anyone know what the configuration of OpenSSL on that machine > > is like? > > Neal Norwitz and Ralf Grosse-Kunstleve have access to that > machine. Neal's on leave all this month, I believe. -- Anthony Baxter, ekit. anthony at ekit-inc.com (03) 9674 7015 Level 3 The Teahouse, 28 Clarendon St, Sth Melbourne Australia 3205 From martin at v.loewis.de Tue Sep 11 07:43:16 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 11 Sep 2007 07:43:16 +0200 Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <20070911030146.GA28351@panix.com> References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> <46E2FAC4.2030900@v.loewis.de> <20070911030146.GA28351@panix.com> Message-ID: <46E62AF4.4040207@v.loewis.de> >> No. IIUC, "expected skips" are a platform property. For your platform, >> support for threads is expected (whatever your platform is as log as >> it was built in this millenium). > > Really? I thought NetBSD was still iffy WRT threading. Ah, right. Still, it seems that people expect that thread support is available on NetBSD. The list of expected skips does not mention test_thread for 'netbsd3' (it only does so for 'sco_sv3' and 'riscos') Regards, Martin From janssen at parc.com Tue Sep 11 07:59:02 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 10 Sep 2007 22:59:02 PDT Subject: [Python-Dev] Alpha/Tru64 buildbot and SSL compile In-Reply-To: <200709111542.05696.anthony@ekit-inc.com> References: <07Sep10.172015pdt."57996"@synergy1.parc.xerox.com> <46E62954.8010208@v.loewis.de> <200709111542.05696.anthony@ekit-inc.com> Message-ID: <07Sep10.225906pdt."57996"@synergy1.parc.xerox.com> > > Neal Norwitz and Ralf Grosse-Kunstleve have access to that > > machine. > > Neal's on leave all this month, I believe. Well, I'm not sure it's urgent. Are there lots of Alphas still running? And Tru64 is in end-of-life mode. Bill From tleeuwenburg at gmail.com Tue Sep 11 08:41:30 2007 From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg) Date: Tue, 11 Sep 2007 16:41:30 +1000 Subject: [Python-Dev] Compiling Python 2.5 and settinf UCS2 flag Message-ID: <43c8685c0709102341o1f9323f5hc5525460cbd8b12a@mail.gmail.com> Hi all, I have an unusual use case in which some software I work on compiles a version of Python for distribution. I'm not 100% across this as it's not at all my area of responsibility, but I have been having some issues lately. My hand-compiled version of Python 2.5 works just fine, and in turn uses a hand-compiled Tcl/Tk with threading disabled. The system then re-compiles Python2.5 for its own purposes. At this point, it appears to ignore some of the options originally set using configure for Python. I have enough knowledge/control over the system to pass in some additional compiler flags. I would like to try to force some behaviour normally set as a flag to the configure script. Is there a C compiler flag I can use to force the use of UCS2 unicode? Thanks, -Tennessee -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/1fba7334/attachment-0001.htm From tulloss2 at uiuc.edu Tue Sep 11 09:27:58 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Tue, 11 Sep 2007 02:27:58 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) Message-ID: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> Hi, I had a whole long email about exactly what I was doing, but I think I'll get to the point instead. I'm trying to implement a python concurrency API and would like to use cpython to do it. To do that, I would like to remove the GIL. So, since I'm new to interpreter hacking, some help would be appreciated. I've listed what I think the GIL does; if you guys could add to this list or refine it, I would very much appreciate it. Roles of the GIL: 1. Some global interpreter state/modules are protected (where are these globals at?) 2. When writing C extensions I can change the state of my python object without worrying about synchronization 3. When writing C extensions I can change my own internal C state without worrying about synchronization (unless I have other, non-python threads running) Does anyone know of a place where the GIL is required when not operating on a python object? It seems like this would never happen, and would make replacing the GIL somewhat easier. I've only started looking at the code recently, so please forgive my naivety. I'm still learning how the interpreter works on a high level, let alone all the nitty gritty details! Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/364046d8/attachment.htm From martin at v.loewis.de Tue Sep 11 10:33:17 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 11 Sep 2007 10:33:17 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> Message-ID: <46E652CD.1070901@v.loewis.de> > 1. Some global interpreter state/modules are protected (where are these > globals at?) It's the interpreter and thread state itself (pystate.h), for the thread state, also _PyThreadState_Current. Then there is the GC state, in particular "generations". There are various caches and counters also. > 2. When writing C extensions I can change the state of my python object > without worrying about synchronization > 3. When writing C extensions I can change my own internal C state > without worrying about synchronization (unless I have other, non-python > threads running) 4. The builtin container types are protected by the GIL, and various other builtin objects 5. Reference counting is protected by the GIL 6. PyMalloc is protected by the GIL. > Does anyone know of a place where the GIL is required when not operating > on a python object? See 6 above, also (obviously) 1. > I've only started looking at the code recently, so please forgive my > naivety. I'm still learning how the interpreter works on a high level, > let alone all the nitty gritty details! Good luck! Martin From thomas at python.org Tue Sep 11 11:33:22 2007 From: thomas at python.org (Thomas Wouters) Date: Tue, 11 Sep 2007 11:33:22 +0200 Subject: [Python-Dev] Compiling Python 2.5 and settinf UCS2 flag In-Reply-To: <43c8685c0709102341o1f9323f5hc5525460cbd8b12a@mail.gmail.com> References: <43c8685c0709102341o1f9323f5hc5525460cbd8b12a@mail.gmail.com> Message-ID: <9e804ac0709110233s3bba472ctc5e5f57044103215@mail.gmail.com> On 9/11/07, Tennessee Leeuwenburg wrote: > > Hi all, > > I have an unusual use case in which some software I work on compiles a > version of Python for distribution. I'm not 100% across this as it's not at > all my area of responsibility, but I have been having some issues lately. > > My hand-compiled version of Python 2.5 works just fine, and in turn uses a > hand-compiled Tcl/Tk with threading disabled. > > The system then re-compiles Python2.5 for its own purposes. At this point, > it appears to ignore some of the options originally set using configure for > Python. > > I have enough knowledge/control over the system to pass in some additional > compiler flags. I would like to try to force some behaviour normally set as > a flag to the configure script. > > Is there a C compiler flag I can use to force the use of UCS2 unicode? This isn't really a python-dev question, more a python-list one. Python dev is for the development of Python, not with Python or your system ;) The choice between UCS2 and UCS4 is made by configure, based on two things: what you pass with the --enable-unicode argument (if anything), and what the version of Tcl you're linking against seems to use. Tcl's choice is only used if no explicit choice is given. configure also determines the proper type for the actual UCS2/UCS4 data -- wchar_t, unsigned short or unsigned long. Both choices are stored in pyconfig.h as is usual for configure. You can't override them with C compiler flags, but you can edit pyconfig.h if you can't change the configure flags. Keep in mind that you change both of those (you probably want to just diff the two pyconfig.h's to see what else is different.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/e816911d/attachment.htm From matt at pollenation.net Tue Sep 11 13:59:51 2007 From: matt at pollenation.net (Matt Goodall) Date: Tue, 11 Sep 2007 12:59:51 +0100 Subject: [Python-Dev] which SSL client protocols work with which server protocols? In-Reply-To: <07Sep10.114436pdt."57996"@synergy1.parc.xerox.com> References: <07Sep8.095142pdt."57996"@synergy1.parc.xerox.com> <07Sep10.103100pdt."57996"@synergy1.parc.xerox.com> <07Sep10.114436pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E68337.5050307@pollenation.net> Bill Janssen wrote: > Here's the updated connection table: > > SSL2 SSL3 SS23 TLS1 > SSL2 yes no yes no > SSL3 yes yes yes no > SSL23 yes no yes no > TLS1 no no yes yes > > Given this, I think the client-side default should be changed from > SSLv23 to SSLv3, and the server-side default should be SSLv23. I believe you are correct. I did some experiments with this a while ago after hitting problems connecting to some SSL servers although I can't remember the exact results now. More importantly, what you recommend is what Twisted does and I'd believe them more than me any time ;-). See Twisted's DefaultOpenSSLContextFactory [1] for the server side and ClientContextFactory [2] for the client side. Cheers, Matt [1] DefaultOpenSSLContextFactory, http://twistedmatrix.com/trac/browser/trunk/twisted/internet/ssl.py#L67 [2] ClientContextFactory, http://twistedmatrix.com/trac/browser/trunk/twisted/internet/ssl.py#L102 -- Matt Goodall, Pollenation Internet Ltd Technology House, 237 Lidgett Lane, Leeds LS17 6QR Registered No 4382123 A member of the Brunswick MCL Group of Companies w: http://www.pollenation.net/ e: matt at pollenation.net t: +44 113 2252500 From aahz at pythoncraft.com Tue Sep 11 15:12:54 2007 From: aahz at pythoncraft.com (Aahz) Date: Tue, 11 Sep 2007 06:12:54 -0700 Subject: [Python-Dev] testing in a Python --without-threads build In-Reply-To: <46E62AF4.4040207@v.loewis.de> References: <07Sep8.115742pdt."57996"@synergy1.parc.xerox.com> <07Sep8.121928pdt."57996"@synergy1.parc.xerox.com> <46E2FAC4.2030900@v.loewis.de> <20070911030146.GA28351@panix.com> <46E62AF4.4040207@v.loewis.de> Message-ID: <20070911131254.GA21952@panix.com> On Tue, Sep 11, 2007, "Martin v. L?wis" wrote: > >>> No. IIUC, "expected skips" are a platform property. For your platform, >>> support for threads is expected (whatever your platform is as log as >>> it was built in this millenium). >> >> Really? I thought NetBSD was still iffy WRT threading. > > Ah, right. Still, it seems that people expect that thread support is > available on NetBSD. The list of expected skips does not mention > test_thread for 'netbsd3' (it only does so for 'sco_sv3' and 'riscos') I'm assuming that's because NetBSD has threads, they just don't work. So we don't want that to put NetBSD on the list of expected skips so that we find out when threads do work. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer http://www.lysator.liu.se/c/ten-commandments.html From aahz at pythoncraft.com Tue Sep 11 15:18:36 2007 From: aahz at pythoncraft.com (Aahz) Date: Tue, 11 Sep 2007 06:18:36 -0700 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> Message-ID: <20070911131836.GB21952@panix.com> On Tue, Sep 11, 2007, Justin Tulloss wrote: > > I had a whole long email about exactly what I was doing, but I think > I'll get to the point instead. I'm trying to implement a python > concurrency API and would like to use cpython to do it. To do that, I > would like to remove the GIL. You should review the work Greg Stein did to remove the GIL in 1.5.2; although the interpreter core has changed considerably since then, I believe the basic principles of the GIL are the same. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer http://www.lysator.liu.se/c/ten-commandments.html From ncoghlan at gmail.com Tue Sep 11 16:38:45 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Sep 2007 00:38:45 +1000 Subject: [Python-Dev] Making directories and zip files executable Message-ID: <46E6A875.9020208@gmail.com> The local patch I have for PEP 366 is somewhat stale, and before I bring it up to date with SVN head, I'd like to close out the issue raised a while back regarding making zip files executable [1]. The original proposal was for a new command line switch, but PJE came up with a patch (attached to the roundup tracker item) that uses the existing import machinery to avoid the need for the extra command line switch (by checking if the argument is a valid sys.path entry before checking to see if it is an executable script). I personally like the idea (and PJE's approach), and the performance impact on script startup time appears to be negligible (although I haven't performed any high precision measurements - I'm just using the Linux time utility on a short test script with and without the patch). Are there any objections to my committing this? Cheers, Nick. [1] http://bugs.python.org/issue1739468 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From thomas at python.org Tue Sep 11 16:54:30 2007 From: thomas at python.org (Thomas Wouters) Date: Tue, 11 Sep 2007 16:54:30 +0200 Subject: [Python-Dev] [issue1056] test_cmd_line starts python without -E In-Reply-To: <1189519439.77.0.401459254933.issue1056@psf.upfronthosting.co.za> References: <1189519439.77.0.401459254933.issue1056@psf.upfronthosting.co.za> Message-ID: <9e804ac0709110754h28912b2cr53863d46d9e2bf9f@mail.gmail.com> On 9/11/07, Nick Coghlan wrote: > (Is the head still being merged to the py3k branch? Or does this need to > be forward-ported manually?) No worries, the trunk is still being merged to py3k. I doubt we'll ever stop doing that, unless the trunk becomes py3k and 2.x development is done on a branch (in which case I imagine we'd merge between those two in some way :-) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/67b9e740/attachment.htm From tulloss2 at uiuc.edu Tue Sep 11 17:07:34 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Tue, 11 Sep 2007 10:07:34 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E652CD.1070901@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> Message-ID: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> On 9/11/07, "Martin v. L?wis" wrote: > > > 1. Some global interpreter state/modules are protected (where are these > > globals at?) > > It's the interpreter and thread state itself (pystate.h), for the thread > state, also _PyThreadState_Current. Then there is the GC state, in > particular "generations". There are various caches and counters also. Caches seem like they definitely might be a problem. Would you mind expanding on this a little? What gets cached and why? Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/b4c00904/attachment.htm From skip at pobox.com Tue Sep 11 17:21:07 2007 From: skip at pobox.com (skip at pobox.com) Date: Tue, 11 Sep 2007 10:21:07 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> Message-ID: <18150.45667.738340.378354@montanaro.dyndns.org> Justin> Caches seem like they definitely might be a problem. Would you Justin> mind expanding on this a little? What gets cached and why? I believe the integer free list falls into this category. Skip From martin at v.loewis.de Tue Sep 11 17:50:00 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 11 Sep 2007 17:50:00 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> Message-ID: <46E6B928.1090603@v.loewis.de> > It's the interpreter and thread state itself (pystate.h), for the thread > state, also _PyThreadState_Current. Then there is the GC state, in > particular "generations". There are various caches and counters also. > > > Caches seem like they definitely might be a problem. Would you mind > expanding on this a little? What gets cached and why? Depends on the Python version what precisely gets cached. Several types preserve a pool of preallocated objects, to speed up allocation. Examples are intobject.c (block_list, free_list), frameobject.c (free_list), listobject.c (free_list), methodobject.c (free_list), float_object.c (block_list, free_list), classobject.c (free_list). Plus there are tons of variables caching string objects. From classobject.c alone: getattrstr, setattrstr, delattrs, docstr, modstr, namestr, initstr, delstr, reprstr, strstr, hashstr, eqstr, cmpstr, getitemstr, setitemstr, delitemstr, lenstr, iterstr, nextstr, getslicestr, setslicestr, delslicestr, __contains__, all arguments to UNARY, UNARY_FB, BINARY, BINARY_INPLACE (e.g. instance_neg, instance_or, instance_ior, then cmp_obj, nonzerostr, indexstr. (admittedly, classobject.c is extreme here). There are probably more classes which I just forgot. Regards, Martin From status at bugs.python.org Tue Sep 11 18:22:59 2007 From: status at bugs.python.org (Tracker) Date: Tue, 11 Sep 2007 16:22:59 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070911162259.50E71782C1@psf.upfronthosting.co.za> ACTIVITY SUMMARY (09/04/07 - 09/11/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1274 open (+31) / 11356 closed (+16) / 12630 total (+47) Average duration of open issues: 673 days. Median duration of open issues: 635 days. Open Issues Breakdown open 1271 (+31) pending 3 ( +0) Issues Created Or Reopened (47) _______________________________ Is there just no PRINT statement any more? Or it just doesn't wo 09/06/07 CLOSED http://bugs.python.org/issue1101 reopened loewis Add support for _msi.Record.GetString() and _msi.Record.GetInteg 09/04/07 http://bugs.python.org/issue1102 created atuining Typo in dummy_threading documentation 09/04/07 CLOSED http://bugs.python.org/issue1103 created dthomasset msilib.SummaryInfo.GetProperty() truncates the string by one cha 09/04/07 http://bugs.python.org/issue1104 created atuining patch for readme.txt in PCbuild8 09/05/07 CLOSED http://bugs.python.org/issue1105 created chipped Error in random.shuffle 09/05/07 CLOSED http://bugs.python.org/issue1106 created Viscaynot 2to3, lambda with non-tuple argument inside parenthesis 09/05/07 CLOSED http://bugs.python.org/issue1107 created falsetru Problem with doctest and decorated functions 09/05/07 http://bugs.python.org/issue1108 created danilo Warning required when calling register() on an ABCMeta subclass 09/05/07 http://bugs.python.org/issue1109 created mark Problems with the msi installer - python-3.0a1.msi 09/05/07 http://bugs.python.org/issue1110 created vbr Users' directories information 09/05/07 CLOSED http://bugs.python.org/issue1111 created uzytkownik Test debug assertion in bsddb test_1413192.py 09/05/07 CLOSED http://bugs.python.org/issue1112 created db3l interrupt_main() fails to interrupt raw_input() 09/05/07 CLOSED http://bugs.python.org/issue1113 created anand _curses issues on 64-bit big-endian (e.g, AIX) 09/06/07 http://bugs.python.org/issue1114 created lukemewburn Minor Change For Better cross compile 09/06/07 http://bugs.python.org/issue1115 created zengbo reference in extending doc to non-existing file 09/06/07 CLOSED http://bugs.python.org/issue1116 created anthon Spurious warning about missing _sha256 and _sha512 when not need 09/06/07 CLOSED http://bugs.python.org/issue1117 created dripton hashlib module fails with TypeError 09/06/07 CLOSED http://bugs.python.org/issue1118 created dripton Search index is messed up after partial rebuilding 09/06/07 CLOSED http://bugs.python.org/issue1119 created lars.gustaebel "make altinstall" installs pydoc, idle, smtpd.py with broken she 09/06/07 http://bugs.python.org/issue1120 created dripton Document inspect.getfullargspec() 09/06/07 http://bugs.python.org/issue1121 created brett.cannon PyTuple_Size and PyTuple_GET_SIZE return type documentation inco 09/07/07 http://bugs.python.org/issue1122 created gaul split(None, maxsplit) does not strip whitespace correctly 09/07/07 http://bugs.python.org/issue1123 created nirs Webchecker not parsing css "@import url" 09/07/07 http://bugs.python.org/issue1124 created ready.eddy bytes.split shold have same interface as str.split, or different 09/07/07 CLOSED http://bugs.python.org/issue1125 created nirs file.fileno and file.isatty() should be implementable by any fil 09/07/07 http://bugs.python.org/issue1126 created nirs No tests for inspect.getfullargspec() 09/07/07 http://bugs.python.org/issue1127 created brett.cannon msilib.Directory.make_short only handles file names with a singl 09/07/07 http://bugs.python.org/issue1128 created atuining OpenSSL detection broken for Python 3.0a1 09/07/07 CLOSED http://bugs.python.org/issue1129 created pythonmeister Idle - Save (buffer) - closes IDLE and does not save file (Windo 09/08/07 http://bugs.python.org/issue1130 created infixum Reference Manual: "for statement" links to "break statement" 09/08/07 http://bugs.python.org/issue1131 created Martoon compile error in poplib.py 09/08/07 CLOSED http://bugs.python.org/issue1132 created andre python3.0-config raises SyntaxError 09/09/07 CLOSED http://bugs.python.org/issue1133 created complex Parsing a simple script eats all of your memory 09/09/07 http://bugs.python.org/issue1134 created complex xview/yview of Tix.Grid is broken 09/09/07 http://bugs.python.org/issue1135 created ocean-city Bdb documentation 09/09/07 http://bugs.python.org/issue1136 created arklad pyexpat patch for changing buffer_size 09/09/07 http://bugs.python.org/issue1137 created AchimGaedke Fixer needed for __future__ imports 09/09/07 http://bugs.python.org/issue1138 created collinwinter PyFile_Encoding should be PyFile_SetEncoding 09/10/07 http://bugs.python.org/issue1139 created gagenellina re.sub returns str when processing empty unicode string 09/10/07 http://bugs.python.org/issue1140 created beda reading large files 09/10/07 http://bugs.python.org/issue1141 created Richard.Christen at unice.fr code sample showing errors reading large files with py 2.5/3.0 09/10/07 http://bugs.python.org/issue1142 created Richard.Christen at unice.fr Update to latest ElementTree in Python 2.6 09/11/07 http://bugs.python.org/issue1143 created effbot parsermodule validation out of sync with Grammar 09/11/07 http://bugs.python.org/issue1144 created dbinger Allow str.join to join non-string types (as per PEP 3100) 09/11/07 http://bugs.python.org/issue1145 created thomas.lee TextWrap vs words 1-character shorter than the width 09/11/07 http://bugs.python.org/issue1146 created sam string exceptions inconsistently deprecated/disabled 09/11/07 http://bugs.python.org/issue1147 created exarkun Issues Now Closed (43) ______________________ 2to3 crashes on input files with no trailing newlines 14 days http://bugs.python.org/issue1001 collinwinter Backport ABC to 2.6 16 days http://bugs.python.org/issue1026 baranguren test_cmd_line starts python without -E 13 days http://bugs.python.org/issue1056 ncoghlan ssl.py shouldn't change class names from 2.6 to 3.x 11 days http://bugs.python.org/issue1065 janssen Unexpected results in Tutorial about Unicode 2 days http://bugs.python.org/issue1092 georg.brandl TypeError in poplib.py 7 days http://bugs.python.org/issue1094 gvanrossum make install failed 4 days http://bugs.python.org/issue1095 georg.brandl Deeply recursive repr segfault 7 days http://bugs.python.org/issue1096 brett.cannon Is there just no PRINT statement any more? Or it just doesn't w 0 days http://bugs.python.org/issue1101 gvanrossum Typo in dummy_threading documentation 0 days http://bugs.python.org/issue1103 dthomasset patch for readme.txt in PCbuild8 0 days http://bugs.python.org/issue1105 loewis Error in random.shuffle 0 days http://bugs.python.org/issue1106 georg.brandl 2to3, lambda with non-tuple argument inside parenthesis 4 days http://bugs.python.org/issue1107 collinwinter Users' directories information 1 days http://bugs.python.org/issue1111 uzytkownik Test debug assertion in bsddb test_1413192.py 1 days http://bugs.python.org/issue1112 gregory.p.smith interrupt_main() fails to interrupt raw_input() 2 days http://bugs.python.org/issue1113 anand reference in extending doc to non-existing file 0 days http://bugs.python.org/issue1116 georg.brandl Spurious warning about missing _sha256 and _sha512 when not nee 0 days http://bugs.python.org/issue1117 skip.montanaro hashlib module fails with TypeError 0 days http://bugs.python.org/issue1118 georg.brandl Search index is messed up after partial rebuilding 0 days http://bugs.python.org/issue1119 georg.brandl bytes.split shold have same interface as str.split, or differen 4 days http://bugs.python.org/issue1125 gvanrossum OpenSSL detection broken for Python 3.0a1 0 days http://bugs.python.org/issue1129 georg.brandl compile error in poplib.py 1 days http://bugs.python.org/issue1132 georg.brandl python3.0-config raises SyntaxError 0 days http://bugs.python.org/issue1133 loewis raw_input defers alarm signal 1666 days http://bugs.python.org/issue685846 georg.brandl support for server side transactions in _ssl 1498 days http://bugs.python.org/issue783188 loewis patch for build with read-only $srcdir 1486 days http://bugs.python.org/issue786737 loewis SSL-ed sockets don't close correct? 1168 days http://bugs.python.org/issue978833 loewis a bunch of infinite C recursions 844 days http://bugs.python.org/issue1202533 brett.cannon add single html files 833 days http://bugs.python.org/issue1209562 georg.brandl crash recursive __getattr__ 744 days http://bugs.python.org/issue1267884 brett.cannon Traceback error when compiling Regex 537 days http://bugs.python.org/issue1456280 brett.cannon winerror module 450 days http://bugs.python.org/issue1505257 loewis SSL "issuer" and "server" names cannot be parsed 321 days http://bugs.python.org/issue1583946 janssen Suggest a textlist() method for ElementTree 291 days http://bugs.python.org/issue1602189 effbot sys.intern() 2to3 fixer 261 days http://bugs.python.org/issue1619049 collinwinter Explain __method__ lookup semantics for new-style classes 168 days http://bugs.python.org/issue1684991 georg.brandl socket.error exceptions not subclass of StandardError 138 days http://bugs.python.org/issue1706815 gregory.p.smith imp.find_module doc ambiguity 134 days http://bugs.python.org/issue1708326 georg.brandl _lsprof.c:ptrace_enter_call assumes PyErr_* is clean 89 days http://bugs.python.org/issue1733973 arigo expanduser("~") on Windows looks for HOME first 62 days http://bugs.python.org/issue1749583 georg.brandl os.chmod failure 35 days http://bugs.python.org/issue1767242 georg.brandl Binding fails 26 days http://bugs.python.org/issue1774736 loewis Top Issues Most Discussed (10) ______________________________ 11 re.sub returns str when processing empty unicode string 1 days open http://bugs.python.org/issue1140 11 bytes.split shold have same interface as str.split, or differen 4 days closed http://bugs.python.org/issue1125 8 split(None, maxsplit) does not strip whitespace correctly 5 days open http://bugs.python.org/issue1123 7 Spurious warning about missing _sha256 and _sha512 when not nee 0 days closed http://bugs.python.org/issue1117 7 make install failed 4 days closed http://bugs.python.org/issue1095 6 reading large files 1 days open http://bugs.python.org/issue1141 5 code sample showing errors reading large files with py 2.5/3.0 1 days open http://bugs.python.org/issue1142 5 Parsing a simple script eats all of your memory 3 days open http://bugs.python.org/issue1134 4 logging: delay_fh option and configuration kwargs 41 days open http://bugs.python.org/issue1765140 4 interrupt_main() fails to interrupt raw_input() 2 days closed http://bugs.python.org/issue1113 From guido at python.org Tue Sep 11 19:48:57 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 11 Sep 2007 10:48:57 -0700 Subject: [Python-Dev] Making directories and zip files executable In-Reply-To: <46E6A875.9020208@gmail.com> References: <46E6A875.9020208@gmail.com> Message-ID: I could use a refresher on how PJE's patch solves Andy's problem. On 9/11/07, Nick Coghlan wrote: > The local patch I have for PEP 366 is somewhat stale, and before I bring > it up to date with SVN head, I'd like to close out the issue raised a > while back regarding making zip files executable [1]. > > The original proposal was for a new command line switch, but PJE came up > with a patch (attached to the roundup tracker item) that uses the > existing import machinery to avoid the need for the extra command line > switch (by checking if the argument is a valid sys.path entry before > checking to see if it is an executable script). > > I personally like the idea (and PJE's approach), and the performance > impact on script startup time appears to be negligible (although I > haven't performed any high precision measurements - I'm just using the > Linux time utility on a short test script with and without the patch). > > Are there any objections to my committing this? > > Cheers, > Nick. > > [1] http://bugs.python.org/issue1739468 > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Tue Sep 11 20:12:09 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 11 Sep 2007 11:12:09 PDT Subject: [Python-Dev] adding a "test" fork to a setup.py package Message-ID: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com> I'm packaging up the SSL support for Python 2.3, and I'd like to be able to include the unit test for it along with the package. Ideally, I'd like to be able to say % python setup.py build % python setup.py test and have the regrtest.py module run my tests. Any ideas (examples) of how to do that? Bill From anomyo2 at gmail.com Tue Sep 11 20:33:00 2007 From: anomyo2 at gmail.com (=?iso-8859-1?Q?Carlos_Mart=EDnez?=) Date: Tue, 11 Sep 2007 20:33:00 +0200 Subject: [Python-Dev] Compiler Python Message-ID: <000001c7f4a2$307fa720$917ef560$@com> Hello Someone knows since as I can obtain the information detailed about the compiler of Python? (Table of tokens, lists of productions of the syntactic one , semantic restrictions...) Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/22f26470/attachment.htm From barry at python.org Tue Sep 11 20:38:44 2007 From: barry at python.org (Barry Warsaw) Date: Tue, 11 Sep 2007 14:38:44 -0400 Subject: [Python-Dev] adding a "test" fork to a setup.py package In-Reply-To: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com> References: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 11, 2007, at 2:12 PM, Bill Janssen wrote: > I'm packaging up the SSL support for Python 2.3, and I'd like to be > able to include the unit test for it along with the package. Ideally, > I'd like to be able to say > > % python setup.py build > % python setup.py test > > and have the regrtest.py module run my tests. Any ideas (examples) of > how to do that? The email package does something like this by having most of the tests in a subpackage of enum, with a shim in Lib/test for regrtest. The standalone package has a testall script, but that should really be converted to nosetests or some such. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRubgtXEjvBPtnXfVAQIGJgP/aeCOlybJj0sA3k6WOWMCOhugggLHHtO2 lu5v7hZZG5nqe5iApwxjbiylxvMRfRB6HS7dgEABx1D5OC3uFssn3kUzokfBtsxy I/e4qYiTSCG3WZacqytAqmjKt3FkceIo+l6YKx29FjPlaoHHz0UzCJIdW9AuJp4a Ussk9AOPIXo= =FMDE -----END PGP SIGNATURE----- From tjreedy at udel.edu Tue Sep 11 21:01:33 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 11 Sep 2007 15:01:33 -0400 Subject: [Python-Dev] Compiler Python References: <000001c7f4a2$307fa720$917ef560$@com> Message-ID: "Carlos Mart?nez" wrote in message news:000001c7f4a2$307fa720$917ef560$@com... | Someone knows since as I can obtain the information detailed about the | compiler of Python? (Table of tokens, lists of productions of the syntactic | one , semantic restrictions...) Ask this sort of question on the python-list mailing list or the comp.lang.python or gmane.comp.python.general newsgroups. And check the source code. From theller at ctypes.org Tue Sep 11 21:04:31 2007 From: theller at ctypes.org (Thomas Heller) Date: Tue, 11 Sep 2007 21:04:31 +0200 Subject: [Python-Dev] adding a "test" fork to a setup.py package In-Reply-To: References: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com> Message-ID: Barry Warsaw schrieb: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On Sep 11, 2007, at 2:12 PM, Bill Janssen wrote: > >> I'm packaging up the SSL support for Python 2.3, and I'd like to be >> able to include the unit test for it along with the package. Ideally, >> I'd like to be able to say >> >> % python setup.py build >> % python setup.py test >> >> and have the regrtest.py module run my tests. Any ideas (examples) of >> how to do that? > > The email package does something like this by having most of the > tests in a subpackage of enum, with a shim in Lib/test for regrtest. > The standalone package has a testall script, but that should really > be converted to nosetests or some such. ctypes does it in a similar way. Tests are in the Lib/ctypes/test package. Thomas From pje at telecommunity.com Tue Sep 11 21:19:46 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 11 Sep 2007 15:19:46 -0400 Subject: [Python-Dev] Making directories and zip files executable In-Reply-To: References: <46E6A875.9020208@gmail.com> Message-ID: <20070911191716.06F913A40D7@sparrow.telecommunity.com> At 10:48 AM 9/11/2007 -0700, Guido van Rossum wrote: >I could use a refresher on how PJE's patch solves Andy's problem. It does the same thing, but with __main__ instead of __zipmain__, and without needing the -z. So instead of "python -z foo.zip" you can just do "python foo.zip". This means you can use a reasonably cross-platform #! header that invokes Python via "env". If you tried to use -z with env, Andy's approach would either only work on BSDish platforms that support multiple options to a #! command, or else you'd have to ditch the "env" and hardcode the patht to Python. So not needing a command-line option improves the effective portability/usability of the executable zip file. Also, being able to run "python directory_to_be_zipped_later" also lets you test/develop your program without it needing to be zipped first. >On 9/11/07, Nick Coghlan wrote: > > The local patch I have for PEP 366 is somewhat stale, and before I bring > > it up to date with SVN head, I'd like to close out the issue raised a > > while back regarding making zip files executable [1]. > > > > The original proposal was for a new command line switch, but PJE came up > > with a patch (attached to the roundup tracker item) that uses the > > existing import machinery to avoid the need for the extra command line > > switch (by checking if the argument is a valid sys.path entry before > > checking to see if it is an executable script). > > > > I personally like the idea (and PJE's approach), and the performance > > impact on script startup time appears to be negligible (although I > > haven't performed any high precision measurements - I'm just using the > > Linux time utility on a short test script with and without the patch). > > > > Are there any objections to my committing this? > > > > Cheers, > > Nick. > > > > [1] http://bugs.python.org/issue1739468 From pje at telecommunity.com Tue Sep 11 21:23:04 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Tue, 11 Sep 2007 15:23:04 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.co m> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> Message-ID: <20070911192035.94CE33A40D7@sparrow.telecommunity.com> At 10:07 AM 9/11/2007 -0500, Justin Tulloss wrote: >On 9/11/07, "Martin v. L?wis" ><martin at v.loewis.de> wrote: > > 1. Some global interpreter state/modules are protected (where are these > > globals at?) > >It's the interpreter and thread state itself (pystate.h), for the thread >state, also _PyThreadState_Current. Then there is the GC state, in >particular "generations". There are various caches and counters also. > > >Caches seem like they definitely might be a problem. Would you mind >expanding on this a little? What gets cached and why? It's not just caches and counters. It's also every built-in type structure, builtin module, builtin function... any Python object that's a built-in, period. That includes things like None, True, and False. Caches would include such things as the pre-created integers -100 through 255, the 1-byte character strings for chr(0)-chr(255), and the interned strings cache, to name a few. Most of these things I've mentioned are truly global, and not specific to an individual interpreter. From brett at python.org Tue Sep 11 21:30:40 2007 From: brett at python.org (Brett Cannon) Date: Tue, 11 Sep 2007 12:30:40 -0700 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E6B928.1090603@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <46E6B928.1090603@v.loewis.de> Message-ID: On 9/11/07, "Martin v. L?wis" wrote: > > It's the interpreter and thread state itself (pystate.h), for the thread > > state, also _PyThreadState_Current. Then there is the GC state, in > > particular "generations". There are various caches and counters also. > > > > > > Caches seem like they definitely might be a problem. Would you mind > > expanding on this a little? What gets cached and why? > > Depends on the Python version what precisely gets cached. Several types > preserve a pool of preallocated objects, to speed up allocation. > Examples are intobject.c (block_list, free_list), frameobject.c > (free_list), listobject.c (free_list), methodobject.c (free_list), > float_object.c (block_list, free_list), classobject.c (free_list). > > Plus there are tons of variables caching string objects. From > classobject.c alone: getattrstr, setattrstr, delattrs, docstr, > modstr, namestr, initstr, delstr, reprstr, strstr, hashstr, eqstr, > cmpstr, getitemstr, setitemstr, delitemstr, lenstr, iterstr, nextstr, > getslicestr, setslicestr, delslicestr, __contains__, all arguments > to UNARY, UNARY_FB, BINARY, BINARY_INPLACE (e.g. instance_neg, > instance_or, instance_ior, then cmp_obj, nonzerostr, indexstr. > (admittedly, classobject.c is extreme here). > > There are probably more classes which I just forgot. We should probably document where all of these globals lists are instead of relying on looking for all file level static declarations or something. Or would there be benefit to moving things like this to the interpreter struct so that threads within a single interpreter call are locked but interpreters can act much more independently? -Brett From janssen at parc.com Tue Sep 11 22:02:36 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 11 Sep 2007 13:02:36 PDT Subject: [Python-Dev] adding a "test" fork to a setup.py package In-Reply-To: References: <07Sep11.111218pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep11.130243pdt."57996"@synergy1.parc.xerox.com> It's actually not bad. I put the test code and the data files in a "test" subdirectory of my distribution, then added the following to the setup.py file: class Test (Command): user_options = [] def initialize_options(self): pass def finalize_options(self): pass def run (self): """Run the regrtest module appropriately""" # figure out where the _ssl2 extension will be put b = build(self.distribution) b.initialize_options() b.finalize_options() extdir = os.path.abspath(b.build_platlib) # now set up the load path topdir = os.path.dirname(os.path.abspath(__file__)) localtestdir = os.path.join(topdir, "test") sys.path.insert(0, topdir) # for ssl package sys.path.insert(0, localtestdir) # for test module sys.path.insert(0, extdir) # for _ssl2 extension # make sure the network is enabled import test.test_support test.test_support.use_resources = ["network"] # and load the test and run it os.chdir(localtestdir) the_module = __import__("test_ssl", globals(), locals(), []) # Most tests run to completion simply as a side-effect of # being imported. For the benefit of tests that can't run # that way (like test_threaded_import), explicitly invoke # their test_main() function (if it exists). indirect_test = getattr(the_module, "test_main", None) if indirect_test is not None: indirect_test() and added cmdclass={'test': Test}, to the setup call. Irritating that you have to manually install the test files as data_files. Also irritating that data_files aren't automatically added to the manifest, and that Test has to have null initialize_options and finalize_options. And that there's no easy way to figure out where the build process left the extension. Bill From janssen at parc.com Tue Sep 11 22:36:31 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 11 Sep 2007 13:36:31 PDT Subject: [Python-Dev] re-using the Python setup.py file? Message-ID: <07Sep11.133638pdt."57996"@synergy1.parc.xerox.com> I see that the setup.py at the top level of the Python distribution does a lot of things wrt sensing compiler options, etc, that I'd like to re-use in my SSL setup.py distribution file. I'm a bit curious as to why this framework isn't in the distutils package? Bill From martin at v.loewis.de Tue Sep 11 22:53:01 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 11 Sep 2007 22:53:01 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070911192035.94CE33A40D7@sparrow.telecommunity.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> Message-ID: <46E7002D.6050005@v.loewis.de> > It's not just caches and counters. It's also every built-in type > structure, builtin module, builtin function... any Python object that's > a built-in, period. That includes things like None, True, and False. Sure - but those things don't get modified that often, except for their reference count. In addition, they are objects, and Justin seems to believe that things are easier if they are objects. Regards, Martin From foom at fuhm.net Tue Sep 11 22:54:58 2007 From: foom at fuhm.net (James Y Knight) Date: Tue, 11 Sep 2007 16:54:58 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <46E6B928.1090603@v.loewis.de> Message-ID: On Sep 11, 2007, at 3:30 PM, Brett Cannon wrote: > We should probably document where all of these globals lists are > instead of relying on looking for all file level static declarations > or something. Or would there be benefit to moving things like this to > the interpreter struct so that threads within a single interpreter > call are locked but interpreters can act much more independently? This would be nice. It would be really nice if python was embeddable more like TCL: separate interpreters really are separate, and don't share state. That means basically everything has to be stored in an interp-specific data structure, not in static variables. But this has been raised before, and was rejected as not worth the amount of work that would be required to achieve it. (it's certainly not worth it enough for *me* to do the work, so I can't blame anyone else for making the same determination.) James From martin at v.loewis.de Tue Sep 11 23:00:34 2007 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 11 Sep 2007 23:00:34 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <46E6B928.1090603@v.loewis.de> Message-ID: <46E701F2.3060606@v.loewis.de> > We should probably document where all of these globals lists are > instead of relying on looking for all file level static declarations > or something. I'm not sure what would be gained here, except for people occasionally (i.e. every three years) asking how they can best get rid of the GIL. > Or would there be benefit to moving things like this to > the interpreter struct so that threads within a single interpreter > call are locked but interpreters can act much more independently? The "multiple interpreter" feature doesn't quite work, and likely won't for a foreseeable future; specifically, objects can easily leak across interpreters. That's actually not a problem for immutable objects, like the strings, but here come the global objects into play which PJE mentions: types, including exceptions. Making them per-interpreter would probably break quite some code. As for the cached strings - it would be easy to make a global table of these, e.g. calling them _PyS__init__, _PyS__add__, and so on. These could be initialized at startup, simplifying the code that uses them because they don't have to worry about failures. Regards, Martin From janssen at parc.com Tue Sep 11 23:19:36 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 11 Sep 2007 14:19:36 PDT Subject: [Python-Dev] SSL package for Python 2.3 to 2.5 Message-ID: <07Sep11.141943pdt."57996"@synergy1.parc.xerox.com> I've put up an initial source package at http://www.parc.com/janssen/transient/ssl-1.0.tar.gz which I've tested with Python 2.3.5 on Mac OS X 10.4.10 (Intel) and Python 2.5 on Fedora Core 7. Please send bugs you find to me at janssen at parc.com. Try "python setup.py build", then "python setup.py test". Bill From greg at krypto.org Tue Sep 11 23:43:45 2007 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 11 Sep 2007 14:43:45 -0700 Subject: [Python-Dev] re-using the Python setup.py file? In-Reply-To: <-148172167956746104@unknownmsgid> References: <-148172167956746104@unknownmsgid> Message-ID: <52dc1c820709111443x54837f66i6f20b4f581962eff@mail.gmail.com> On 9/11/07, Bill Janssen wrote: > > I see that the setup.py at the top level of the Python distribution > does a lot of things wrt sensing compiler options, etc, that I'd like > to re-use in my SSL setup.py distribution file. I'm a bit curious > as to why this framework isn't in the distutils package? I suspect a combo of (a) nobody has done it yet and (b) many of the things done there felt too hackish to the person writing them. Regardless of (b) I'd place my money on (a). In maintaining external bsddb and hashlib module distributions for use on older pythons I have so far just pasted code as appropriate to/from the python setup.py and the separate distribution ones. Not ideal but trivial since once settled upon setup didn't change much. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070911/c8e4ec90/attachment.htm From greg.ewing at canterbury.ac.nz Wed Sep 12 00:55:20 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Sep 2007 10:55:20 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070911192035.94CE33A40D7@sparrow.telecommunity.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> Message-ID: <46E71CD8.5070608@canterbury.ac.nz> Phillip J. Eby wrote: > It's also every built-in type > structure, builtin module, builtin function... any Python object > that's a built-in, period. Where "built-in" in this context means anything implemented in C (i.e. it includes extension modules). -- Greg From greg.ewing at canterbury.ac.nz Wed Sep 12 01:20:31 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 12 Sep 2007 11:20:31 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E7002D.6050005@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> Message-ID: <46E722BF.8000807@canterbury.ac.nz> Martin v. L?wis wrote: > Sure - but those things don't get modified that often, except for their > reference count. The reference count is the killer, though -- you have to lock the object even to do that. And it happens a LOT, to all objects, including immutable ones. -- Greg From janssen at parc.com Wed Sep 12 02:30:01 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 11 Sep 2007 17:30:01 PDT Subject: [Python-Dev] what versions of Python don't have the "addr" field in the socket object? Message-ID: <07Sep11.173003pdt."57996"@synergy1.parc.xerox.com> Looks like this change is bothering the error returns from my backported SSL code. I believe this is only in 2.5.1 and later -- can anyone confirm that? Bill ------------------------------------------------------------ r52906 | martin.v.loewis | 2006-12-03 03:23:45 -0800 (Sun, 03 Dec 2006) | 4 lines Patch #1544279: Improve thread-safety of the socket module by moving the sock_addr_t storage out of the socket object. Will backport to 2.5. From janssen at parc.com Wed Sep 12 03:02:54 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 11 Sep 2007 18:02:54 PDT Subject: [Python-Dev] SSL package for Python 2.3 to 2.5 In-Reply-To: <07Sep11.141943pdt."57996"@synergy1.parc.xerox.com> References: <07Sep11.141943pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep11.180303pdt."57996"@synergy1.parc.xerox.com> > I've put up an initial source package at > http://www.parc.com/janssen/transient/ssl-1.0.tar.gz which I've tested > with Python 2.3.5 on Mac OS X 10.4.10 (Intel) and Python 2.5 on Fedora > Core 7. Please send bugs you find to me at janssen at parc.com. > > Try "python setup.py build", then "python setup.py test". There was a bug for 2.5.1 in the package (the socket data structure changed with 2.5.1), so I've put up a different version, http://www.parc.com/janssen/transient/ssl-1.1.tar.gz which I've tested with 2.3.5 and 2.5.1 on OS X. Same drill: try "python setup.py build", then "python setup.py test". Report bugs to janssen at parc.com. Thanks to Collin Winter for reporting this problem with 2.5.1. Bill From aahz at pythoncraft.com Wed Sep 12 03:34:12 2007 From: aahz at pythoncraft.com (Aahz) Date: Tue, 11 Sep 2007 18:34:12 -0700 Subject: [Python-Dev] frozenset C API? In-Reply-To: <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> Message-ID: <20070912013412.GB14034@panix.com> On Thu, Sep 06, 2007, Bill Janssen wrote: > > By the way, I think the hostname matching provisions of 2818 (which > is, after all, only an informational RFC, not a standard) are poorly > thought out. Many machines have more hostnames than you can shake a > stick at, and often provide certs with the wrong hostname in them > (usually because they have no way to determine what the *right* > hostname is, from inside that machine). ...which is why you pretty much need to have a canonical hostname mapped to each IP you're using on a machine. Basically, you need to map the hostname you intend to use to an IP, then do reverse-DNS to find out whether the hostname is in fact the canonical hostname. If not, you're using the wrong hostname on your cert. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer http://www.lysator.liu.se/c/ten-commandments.html From martin at v.loewis.de Wed Sep 12 09:30:10 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 12 Sep 2007 09:30:10 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <46E6B928.1090603@v.loewis.de> Message-ID: <46E79582.2080300@v.loewis.de> > But this has been raised before, and was rejected as not worth the > amount of work that would be required to achieve it. In my understanding, there is an important difference between "it was rejected", and "it was not done". Regards, Martin From martin at v.loewis.de Wed Sep 12 09:32:13 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 12 Sep 2007 09:32:13 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E722BF.8000807@canterbury.ac.nz> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> Message-ID: <46E795FD.1070103@v.loewis.de> >> Sure - but those things don't get modified that often, except for their >> reference count. > > The reference count is the killer, though -- you have > to lock the object even to do that. And it happens > a LOT, to all objects, including immutable ones. Now we are getting into details: you do NOT have to lock an object to modify its reference count. An atomic increment/decrement operation is enough. Regards, Martin From martin at v.loewis.de Wed Sep 12 09:36:30 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 12 Sep 2007 09:36:30 +0200 Subject: [Python-Dev] what versions of Python don't have the "addr" field in the socket object? In-Reply-To: <07Sep11.173003pdt."57996"@synergy1.parc.xerox.com> References: <07Sep11.173003pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E796FE.3060409@v.loewis.de> > I believe this is only in 2.5.1 and later -- can > anyone confirm that? That's correct. Martin From ncoghlan at gmail.com Wed Sep 12 11:19:35 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Sep 2007 19:19:35 +1000 Subject: [Python-Dev] Making directories and zip files executable In-Reply-To: References: <46E6A875.9020208@gmail.com> Message-ID: <46E7AF27.6090300@gmail.com> Guido van Rossum wrote: > I could use a refresher on how PJE's patch solves Andy's problem. I'm not sure if you're asking about how you would execute a zip file after the patch has been applied, or about the mechanics of how the patch works. PJE's last post covered the former question, so I'll cover the gist of the latter. The patch works by passing the script argument to the import machinery to see if it is recognised as a valid sys.path entry (i.e. either a directory or a zip file in a default Python installation). If it is, then add that location to the front of sys.path and use the -m switch support to execute the "__main__" module directly. If the filename passed in isn't recognised as a sys.path entry, then it is executed as a script as normal. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From jason.orendorff at gmail.com Wed Sep 12 16:47:33 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Wed, 12 Sep 2007 10:47:33 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E795FD.1070103@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> Message-ID: On 9/12/07, "Martin v. L?wis" wrote: > Now we are getting into details: you do NOT have to lock > an object to modify its reference count. An atomic > increment/decrement operation is enough. One could measure the performance hit incurred by using atomic operations for refcounting by hacking a few macros -- right? Deferred reference counting (DRC for short) might help... http://www.memorymanagement.org/glossary/d.html#deferred.reference.counting I can explain a little more how this works if anyone's interested. DRC basically eliminates reference counting for locals--that is, pointers from the stack to an object. An object becomes refcounted only when some other object gets a pointer to it. The drawback is that destructors aren't called quite as promptly as in true refcounting. (They're still called in the right order, though--barring cycles, an object's destructor is called before its children's destructors.) What counts as "stack" is up to the implementation; typically it means "the C stack". This could be used to eliminate most refcounting in C code, although listobject.c would keep it. The amount of per-platform assembly code needed is surprisingly small (and won't change, once you've written it--the Tamarin JavaScript VM does this). You could go further and treat the Python f_locals and interpreter stack as "stack". I think this would eliminate all refcounting in the interpreter. Of course, it complicates matters that f_locals is actually an object visible from Python. Just a thought, not a demand, please don't flame me, -j From skip at pobox.com Wed Sep 12 17:31:59 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 12 Sep 2007 10:31:59 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <46E6B928.1090603@v.loewis.de> Message-ID: <18152.1647.855781.953782@montanaro.dyndns.org> Brett> We should probably document where all of these globals lists are Brett> instead of relying on looking for all file level static Brett> declarations or something. I smell a wiki page. Skip Brett> Or would there be benefit to moving things like this to the Brett> interpreter struct so that threads within a single interpreter Brett> call are locked but interpreters can act much more independently? Would that simplify things all that much? All containers would probably still rely on the GIL. Also, all objects rely on it to do reference count increment/decrement as I recall. Skip From skip at pobox.com Wed Sep 12 17:38:47 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 12 Sep 2007 10:38:47 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E795FD.1070103@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> Message-ID: <18152.2055.258930.576257@montanaro.dyndns.org> Martin> Now we are getting into details: you do NOT have to lock an Martin> object to modify its reference count. An atomic Martin> increment/decrement operation is enough. Implemented in asm I suspect? For common CPUs this could just be part of the normal Python distribution. For uncommon ones this could use a lock until someone gets around to writing the necessary couple lines of assembler. Skip From janssen at parc.com Wed Sep 12 20:03:24 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 11:03:24 PDT Subject: [Python-Dev] Windows package for new SSL package? Message-ID: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> I can't figure out how to build a Windows package for ssl-1.1.tar.gz, and probably don't have the tools to do it anyway. I presume that both a Windows machine and Visual Studio (because there's a C extension) is required? Anyone out there who's interested in the challenge? It's at http://www.parc.com/janssen/transient/ssl-1.1.tar.gz. Incidentally, there's no documentation in the package; instead, just use the development documentation at http://docs.python.org/dev/library/ssl.html. Bill From janssen at parc.com Wed Sep 12 20:05:54 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 11:05:54 PDT Subject: [Python-Dev] SSL-protected server on python.org for testing? Message-ID: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com> The SSL tests currently use SSL-protected ports on gmail.com and Verisign for testing. That's not what they are for; I think we should shift to using SSL-protected ports on python.org somewhere. Are there any HTTPS servers, or SSL-protected POP or IMAP servers, currently running on python.org already that I could use? The "use" is an SSL handshake with the server, once or twice per test run. Bill From janssen at parc.com Wed Sep 12 20:12:24 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 11:12:24 PDT Subject: [Python-Dev] frozenset C API? In-Reply-To: <20070912013412.GB14034@panix.com> References: <-4762611594645938717@unknownmsgid> <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> <20070912013412.GB14034@panix.com> Message-ID: <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com> > > By the way, I think the hostname matching provisions of 2818 (which > > is, after all, only an informational RFC, not a standard) are poorly > > thought out. Many machines have more hostnames than you can shake a > > stick at, and often provide certs with the wrong hostname in them > > (usually because they have no way to determine what the *right* > > hostname is, from inside that machine). > > ...which is why you pretty much need to have a canonical hostname mapped > to each IP you're using on a machine. Basically, you need to map the > hostname you intend to use to an IP, then do reverse-DNS to find out > whether the hostname is in fact the canonical hostname. If not, you're > using the wrong hostname on your cert. Yep. The problem is having a particular service know which certificate it should choose to use, and also to know when the network connectivity has changed. Usually, server ports are bound to wildcard IP addresses, so that they can still be reached even if the network connectivity changes (particularly true for servers running on laptops, or the Python server I'm running on my iPhone). The server has no way of knowing which IP address the client knows it as, and no way of knowing which of its multiple certificates to present, so that the name in the cert will match the name the client thought it was using. Or am I wrong? Is there some interface in the socket API which gives this information? Bill From barry at python.org Wed Sep 12 20:25:59 2007 From: barry at python.org (Barry Warsaw) Date: Wed, 12 Sep 2007 14:25:59 -0400 Subject: [Python-Dev] SSL-protected server on python.org for testing? In-Reply-To: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com> References: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 12, 2007, at 2:05 PM, Bill Janssen wrote: > The SSL tests currently use SSL-protected ports on gmail.com and > Verisign for testing. That's not what they are for; I think we should > shift to using SSL-protected ports on python.org somewhere. Are there > any HTTPS servers, or SSL-protected POP or IMAP servers, currently > running on python.org already that I could use? The "use" is an SSL > handshake with the server, once or twice per test run. svn.python.org? - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRugvN3EjvBPtnXfVAQJejQP+JdsEJroDOWdN53cDvtdahJ/2AheObhhb UEdOaucxW3i+odPEUmjLncVq70IQJt1T4YQuZ835iT+k6OkIoB+eaTU3OqslB6bv JKMYsb0Jxdl/plqWld/6WBSH+fCGB5x+JrxelKdu2xVdF8i1YHU+FehK2y1k1kZi Bc9hZ7kONN8= =Uamc -----END PGP SIGNATURE----- From janssen at parc.com Wed Sep 12 20:38:23 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 11:38:23 PDT Subject: [Python-Dev] SSL-protected server on python.org for testing? In-Reply-To: References: <07Sep12.110600pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep12.113830pdt."57996"@synergy1.parc.xerox.com> Yes, port 443 on svn.python.org seems to work for this purpose. Everyone OK with that? If so, I'll change the SSL test code. Bill From guido at python.org Wed Sep 12 20:39:36 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 12 Sep 2007 11:39:36 -0700 Subject: [Python-Dev] Making directories and zip files executable In-Reply-To: <46E7AF27.6090300@gmail.com> References: <46E6A875.9020208@gmail.com> <46E7AF27.6090300@gmail.com> Message-ID: On 9/12/07, Nick Coghlan wrote: > Guido van Rossum wrote: > > I could use a refresher on how PJE's patch solves Andy's problem. > > I'm not sure if you're asking about how you would execute a zip file > after the patch has been applied, or about the mechanics of how the > patch works. PJE's last post covered the former question, so I'll cover > the gist of the latter. > > The patch works by passing the script argument to the import machinery > to see if it is recognised as a valid sys.path entry (i.e. either a > directory or a zip file in a default Python installation). Ah, this is the crux! I didn't understand Phillips wording of "an importable path". I still didn't understand your wording "recognised as a valid sys.path entry"; both wordings suggest a link between sys.argv[0] and the current value of sys.path, which isn't the case -- it is whether they are recognized by the "meta import hook"! This only became clear after I re-read the patch with Phillip's and your words in the back of my head. I now like and approve of the patch, and said so on the tracker. --Guido > If it is, > then add that location to the front of sys.path and use the -m switch > support to execute the "__main__" module directly. > > If the filename passed in isn't recognised as a sys.path entry, then it > is executed as a script as normal. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > http://www.boredomandlaziness.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From mhammond at skippinet.com.au Thu Sep 13 01:22:48 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu, 13 Sep 2007 09:22:48 +1000 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> Message-ID: <01bf01c7f593$d7beecc0$873cc640$@com.au> > I can't figure out how to build a Windows package for ssl-1.1.tar.gz, > and probably don't have the tools to do it anyway. I presume that > both a Windows machine and Visual Studio (because there's a C > extension) is required? > > Anyone out there who's interested in the challenge? > > It's at http://www.parc.com/janssen/transient/ssl-1.1.tar.gz. > I had a bit of a look at this. I think I managed to get it building: * find_ssl() is along way from working on Windows. Python itself uses magic to locate an SSL directory in the main Python directory's parent. On my system, this is c:\src\openssl-0.9.7e, but obviously that could be almost anywhere, and with almost any name. See PCBuild\build_ssl.py and PCBuild\_ssl.mak for the gory details. I'm not sure how you would like to approach this (insist on an environment variable for the top-level SSL dir name?), but in the meantime I hacked find_ssl() to: ssl_incs = [r"\src\openssl-0.9.7e\inc32",] ssl_libs = [r"\src\openssl-0.9.7e\out32"] return ssl_incs, ssl_libs, ["libeay32", "ssleay32", "gdi32", "wsock32"] * The call to find_ssl() appears to discard the 3rd param: ssl_incs, ssl_libs, libs = find_ssl() ... ext_modules=[Extension('ssl._ssl2', ['ssl/_ssl2.c'], include_dirs = ssl_incs + [socket_inc], library_dirs = ssl_libs, libraries = ['ssl', 'crypto'], depends = ['ssl/socketmodule.h'])], The 'libraries =' line probably means to pass 'libs' rather than the literal. * The "depends = ['ssl/socketmodule.h']" fails for me - no header of that name exists in the ssl directory in your archive. After those changes I was able to get it built and tested: """ Ran 15 tests in 3.157s OK """ Hope this helps, Mark From janssen at parc.com Thu Sep 13 03:59:54 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 18:59:54 PDT Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <01bf01c7f593$d7beecc0$873cc640$@com.au> References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> Message-ID: <07Sep12.190001pdt."57996"@synergy1.parc.xerox.com> Thanks, Mark (and David, who replied to me personally). I'll update the setup.py files with your suggestions and do a 1.2 (with more metadata in it, too). Looks like the functionality is working for people, even if the build is still a bit flakey. Bill From janssen at parc.com Thu Sep 13 04:57:17 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 19:57:17 PDT Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <01bf01c7f593$d7beecc0$873cc640$@com.au> References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> Message-ID: <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> > * find_ssl() is along way from working on Windows. Python itself uses magic > to locate an SSL directory in the main Python directory's parent. On my > system, this is c:\src\openssl-0.9.7e, but obviously that could be almost > anywhere, and with almost any name. See PCBuild\build_ssl.py and > PCBuild\_ssl.mak for the gory details. I'm not sure how you would like to > approach this (insist on an environment variable for the top-level SSL dir > name?) Can't we look in the registry for this? We have a working Python; perhaps we can just use a Windows-specific registry lookup to find OpenSSL? (I'm just blue-skying here; I have no clue how things work on Windows.) Bill From mhammond at skippinet.com.au Thu Sep 13 05:03:50 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Thu, 13 Sep 2007 13:03:50 +1000 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> Message-ID: <01da01c7f5b2$cadca4b0$6095ee10$@com.au> > > * find_ssl() is along way from working on Windows. Python itself > uses magic > > to locate an SSL directory in the main Python directory's parent. On > my > > system, this is c:\src\openssl-0.9.7e, but obviously that could be > almost > > anywhere, and with almost any name. See PCBuild\build_ssl.py and > > PCBuild\_ssl.mak for the gory details. I'm not sure how you would > like to > > approach this (insist on an environment variable for the top-level > SSL dir > > name?) > > Can't we look in the registry for this? We have a working Python; > perhaps we can just use a Windows-specific registry lookup to find > OpenSSL? (I'm just blue-skying here; I have no clue how things work > on Windows.) Not really. Python itself, when building _ssl, doesn't look for a binary install of openssl, but instead a source directory and a working perl interpreter so an openssl can be built suitable for linking with Python. This source directory is just downloaded and unzipped - no registration takes place, and any binaries that may be built are ignored (we just want the .h and .lib files) It might be possible to try and use build_ssl.py to locate the openssl directory, but this will still require that someone building it has Python built from source - I'm fairly sure that someone installing a Python binary will not have build_ssl.py, nor are they likely to have a suitable openssl directory or installation just "hanging around" either. Mark From janssen at parc.com Thu Sep 13 05:27:58 2007 From: janssen at parc.com (Bill Janssen) Date: Wed, 12 Sep 2007 20:27:58 PDT Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <01da01c7f5b2$cadca4b0$6095ee10$@com.au> References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> Message-ID: <07Sep12.202807pdt."57996"@synergy1.parc.xerox.com> > > Can't we look in the registry for this? We have a working Python; > > perhaps we can just use a Windows-specific registry lookup to find > > OpenSSL? (I'm just blue-skying here; I have no clue how things work > > on Windows.) > > Not really. Python itself, when building _ssl, doesn't look for a binary > install of openssl, but instead a source directory and a working perl > interpreter so an openssl can be built suitable for linking with Python. > This source directory is just downloaded and unzipped - no registration > takes place, and any binaries that may be built are ignored (we just want > the .h and .lib files) In that case, I think your idea of just hard-coding a path is probably the right thing to do. I'll add a note that this is how you need to do it if you are going to try "python setup.py build". Presumably the binary then built with "python setup.py bdist" will install on a Windows machine regardless of where OpenSSL is installed? Bill From db3l.net at gmail.com Thu Sep 13 05:41:57 2007 From: db3l.net at gmail.com (David Bolen) Date: Wed, 12 Sep 2007 23:41:57 -0400 Subject: [Python-Dev] Windows package for new SSL package? References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> Message-ID: "Mark Hammond" writes: > It might be possible to try and use build_ssl.py to locate the openssl > directory, but this will still require that someone building it has Python > built from source - I'm fairly sure that someone installing a Python binary > will not have build_ssl.py, nor are they likely to have a suitable openssl > directory or installation just "hanging around" either. Yep - even if a Windows user has an appropriate development environment in general (and can build most standalone extensions with just a binary Python install), as you say the odds are pretty small they'd have an OpenSSL source tree around, with libraries built. At the same time, I suspect that only a small percentage of Windows users will want to rebuild the extension - rather they'll just want a binary installer, something not uncommon to be published for Windows users of many extension modules. So that pushes the problem upstream a bit where having a Python development tree might be more common or familiar. Rather than a lot of complexity to cater to that small percentage, I'd probably just make setup.py need an explicit configuration - editing, or perhaps environment variable - for the location of the root of the OpenSSL source tree. As you say, there's no guaranteed way to find it otherwise, although I suppose it might try checking relative to the Python executable (along the same lines as build_ssl.py) in case it's being built from within the source tree. Adding some comments that following instructions to build Python from source (or at least the standard _ssl module) will yield just such a tree should be a simple enough as a reference for those who need it. The setup.py does also need to understand the different library names (and required system libraries) to build properly under Windows, as you've already highlighted, but that should be relatively easy to vary by platform. -- David From aahz at pythoncraft.com Thu Sep 13 06:26:06 2007 From: aahz at pythoncraft.com (Aahz) Date: Wed, 12 Sep 2007 21:26:06 -0700 Subject: [Python-Dev] SSL certs In-Reply-To: <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com> References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> <20070912013412.GB14034@panix.com> <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com> Message-ID: <20070913042606.GB27547@panix.com> On Wed, Sep 12, 2007, Bill Janssen wrote: > >>> By the way, I think the hostname matching provisions of 2818 (which >>> is, after all, only an informational RFC, not a standard) are poorly >>> thought out. Many machines have more hostnames than you can shake a >>> stick at, and often provide certs with the wrong hostname in them >>> (usually because they have no way to determine what the *right* >>> hostname is, from inside that machine). >> >> ...which is why you pretty much need to have a canonical hostname mapped >> to each IP you're using on a machine. Basically, you need to map the >> hostname you intend to use to an IP, then do reverse-DNS to find out >> whether the hostname is in fact the canonical hostname. If not, you're >> using the wrong hostname on your cert. > > Yep. The problem is having a particular service know which > certificate it should choose to use, and also to know when the network > connectivity has changed. Usually, server ports are bound to wildcard > IP addresses, so that they can still be reached even if the network > connectivity changes (particularly true for servers running on > laptops, or the Python server I'm running on my iPhone). The server > has no way of knowing which IP address the client knows it as, and no > way of knowing which of its multiple certificates to present, so that > the name in the cert will match the name the client thought it was > using. My understanding is that the client tells the server which hostname it wants to use; the server should then pass down that information. That's how virtual hosting works in the first place. The only difference with SSL is that the hostname must have a unique IP address, so that when the client does a reverse DNS to validate the IP address presented by the SSL certificate, it all comes together correctly. There are, of course, wildcard certs; I don't understand how those work. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Many customs in this life persist because they ease friction and promote productivity as a result of universal agreement, and whether they are precisely the optimal choices is much less important." --Henry Spencer http://www.lysator.liu.se/c/ten-commandments.html From surekap at gmail.com Thu Sep 13 06:30:43 2007 From: surekap at gmail.com (Prateek Sureka) Date: Thu, 13 Sep 2007 10:00:43 +0530 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <18152.2055.258930.576257@montanaro.dyndns.org> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> Message-ID: <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> I was reading GvR's post on this and came up with a theory on how to tackle the problem. I ended up putting it in a blog post. http://www.brainwavelive.com/blog/index.php?/archives/12-Suggestion- for-removing-the-Python-Global-Interpreter-Lock.html What do you think? Prateek On Sep 12, 2007, at 9:08 PM, skip at pobox.com wrote: > > Martin> Now we are getting into details: you do NOT have to > lock an > Martin> object to modify its reference count. An atomic > Martin> increment/decrement operation is enough. > > Implemented in asm I suspect? For common CPUs this could just be > part of > the normal Python distribution. For uncommon ones this could use a > lock > until someone gets around to writing the necessary couple lines of > assembler. > > Skip > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ > surekap%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/f65a8897/attachment.htm From martin at v.loewis.de Thu Sep 13 06:42:18 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Sep 2007 06:42:18 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> Message-ID: <46E8BFAA.5090008@v.loewis.de> > What do you think? I think what you are describing is the situation of today, except in a less-performant way. The kernel *already* implements such a "synchronization server", except that all CPUs can act as such. You write "Since we are guaranteeing that synchronized code is running on a single core, it is the equivalent of a lock at the cost of a context switch." This is precisely what a lock costs today: a context switch. Since the Python interpreter is synchronized all of the time, it would completely run on the synchronization server all of the time. As you identify, that single CPU might get overloaded, so your scheme would give no benefits (since Python code could never run in parallel), and only disadvantages (since multiple Python interpreters today can run on multiple CPUs, but could not anymore under your scheme). Regards, Martin From db3l.net at gmail.com Thu Sep 13 06:43:44 2007 From: db3l.net at gmail.com (David Bolen) Date: Thu, 13 Sep 2007 00:43:44 -0400 Subject: [Python-Dev] Windows package for new SSL package? References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <07Sep12.202807pdt."57996"@synergy1.parc.xerox.com> Message-ID: Bill Janssen writes: > In that case, I think your idea of just hard-coding a path is probably > the right thing to do. I'll add a note that this is how you need to do > it if you are going to try "python setup.py build". Presumably the > binary then built with "python setup.py bdist" will install on a Windows > machine regardless of where OpenSSL is installed? Yes (though typically bdist_wininst for the Windows installer), but perhaps not for the reason you think. I think where there's probably a small disconnect here is that, there really isn't an OpenSSL "installed" on the end user's machine. Well, there could be, but Python isn't using it. The OpenSSL library is statically linked as part of the _ssl.pyd module, as it will be with your _ssl2.pyd module. (That's also why there is no OpenSSL to "find" in your setup even with Python installed - at least not any libraries you can use). In other words, both the standard and your extension module on Windows bring along their own OpenSSL. -- David From tulloss2 at uiuc.edu Thu Sep 13 09:08:35 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Thu, 13 Sep 2007 02:08:35 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> Message-ID: <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> > What do you think? > I'm going to have to agree with Martin here, although I'm not sure I understand what you're saying entirely. Perhaps if you explained where the benefits of this approach come from, it would clear up what you're thinking. After a few days of thought, I'm starting to realize the difficulty of maintaining compatibility with existing C extensions after removing the GIL. The possible C-level side effects are very difficult to work around without kernel or hardware level transaction support. I see a couple of approaches that might work (though I probably haven't thought of everything). 1. Use message passing and transactions. Put every module into its own tasklet that ends up getting owned by one thread or another. Every call to an object that is owned by that module is put into a module wide message queue and delivered sequentially to its objects. All this does is serialize requests to objects implemented in C to slightly mitigate the need to lock. Then use transactions to protect any python object. You still have the problem of C side effects going unnoticed (IE Thread A executes function, object sets c-state in a certain way, Thread B calls the same function, changes all the C-state, A reacts to return value that no longer reflects on the actual state). So, this doesn't actually work, but its close since python objects will remain consistent w/transactions and conflicting C-code won't execute simultaneously. 2. Do it perl style. Perl just spawns off multiple interpreters and doesn't share state between them. This would require cleaning up what state belongs where, and probably providing some global state lock free. For instance, all the numbers, letters, and None are read only, so we could probably work out a way to share them between threads. In fact, any python global could be read only until it is written to. Then it belongs to the thread that wrote to it and is updated in the other threads via some sort of cache-coherency protocol. I haven't really wrapped my head around how C extensions would play with this yet, but essentially code operating in different threads would be operating on different copies of the modules. That seems fair to me. 3. Come up with an elegant way of handling multiple python processes. Of course, this has some downsides. I don't really want to pickle python objects around if I decide they need to be in another address space, which I would probably occasionally need to do if I abstracted away the fact that a bunch of interpreters had been spawned off. 4. Remove the GIL, use transactions for python objects, and adapt all C-extensions to be thread safe. Woo. I'll keep kicking around ideas for a while; hopefully they'll become more refined as I explore the code more. Justin PS. A good paper on how hardware transactional memory could help us out: http://www-faculty.cs.uiuc.edu/~zilles/papers/python_htm.dls2006.pdf A few of you have probably read this already. Martin is even acknowledged, but it was news to me! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/544d4416/attachment.htm From martin at v.loewis.de Thu Sep 13 09:20:16 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Sep 2007 09:20:16 +0200 Subject: [Python-Dev] SSL certs In-Reply-To: <20070913042606.GB27547@panix.com> References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> <20070912013412.GB14034@panix.com> <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com> <20070913042606.GB27547@panix.com> Message-ID: <46E8E4B0.60909@v.loewis.de> > My understanding is that the client tells the server which hostname it > wants to use; the server should then pass down that information. That's > how virtual hosting works in the first place. The only difference with > SSL is that the hostname must have a unique IP address, so that when the > client does a reverse DNS to validate the IP address presented by the SSL > certificate, it all comes together correctly. Unfortunately, it does not quite work that way. The client tells the server what hostname to use only *after* the SSL connection has been established, and certificates being exchanged (in the Host: header). So the Host: header cannot be used for selecting what certificate to present to the client. *That* is the reason why people typically assume they have to have different IP addresses for different SSL hosts: certificate selection must be done based on IP address (which is already known before the SSL handshaking starts). There is no need for the client to do a reverse name lookup, and indeed, the client should *not* do a reverse DNS lookup to check the server's identity. Instead, it should check the host name it wants to talk to against the certificate. However, there is an alternative to using multiple IP addresses: one could also use multiple "subject alternative names", and create a certificate that lists them all. > There are, of course, wildcard certs; I don't understand how those work. The same way: the client does *not* perform a reverse name lookup. Instead, it just matches the hostname against the name in the certificate; if the certificate is for *.python.org (say) and the client wants to talk to pypi.python.org, it matches, and hostname verification passes. It would also pass if the client wanted to talk to cheeseshop.python.org, or wiki.python.org (which all have the same IP address). Regards, Martin From lists at cheimes.de Thu Sep 13 12:11:21 2007 From: lists at cheimes.de (Christian Heimes) Date: Thu, 13 Sep 2007 12:11:21 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070911192035.94CE33A40D7@sparrow.telecommunity.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.co m> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> Message-ID: Phillip J. Eby wrote: > It's not just caches and counters. It's also every built-in type > structure, builtin module, builtin function... any Python object > that's a built-in, period. That includes things like None, True, and False. > > Caches would include such things as the pre-created integers -100 > through 255, the 1-byte character strings for chr(0)-chr(255), and > the interned strings cache, to name a few. > > Most of these things I've mentioned are truly global, and not > specific to an individual interpreter. Pardon my ignorance but why does Python do reference counting for truly global and static objects like None, True, False, small and cached integers, sys and other builtins? If I understand it correctly these objects are never garbaged collected (at least they shouldn't) until the interpreter exits. Wouldn't it decrease the overhead and increase speed when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects? Christian From nd at perlig.de Thu Sep 13 12:19:21 2007 From: nd at perlig.de (=?iso-8859-1?q?Andr=E9_Malo?=) Date: Thu, 13 Sep 2007 12:19:21 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> Message-ID: <200709131219.21152.nd@perlig.de> * Christian Heimes wrote: > Pardon my ignorance but why does Python do reference counting for truly > global and static objects like None, True, False, small and cached > integers, sys and other builtins? If I understand it correctly these > objects are never garbaged collected (at least they shouldn't) until the > interpreter exits. Wouldn't it decrease the overhead and increase speed > when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects? The check what kind of object you have takes time, too. Right now, just counting up or down is most likely faster than that check on every refcount operation. nd From p.f.moore at gmail.com Thu Sep 13 12:58:44 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Sep 2007 11:58:44 +0100 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> Message-ID: <79990c6b0709130358w54da4bag516d10ef301ed16a@mail.gmail.com> On 13/09/2007, David Bolen wrote: > "Mark Hammond" writes: > > > It might be possible to try and use build_ssl.py to locate the openssl > > directory, but this will still require that someone building it has Python > > built from source - I'm fairly sure that someone installing a Python binary > > will not have build_ssl.py, nor are they likely to have a suitable openssl > > directory or installation just "hanging around" either. > > Yep - even if a Windows user has an appropriate development > environment in general (and can build most standalone extensions with > just a binary Python install), as you say the odds are pretty small > they'd have an OpenSSL source tree around, with libraries built. It is possible to build extensions on Windows using the mingw gcc toolchain. Users doing this may well have some or all of the gnuwin32 (http://gnuwin32.sf.net) utilities installed. Gnuwin32 includes openssl (both headers, link libraries, and DLLs). It seems to me a perfectly reasonable option for someone wanting to build the SSL extension to grab mingw and gnuwin32 openssl. I tried building with this config last night, but didn't have the time to deal with hacking the setup.py - I see someone else has covered this. I'll have another go with the new version when I get a chance. > At the same time, I suspect that only a small percentage of Windows > users will want to rebuild the extension - rather they'll just want a > binary installer, something not uncommon to be published for Windows > users of many extension modules. So that pushes the problem upstream > a bit where having a Python development tree might be more common or > familiar. Agreed. I assume Windows binary builds will be published, so it's only early adopters, or people who want to work with their own builds for some other reason, who might care. Paul. From p.f.moore at gmail.com Thu Sep 13 13:02:55 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Sep 2007 12:02:55 +0100 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> Message-ID: <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> On 13/09/2007, David Bolen wrote: > Bill Janssen writes: > > > In that case, I think your idea of just hard-coding a path is probably > > the right thing to do. I'll add a note that this is how you need to do > > it if you are going to try "python setup.py build". Presumably the > > binary then built with "python setup.py bdist" will install on a Windows > > machine regardless of where OpenSSL is installed? > > Yes (though typically bdist_wininst for the Windows installer), but > perhaps not for the reason you think. > > I think where there's probably a small disconnect here is that, there > really isn't an OpenSSL "installed" on the end user's machine. Well, > there could be, but Python isn't using it. The OpenSSL library is > statically linked as part of the _ssl.pyd module, as it will be with > your _ssl2.pyd module. (That's also why there is no OpenSSL to "find" > in your setup even with Python installed - at least not any libraries > you can use). That's not 100% true, is it? If I use mingw and Gnuwin32 openssl, I believe the default is a dynamic link of openssl (it depends on the import library used, and gnuwin32 supplies dynamic libs by default). So the openssl DLLs need to be on the user's PATH for the extension module to work. > In other words, both the standard and your extension module on Windows > bring along their own OpenSSL. For the extension, you may need to (1) document that the user needs to have the openssl DLLs on their PATH and possibly (1a) supply a zipfile with the necessary DLLs as a supplemental download, or (2) arrange for the openssl DLLs to be included in the extension installer, and installed alongside the .pyd file. Alternatively, it *may* be possible with setup.py magic to force a static openssl link (but would that need hard coding for the gnuwin32 naming conventions?) Paul. From jon+python-dev at unequivocal.co.uk Thu Sep 13 12:55:27 2007 From: jon+python-dev at unequivocal.co.uk (Jon Ribbens) Date: Thu, 13 Sep 2007 11:55:27 +0100 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <200709131219.21152.nd@perlig.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> Message-ID: <20070913105527.GH32061@snowy.squish.net> On Thu, Sep 13, 2007 at 12:19:21PM +0200, Andr? Malo wrote: > > Pardon my ignorance but why does Python do reference counting for truly > > global and static objects like None, True, False, small and cached > > integers, sys and other builtins? If I understand it correctly these > > objects are never garbaged collected (at least they shouldn't) until the > > interpreter exits. Wouldn't it decrease the overhead and increase speed > > when Py_INCREF and Py_DECREF are NOOPs for static and immutable objects? > > The check what kind of object you have takes time, too. Right now, just > counting up or down is most likely faster than that check on every refcount > operation. To put it another way, would it actually matter if the reference counts for such objects became hopelessly wrong due to non-atomic adjustments? From db3l.net at gmail.com Thu Sep 13 13:14:26 2007 From: db3l.net at gmail.com (David Bolen) Date: Thu, 13 Sep 2007 07:14:26 -0400 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> Message-ID: <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> On 9/13/07, Paul Moore wrote: > On 13/09/2007, David Bolen wrote: > > I think where there's probably a small disconnect here is that, there > > really isn't an OpenSSL "installed" on the end user's machine. Well, > > there could be, but Python isn't using it. The OpenSSL library is > > statically linked as part of the _ssl.pyd module, as it will be with > > your _ssl2.pyd module. (That's also why there is no OpenSSL to "find" > > in your setup even with Python installed - at least not any libraries > > you can use). > > That's not 100% true, is it? If I use mingw and Gnuwin32 openssl, I > believe the default is a dynamic link of openssl (it depends on the > import library used, and gnuwin32 supplies dynamic libs by default). > So the openssl DLLs need to be on the user's PATH for the extension > module to work. That's a fair point - my comments are all related to the standard Python distribution and building extensions with the VS.NET compiler (including the binary installer I had built for Bill). > For the extension, you may need to (1) document that the user needs to > have the openssl DLLs on their PATH and possibly (1a) supply a zipfile > with the necessary DLLs as a supplemental download, or (2) arrange for > the openssl DLLs to be included in the extension installer, and > installed alongside the .pyd file. If we're talking about the construction of binary Windows installers, I'd just suggest that they get built as the built-in SSL module does, including the static linking with a pure Windows OpenSSL build, which is a bit simpler for the typical end user and has no other external requirements. Of course, that certainly doesn't stop a person who has already set up their system for using mingw for extensions from doing their own compilation, although it does raise a question as to whether the setup.py would need some further adjustments to cover that case most cleanly. -- David From martin at v.loewis.de Thu Sep 13 13:15:39 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Sep 2007 13:15:39 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070913105527.GH32061@snowy.squish.net> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> Message-ID: <46E91BDB.7070601@v.loewis.de> > To put it another way, would it actually matter if the reference > counts for such objects became hopelessly wrong due to non-atomic > adjustments? If they drop to zero (which may happen due to non-atomic adjustments), Python will try to release the static memory, which will crash the malloc implementation. Regards, Martin From jon+python-dev at unequivocal.co.uk Thu Sep 13 13:55:38 2007 From: jon+python-dev at unequivocal.co.uk (Jon Ribbens) Date: Thu, 13 Sep 2007 12:55:38 +0100 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E91BDB.7070601@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> Message-ID: <20070913115538.GJ32061@snowy.squish.net> On Thu, Sep 13, 2007 at 01:15:39PM +0200, "Martin v. L?wis" wrote: > > To put it another way, would it actually matter if the reference > > counts for such objects became hopelessly wrong due to non-atomic > > adjustments? > > If they drop to zero (which may happen due to non-atomic adjustments), > Python will try to release the static memory, which will crash the > malloc implementation. That could be avoided by a flag on the object which is checked in free(). I'm just suggesting it as an alternative as it sounds like it might be more efficient than either locking or avoiding having reference counts on these objects (especially if the reference count is initialised to MAX_INT/2 or whatever). From p.f.moore at gmail.com Thu Sep 13 14:21:47 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Sep 2007 13:21:47 +0100 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> Message-ID: <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> On 13/09/2007, David Bolen wrote: > That's a fair point - my comments are all related to the standard > Python distribution and building extensions with the VS.NET compiler > (including the binary installer I had built for Bill). [...] > If we're talking about the construction of binary Windows installers, > I'd just suggest that they get built as the built-in SSL module does, > including the static linking with a pure Windows OpenSSL build, which > is a bit simpler for the typical end user and has no other external > requirements. OK. Building with mingw is a bit of a hobby horse of mine, as the requirement for the (expensive) VS.NET compiler forces many users to rely on binary builds. I know other alternatives can be made to work, but it's often too much pain to bother. I'd much rather extensions which *can* be built using free tools, support actually doing so out of the box. > Of course, that certainly doesn't stop a person who has already set up > their system for using mingw for extensions from doing their own > compilation, although it does raise a question as to whether the > setup.py would need some further adjustments to cover that case most > cleanly. And that's my point. I'd rather work to ensure that mingw works out of the box, than leave things requiring VS.NET for a clean build. It's not relevant here, but I've certainly been in a situation with other extensions where I can't upgrade a Python install because the distributors of a particular extension haven't produced a build for the new version yet (*cough* mod_python *cough*) - and I live in fear of support for some extensions dying, as I'll then have to move away from them or stop upgrading Python. Anyway, philosophy aside, I'll try to make some time in the next few days to get a working setup.py for the SSL package using mingw. Hopefully, Bill will then integrate this and we'll have mingw as a supported option. Paul. From skip at pobox.com Thu Sep 13 15:26:50 2007 From: skip at pobox.com (skip at pobox.com) Date: Thu, 13 Sep 2007 08:26:50 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070913105527.GH32061@snowy.squish.net> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> Message-ID: <18153.15002.76898.448843@montanaro.dyndns.org> Jon> To put it another way, would it actually matter if the reference Jon> counts for such objects became hopelessly wrong due to non-atomic Jon> adjustments? I believe this was suggested and tried by someone (within the last few years). It wasn't any benefit. The costs of special-casing outweighed the costs of uniform reference counting, not to mention the code got more complex. Or something like that. Anyway, it didn't work. Just thinking out loud here, what if ... we use atomic test-and-set to handle reference counting (with a lock for those CPU architectures where we haven't written the necessary assembler fragment), then implement a lock for each mutable type and another for global state (thread state, interpreter state, etc)? Might that be close enough to free threading to provide some benefits, but not so fine-grained that lock contention becomes a bottleneck? Skip From hrvoje.niksic at avl.com Thu Sep 13 17:13:11 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Thu, 13 Sep 2007 17:13:11 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E91BDB.7070601@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> Message-ID: <1189696391.11322.275.camel@localhost> On Thu, 2007-09-13 at 13:15 +0200, "Martin v. L?wis" wrote: > > To put it another way, would it actually matter if the reference > > counts for such objects became hopelessly wrong due to non-atomic > > adjustments? > > If they drop to zero (which may happen due to non-atomic adjustments), > Python will try to release the static memory, which will crash the > malloc implementation. More precisely, Python will call the deallocator appropriate for the object type. If that deallocator does nothing, the object continues to live. Such objects could also start out with a refcount of sys.maxint or so to ensure that calls to the no-op deallocator are unlikely. The part I don't understand is how Python would know which objects are global/static. Testing for such a thing sounds like something that would be slower than atomic incref/decref. From surekap at gmail.com Thu Sep 13 17:30:47 2007 From: surekap at gmail.com (Prateek Sureka) Date: Thu, 13 Sep 2007 21:00:47 +0530 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E8BFAA.5090008@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <46E8BFAA.5090008@v.loewis.de> Message-ID: On Sep 13, 2007, at 10:12 AM, Martin v. L?wis wrote: >> What do you think? > > I think what you are describing is the situation of today, > except in a less-performant way. The kernel *already* > implements such a "synchronization server", except that > all CPUs can act as such. You write > > "Since we are guaranteeing that synchronized code is running on a > single > core, it is the equivalent of a lock at the cost of a context switch." > > This is precisely what a lock costs today: a context switch. > Really? Wouldn't we save some memory allocation overhead (since in my design, the "lock" is a really just simple kernel instruction as opposed to a full blown object) thereby lowering lock overhead (and allowing us to go with finer-grained "locks"? Since we're using an asynch message queue for the synch-server, it sounds like a standard lock-free algorithm. > Since the Python interpreter is synchronized all of the time, it > would completely run on the synchronization server all of the > time. As you identify, that single CPU might get overloaded, so > your scheme would give no benefits (since Python code could never > run in parallel), I think I neglected to mention that the locking would still need to be more fine grained - perhaps only do the context switch around refcounts (and the other places where the GIL is critical). If we can do this in a way that allows simple list comprehensions to run in parallel, that would be really helpful (like a truly parallel map function). > and only disadvantages (since multiple Python > interpreters today can run on multiple CPUs, but could not > anymore under your scheme). > Well, you could still run python code in parallel if you used multiple processes (each process having its own 'synchronization server'). Is that what you meant? On Sep 13, 2007, at 12:38 PM, Justin Tulloss wrote: > > What do you think? > > I'm going to have to agree with Martin here, although I'm not sure > I understand what you're saying entirely. Perhaps if you explained > where the benefits of this approach come from, it would clear up > what you're thinking. Well, my interpretation of the current problem is that removing the GIL has not been productive because of problems with lock contention on multi-core machines. Naturally, we need to make the locking more fine-grained to resolve this. Hopefully we can do so in a way that does not increase the lock overhead (hence my suggestion for a lock free approach using an asynch queue and a core as dedicated server). If we can somehow guarantee all GC operations (which is why the GIL is needed in the first place) run on a single core, we get locking for free without actually having to have threads spinning. regards, Prateek From martin at v.loewis.de Thu Sep 13 17:55:24 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Sep 2007 17:55:24 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <46E8BFAA.5090008@v.loewis.de> Message-ID: <46E95D6C.1060704@v.loewis.de> >> "Since we are guaranteeing that synchronized code is running on a single >> core, it is the equivalent of a lock at the cost of a context switch." >> >> This is precisely what a lock costs today: a context switch. >> > > Really? Wouldn't we save some memory allocation overhead (since in my > design, the "lock" is a really just simple kernel instruction as opposed > to a full blown object) The GIL is a single variable, not larger than 50 Bytes or so. Locking it requires no memory at all in user-space, and might require 8 bytes or so per waiting thread in kernel-space. > thereby lowering lock overhead Why do you think "lock overhead" is related to memory consumption? > Since we're using an asynch message queue for the synch-server, it > sounds like a standard lock-free algorithm. You lost me here. What are you trying to achieve? It's not the lock that people complain about, but that Python runs serially most of the time. > I think I neglected to mention that the locking would still need to be > more fine grained - perhaps only do the context switch around refcounts > (and the other places where the GIL is critical). I think this is the point where I need to say "good luck implementing it". > Well, my interpretation of the current problem is that removing the GIL > has not been productive because of problems with lock contention on > multi-core machines. My guess is that this interpretation is wrong. It was reported that there was a slowdown by a factor of 2 in a single-threaded application. That can't be due to lock contention. > If we can somehow guarantee all GC operations (which is why the GIL is > needed in the first place) No, unless we disagree on what a "GC operation" is. Regards, Martin From janssen at parc.com Thu Sep 13 18:04:00 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 13 Sep 2007 09:04:00 PDT Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: References: <07Sep12.110325pdt."57996"@synergy1.parc.xerox.com> <01bf01c7f593$d7beecc0$873cc640$@com.au> <07Sep12.195721pdt."57996"@synergy1.parc.xerox.com> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <07Sep12.202807pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep13.090402pdt."57996"@synergy1.parc.xerox.com> > In other words, both the standard and your extension module on Windows > bring along their own OpenSSL. I see -- thanks. Bill From janssen at parc.com Thu Sep 13 18:08:23 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 13 Sep 2007 09:08:23 PDT Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> Message-ID: <07Sep13.090827pdt."57996"@synergy1.parc.xerox.com> > Anyway, philosophy aside, I'll try to make some time in the next few > days to get a working setup.py for the SSL package using mingw. > Hopefully, Bill will then integrate this and we'll have mingw as a > supported option. I'll be happy to do that! Bill From surekap at gmail.com Thu Sep 13 18:29:15 2007 From: surekap at gmail.com (Prateek Sureka) Date: Thu, 13 Sep 2007 21:59:15 +0530 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E95D6C.1060704@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <46E8BFAA.5090008@v.loewis.de> <46E95D6C.1060704@v.loewis.de> Message-ID: On Sep 13, 2007, at 9:25 PM, Martin v. L?wis wrote: >>> "Since we are guaranteeing that synchronized code is running on a >>> single >>> core, it is the equivalent of a lock at the cost of a context >>> switch." >>> >>> This is precisely what a lock costs today: a context switch. >>> >> >> Really? Wouldn't we save some memory allocation overhead (since in my >> design, the "lock" is a really just simple kernel instruction as >> opposed >> to a full blown object) > > The GIL is a single variable, not larger than 50 Bytes or so. Locking > it requires no memory at all in user-space, and might require 8 bytes > or so per waiting thread in kernel-space. > >> thereby lowering lock overhead > > Why do you think "lock overhead" is related to memory consumption? Well, it can be one (or both) of two things - 1) memory consumption, 2) cost of acquiring and releasing the locks (which you said is the same as a context switch). Since we've also identified (according to GvR's post: http:// www.artima.com/weblogs/viewpost.jsp?thread=214235) that the slowdown was 2x in a single threaded application (which couldn't be due to lock contention), it must be due to lock overhead (unless the programming was otherwise faulty or there is something else about locks that I don't know about - Martin?). Hence I'm assuming that we need to reduce lock overhead. If acquiring and releasing locks (part of lock overhead) is a simple context switch (and I don't doubt you here), then the only remaining thing to optimize is memory operations related to lock objects. > >> Since we're using an asynch message queue for the synch-server, it >> sounds like a standard lock-free algorithm. > > You lost me here. What are you trying to achieve? It's not the lock > that people complain about, but that Python runs serially most > of the time. http://en.wikipedia.org/wiki/Lock-free_and_wait- free_algorithms#The_lock-free_approach Specifically, i'm trying to achieve the approach using a "deposit request". >> I think I neglected to mention that the locking would still need >> to be >> more fine grained - perhaps only do the context switch around >> refcounts >> (and the other places where the GIL is critical). > > I think this is the point where I need to say "good luck implementing > it". I don't mean to be unhelpful. Its just that this discussion started because people (not me - although I would definitely benefit) showed interest in removing the GIL. >> Well, my interpretation of the current problem is that removing >> the GIL >> has not been productive because of problems with lock contention on >> multi-core machines. > > My guess is that this interpretation is wrong. It was reported that > there was a slowdown by a factor of 2 in a single-threaded > application. > That can't be due to lock contention. I agree with your point Martin (see my analysis above). Regarding lock contention: I'm guessing that if single threaded applications are so badly affected, then the cumulative overhead on multithreaded applications will be even worse. So we need to reduce the overhead. But then since all Python code runs under the GIL - which is a pretty coarse lock, we have to make the new locking more fine-grained (which is what I think the original patch by Greg Stein did). I'm also guessing that if you do that then for each refcount you're going to have to acquire a lock... which happens *very* frequently (and I think by your earlier responses you concur). So that means anytime multiple threads try to access the same object, they will need to do an incref/decref. e.g. If you access a global variable inside a for- loop from multiple threads. >> If we can somehow guarantee all GC operations (which is why the >> GIL is >> needed in the first place) > > No, unless we disagree on what a "GC operation" is. Ok. Other people know more about the specifics of the GIL than I do. However, the main issue with removing the GIL seems to be the reference counting algorithm. That is what I was alluding to. In any case, it is not relevant for the rest of the discussion. regards, Prateek From martin at v.loewis.de Thu Sep 13 18:51:42 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Sep 2007 18:51:42 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <46E8BFAA.5090008@v.loewis.de> <46E95D6C.1060704@v.loewis.de> Message-ID: <46E96A9E.8080305@v.loewis.de> > http://www.artima.com/weblogs/viewpost.jsp?thread=214235) that the > slowdown was 2x in a single threaded application (which couldn't be due > to lock contention), it must be due to lock overhead (unless the > programming was otherwise faulty or there is something else about locks > that I don't know about - Martin?). Hence I'm assuming that we need to > reduce lock overhead. If acquiring and releasing locks (part of lock > overhead) is a simple context switch (and I don't doubt you here), then > the only remaining thing to optimize is memory operations related to > lock objects. I think you are putting too many assumptions on top of each other. It might also have been that the locks in the slow implementation were too fine-grained, and that some performance could have been regained by making them coarser again. >> You lost me here. What are you trying to achieve? It's not the lock >> that people complain about, but that Python runs serially most >> of the time. > > http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms#The_lock-free_approach The asynchronous model assumes that the sender can continue to process data without needing a reply. This is not true for the Python threading model: if the thread needs access to some data structure, it really needs to wait for the result of that access, because that's the semantics of the operations. > Specifically, i'm trying to achieve the approach using a "deposit request". For that to work, you need to produce a list of requests that can be processed asynchronously. I can't see any in the Python interpreter. > I'm also guessing that if > you do that then for each refcount you're going to have to acquire a > lock... which happens *very* frequently (and I think by your earlier > responses you concur). In that it occurs frequently - not in that you have to acquire a lock to modify the refcount. You don't. > Ok. Other people know more about the specifics of the GIL than I do. > However, the main issue with removing the GIL seems to be the reference > counting algorithm. It isn't. Reference counting could be done easily without the GIL. It's rather the container objects, and the global variables, that need protection. Regards, Martin From rhamph at gmail.com Thu Sep 13 19:08:40 2007 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 13 Sep 2007 11:08:40 -0600 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <1189696391.11322.275.camel@localhost> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> Message-ID: On 9/13/07, Hrvoje Nik?i? wrote: > On Thu, 2007-09-13 at 13:15 +0200, "Martin v. L?wis" wrote: > > > To put it another way, would it actually matter if the reference > > > counts for such objects became hopelessly wrong due to non-atomic > > > adjustments? > > > > If they drop to zero (which may happen due to non-atomic adjustments), > > Python will try to release the static memory, which will crash the > > malloc implementation. > > More precisely, Python will call the deallocator appropriate for the > object type. If that deallocator does nothing, the object continues to > live. Such objects could also start out with a refcount of sys.maxint > or so to ensure that calls to the no-op deallocator are unlikely. > > The part I don't understand is how Python would know which objects are > global/static. Testing for such a thing sounds like something that > would be slower than atomic incref/decref. I've explained my experiments here: http://www.artima.com/forums/flat.jsp?forum=106&thread=214235&start=30&msRange=15#279978 Basically though, atomic incref/decref won't work. Once you've got two threads modifying the same location the costs skyrocket. Even without being properly atomic you'll get the same slowdown on x86 (who's cache coherency is fairly strict.) The only two options are: A) Don't modify an object on every incref/decref. Deletion must be delayed. This lets you share (thread-safe) objects. B) Don't share *any* objects. This is a process model (even if they're lightweight like erlang). For the near future, it's much easier to do this using real processes though. Threading is much more powerful, but it remains to be proven that it can be done efficiently. -- Adam Olsen, aka Rhamphoryncus From janssen at parc.com Thu Sep 13 19:15:32 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 13 Sep 2007 10:15:32 PDT Subject: [Python-Dev] SSL certs In-Reply-To: <46E8E4B0.60909@v.loewis.de> References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> <20070912013412.GB14034@panix.com> <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com> <20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de> Message-ID: <07Sep13.101532pdt."57996"@synergy1.parc.xerox.com> > However, there is an alternative to using multiple IP addresses: > one could also use multiple "subject alternative names", and create > a certificate that lists them all. Unfortunately, much of the client code that does the hostname verification is wrapped up in gullible Web browsers or Java HTTPS libraries that swallowed RFC 2818 whole, and not easily accessible by applications. Does any of it recognize and accept "subject alternative name"? It's possible to at least override the default Java client-side hostname verification handling in a new application. And Python is lucky; because there was no client-side hostname verification possible, RFC 2818 hasn't been plastered into the Python standard library :-). Bill From martin at v.loewis.de Thu Sep 13 19:18:43 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 13 Sep 2007 19:18:43 +0200 Subject: [Python-Dev] SSL certs In-Reply-To: <07Sep13.101532pdt."57996"@synergy1.parc.xerox.com> References: <46DDCD7C.40004@v.loewis.de> <46DE3DB8.6000004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <07Sep6.102547pdt."57996"@synergy1.parc.xerox.com> <20070912013412.GB14034@panix.com> <07Sep12.111225pdt."57996"@synergy1.parc.xerox.com> <20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de> <07Sep13.101532pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46E970F3.2080304@v.loewis.de> >> However, there is an alternative to using multiple IP addresses: >> one could also use multiple "subject alternative names", and create >> a certificate that lists them all. > > Unfortunately, much of the client code that does the hostname > verification is wrapped up in gullible Web browsers or Java HTTPS > libraries that swallowed RFC 2818 whole, and not easily accessible by > applications. Does any of it recognize and accept "subject > alternative name"? Works fine with Firefox and MSIE. Regards, Martin From jason.orendorff at gmail.com Thu Sep 13 19:29:23 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 13 Sep 2007 13:29:23 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> Message-ID: On 9/13/07, Justin Tulloss wrote: > 1. Use message passing and transactions. [...] > 2. Do it perl style. [...] > 3. Come up with an elegant way of handling multiple python processes. [...] > 4. Remove the GIL, use transactions for python objects, [...] The SpiderMonkey JavaScript engine takes a very different approach, described here: http://developer.mozilla.org/en/docs/SpiderMonkey_Internals:_Thread_Safety The SpiderMonkey C API threading model should sound familiar: C code can assume that simple operations, like dictionary lookups, are atomic and thread-safe. C code must explicitly JS_SuspendRequest() before doing blocking I/O or number-crunching (just like Py_BEGIN_ALLOW_THREADS). The main difference is that SpiderMonkey's "requests" are not mutually exclusive, the way the GIL is. SpiderMonkey does fine-grained locking for mutable objects to avoid race conditions. The clever bit is that SpiderMonkey's per-object locking does *not* require a context switch or even an atomic instruction, in the usual case where an object is *not* shared among threads. (Programs that embed SpiderMonkey therefore run faster if they manage to ensure that threads share relatively few mutable objects. JavaScript doesn't have modules.) Suppose Python went this route. There would still have to be a "stop-the-world" global lock, because the cycle collector won't work if other threads are going about changing pointers. (SpiderMonkey's GC does the same thing.) Retaining such a lock has another advantage: this change could be completely backward-compatible to extensions. Just use this global lock as the GIL when entering a non-thread-safe extension (all existing extensions would be considered non-thread-safe). This means non-thread-safe extensions would be hoggish (but not much worse than they are already!). Making an existing extension thread-safe would require some thought, but it wouldn't be terribly hard. In the simplest cases, the extension writer could just add a flag to the type saying "ok, I'm thread-safe". Refcounting is another major issue. SpiderMonkey uses GC instead. CPython would need to do atomic increfs/decrefs. (Deferred refcounting could mitigate the cost.) The main drawback (aside from the amount of work) is the patent. SpiderMonkey's license grants a worldwide, royalty-free license, but not under the Python license. I think this could be wrangled, if the technical approach looks worthwhile. -j From janssen at parc.com Thu Sep 13 19:55:36 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 13 Sep 2007 10:55:36 PDT Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? Message-ID: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> I see that base64.b64encode and base64.standard_b64encode no longer introduce line breaks into the output strings, as base64.encodestring does. Shouldn't there be an option on one of them to do this? Bill From janssen at parc.com Thu Sep 13 20:43:01 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 13 Sep 2007 11:43:01 PDT Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? In-Reply-To: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> > I see that base64.b64encode and base64.standard_b64encode no longer > introduce line breaks into the output strings, as base64.encodestring > does. Shouldn't there be an option on one of them to do this? See: http://mail.python.org/pipermail/python-bugs-list/2001-October/007856.html section 2.1 of http://www.faqs.org/rfcs/rfc3548.html Perhaps adding MIME_b64encode() and PEM_b64encode() routines? Or just an optional parameter to standard_b64encode, called "max_line_length", defaulting to 0, meaning no max? Bill From facundobatista at gmail.com Thu Sep 13 21:08:48 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 13 Sep 2007 16:08:48 -0300 Subject: [Python-Dev] Python tickets summary In-Reply-To: References: Message-ID: 2007/9/10, Facundo Batista : > I modified my tool, whichs makes a summary of all the Python tickets > (I moved the source where the info is taken from SF to our Roundup). > > In result, the summary is now, again, updated daily: Taking an idea from Jeff Rush, now there're separate listings in function of the keyword of the ticket. This way, you can see only the Py3k tickets, or the patchs, etc. All the listings are accesible from the same pages, start here: http://www.taniquetil.com.ar/facundo/py_tickets.html (remember to refresh) Any idea to improve these pages is welcomed. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From draghuram at gmail.com Thu Sep 13 21:14:41 2007 From: draghuram at gmail.com (Raghuram Devarakonda) Date: Thu, 13 Sep 2007 15:14:41 -0400 Subject: [Python-Dev] [Tracker-discuss] Python tickets summary In-Reply-To: References: Message-ID: <2c51ecee0709131214q14bcce2du1b3cb077088f063d@mail.gmail.com> On 9/13/07, Facundo Batista wrote: > http://www.taniquetil.com.ar/facundo/py_tickets.html It looks like the column "Opened by" contains information for "Last update by" and vice versa. At least, that is the case with issue 1159. Thanks, Raghu From rhamph at gmail.com Thu Sep 13 21:43:28 2007 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 13 Sep 2007 13:43:28 -0600 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709131145p5ef9aea6geeb3f6d03c8227c7@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> <2cfeb93c0709131145p5ef9aea6geeb3f6d03c8227c7@mail.gmail.com> Message-ID: On 9/13/07, Justin Tulloss wrote: > > > On 9/13/07, Adam Olsen wrote: > > > > Basically though, atomic incref/decref won't work. Once you've got > > two threads modifying the same location the costs skyrocket. Even > > without being properly atomic you'll get the same slowdown on x86 > > (who's cache coherency is fairly strict.) > > > I'm a bit skeptical of the actual costs of atomic incref. For there to be > contention, you would need to have to be modifying the same memory location > at the exact same time. That seems unlikely to ever happen. We can't bank on > it never happening, but an occasionally expensive operation is ok. After > all, it's occasional. That was my initial expectation too. However, the incref *is* a modification. It's not simply an issue of the "exact same time", but anything that causes the cache entries to bounce back and forth and delay the rest of the pipeline. If you have a simple loop like "for i in range(count): 1.0+n", then the 1.0 literal will get shared between threads, and the refcount will get hammered. Is it reasonable to expect that much sharing? I think it is. Literals are an obvious example, but there's also configuration data passed between threads. Pystone seems to have enough sharing to kill performance. And after all, isn't sharing the whole point (even in the definition) of threads? -- Adam Olsen, aka Rhamphoryncus From p.f.moore at gmail.com Thu Sep 13 21:46:10 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Sep 2007 20:46:10 +0100 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <-3701875869749923189@unknownmsgid> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> <-3701875869749923189@unknownmsgid> Message-ID: <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com> On 13/09/2007, Bill Janssen wrote: > > Anyway, philosophy aside, I'll try to make some time in the next few > > days to get a working setup.py for the SSL package using mingw. > > Hopefully, Bill will then integrate this and we'll have mingw as a > > supported option. > > I'll be happy to do that! OK, the following patch to setup.py works for mingw32. You need to set 2 variables - 1. The location where you installed gnuwin32 2. Whether you want a static or dynamic build I've checked both versions on Python 2.5.1 and they pass all tests. Static build is 670k, dynamic is 26k (but depends on the openssl DLLs libssl32.dll and libeay32.dll). Ideally, these should be settable via command line options or something. Also, it would be nice to detect the use of MSVC and do something equivalent (but presumably somewhat different), but I don't know how to detect the type of compiler the user has selected :-( Anyway, I hope it's useful. If nothing else, it offers a way for people to build the module with free software on Windows. I could build some Windows installers if you want, but I'd need to download and install some extra versions of Python, so you'd have to tell me which you want doing (and I can't offer to commit to doing this on a regular basis...) Paul. -------------- next part -------------- A non-text attachment was scrubbed... Name: mingw.patch Type: application/octet-stream Size: 2071 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070913/4a0a3595/attachment-0001.obj From tulloss2 at uiuc.edu Thu Sep 13 22:16:57 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Thu, 13 Sep 2007 15:16:57 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> Message-ID: <2cfeb93c0709131316t23297e4doee08d46601cbfb2c@mail.gmail.com> On 9/13/07, Jason Orendorff wrote: > > On 9/13/07, Justin Tulloss wrote: > > 1. Use message passing and transactions. [...] > > 2. Do it perl style. [...] > > 3. Come up with an elegant way of handling multiple python processes. > [...] > > 4. Remove the GIL, use transactions for python objects, [...] > > The SpiderMonkey JavaScript engine takes a very different approach, > described here: > http://developer.mozilla.org/en/docs/SpiderMonkey_Internals:_Thread_Safety This is basically the same as what perl does, as far as I understand it. There are differences, but they're not that substantial. It's basically the idea of keeping all state separate and treating global access as a special case. I think this is a pretty solid approach, since globals shouldn't be accessed that often. What we would want to do differently is make sure that read-only globals can be cheaply accessed from any thread. Otherwise we lose the performance benefit of having them in the first place. Refcounting is another major issue. SpiderMonkey uses GC instead. > CPython would need to do atomic increfs/decrefs. (Deferred > refcounting could mitigate the cost.) This is definitely something to think about. I don't really have an answer straight off, but there are several things we could try. The main drawback (aside from the amount of work) is the patent. > SpiderMonkey's license grants a worldwide, royalty-free license, but > not under the Python license. I think this could be wrangled, if the > technical approach looks worthwhile. I'm not sure this is an issue. It's not like we would be using the code, just the patented algorithm. Any code we wrote to implement the algorithm would of course be covered under the python license. I'm not a legal guy though. Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/3ea47b36/attachment.htm From janssen at parc.com Thu Sep 13 22:24:01 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 13 Sep 2007 13:24:01 PDT Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> <-3701875869749923189@unknownmsgid> <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com> Message-ID: <07Sep13.132403pdt."57996"@synergy1.parc.xerox.com> > I could build some Windows installers if you want, but I'd need to > download and install some extra versions of Python, so you'd have to > tell me which you want doing (and I can't offer to commit to doing > this on a regular basis...) Thanks, but let's wait till this settles down a bit (say, a week passes without me saying anything about it :-). Then I'll definitely want both VS and mingw versions to upload to the Cheeseshop. But it's not quite ready yet. Bill From p.f.moore at gmail.com Thu Sep 13 22:28:34 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 13 Sep 2007 21:28:34 +0100 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <7888070288790377344@unknownmsgid> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> <-3701875869749923189@unknownmsgid> <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com> <7888070288790377344@unknownmsgid> Message-ID: <79990c6b0709131328s1d042881wf4a4c396b82b720d@mail.gmail.com> On 13/09/2007, Bill Janssen wrote: > > I could build some Windows installers if you want, but I'd need to > > download and install some extra versions of Python, so you'd have to > > tell me which you want doing (and I can't offer to commit to doing > > this on a regular basis...) > > Thanks, but let's wait till this settles down a bit (say, a week > passes without me saying anything about it :-). Then I'll definitely > want both VS and mingw versions to upload to the Cheeseshop. But > it's not quite ready yet. OK, ignore my other message then (except as an indication that I can build them when you're ready :-)). You don't need VS and mingw binary installers, though - the mingw ones will work for any Python (ignoring specialised custom builds, and anyone doing one of them is probably capable of building the ssl module!). Paul. From jmtulloss at gmail.com Thu Sep 13 20:45:09 2007 From: jmtulloss at gmail.com (Justin Tulloss) Date: Thu, 13 Sep 2007 13:45:09 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> Message-ID: <2cfeb93c0709131145p5ef9aea6geeb3f6d03c8227c7@mail.gmail.com> On 9/13/07, Adam Olsen wrote: > > > Basically though, atomic incref/decref won't work. Once you've got > two threads modifying the same location the costs skyrocket. Even > without being properly atomic you'll get the same slowdown on x86 > (who's cache coherency is fairly strict.) I'm a bit skeptical of the actual costs of atomic incref. For there to be contention, you would need to have to be modifying the same memory location at the exact same time. That seems unlikely to ever happen. We can't bank on it never happening, but an occasionally expensive operation is ok. After all, it's occasional. Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070913/f373a0ec/attachment.htm From greg.ewing at canterbury.ac.nz Fri Sep 14 00:59:36 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Sep 2007 10:59:36 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E795FD.1070103@v.loewis.de> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> Message-ID: <46E9C0D8.5050100@canterbury.ac.nz> Martin v. L?wis wrote: > Now we are getting into details: you do NOT have to lock > an object to modify its reference count. An atomic > increment/decrement operation is enough. I stand corrected. But if it were as simple as that, I think it would have been done by now. I got the impression that this had already been tried, and it was still too slow. -- Greg From mhammond at skippinet.com.au Fri Sep 14 01:18:12 2007 From: mhammond at skippinet.com.au (Mark Hammond) Date: Fri, 14 Sep 2007 09:18:12 +1000 Subject: [Python-Dev] Windows package for new SSL package? In-Reply-To: <79990c6b0709131328s1d042881wf4a4c396b82b720d@mail.gmail.com> References: <01bf01c7f593$d7beecc0$873cc640$@com.au> <01da01c7f5b2$cadca4b0$6095ee10$@com.au> <79990c6b0709130402l28f62b5dr73be665d927c65d0@mail.gmail.com> <9f94e2360709130414n4817b94eufcbdc8829c069d38@mail.gmail.com> <79990c6b0709130521h2a73115ai281fa881e37c9dd8@mail.gmail.com> <-3701875869749923189@unknownmsgid> <79990c6b0709131246k4d442362y2f684bd5342e1a06@mail.gmail.com> <7888070288790377344@unknownmsgid> <79990c6b0709131328s1d042881wf4a4c396b82b720d@mail.gmail.com> Message-ID: <028f01c7f65c$671741b0$3545c510$@com.au> > You don't need VS and mingw binary installers, though - the mingw ones > will work for any Python (ignoring specialised custom builds, and > anyone doing one of them is probably capable of building the ssl > module!). Why I appreciate your points about building the extension with free tools, wouldn't it be prudent to release binaries using the same compiler as Python itself, assuming that option is available? If I read this thread correctly, a mingw build will rely on an openssl DLL being available or installed, which would seem to be less desirable than the way it builds with the openssl Python itself builds with. Mark From skip at pobox.com Fri Sep 14 01:38:05 2007 From: skip at pobox.com (skip at pobox.com) Date: Thu, 13 Sep 2007 18:38:05 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <1189696391.11322.275.camel@localhost> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> Message-ID: <18153.51677.587497.132597@montanaro.dyndns.org> Hrvoje> More precisely, Python will call the deallocator appropriate for Hrvoje> the object type. If that deallocator does nothing, the object Hrvoje> continues to live. Such objects could also start out with a Hrvoje> refcount of sys.maxint or so to ensure that calls to the no-op Hrvoje> deallocator are unlikely. Maybe sys.maxint/2? You'd hate for the first incref to invoke the deallocator even if it was a no-op. Skip From jon+python-dev at unequivocal.co.uk Fri Sep 14 03:01:22 2007 From: jon+python-dev at unequivocal.co.uk (Jon Ribbens) Date: Fri, 14 Sep 2007 02:01:22 +0100 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <18153.51677.587497.132597@montanaro.dyndns.org> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> <18153.51677.587497.132597@montanaro.dyndns.org> Message-ID: <20070914010122.GN32061@snowy.squish.net> On Thu, Sep 13, 2007 at 06:38:05PM -0500, skip at pobox.com wrote: > Hrvoje> More precisely, Python will call the deallocator appropriate for > Hrvoje> the object type. If that deallocator does nothing, the object > Hrvoje> continues to live. Such objects could also start out with a > Hrvoje> refcount of sys.maxint or so to ensure that calls to the no-op > Hrvoje> deallocator are unlikely. > > Maybe sys.maxint/2? You'd hate for the first incref to invoke the > deallocator even if it was a no-op. I do believe I already suggested that ;-) From greg.ewing at canterbury.ac.nz Fri Sep 14 05:11:00 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Sep 2007 15:11:00 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> Message-ID: <46E9FBC4.1020901@canterbury.ac.nz> Christian Heimes wrote: > Pardon my ignorance but why does Python do reference counting for truly > global and static objects Because it would cost more time to check whether the reference counting needed to be done than to just do it anyway. Remember that *most* refcount operations are on non-global objects. Putting in a test would slow all of them down, but only speed a few of them up. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Sep 14 05:15:23 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Sep 2007 15:15:23 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070913105527.GH32061@snowy.squish.net> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> Message-ID: <46E9FCCB.1050105@canterbury.ac.nz> Jon Ribbens wrote: > To put it another way, would it actually matter if the reference > counts for such objects became hopelessly wrong due to non-atomic > adjustments? Again, it would cost time to check whether you could get away with doing non-atomic refcounting. If you're thinking that no check would be needed because only things like True, False and None would be shared between threads, that's quite wrong. If the threads are to communicate at all, they need to share some kind of data somewhere. Also keep in mind that there is one case of "wrong" refcounting that would be distastrous, which is the case where the refcount becomes zero prematurely. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Sep 14 05:19:04 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Sep 2007 15:19:04 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <18153.15002.76898.448843@montanaro.dyndns.org> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> Message-ID: <46E9FDA8.6010303@canterbury.ac.nz> skip at pobox.com wrote: > what if ... we use atomic test-and-set to > handle reference counting (with a lock for those CPU architectures where we > haven't written the necessary assembler fragment), then implement a lock for > each mutable type and another for global state (thread state, interpreter > state, etc)? Could be worth a try. A first step might be to just implement the atomic refcounting, and run that single-threaded to see if it has terribly bad effects on performance. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Sep 14 05:43:57 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Sep 2007 15:43:57 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <46E8BFAA.5090008@v.loewis.de> Message-ID: <46EA037D.5030909@canterbury.ac.nz> Prateek Sureka wrote: > Naturally, we need to make the locking more > fine-grained to resolve this. Hopefully we can do so in a way that > does not increase the lock overhead (hence my suggestion for a lock > free approach using an asynch queue and a core as dedicated server). What you don't seem to see is that this would have no less overhead, and probably a lot *more*, than a mutex or other standard synchronisation mechanism. Certainly a lot more than an atomic instruction for the incref/decref. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg.ewing at canterbury.ac.nz Fri Sep 14 05:55:30 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 14 Sep 2007 15:55:30 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E652CD.1070901@v.loewis.de> <2cfeb93c0709110807t4d49f720l996710f5fe4ee3de@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> Message-ID: <46EA0632.5060603@canterbury.ac.nz> Jason Orendorff wrote: > The clever bit is that SpiderMonkey's per-object > locking does *not* require a context switch or even an atomic > instruction, in the usual case where an object is *not* shared among > threads. How does it tell whether an object is shared between threads? That sounds like the really clever bit to me. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From tleeuwenburg at gmail.com Fri Sep 14 06:13:12 2007 From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg) Date: Fri, 14 Sep 2007 14:13:12 +1000 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46EA0632.5060603@canterbury.ac.nz> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> <46EA0632.5060603@canterbury.ac.nz> Message-ID: <43c8685c0709132113u2282bb7bg848ed10c7d80f640@mail.gmail.com> Pardon me for talking with no experience in such matters, but... Okay, incrementing a reference counter is atomic, therefore the cheapest possible operation. Is it possible to keep reference counting atomic in a multi-thread model? Could you do the following... let's consider two threads, "A" and "B". Each time an object is created, a reference count is created in both "A" and "B". Let's suppose "A" has a real reference and "B" has no reference really. Couldn't the GC check two reference registers for a reference count? The object would then be cleaned up only if both registers were 0. To exploit multiple CPUs, you could have two persistent Python processes on each CPU with its own mini-GIL. Object creation would then involve a call to each process to create the reference and GC would involve checking each process to see what their count is. However, it would mean that within each process, threads could create additional references or remove references in an atomic way. In a single-CPU system, this would be the same cost as currently, since I think that situation would devolve to having just one place to check for references. This seems to mean that it is the case that it would be no more expensive for a single-CPU system. In a two-CPU system, I'm no expertise on the actual call overheads of object creation and garbage collection, but logically it would double the effort of object creation and destruction (all such operations now need to occur on both processes) but would keep reference increments and decrements atomic. Once again, I'm really sorry if I'm completely off-base since I have never done any actual coding in this area, but I thought I'd make the suggestion just in case it happened to have relevance. Thanks, -Tennessee -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/e11d56bb/attachment.htm From tulloss2 at uiuc.edu Fri Sep 14 07:10:34 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Fri, 14 Sep 2007 00:10:34 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46EA0632.5060603@canterbury.ac.nz> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> <46EA0632.5060603@canterbury.ac.nz> Message-ID: <2cfeb93c0709132210o4c5f6e56va0c9e2d9ebf1be27@mail.gmail.com> On 9/13/07, Greg Ewing wrote: > > Jason Orendorff wrote: > > The clever bit is that SpiderMonkey's per-object > > locking does *not* require a context switch or even an atomic > > instruction, in the usual case where an object is *not* shared among > > threads. > > How does it tell whether an object is shared between > threads? That sounds like the really clever bit to me. If you look at the article, they have a code sample. Basically a global is "owned" by the first thread that touches it. That thread can do whatever it wants with that global. If another thread wants to touch the global, it locks everything to do so. This is a pretty good idea except that in Python there are so many globals that all threads benefit from having access to. Luckily, except for their reference counts, they're mostly read-only. Therefore, if we can work out this reference count, we can probably use a similar concept. Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/53f674b4/attachment.htm From rhamph at gmail.com Fri Sep 14 08:10:17 2007 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 14 Sep 2007 00:10:17 -0600 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <46E9FDA8.6010303@canterbury.ac.nz> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> Message-ID: On 9/13/07, Greg Ewing wrote: > skip at pobox.com wrote: > > what if ... we use atomic test-and-set to > > handle reference counting (with a lock for those CPU architectures where we > > haven't written the necessary assembler fragment), then implement a lock for > > each mutable type and another for global state (thread state, interpreter > > state, etc)? > > Could be worth a try. A first step might be to just implement > the atomic refcounting, and run that single-threaded to see > if it has terribly bad effects on performance. I've done this experiment. It was about 12% on my box. Later, once I had everything else setup so I could run two threads simultaneously, I found much worse costs. All those literals become shared objects that create contention. I'm now working on an approach that writes out refcounts in batches to reduce contention. The initial cost is much higher, but it scales better too. I've currently got it to just under 50% cost, meaning two threads is a slight net gain. -- Adam Olsen, aka Rhamphoryncus From steve at holdenweb.com Fri Sep 14 08:15:41 2007 From: steve at holdenweb.com (Steve Holden) Date: Fri, 14 Sep 2007 02:15:41 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <1189696391.11322.275.camel@localhost> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> Message-ID: Hrvoje Nik?i? wrote: > On Thu, 2007-09-13 at 13:15 +0200, "Martin v. L?wis" wrote: >>> To put it another way, would it actually matter if the reference >>> counts for such objects became hopelessly wrong due to non-atomic >>> adjustments? >> If they drop to zero (which may happen due to non-atomic adjustments), >> Python will try to release the static memory, which will crash the >> malloc implementation. > > More precisely, Python will call the deallocator appropriate for the > object type. If that deallocator does nothing, the object continues to > live. Such objects could also start out with a refcount of sys.maxint > or so to ensure that calls to the no-op deallocator are unlikely. > The thought of adding references is amusing. What happens when a refcount becomes negative by overflow? I know, I should read the source ... regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Sorry, the dog ate my .sigline From tulloss2 at uiuc.edu Fri Sep 14 08:51:35 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Fri, 14 Sep 2007 01:51:35 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> Message-ID: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> On 9/14/07, Adam Olsen wrote: > > Could be worth a try. A first step might be to just implement > > the atomic refcounting, and run that single-threaded to see > > if it has terribly bad effects on performance. > > I've done this experiment. It was about 12% on my box. Later, once I > had everything else setup so I could run two threads simultaneously, I > found much worse costs. All those literals become shared objects that > create contention. It's hard to argue with cold hard facts when all we have is raw speculation. What do you think of a model where there is a global "thread count" that keeps track of how many threads reference an object? Then there are thread-specific reference counters for each object. When a thread's refcount goes to 0, it decrefs the object's thread count. If you did this right, hopefully there would only be cache updates when you update the thread count, which will only be when a thread first references an object and when it last references an object. I mentioned this idea earlier and it's growing on me. Since you've actually messed around with the code, do you think this would alleviate some of the contention issues? Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/9b77d4b1/attachment.htm From hrvoje.niksic at avl.com Fri Sep 14 09:25:24 2007 From: hrvoje.niksic at avl.com (Hrvoje =?UTF-8?Q?Nik=C5=A1i=C4=87?=) Date: Fri, 14 Sep 2007 09:25:24 +0200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <18153.51677.587497.132597@montanaro.dyndns.org> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <46E91BDB.7070601@v.loewis.de> <1189696391.11322.275.camel@localhost> <18153.51677.587497.132597@montanaro.dyndns.org> Message-ID: <1189754724.11322.279.camel@localhost> On Thu, 2007-09-13 at 18:38 -0500, skip at pobox.com wrote: > Hrvoje> More precisely, Python will call the deallocator appropriate for > Hrvoje> the object type. If that deallocator does nothing, the object > Hrvoje> continues to live. Such objects could also start out with a > Hrvoje> refcount of sys.maxint or so to ensure that calls to the no-op > Hrvoje> deallocator are unlikely. > > Maybe sys.maxint/2? You'd hate for the first incref to invoke the > deallocator even if it was a no-op. ob_refcnt is signed. :-) From martin at v.loewis.de Fri Sep 14 14:24:12 2007 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Sep 2007 14:24:12 +0200 Subject: [Python-Dev] Daily Windows Installers Message-ID: <46EA7D6C.8010600@v.loewis.de> Together with David Bolen, I set up a series of buildbot slaves that create an MSI installer from the 2.5, 2.6, and 3.0 branches every day. The result files are available from http://www.python.org/dev/daily-msi/ The buildbot pages themselves are at http://www.python.org/dev/buildbot/msi/ There are still some glitches with that installation (in particular, the Microsoft help compiler seems to crash occasionally). If you find any problems with the MSI files themselves, please report them to this list, or to the bug tracker. Regards, Martin From facundobatista at gmail.com Thu Sep 13 21:19:02 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Thu, 13 Sep 2007 16:19:02 -0300 Subject: [Python-Dev] Decimal news Message-ID: Hi people! After some months, Decimal is now in the trunk again. It's fully updated to the latest Cowlishaw specification, and complying with the latest test cases (from a few days ago, which even take in consideration some feedback from ours). I want to thank so much to Mark Dickinson, who made *a lot* of this work, not only the math part (he's a mathematician himself), but also a lot of cleaning and speeding up. Now we will put our hands in the documentation, for it to be 100% OK way before 2.6 arrives. Py3 will come after that. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From g.brandl at gmx.net Fri Sep 14 17:43:17 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 14 Sep 2007 17:43:17 +0200 Subject: [Python-Dev] Daily Windows Installers In-Reply-To: <46EA7D6C.8010600@v.loewis.de> References: <46EA7D6C.8010600@v.loewis.de> Message-ID: Martin v. L?wis schrieb: > Together with David Bolen, I set up a series of buildbot > slaves that create an MSI installer from the 2.5, 2.6, > and 3.0 branches every day. The result files are available > from > > http://www.python.org/dev/daily-msi/ > > The buildbot pages themselves are at > > http://www.python.org/dev/buildbot/msi/ > > There are still some glitches with that installation > (in particular, the Microsoft help compiler seems to > crash occasionally). I hope this isn't due to the files that Sphinx creates. I had a nasty crash with HTML Help Workshop when I generated an "invalid" index file -- but this was reproducible of course. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From p.f.moore at gmail.com Fri Sep 14 17:46:11 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 14 Sep 2007 16:46:11 +0100 Subject: [Python-Dev] Daily Windows Installers In-Reply-To: <46EA7D6C.8010600@v.loewis.de> References: <46EA7D6C.8010600@v.loewis.de> Message-ID: <79990c6b0709140846g26a99c33hd39e60eb226edc1a@mail.gmail.com> On 14/09/2007, "Martin v. L?wis" wrote: > Together with David Bolen, I set up a series of buildbot > slaves that create an MSI installer from the 2.5, 2.6, > and 3.0 branches every day. That's good news. Thanks for doing this. Paul. From rhamph at gmail.com Fri Sep 14 18:33:09 2007 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 14 Sep 2007 10:33:09 -0600 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> Message-ID: On 9/14/07, Justin Tulloss wrote: > > On 9/14/07, Adam Olsen wrote: > > > Could be worth a try. A first step might be to just implement > > > the atomic refcounting, and run that single-threaded to see > > > if it has terribly bad effects on performance. > > > > I've done this experiment. It was about 12% on my box. Later, once I > > had everything else setup so I could run two threads simultaneously, I > > found much worse costs. All those literals become shared objects that > > create contention. > > It's hard to argue with cold hard facts when all we have is raw speculation. > What do you think of a model where there is a global "thread count" that > keeps track of how many threads reference an object? Then there are > thread-specific reference counters for each object. When a thread's refcount > goes to 0, it decrefs the object's thread count. If you did this right, > hopefully there would only be cache updates when you update the thread > count, which will only be when a thread first references an object and when > it last references an object. > > I mentioned this idea earlier and it's growing on me. Since you've actually > messed around with the code, do you think this would alleviate some of the > contention issues? There would be some poor worst-case behaviour. In the case of literals you'd start referencing them when you call a function, then stop when the function returns. Same for any shared datastructure. I think caching/buffering refcounts in general holds promise though. My current approach uses a crude hash table as a cache and only flushes when there's a collision or when the tracing GC starts up. So far I've only got about 50% of the normal performance, but that's with 90% or more scalability, and I'm hoping to keep improving it. -- Adam Olsen, aka Rhamphoryncus From martin at v.loewis.de Fri Sep 14 18:45:29 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Sep 2007 18:45:29 +0200 Subject: [Python-Dev] Daily Windows Installers In-Reply-To: References: <46EA7D6C.8010600@v.loewis.de> Message-ID: <46EABAA9.40407@v.loewis.de> > I hope this isn't due to the files that Sphinx creates. > I had a nasty crash with HTML Help Workshop when I generated > an "invalid" index file -- but this was reproducible of course. It's not clear what precisely the problem is, but yes, it must have to do with the input :-) If you fixed that problem fairly recently (within the last 48 hours), this may have been the one we were seeing. Unfortunately, this is again one of the Windows problems which make buildbot on Windows so difficult: it brings up an error window, and then hangs. Regards, Martin From tonynelson at georgeanelson.com Fri Sep 14 18:44:25 2007 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Fri, 14 Sep 2007 12:44:25 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> Message-ID: At 1:51 AM -0500 9/14/07, Justin Tulloss wrote: >On 9/14/07, Adam Olsen <rhamph at gmail.com> wrote: > >> Could be worth a try. A first step might be to just implement >> the atomic refcounting, and run that single-threaded to see >> if it has terribly bad effects on performance. > >I've done this experiment. It was about 12% on my box. Later, once I >had everything else setup so I could run two threads simultaneously, I >found much worse costs. All those literals become shared objects that >create contention. > > >It's hard to argue with cold hard facts when all we have is raw >speculation. What do you think of a model where there is a global "thread >count" that keeps track of how many threads reference an object? Then >there are thread-specific reference counters for each object. When a >thread's refcount goes to 0, it decrefs the object's thread count. If you >did this right, hopefully there would only be cache updates when you >update the thread count, which will only be when a thread first references >an object and when it last references an object. It's likely that cache line contention is the issue, so don't glom all the different threads' refcount for an object into one vector. Keep each thread's refcounts in a per-thread vector of objects, so only that thread will cache that vector, or make refcounts so large that each will be in its own cache line (usu. 64 bytes, not too horrible for testing purposes). I don't know all what would be required for separate vectors of refcounts, but each object could contain its index into the vectors, which would all be the same size (Go Virtual Memory!). >I mentioned this idea earlier and it's growing on me. Since you've >actually messed around with the code, do you think this would alleviate >some of the contention issues? > >Justin Your idea can be combined with the maxint/2 initial refcount for non-disposable objects, which should about eliminate thread-count updates for them. -- ____________________________________________________________________ TonyN.:' ' -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/70897382/attachment.htm From status at bugs.python.org Fri Sep 14 19:36:49 2007 From: status at bugs.python.org (Tracker) Date: Fri, 14 Sep 2007 17:36:49 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070914173649.C02167815C@psf.upfronthosting.co.za> ACTIVITY SUMMARY (09/07/07 - 09/14/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1274 open (+24) / 11372 closed (+11) / 12646 total (+35) Average duration of open issues: 672 days. Median duration of open issues: 640 days. Open Issues Breakdown open 1270 (+24) pending 4 ( +0) Issues Created Or Reopened (35) _______________________________ OpenSSL detection broken for Python 3.0a1 09/07/07 CLOSED http://bugs.python.org/issue1129 created pythonmeister Idle - Save (buffer) - closes IDLE and does not save file (Windo 09/08/07 http://bugs.python.org/issue1130 created infixum Reference Manual: "for statement" links to "break statement" 09/08/07 http://bugs.python.org/issue1131 created Martoon compile error in poplib.py 09/08/07 CLOSED http://bugs.python.org/issue1132 created andre python3.0-config raises SyntaxError 09/09/07 CLOSED http://bugs.python.org/issue1133 created complex Parsing a simple script eats all of your memory 09/09/07 http://bugs.python.org/issue1134 created complex xview/yview of Tix.Grid is broken 09/09/07 http://bugs.python.org/issue1135 created ocean-city Bdb documentation 09/09/07 CLOSED http://bugs.python.org/issue1136 created arklad pyexpat patch for changing buffer_size 09/09/07 http://bugs.python.org/issue1137 created AchimGaedke Fixer needed for __future__ imports 09/09/07 http://bugs.python.org/issue1138 created collinwinter PyFile_Encoding should be PyFile_SetEncoding 09/10/07 CLOSED http://bugs.python.org/issue1139 created gagenellina re.sub returns str when processing empty unicode string 09/10/07 http://bugs.python.org/issue1140 created beda reading large files 09/10/07 http://bugs.python.org/issue1141 created Richard.Christen at unice.fr code sample showing errors reading large files with py 2.5/3.0 09/10/07 http://bugs.python.org/issue1142 created Richard.Christen at unice.fr Update to latest ElementTree in Python 2.6 09/11/07 http://bugs.python.org/issue1143 created effbot parsermodule validation out of sync with Grammar 09/11/07 http://bugs.python.org/issue1144 created dbinger Allow str.join to join non-string types (as per PEP 3100) 09/11/07 http://bugs.python.org/issue1145 created thomas.lee TextWrap vs words 1-character shorter than the width 09/11/07 http://bugs.python.org/issue1146 created sam string exceptions inconsistently deprecated/disabled 09/11/07 CLOSED http://bugs.python.org/issue1147 created exarkun TypeError on join - httplib mixing str and bytes 09/11/07 CLOSED http://bugs.python.org/issue1148 created eopadoan fdopen does not work as expected 09/11/07 http://bugs.python.org/issue1149 created luis at luispedro.org Rename PyBUF_WRITEABLE to PyBUF_WRITABLE 09/11/07 http://bugs.python.org/issue1150 created gvanrossum "TypeError: expected string, bytes found" instead of KeyboardInt 09/11/07 http://bugs.python.org/issue1151 created eopadoan Bug in documentation for SimpleXMLRPCServer 09/12/07 CLOSED http://bugs.python.org/issue1152 created FrankMillman help(pickle) fails: unorderable types: type() < type() 09/12/07 CLOSED http://bugs.python.org/issue1153 created Qrczak Carbon.CF memory leak 09/12/07 CLOSED http://bugs.python.org/issue1154 created hhas Carbon.CF memory management problem 09/12/07 http://bugs.python.org/issue1155 created hhas Suggested change to _exit function description in os module docu 09/12/07 http://bugs.python.org/issue1156 created jtonsing test_urllib2net fails on test_ftp 09/12/07 http://bugs.python.org/issue1157 created gvanrossum %f format for datetime objects 09/13/07 http://bugs.python.org/issue1158 created skip.montanaro os.getenv() not updated after external module uses C putenv() 09/13/07 http://bugs.python.org/issue1159 created robert.ancell Medium size regexp crashes python 09/13/07 http://bugs.python.org/issue1160 created ostkamp Garbled chars in offending line of SyntaxError traceback 09/13/07 http://bugs.python.org/issue1161 created eopadoan Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" B 09/13/07 CLOSED http://bugs.python.org/issue1162 created swaroopch Patch to make py3k/Lib/test/test_thread.py use unittest 09/13/07 http://bugs.python.org/issue1163 created JonoDiCarlo Issues Now Closed (33) ______________________ Backport ABC to 2.6 16 days http://bugs.python.org/issue1026 baranguren [py3k] pdb does not work in python 3000 16 days http://bugs.python.org/issue1038 georg.brandl test_cmd_line starts python without -E 13 days http://bugs.python.org/issue1056 ncoghlan ssl.py shouldn't change class names from 2.6 to 3.x 11 days http://bugs.python.org/issue1065 janssen TypeError in poplib.py 7 days http://bugs.python.org/issue1094 gvanrossum make install failed 4 days http://bugs.python.org/issue1095 georg.brandl Deeply recursive repr segfault 7 days http://bugs.python.org/issue1096 brett.cannon 2to3, lambda with non-tuple argument inside parenthesis 4 days http://bugs.python.org/issue1107 collinwinter "make altinstall" installs pydoc, idle, smtpd.py with broken sh 6 days http://bugs.python.org/issue1120 georg.brandl Document inspect.getfullargspec() 6 days http://bugs.python.org/issue1121 georg.brandl PyTuple_Size and PyTuple_GET_SIZE return type documentation inc 6 days http://bugs.python.org/issue1122 georg.brandl bytes.split shold have same interface as str.split, or differen 4 days http://bugs.python.org/issue1125 gvanrossum OpenSSL detection broken for Python 3.0a1 0 days http://bugs.python.org/issue1129 georg.brandl compile error in poplib.py 1 days http://bugs.python.org/issue1132 georg.brandl python3.0-config raises SyntaxError 0 days http://bugs.python.org/issue1133 loewis Bdb documentation 3 days http://bugs.python.org/issue1136 georg.brandl PyFile_Encoding should be PyFile_SetEncoding 3 days http://bugs.python.org/issue1139 georg.brandl string exceptions inconsistently deprecated/disabled 0 days http://bugs.python.org/issue1147 brett.cannon TypeError on join - httplib mixing str and bytes 1 days http://bugs.python.org/issue1148 gvanrossum Bug in documentation for SimpleXMLRPCServer 1 days http://bugs.python.org/issue1152 georg.brandl help(pickle) fails: unorderable types: type() < type() 0 days http://bugs.python.org/issue1153 georg.brandl Carbon.CF memory leak 0 days http://bugs.python.org/issue1154 georg.brandl Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" 1 days http://bugs.python.org/issue1162 georg.brandl time mod's timezone doesn't honor TZ var 2114 days http://bugs.python.org/issue487331 brett.cannon asyncore file wrapper & os.error 1988 days http://bugs.python.org/issue539444 brett.cannon support for server side transactions in _ssl 1498 days http://bugs.python.org/issue783188 loewis class property fset not working 842 days http://bugs.python.org/issue1207379 georg.brandl Traceback error when compiling Regex 537 days http://bugs.python.org/issue1456280 brett.cannon NNTPS support in nntplib 402 days http://bugs.python.org/issue1535659 janssen SSL "issuer" and "server" names cannot be parsed 321 days http://bugs.python.org/issue1583946 janssen Suggest a textlist() method for ElementTree 291 days http://bugs.python.org/issue1602189 effbot socket.error exceptions not subclass of StandardError 138 days http://bugs.python.org/issue1706815 gregory.p.smith Binding fails 26 days http://bugs.python.org/issue1774736 loewis Top Issues Most Discussed (10) ______________________________ 11 re.sub returns str when processing empty unicode string 4 days open http://bugs.python.org/issue1140 9 os.getenv() not updated after external module uses C putenv() 1 days open http://bugs.python.org/issue1159 8 code sample showing errors reading large files with py 2.5/3.0 4 days open http://bugs.python.org/issue1142 8 reading large files 4 days open http://bugs.python.org/issue1141 6 Allow str.join to join non-string types (as per PEP 3100) 3 days open http://bugs.python.org/issue1145 5 %f format for datetime objects 2 days open http://bugs.python.org/issue1158 5 Parsing a simple script eats all of your memory 6 days open http://bugs.python.org/issue1134 5 bytes.split shold have same interface as str.split, or differen 4 days closed http://bugs.python.org/issue1125 4 logging: delay_fh option and configuration kwargs 44 days open http://bugs.python.org/issue1765140 4 Python SEGFAULT on tuple.__repr__ and str() 176 days pending http://bugs.python.org/issue1686386 From barry at python.org Fri Sep 14 20:34:44 2007 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Sep 2007 14:34:44 -0400 Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? In-Reply-To: <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 13, 2007, at 2:43 PM, Bill Janssen wrote: >> I see that base64.b64encode and base64.standard_b64encode no longer >> introduce line breaks into the output strings, as base64.encodestring >> does. Shouldn't there be an option on one of them to do this? > > See: > > http://mail.python.org/pipermail/python-bugs-list/2001-October/ > 007856.html > > section 2.1 of http://www.faqs.org/rfcs/rfc3548.html > > Perhaps adding MIME_b64encode() and PEM_b64encode() routines? Or just > an optional parameter to standard_b64encode, called "max_line_length", > defaulting to 0, meaning no max? It turns out to be inconvenient in other contexts to do the line splitting at this lower level, so I would prefer to leave the current methods as is (that means, no change in semantics or arguments). I wouldn't necessarily be opposed to new functions that did the line splitting, but ideally, you could design an API that provided that behavior for any of the existing alternatives in the base64 module, without duplicating them all. It's not clear to me how you'd do that though (or if it's worth it). - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCUAwUBRurURHEjvBPtnXfVAQLSZAP4pdn3lUPvVfqSl+RT4GBzYmL1uUTMrmJx +lc+7SEaOj0sphfQbTmN9kKlwS2cJQ7UdZQzXM6t5+zlM+b4GRl6pA0CEk/M3PUI VWs3JkxgMRQA0CoeF5AflLru7ZxEL7pYej88y9KPAZCQ7H6e0+b8TCr/6Qj0YiYw c2eLfZoSAA== =klKj -----END PGP SIGNATURE----- From tulloss2 at uiuc.edu Fri Sep 14 21:13:47 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Fri, 14 Sep 2007 14:13:47 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> Message-ID: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com> Your idea can be combined with the maxint/2 initial refcount for > non-disposable objects, which should about eliminate thread-count updates > for them. > -- > I don't really like the maxint/2 idea because it requires us to differentiate between globals and everything else. Plus, it's a hack. I'd like a more elegant solution if possible. Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/1e3cddfd/attachment.htm From janssen at parc.com Fri Sep 14 21:20:32 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 14 Sep 2007 12:20:32 PDT Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? In-Reply-To: References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com> > >> I see that base64.b64encode and base64.standard_b64encode no longer > >> introduce line breaks into the output strings, as base64.encodestring > >> does. Shouldn't there be an option on one of them to do this? > > > > See: > > > > http://mail.python.org/pipermail/python-bugs-list/2001-October/ > > 007856.html > > > > section 2.1 of http://www.faqs.org/rfcs/rfc3548.html > > > > Perhaps adding MIME_b64encode() and PEM_b64encode() routines? Or just > > an optional parameter to standard_b64encode, called "max_line_length", > > defaulting to 0, meaning no max? > > It turns out to be inconvenient in other contexts to do the line > splitting at this lower level, so I would prefer to leave the current > methods as is (that means, no change in semantics or arguments). > > I wouldn't necessarily be opposed to new functions that did the line > splitting, but ideally, you could design an API that provided that > behavior for any of the existing alternatives in the base64 module, > without duplicating them all. It's not clear to me how you'd do that > though (or if it's worth it). I think that's probably right. I just added the PEM line-wrapping to the code in the ssl module. Though I hate to keep adding line-wrapping code here and there... Perhaps just adding a utility function, wrap_lines(), or some such to the module would suffice. Bill From db3l.net at gmail.com Fri Sep 14 21:26:25 2007 From: db3l.net at gmail.com (David Bolen) Date: Fri, 14 Sep 2007 15:26:25 -0400 Subject: [Python-Dev] Daily Windows Installers References: <46EA7D6C.8010600@v.loewis.de> Message-ID: Georg Brandl writes: > I hope this isn't due to the files that Sphinx creates. > I had a nasty crash with HTML Help Workshop when I generated > an "invalid" index file -- but this was reproducible of course. The really annoying thing is that this only occurs (so far) in the 3.0 tree when run beneath the buildbot, although it seems consistent there. Using the same tree right after a crash, and running the same build command interactively always seems to work fine. I thought it might be a stdout/console thing but redirecting the compiler's output to a file still crashes. I think, but can't prove it has parsed all the input files, since the last bit of output even in verbose mode is still buffered in its process when it crashes. I did determine that genindex.html is being created with malformed HTML (< and > in operators aren't being quoted as < and >), but manually fixing that didn't resolve the crash. And even in the 2.6 branch (which builds fine) genindex.html has erroneous uses of "" that isn't quoted either. For the moment I'm probably going to work to ensure we don't get the pop-up box (which blocks the rest of the processing) so at least an MSI can get created even if the chm is bad. -- David From exarkun at divmod.com Fri Sep 14 21:30:49 2007 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Fri, 14 Sep 2007 15:30:49 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com> Message-ID: <20070914193049.8162.648500711.divmod.quotient.8674@ohm> On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss wrote: >Your idea can be combined with the maxint/2 initial refcount for >> non-disposable objects, which should about eliminate thread-count updates >> for them. >> -- >> > > I don't really like the maxint/2 idea because it requires us to >differentiate between globals and everything else. Plus, it's a hack. I'd >like a more elegant solution if possible. It's not really a solution either. If your program runs for a couple minutes and then exits, maybe it won't trigger some catastrophic behavior from this hack, but if you have a long running process then you're almost certain to be screwed over by this (it wouldn't even have to be *very* long running - a month or two could do it on a 32bit platform). Jean-Paul From janssen at parc.com Fri Sep 14 21:36:22 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 14 Sep 2007 12:36:22 PDT Subject: [Python-Dev] SSL and asyncore update (and SSL_shutdown) Message-ID: <07Sep14.123630pdt."57996"@synergy1.parc.xerox.com> I've now got an HTTPS server (more importantly, one built on asyncore and SocketServer and BaseHTTPServer), running in the test suite. Also, I think that, for the moment, I'm going to take ssl.ssl_shutdown() out of the library. The state machine implemented at the GoogleSprint really only does the client side of SSL_shutdown(); it will take a bit more work for the server side. I'll check this in this weekend, and update the 2.3-compatible package. Bill From barry at python.org Fri Sep 14 21:54:23 2007 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Sep 2007 15:54:23 -0400 Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? In-Reply-To: <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com> References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com> Message-ID: <5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 14, 2007, at 3:20 PM, Bill Janssen wrote: > I think that's probably right. I just added the PEM line-wrapping to > the code in the ssl module. Though I hate to keep adding > line-wrapping code here and there... Perhaps just adding a utility > function, wrap_lines(), or some such to the module would suffice. Does anything in textwrap already do the trick? If not, that might be the best place to refactor similar code to. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRurm8HEjvBPtnXfVAQIGFwP9G/hYbZ/cR7n9X4NtKATFqm/Mp+q8SH3b jFUEuvf/y0/0Ri6aKpC9QJzLNg+ZlgthmaYNRT488SXPplbB4mtysFbJg+A9x3d3 fi4rkqXnrvJt6Msqbti7wt6sGYZRisDveztuKM5Sh8t+die+55e3bZg7ght6Vyuk +N6V9lg2/3A= =iknB -----END PGP SIGNATURE----- From steve at holdenweb.com Fri Sep 14 21:58:57 2007 From: steve at holdenweb.com (Steve Holden) Date: Fri, 14 Sep 2007 15:58:57 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070914193049.8162.648500711.divmod.quotient.8674@ohm> References: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com> <20070914193049.8162.648500711.divmod.quotient.8674@ohm> Message-ID: Jean-Paul Calderone wrote: > On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss wrote: >> Your idea can be combined with the maxint/2 initial refcount for >>> non-disposable objects, which should about eliminate thread-count updates >>> for them. >>> -- >>> >> I don't really like the maxint/2 idea because it requires us to >> differentiate between globals and everything else. Plus, it's a hack. I'd >> like a more elegant solution if possible. > > It's not really a solution either. If your program runs for a couple > minutes and then exits, maybe it won't trigger some catastrophic behavior > from this hack, but if you have a long running process then you're almost > certain to be screwed over by this (it wouldn't even have to be *very* > long running - a month or two could do it on a 32bit platform). > Could each class define the value to be added to or subtracted from the refcount? We'd only need a bit to store the value (since it would always be zero or one), but the execution time might increase quite a lot if there's no nifty way to conditionally add or subtract one. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Sorry, the dog ate my .sigline From g.brandl at gmx.net Fri Sep 14 22:37:47 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 14 Sep 2007 22:37:47 +0200 Subject: [Python-Dev] Daily Windows Installers In-Reply-To: References: <46EA7D6C.8010600@v.loewis.de> Message-ID: David Bolen schrieb: > Georg Brandl writes: > >> I hope this isn't due to the files that Sphinx creates. >> I had a nasty crash with HTML Help Workshop when I generated >> an "invalid" index file -- but this was reproducible of course. > > The really annoying thing is that this only occurs (so far) in the 3.0 > tree when run beneath the buildbot, although it seems consistent > there. Using the same tree right after a crash, and running the same > build command interactively always seems to work fine. I thought it > might be a stdout/console thing but redirecting the compiler's output > to a file still crashes. > > I think, but can't prove it has parsed all the input files, since the > last bit of output even in verbose mode is still buffered in its > process when it crashes. Can't help you there, just notice that this is the same point where I saw "my" crash. > I did determine that genindex.html is being created with malformed > HTML (< and > in operators aren't being quoted as < and >), but > manually fixing that didn't resolve the crash. And even in the 2.6 > branch (which builds fine) genindex.html has erroneous uses of > "" that isn't quoted either. Okay, I should really fix this. Added a todo-list item. > For the moment I'm probably going to work to ensure we don't get the > pop-up box (which blocks the rest of the processing) so at least an > MSI can get created even if the chm is bad. Thanks for handling this! Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From tomerfiliba at gmail.com Fri Sep 14 22:30:20 2007 From: tomerfiliba at gmail.com (tomer filiba) Date: Fri, 14 Sep 2007 20:30:20 -0000 Subject: [Python-Dev] import file extensions Message-ID: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> a quick question: i'm working on a pythonic build system, where the build scripts are plain python files. but i want to differentiate them from normal python files (.py) by a different suffix (say .pyy), but then i can't import them. so i'm wondering, is there a quick way to just add another extension to import mechanism? or do i have to write a fully fledged import hook? -tomer From guido at python.org Fri Sep 14 23:11:41 2007 From: guido at python.org (Guido van Rossum) Date: Fri, 14 Sep 2007 14:11:41 -0700 Subject: [Python-Dev] import file extensions In-Reply-To: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> Message-ID: I think you're looking for a PEP 302 style meta hook. On 9/14/07, tomer filiba wrote: > a quick question: i'm working on a pythonic build system, where the > build > scripts are plain python files. but i want to differentiate them from > normal > python files (.py) by a different suffix (say .pyy), but then i can't > import > them. > > so i'm wondering, is there a quick way to just add another extension > to > import mechanism? or do i have to write a fully fledged import hook? > > > -tomer > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Sep 14 23:19:49 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 14 Sep 2007 23:19:49 +0200 Subject: [Python-Dev] import file extensions In-Reply-To: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> Message-ID: <46EAFAF5.8030007@v.loewis.de> > so i'm wondering, is there a quick way to just add another extension > to import mechanism? or do i have to write a fully fledged import > hook? [this question is off-topic for python-dev] If recompiling Python is an option, the quick way is to edit the interpreter, and add that extension. If that is not an option, but it is an option to put all .pyy files in a single directory, the quick way is to add an entry to sys.path_hooks. If that is also not an option, the quick way is to add an entry to sys.meta_path. The best way would be to not use import, but provide a separate function (e.g. calling it "require"). Regards, Martin From fuzzyman at voidspace.org.uk Fri Sep 14 23:17:06 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 14 Sep 2007 22:17:06 +0100 Subject: [Python-Dev] [python] Re: import file extensions In-Reply-To: References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> Message-ID: <46EAFA52.8020103@voidspace.org.uk> Guido van Rossum wrote: > I think you're looking for a PEP 302 style meta hook. > Or even execfile in a context... Michael Foord http://www.manning.com/foord > On 9/14/07, tomer filiba wrote: > >> a quick question: i'm working on a pythonic build system, where the >> build >> scripts are plain python files. but i want to differentiate them from >> normal >> python files (.py) by a different suffix (say .pyy), but then i can't >> import >> them. >> >> so i'm wondering, is there a quick way to just add another extension >> to >> import mechanism? or do i have to write a fully fledged import hook? >> >> >> -tomer >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org >> >> > > > From tomerfiliba at gmail.com Fri Sep 14 23:34:16 2007 From: tomerfiliba at gmail.com (tomer filiba) Date: Fri, 14 Sep 2007 23:34:16 +0200 Subject: [Python-Dev] import file extensions In-Reply-To: <46EAFAF5.8030007@v.loewis.de> References: <1189801820.747989.270020@d55g2000hsg.googlegroups.com> <46EAFAF5.8030007@v.loewis.de> Message-ID: <1d85506f0709141434t4bc4ec89ub43bc66dde81f897@mail.gmail.com> On 9/14/07, "Martin v. L?wis" wrote: > The best way would be to not use import, but provide a separate > function (e.g. calling it "require"). > yepp, that's probably the cleanest and quickest solution. i needed to see all the alternatives to realize this though. sorry. -- An NCO and a Gentleman From db3l.net at gmail.com Fri Sep 14 23:51:26 2007 From: db3l.net at gmail.com (David Bolen) Date: Fri, 14 Sep 2007 17:51:26 -0400 Subject: [Python-Dev] Daily Windows Installers References: <46EA7D6C.8010600@v.loewis.de> Message-ID: Georg Brandl writes: > David Bolen schrieb: >> Georg Brandl writes: (...) >> For the moment I'm probably going to work to ensure we don't get the >> pop-up box (which blocks the rest of the processing) so at least an >> MSI can get created even if the chm is bad. > > Thanks for handling this! I hit it with a sledge-hammer and modified my build slave to disable error boxes for anything it runs, so we'll get the 3.0 MSI now but with a bad chm until it gets figured out. -- David From exarkun at divmod.com Fri Sep 14 23:59:06 2007 From: exarkun at divmod.com (Jean-Paul Calderone) Date: Fri, 14 Sep 2007 17:59:06 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: Message-ID: <20070914215906.8162.2036179390.divmod.quotient.8742@ohm> On Fri, 14 Sep 2007 17:43:39 -0400, James Y Knight wrote: > >On Sep 14, 2007, at 3:30 PM, Jean-Paul Calderone wrote: >>On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss >>wrote: >>>Your idea can be combined with the maxint/2 initial refcount for >>>>non-disposable objects, which should about eliminate thread-count >>>>updates >>>>for them. >>>>-- >>> >>>I don't really like the maxint/2 idea because it requires us to >>>differentiate between globals and everything else. Plus, it's a hack. I'd >>>like a more elegant solution if possible. >> >>It's not really a solution either. If your program runs for a couple >>minutes and then exits, maybe it won't trigger some catastrophic behavior >>from this hack, but if you have a long running process then you're almost >>certain to be screwed over by this (it wouldn't even have to be *very* >>long running - a month or two could do it on a 32bit platform). > >Not true: the refcount becoming 0 only calls a dealloc function.. For >objects which are not deletable, the dealloc function should simply set the >refcount back to maxint/2. Done. > So, eg, replace the Py_FatalError in none_dealloc with an assignment to ob_refcnt? Good point, sounds like it could work (I'm pretty sure you know more about deallocation in CPython than I :). Jean-Paul From janssen at parc.com Sat Sep 15 00:01:49 2007 From: janssen at parc.com (Bill Janssen) Date: Fri, 14 Sep 2007 15:01:49 PDT Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? In-Reply-To: <5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org> References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com> <5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org> Message-ID: <07Sep14.150150pdt."57996"@synergy1.parc.xerox.com> > Does anything in textwrap already do the trick? If not, that might > be the best place to refactor similar code to. Yes, textwrap.fill. Thanks for pointing it out. Bill From rhamph at gmail.com Sat Sep 15 00:21:57 2007 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 14 Sep 2007 16:21:57 -0600 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070914215906.8162.2036179390.divmod.quotient.8742@ohm> References: <20070914215906.8162.2036179390.divmod.quotient.8742@ohm> Message-ID: On 9/14/07, Jean-Paul Calderone wrote: > On Fri, 14 Sep 2007 17:43:39 -0400, James Y Knight wrote: > > > >On Sep 14, 2007, at 3:30 PM, Jean-Paul Calderone wrote: > >>On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss > >>wrote: > >>>Your idea can be combined with the maxint/2 initial refcount for > >>>>non-disposable objects, which should about eliminate thread-count > >>>>updates > >>>>for them. > >>>>-- > >>> > >>>I don't really like the maxint/2 idea because it requires us to > >>>differentiate between globals and everything else. Plus, it's a hack. I'd > >>>like a more elegant solution if possible. > >> > >>It's not really a solution either. If your program runs for a couple > >>minutes and then exits, maybe it won't trigger some catastrophic behavior > >>from this hack, but if you have a long running process then you're almost > >>certain to be screwed over by this (it wouldn't even have to be *very* > >>long running - a month or two could do it on a 32bit platform). > > > >Not true: the refcount becoming 0 only calls a dealloc function.. For > >objects which are not deletable, the dealloc function should simply set the > >refcount back to maxint/2. Done. > > > > So, eg, replace the Py_FatalError in none_dealloc with an assignment to > ob_refcnt? Good point, sounds like it could work (I'm pretty sure you > know more about deallocation in CPython than I :). As I've said, this is all moot. The cache coherence protocols on x86 means this will be nearly as slow as proper atomic refcounting, and will not scale if multiple threads regularly touch the object. My experience is that they will touch it regularly. -- Adam Olsen, aka Rhamphoryncus From greg.ewing at canterbury.ac.nz Sat Sep 15 00:23:39 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 15 Sep 2007 10:23:39 +1200 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> <2cfeb93c0709132351r30193614k6e3b90b6da515fbb@mail.gmail.com> Message-ID: <46EB09EB.7070600@canterbury.ac.nz> Justin Tulloss wrote: > > What do you think of a model where there is a global > "thread count" that keeps track of how many threads reference an object? I've thought about that sort of thing before. The problem is how you keep track of how many threads reference an object, without introducing far more overhead than you're trying to eliminate. > Then there are thread-specific reference counters for each object. What happens when a new thread comes into existence? Do you go through all existing objects and add another element to their refcount arrays? -- Greg From barry at python.org Sat Sep 15 00:37:57 2007 From: barry at python.org (Barry Warsaw) Date: Fri, 14 Sep 2007 18:37:57 -0400 Subject: [Python-Dev] base64 -- should b64encode introduce line breaks? In-Reply-To: <07Sep14.150150pdt."57996"@synergy1.parc.xerox.com> References: <07Sep13.105543pdt."57996"@synergy1.parc.xerox.com> <07Sep13.114308pdt."57996"@synergy1.parc.xerox.com> <07Sep14.122040pdt."57996"@synergy1.parc.xerox.com> <5331CD54-34F4-41D9-8731-D7E0E4CF96C0@python.org> <07Sep14.150150pdt."57996"@synergy1.parc.xerox.com> Message-ID: <29E110FC-FC99-40C6-8775-085B29EEC0A9@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Sep 14, 2007, at 6:01 PM, Bill Janssen wrote: >> Does anything in textwrap already do the trick? If not, that might >> be the best place to refactor similar code to. > > Yes, textwrap.fill. Thanks for pointing it out. /me tries to remember that for Py3k's email package. ;) - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRusNRnEjvBPtnXfVAQIpNgP/SosDYX/GDMolxcv3U2WrGzMjQa+gd8ai J/Oaw2vdSf8H84eU9ziKaWHQtK0obS9XrnUTLUDyfAKObNffZVvldG1KUV9vAKhr 3JuNJ3xiIk7RKXdkKd5mA7SXXqRd80NVN26Za0H8bkl16mhdpZM7OqJmhaIkCkXr AJtjJP5esWQ= =v/C8 -----END PGP SIGNATURE----- From tonynelson at georgeanelson.com Sat Sep 15 00:55:01 2007 From: tonynelson at georgeanelson.com (Tony Nelson) Date: Fri, 14 Sep 2007 18:55:01 -0400 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070914193049.8162.648500711.divmod.quotient.8674@ohm> References: <20070914193049.8162.648500711.divmod.quotient.8674@ohm> Message-ID: At 3:30 PM -0400 9/14/07, Jean-Paul Calderone wrote: >On Fri, 14 Sep 2007 14:13:47 -0500, Justin Tulloss wrote: >>Your idea can be combined with the maxint/2 initial refcount for >>> non-disposable objects, which should about eliminate thread-count updates >>> for them. >>> -- >>> >> >> I don't really like the maxint/2 idea because it requires us to >>differentiate between globals and everything else. Plus, it's a hack. I'd >>like a more elegant solution if possible. > >It's not really a solution either. If your program runs for a couple >minutes and then exits, maybe it won't trigger some catastrophic behavior >from this hack, but if you have a long running process then you're almost >certain to be screwed over by this (it wouldn't even have to be *very* >long running - a month or two could do it on a 32bit platform). I don't think either of you understand what setting the initial refcount to maxint/2 for global objects in a thread's refcount vector would do. It has /no/ effect on refcounting. It only prevents the refcount from becoming zero for objects that can never be released, but which would always have a zero thread refcount on thread exit, which would cause a useless and frequent thread count decrement for the object. As the object can never be released, its thread count would be initially non-zero, so the thread count won't be made zero when the thread refcount becomes zero. The thread count is shared in the object. The thread refcount is per thread, and should not be shared, even at the physical cache line level, if good performance is desired. When a new thread is created, part of the thread state would be the refcount vector. Hopefully it would mostly be just VM magic, but the initial part of the vector would contain the immortal objects' refcount, and those would be set to maxint/2. Or 1, for that matter. -- ____________________________________________________________________ TonyN.:' ' From jon+python-dev at unequivocal.co.uk Sat Sep 15 02:50:58 2007 From: jon+python-dev at unequivocal.co.uk (Jon Ribbens) Date: Sat, 15 Sep 2007 01:50:58 +0100 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <20070914193049.8162.648500711.divmod.quotient.8674@ohm> References: <2cfeb93c0709141213t6727efack5db90dc706e2b95f@mail.gmail.com> <20070914193049.8162.648500711.divmod.quotient.8674@ohm> Message-ID: <20070915005058.GS32061@snowy.squish.net> On Fri, Sep 14, 2007 at 03:30:49PM -0400, Jean-Paul Calderone wrote: > > I don't really like the maxint/2 idea because it requires us to > >differentiate between globals and everything else. Plus, it's a hack. I'd > >like a more elegant solution if possible. > > It's not really a solution either. If your program runs for a couple > minutes and then exits, maybe it won't trigger some catastrophic behavior > from this hack, but if you have a long running process then you're almost > certain to be screwed over by this You misunderstand - the point of the 'maxint/2' thing isn't to prevent something from happening at all, it's to prevent it from happening *frequently*. From talin at acm.org Sat Sep 15 07:48:40 2007 From: talin at acm.org (Talin) Date: Fri, 14 Sep 2007 22:48:40 -0700 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <20070911192035.94CE33A40D7@sparrow.telecommunity.com> <200709131219.21152.nd@perlig.de> <20070913105527.GH32061@snowy.squish.net> <18153.15002.76898.448843@montanaro.dyndns.org> <46E9FDA8.6010303@canterbury.ac.nz> Message-ID: <46EB7238.5040104@acm.org> Adam Olsen wrote: > I'm now working on an approach that writes out refcounts in batches to > reduce contention. The initial cost is much higher, but it scales > better too. I've currently got it to just under 50% cost, meaning two > threads is a slight net gain. http://www.research.ibm.com/people/d/dfb/publications.html Look at the various papers on 'Recycler'. The way it works is that for each thread, there is an addref buffer and a decref buffer. The buffers are arrays of pointers. Each time a reference is addref'd, its appended to the addref buffer, likewise for decref. When a buffer gets full, it is added to a queue and then a new buffer is allocated. There is a background thread that actually applies the refcounts from the buffers and frees the objects. Since this background thread is the only thread that ever touches the actual refcount field of the object, there's no need for locking. -- Talin From jmtulloss at gmail.com Fri Sep 14 07:08:13 2007 From: jmtulloss at gmail.com (Justin Tulloss) Date: Fri, 14 Sep 2007 00:08:13 -0500 Subject: [Python-Dev] Removing the GIL (Me, not you!) In-Reply-To: <43c8685c0709132113u2282bb7bg848ed10c7d80f640@mail.gmail.com> References: <2cfeb93c0709110027x43a142c6gf97724c85d02ab06@mail.gmail.com> <46E7002D.6050005@v.loewis.de> <46E722BF.8000807@canterbury.ac.nz> <46E795FD.1070103@v.loewis.de> <18152.2055.258930.576257@montanaro.dyndns.org> <741C7AC6-55CF-40A0-BB0B-DE418AE2CD88@gmail.com> <2cfeb93c0709130008u28634a6dmaef370b970a0a6a5@mail.gmail.com> <46EA0632.5060603@canterbury.ac.nz> <43c8685c0709132113u2282bb7bg848ed10c7d80f640@mail.gmail.com> Message-ID: <2cfeb93c0709132208g5198439ds3415c0f9a7689174@mail.gmail.com> I'm not sure I understand entirely what you're saying, but it sounds like you want multiple reference counts. A reference count per thread might not be a bad idea, but I can't think of how it would work without locks. If every object has an array of reference counts, then the GC would need to lock that array to check to see if they're all 0. That means the incref/decref operations would need to acquire this lock or risk messing up the GC. Perhaps you could have something where you have a reference count per thread and then a thread count per object. Then you would only need to lock the thread count for the first and last reference a thread makes to an object. Once there are no threads referencing and object, its obviously safe for cleanup. Of course, I'm not convinced atomic ops are really so expensive you can't have every thread doing it at once, but Adam says that the caches will be thrashed if we have a bunch of threads continuously updating the same memory address. I can see the possibility. Perhaps once we have a version that actually demonstrates this thrashing, we can alleviate it with some sort of multiple reference count scheme. Justin On 9/13/07, Tennessee Leeuwenburg wrote: > > Pardon me for talking with no experience in such matters, but... > > Okay, incrementing a reference counter is atomic, therefore the cheapest > possible operation. Is it possible to keep reference counting atomic in a > multi-thread model? > > Could you do the following... let's consider two threads, "A" and "B". > Each time an object is created, a reference count is created in both "A" and > "B". Let's suppose "A" has a real reference and "B" has no reference really. > Couldn't the GC check two reference registers for a reference count? The > object would then be cleaned up only if both registers were 0. > > To exploit multiple CPUs, you could have two persistent Python processes > on each CPU with its own mini-GIL. Object creation would then involve a call > to each process to create the reference and GC would involve checking each > process to see what their count is. However, it would mean that within each > process, threads could create additional references or remove references in > an atomic way. > > In a single-CPU system, this would be the same cost as currently, since I > think that situation would devolve to having just one place to check for > references. This seems to mean that it is the case that it would be no more > expensive for a single-CPU system. > > In a two-CPU system, I'm no expertise on the actual call overheads of > object creation and garbage collection, but logically it would double the > effort of object creation and destruction (all such operations now need to > occur on both processes) but would keep reference increments and decrements > atomic. > > Once again, I'm really sorry if I'm completely off-base since I have never > done any actual coding in this area, but I thought I'd make the suggestion > just in case it happened to have relevance. > > Thanks, > -Tennessee > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/tulloss2%40uiuc.edu > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070914/1b63ece8/attachment.htm From Martin.Drautzburg at web.de Fri Sep 14 22:48:15 2007 From: Martin.Drautzburg at web.de (Martin Drautzburg) Date: Fri, 14 Sep 2007 22:48:15 +0200 Subject: [Python-Dev] How to pickle class derived from c++ extension Message-ID: <200709142248.16539.Martin.Drautzburg@web.de> I understand that I can picke an extension class written in C/C++ by providing a __reduce__() method along with __getstate__()/__setstate__(). While I still havent gotten this to work, my main question is: How could I possibly pickle an object of a python class which is derived from the C++ extension? It seems that I can define >>> class Bar(list): ... pass and add more attributes >>> l=Bar() >>> l.x=11 and __reduce__() will show the "x" attribute >>> l.__reduce__() (, (, , []), {'x': 11}) But this does not seem to work with my extension class Foo. I defined a __getstate__() method and __reduce__() indeed shows me some state. But if I create a derived class Bar on the Python side and an object bar as an instance of that class, and add an "x" attribute to that bar object, then __reduce__ing that object shows nothing about the "x" attribute. This is in a way undestandable, as __reduce__() eventually just calls __getstate__() and the only implementation it can find is in my Foo extension class, which knows nothing abpout the Bar derived class let alone its "x" attribute. I would like to have __reduce__() do it the pyhon way as far as it cat get, and then call some magic method of my C++ class to pickle the "C++ part" of an object. Is there a way to achieve this? The "list" class seems to have something that my Foo class does not have. What is this? Or of course if there is a better way, to picke objects of classes which are derived from C++ extensions I'd be happy to hear about it. From aahz at pythoncraft.com Sat Sep 15 18:50:44 2007 From: aahz at pythoncraft.com (Aahz) Date: Sat, 15 Sep 2007 09:50:44 -0700 Subject: [Python-Dev] How to pickle class derived from c++ extension In-Reply-To: <200709142248.16539.Martin.Drautzburg@web.de> References: <200709142248.16539.Martin.Drautzburg@web.de> Message-ID: <20070915165044.GA17750@panix.com> On Fri, Sep 14, 2007, Martin Drautzburg wrote: > > I understand that I can picke an extension class written > in C/C++ by providing a __reduce__() method along with > __getstate__()/__setstate__(). While I still havent gotten this to > work, my main question is: > > How could I possibly pickle an object of a python class which is > derived from the C++ extension? python-dev is not an appropriate place to ask questions about writing your own applications. I suggest the C++-sig or capi-sig lists or the comp.lang.python newsgroup. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The best way to get information on Usenet is not to ask a question, but to post the wrong information. From pfdubois at gmail.com Sun Sep 16 04:46:11 2007 From: pfdubois at gmail.com (Paul Dubois) Date: Sat, 15 Sep 2007 19:46:11 -0700 Subject: [Python-Dev] Eric Raymond account on bug tracker locked Message-ID: Eric Raymond (esr)'s account on bugs.python.org has been misused. Since this may mean that his password on sf.net is also compromised, I cannot trust that address to notify him. I have changed the password to prevent further mischief. If someone knows a bona-fide way to contact him let me know it and I'll inform him, if he doesn't see this himself. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070915/ed935f57/attachment.htm From guido at python.org Sun Sep 16 16:51:42 2007 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Sep 2007 07:51:42 -0700 Subject: [Python-Dev] Eric Raymond account on bug tracker locked In-Reply-To: References: Message-ID: He's probably still esr at thyrsus.com. But he has long stopped being an active developer so I doubt that informing him matters much. On 9/15/07, Paul Dubois wrote: > Eric Raymond (esr)'s account on bugs.python.org has been misused. Since this > may mean that his password on sf.net is also compromised, I cannot trust > that address to notify him. I have changed the password to prevent further > mischief. If someone knows a bona-fide way to contact him let me know it and > I'll inform him, if he doesn't see this himself. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From janssen at parc.com Mon Sep 17 01:01:29 2007 From: janssen at parc.com (Bill Janssen) Date: Sun, 16 Sep 2007 16:01:29 PDT Subject: [Python-Dev] 'text' mode rears its ugly head again Message-ID: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> I've checked in the asyncore SSL patch, and the Windows buildbots are failing on the HTTPS test. I believe it's due to this insane differentiation between between text files and binary files, a bad idea introduced by Windows and perpetuated (apparently) by Python. I can't believe this wasn't eliminated in py3k! Anyway, I think what's going on is that the two data blobs the test compares, one read from a file opened with "open(filename, 'r')", and the other a data stream read from an HTTP response "file" returned from urllib.urlopen(), have different line-endings. Of course, this only matters on Windows; on UNIX, the faux differentiation doesn't exist. Bill From janssen at parc.com Mon Sep 17 02:16:51 2007 From: janssen at parc.com (Bill Janssen) Date: Sun, 16 Sep 2007 17:16:51 PDT Subject: [Python-Dev] 'text' mode rears its ugly head again In-Reply-To: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> References: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> Message-ID: <07Sep16.171657pdt."57996"@synergy1.parc.xerox.com> > I've checked in the asyncore SSL patch, and the Windows buildbots are > failing on the HTTPS test. Fixed. Bill From ncoghlan at gmail.com Mon Sep 17 12:53:08 2007 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 17 Sep 2007 20:53:08 +1000 Subject: [Python-Dev] 'text' mode rears its ugly head again In-Reply-To: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> References: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> Message-ID: <46EE5C94.2050008@gmail.com> Bill Janssen wrote: > I've checked in the asyncore SSL patch, and the Windows buildbots are > failing on the HTTPS test. I believe it's due to this insane > differentiation between between text files and binary files, a bad > idea introduced by Windows and perpetuated (apparently) by Python. I > can't believe this wasn't eliminated in py3k! The binary/text distinction is being increased in Py3k rather than reduced (the API for binary files uses bytes, the API for text files uses Unicode strings). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org From orsenthil at gmail.com Mon Sep 17 18:37:36 2007 From: orsenthil at gmail.com (O.R.Senthil Kumaran) Date: Mon, 17 Sep 2007 22:07:36 +0530 Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo() Message-ID: <20070917163736.GA5434@gmail.com> To get the hostname, I can use socket.gethostbyname() but that has an inherent limitation wherein does it not support IPv6 name resolution, and getaddrinfo() should be used instead. Looking up the socket.getaddrinfo() documentation, I come to know that The getaddrinfo() function returns a list of 5-tuples with the following structure: (family, socktype, proto, canonname, sockaddr) family, socktype, proto are all integer and are meant to be passed to the socket() function. canonname is a string representing the canonical name of the host. It can be a numeric IPv4/v6 address when AI_CANONNAME is specified for a numeric host. With this information, if I try something like this: >>> for res in socket.getaddrinfo('goofy.goofy.com', None, socket.AI_CANONNAME): print res (2, 1, 6, '', ('10.98.1.6', 0)) (2, 2, 17, '', ('10.98.1.6', 0)) (2, 3, 0, '', ('10.98.1.6', 0)) In the output, I see the cannoname to be always blank ''. I am not getting the IPv4 or IPv6 address as a result of using getaddrinfo(). Am I making any mistake? What i am trying is a replacement function for socket.gethostbyname(hostname) which will work for both IPv4 and IPv6 (and make changes in urllib2 to support that) # return hostbyname for either IPv4 or IPv6 address. Common function. def ipv6_gethostbyname(hostname): for res in socket.getaddrinfo(hostname,None, socket.AI_CANONNAME): fa, socktype, proto, canonname, sa = res return cannoname The above function does not seem to work. It returns blank value only. Any help/ pointers? -- O.R.Senthil Kumaran http://uthcode.sarovar.org From janssen at parc.com Mon Sep 17 20:08:27 2007 From: janssen at parc.com (Bill Janssen) Date: Mon, 17 Sep 2007 11:08:27 PDT Subject: [Python-Dev] 'text' mode rears its ugly head again In-Reply-To: <46EE5C94.2050008@gmail.com> References: <07Sep16.160132pdt."57996"@synergy1.parc.xerox.com> <46EE5C94.2050008@gmail.com> Message-ID: <07Sep17.110832pdt."57996"@synergy1.parc.xerox.com> > > differentiation between between text files and binary files, a bad > > idea introduced by Windows and perpetuated (apparently) by Python. I > > can't believe this wasn't eliminated in py3k! > > The binary/text distinction is being increased in Py3k rather than > reduced (the API for binary files uses bytes, the API for text files > uses Unicode strings). Actually, it's not so much the differentiation that bothers me, as it is the default of assuming "text". I think the default should be "binary", and getting the file in "text" mode should require extra effort. It should be 'rt', not 'rb' -- an extra qualifier for text mode, not for binary mode. That would eliminate a lot of the little bugs like this one that crop up in ports to the ineffable assemblage that is Windows. Bill From facundobatista at gmail.com Mon Sep 17 22:58:44 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Mon, 17 Sep 2007 17:58:44 -0300 Subject: [Python-Dev] Hash to longs, and Decimal Message-ID: Hi everybody! In the Tracker Issue... http://bugs.python.org/issue1772851 ... Mark Dickinson came with a patch that alters in a very corner case how the hash is calculated to a long integer. This allows changes in Decimal that lead to a better hashing behaviour for big, big, really big numbers. The patch applies cleanly, all the tests pass ok (Mark also provided more tests for the hash function). I won't commit this right now; I'll delay the change for a couple of days in case somebody wants to take a look at it. Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From skip at pobox.com Mon Sep 17 23:05:15 2007 From: skip at pobox.com (skip at pobox.com) Date: Mon, 17 Sep 2007 16:05:15 -0500 Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo() In-Reply-To: <20070917163736.GA5434@gmail.com> References: <20070917163736.GA5434@gmail.com> Message-ID: <18158.60427.849358.838885@montanaro.dyndns.org> Senthil> To get the hostname, I can use socket.gethostbyname() but that Senthil> has an inherent limitation wherein does it not support IPv6 Senthil> name resolution, and getaddrinfo() should be used instead. ... For those who would ask Senthil to take this to comp.lang.python, he already did and got no response. He's working on fixes to urllib2, so this seems to me to be a python-dev question and I suggested he post here. I tried it with 2.5, 2.6 and 3.0 and got blanks for the canonical name as well. Hopefully someone with more network-fu can steer him in the right direction. Skip From guido at python.org Mon Sep 17 23:17:19 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Sep 2007 14:17:19 -0700 Subject: [Python-Dev] Hash to longs, and Decimal In-Reply-To: References: Message-ID: Seems a fine idea. I don't have the time for a code review but I'll leave that up to you all. --Guido On 9/17/07, Facundo Batista wrote: > Hi everybody! > > In the Tracker Issue... > > http://bugs.python.org/issue1772851 > > ... Mark Dickinson came with a patch that alters in a very corner case > how the hash is calculated to a long integer. > > This allows changes in Decimal that lead to a better hashing behaviour > for big, big, really big numbers. > > The patch applies cleanly, all the tests pass ok (Mark also provided > more tests for the hash function). > > I won't commit this right now; I'll delay the change for a couple of > days in case somebody wants to take a look at it. > > Thanks! > > -- > . Facundo > > Blog: http://www.taniquetil.com.ar/plog/ > PyAr: http://www.python.org/ar/ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dickinsm at gmail.com Mon Sep 17 23:50:45 2007 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 17 Sep 2007 17:50:45 -0400 Subject: [Python-Dev] Hash to longs, and Decimal In-Reply-To: References: Message-ID: <5c6f2a5d0709171450p7feb91f5oc55371824a53108e@mail.gmail.com> On 9/17/07, Facundo Batista wrote: > > In the Tracker Issue... > > http://bugs.python.org/issue1772851 > > ... Mark Dickinson came with a patch that alters in a very corner case > how the hash is calculated to a long integer. > Much as I'd like this patch to be applied, I feel compelled to point out that it does have a significant(?) downside: it slows down hashing of large integers to some degree. On my machine (Dual Xeon 2.8Ghz/SuSE Linux 10.2/gcc 4.1 with -O3), using timeit.timeit('hash(n)') to get timings, the new hash function takes 70% more time for 1000 digit integers, 20% longer for 100 digit integers, but has no measurable performance impact for small (int-sized) longs. I don't know how significant this performance hit is in the larger scheme of things. Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070917/9a327041/attachment.htm From martin at v.loewis.de Tue Sep 18 00:00:53 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 18 Sep 2007 00:00:53 +0200 Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo() In-Reply-To: <20070917163736.GA5434@gmail.com> References: <20070917163736.GA5434@gmail.com> Message-ID: <46EEF915.9060301@v.loewis.de> > Any help/ pointers? Did you read the man page of getaddrinfo, or the RFC? Regards, Martin From trentm at activestate.com Tue Sep 18 00:13:51 2007 From: trentm at activestate.com (Trent Mick) Date: Mon, 17 Sep 2007 15:13:51 -0700 Subject: [Python-Dev] Daily Windows Installers In-Reply-To: References: <46EA7D6C.8010600@v.loewis.de> Message-ID: <46EEFC1F.5020004@activestate.com> David Bolen wrote: > I hit it with a sledge-hammer and modified my build slave to disable > error boxes for anything it runs, so we'll get the 3.0 MSI now but > with a bad chm until it gets figured out. How do you tell Windows to do that? Trent -- Trent Mick trentm at activestate.com From db3l.net at gmail.com Tue Sep 18 00:19:07 2007 From: db3l.net at gmail.com (David Bolen) Date: Mon, 17 Sep 2007 18:19:07 -0400 Subject: [Python-Dev] Daily Windows Installers In-Reply-To: <46EEFC1F.5020004@activestate.com> References: <46EA7D6C.8010600@v.loewis.de> <46EEFC1F.5020004@activestate.com> Message-ID: <9f94e2360709171519m14c1bbd5uda975ed704974230@mail.gmail.com> On 9/17/07, Trent Mick wrote: > How do you tell Windows to do that? Via the SetErrorMode call. Since the Windows buildbot already uses the win32 extensions, I just used the existing win32api wrapper (although through ctypes is very easy too). In my case I just surrounded the reactor.spawnProcess call in buildbot/slave/commands.py with: old_err_mode = win32api.SetErrorMode(7) and win32api.SetErrorMode(old_err_mode) I suppose I should really tweak that to 0x8007 rather than just 7 to include missing file dialogs (like when a removeable device is not available). Since the error mode is inherited by child processes (unless explicitly overridden in the CreateProcess call), this effectively covers the primary child process and any others it may spawn during execution, so it works even though buildbot uses an intermediate command interpreter to execute whatever command is requested. We had a bit of discussion about this recently on the py3k devel list, in regards to failures in the python buildbot tests, in regards to more local changes within Python itself. -- David From Blinston_Fernandes at Dell.com Tue Sep 18 04:14:46 2007 From: Blinston_Fernandes at Dell.com (Blinston_Fernandes at Dell.com) Date: Tue, 18 Sep 2007 07:44:46 +0530 Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo() In-Reply-To: <20070917163736.GA5434@gmail.com> References: <20070917163736.GA5434@gmail.com> Message-ID: On python2.4.1 >>> socket.getaddrinfo('www.python.org', None, socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_IP, socket.AI_CANONNAME) [(2, 2, 17, 'dinsdale.python.org', ('82.94.237.218', 0))] >>> Blinston. -----Original Message----- From: python-dev-bounces+blinston_fernandes=dell.com at python.org [mailto:python-dev-bounces+blinston_fernandes=dell.com at python.org] On Behalf Of O.R.Senthil Kumaran Sent: Monday, September 17, 2007 10:08 PM To: python-dev at python.org Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo() To get the hostname, I can use socket.gethostbyname() but that has an inherent limitation wherein does it not support IPv6 name resolution, and getaddrinfo() should be used instead. Looking up the socket.getaddrinfo() documentation, I come to know that The getaddrinfo() function returns a list of 5-tuples with the following structure: (family, socktype, proto, canonname, sockaddr) family, socktype, proto are all integer and are meant to be passed to the socket() function. canonname is a string representing the canonical name of the host. It can be a numeric IPv4/v6 address when AI_CANONNAME is specified for a numeric host. With this information, if I try something like this: >>> for res in socket.getaddrinfo('goofy.goofy.com', None, socket.AI_CANONNAME): print res (2, 1, 6, '', ('10.98.1.6', 0)) (2, 2, 17, '', ('10.98.1.6', 0)) (2, 3, 0, '', ('10.98.1.6', 0)) In the output, I see the cannoname to be always blank ''. I am not getting the IPv4 or IPv6 address as a result of using getaddrinfo(). Am I making any mistake? What i am trying is a replacement function for socket.gethostbyname(hostname) which will work for both IPv4 and IPv6 (and make changes in urllib2 to support that) # return hostbyname for either IPv4 or IPv6 address. Common function. def ipv6_gethostbyname(hostname): for res in socket.getaddrinfo(hostname,None, socket.AI_CANONNAME): fa, socktype, proto, canonname, sa = res return cannoname The above function does not seem to work. It returns blank value only. Any help/ pointers? -- O.R.Senthil Kumaran http://uthcode.sarovar.org _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/blinston_fernandes%40d ell.com From ksankar at doubleclix.net Tue Sep 18 20:45:25 2007 From: ksankar at doubleclix.net (Krishna Sankar) Date: Tue, 18 Sep 2007 11:45:25 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures Message-ID: <46F01CC5.3000603@doubleclix.net> Folks, As a follow-up to the py3k discussions started by Bruce and Guido, I pinged Brett and he suggested I submit an exploratory proposal. Would appreciate insights, wisdom, the good, the bad and the ugly. A) Does it make sense ? B) Which application sets should we consider in designing the interfaces and implementations C) In this proposal, parallelism and concurrency are used in an interchangeable fashion. Thoughts ? D) Please suggest pertinent links, discussions and insights. E) I have kept the proposal to a minimum to start the discussions and to explore if this is the right thing to do. Collaboratively, as we zero-in on one or two approaches, the idea is to expand it to a crisp and clear PEP. Need to do some more formatting as well. Cheers P.S : I had sent this to python-ideas couple of days ago and received two comments (Thanks Leonardo, Thanks Adam) I haven't incorporated their comments yet. Folks who are on both lists, pardon me for the spam. ------------------------------------------------------------------------------------------------------------ PEP: xxxxxxxx Title: Concurrency for moderately massive (4 to 32 cores) multi-core architectures Version: $Revision$ Last-Modified: $Date$ Author: Krishna Sankar , Status: Wandering ! (as in "Not all those who wander are lost ..." -J.R.R.Tolkien) Type: Process Content-Type: text/x-rst Created: 15-Sep-2007 Abstract -------- This proposal aims at leveraging the multi-core capability as an embedded mechanism in python. It is not whether python is slow or fast, but of performance and control of parallelism/concurrency in a moderately massive parallelism world. The aim is 4 to 32 cores. The proposal advocates two mechanisms - one for task parallelism and another for data intensive parallelism. Scientific computing and web 2.0 frameworks are the forefront users for this proposal. Other applications would benefit as well. Rationale --------- Multicore architectures need no introductions and their ubiquity is evident. It is imperative that Python has one or more standard ways of leveraging multi-core architectures. OTOH, traditional thread based concurrency and lock based exclusions are becoming more and more difficult to program correctly. First of all, the question is not whether py is slow or fast but performance of a system written in py. Which means, ability to leverage multi-core architectures as well as control. Control in term of things like ability to pin one process/task to a core, ability to pin one or more homogeneous tasks to specific cores et al, as well as not wait for a global lock and similar primitives. (Before anybody jumps into a conclusion, this is not about GIL by any means ;o)) Second, it is clear that we need a good solution (not THE solution) for moderately massive parallelism in multi-core architectures (i.e. 8-32 cores). Share nothing might not be optimal; we need some form of memory sharing, not just copy all data via messages. May be functional programming based on the blackboard pattern would work, who knows. I have seen systems saturated still having only ~25% of CPU utilization (in a 4 core system!). It is because we didn't leverage multi-cores and parallelism. So while py3k will not be slow, lack of a cohesive multi-core strategy will show up in system performance and byte us later(pun intended!). At least, in my mind, this is not an exercise about exposing locks and mutexes or threads in Python. I do believe that the GIL will be refactored to more granularity in the coming months (similar to the Global Locks in Linux) and most probably we will get microThreads et al. As we all know, architecture is constraining as well as liberating. The language primitives influence greatly how we think about a problem. In the discussions, Guido is right in insisting on speed, and Bruce is right in asking for language constructs. Without pragmatic speed, folks won't use it; same is the case without the required constructs. Both are barriers to adoption. We have an opportunity to offer a solution for multi-core architectures and let us seize it - we will rush in where angels fear to tread! Programming Models ------------------ There are at least 3 possible paradigms A. conventional threading model B. Functional model, Erlang being the most appropriate C. Some form of limited shared memory model (message passing but pass pointers, blackboard model) D. Others, like Transactional Memory [2] There is enough literature out there, so do not plan to explain these here. ( Do we need more explanation? ) Pragmatic proposal ------------------ May I suggest we embed two primitives in Python 3K: A) A functional style share-nothing set of interfaces (and implementations thereof) - provides the task parallelism/concurrency capability, "small messages, big computations" as Joe Armstrong calls it[3] B) A limited shared memory based model for data intensive parallelism Most probably this would be part of stdlib. While Guido is almost right in saying that this is a (std)library problem, it is not fully so. We would need a few primitives from the underlying PVM substrate. Possibly one reason for Guido's position is the lack of clarity as to what needs to be changed and why. IMHO, just saying take GIL off does not solve the problem either. The Zen of Python parallelism ----------------------------- I draw inspiration for the very timely article by James Reinders in DDJ [1]. It embodies what we should be doing viz.: 1. Refactor the problem into parallel tasks. We cannot help if the domain is sequential 2. Program to abstraction & program chores not cores. Writing correct program using raw threads et al is difficult. Let the underlying substrate decide how best to optimize 3. Design for scale 4. Have an option to turn concurrency off, for debugging 5. Declarative parallelism based mechanisms (?) Related Efforts --------------- The good news is there are at least 2 or 3 paradigms with implementations and rough benchmarks. Parallel python http://www.artima.com/weblogs/viewpost.jsp?thread=214303 http://cheeseshop.python.org/pypi/parallel Processing http://cheeseshop.python.org/pypi/processing http://code.google.com/p/papyros/ Discussions ----------- There are at least four thread sets (pardon the pun !) I am aware of: 1. The GIL discussions in python-dev and Guido's blog on GIL http://www.artima.com/weblogs/viewpost.jsp?thread=214235 2. The py3k topics started by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214112, response by Guide http://www.artima.com/weblogs/viewpost.jsp?thread=214325 and reply to reply by Bruce http://www.artima.com/weblogs/viewpost.jsp?thread=214480 3. Python and concurrency http://mail.python.org/pipermail/python-ideas/2007-March/000338.html 4. Adam's reply in python-ideas http://mail.python.org/pipermail/python-ideas/2007-September/000972.html References [1]http://www.ddj.com/architect/201804248 [2]Transaction http://acmqueue.com/modules.php?name=Content&pa=showpage&pid=444 [3]Programming Erlang by Joe Armstrong From guido at python.org Tue Sep 18 21:15:40 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Sep 2007 12:15:40 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F01CC5.3000603@doubleclix.net> References: <46F01CC5.3000603@doubleclix.net> Message-ID: On 9/18/07, Krishna Sankar wrote: > Folks, > As a follow-up to the py3k discussions started by Bruce and Guido, I > pinged Brett and he suggested I submit an exploratory proposal. Would > appreciate insights, wisdom, the good, the bad and the ugly. > > A) Does it make sense ? > B) Which application sets should we consider in designing the > interfaces and implementations > C) In this proposal, parallelism and concurrency are used in an > interchangeable fashion. Thoughts ? > D) Please suggest pertinent links, discussions and insights. > E) I have kept the proposal to a minimum to start the discussions and > to explore if this is the right thing to do. Collaboratively, as we > zero-in on one or two approaches, the idea is to expand it to a crisp > and clear PEP. Need to do some more formatting as well. I'd say it is a little light on specific proposals. The only section that actually proposes anything is this: > Pragmatic proposal > ------------------ > May I suggest we embed two primitives in Python 3K: > A) A functional style share-nothing set of interfaces (and > implementations thereof) - provides the task parallelism/concurrency > capability, "small messages, big computations" as Joe Armstrong calls it[3] > B) A limited shared memory based model for data intensive parallelism > > Most probably this would be part of stdlib. While Guido is almost right > in saying that this is a (std)library problem, it is not fully so. We > would need a few primitives from the underlying PVM substrate. Possibly > one reason for Guido's position is the lack of clarity as to what needs > to be changed and why. IMHO, just saying take GIL off does not solve the > problem either. Before I can meaningfully comment I think I'd like to hear more about what specifically you are thinking of. I don't mind the necessary changes to the PVM. I do like to see how this affects existing C extensions though. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Tue Sep 18 21:26:14 2007 From: aahz at pythoncraft.com (Aahz) Date: Tue, 18 Sep 2007 12:26:14 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F01CC5.3000603@doubleclix.net> References: <46F01CC5.3000603@doubleclix.net> Message-ID: <20070918192614.GA6757@panix.com> On Tue, Sep 18, 2007, Krishna Sankar wrote: > > As a follow-up to the py3k discussions started by Bruce and Guido, I > pinged Brett and he suggested I submit an exploratory proposal. Would > appreciate insights, wisdom, the good, the bad and the ugly. This should probably start in python-ideas. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The best way to get information on Usenet is not to ask a question, but to post the wrong information. From tulloss2 at uiuc.edu Tue Sep 18 21:54:43 2007 From: tulloss2 at uiuc.edu (Justin Tulloss) Date: Tue, 18 Sep 2007 14:54:43 -0500 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F01CC5.3000603@doubleclix.net> References: <46F01CC5.3000603@doubleclix.net> Message-ID: <2cfeb93c0709181254v50f26800h3d484b50e208db67@mail.gmail.com> On 9/18/07, Krishna Sankar wrote: > > Folks, > As a follow-up to the py3k discussions started by Bruce and Guido, I > pinged Brett and he suggested I submit an exploratory proposal. Would > appreciate insights, wisdom, the good, the bad and the ugly. I am currently working on parallelizing python as an undergraduate independent study. I plan on first removing the GIL with as little overall effect as possible and then implementing a task-oriented threading API on top, probably based on Stackless (since they already do a great job with concurrency in a single thread). If you're interested in all the details, I'd be happy to share. I haven't gotten far yet (the semester just started!), but I feel that actually implementing these things would be the best way to get a PEP through. Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/cc6ff97f/attachment.htm From weilawei at gmail.com Tue Sep 18 23:46:58 2007 From: weilawei at gmail.com (Rob Crowther) Date: Tue, 18 Sep 2007 17:46:58 -0400 Subject: [Python-Dev] Extending Python 3000 Message-ID: I'm attempting to wrap the GNU MP mpf_t type and related functions, as I need the extra precision and speed for a package I'm writing. The available wrappers are either unmaintained or undocumented and incompatible with Python 3000. Following the docs in the Python 3000 section of the website, I've started off with this: *mpfmodule.c* #include #include #include typedef struct { PyObject_HEAD mpf_t ob_val; } MPFObject; static PyTypeObject MPFType = { PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ "mpf.MPF", /*tp_name*/ sizeof(MPFObject), /*tp_basicsize*/ 0, /*tp_itemsize*/ 0, /*tp_dealloc*/ 0, /*tp_print*/ 0, /*tp_getattr*/ 0, /*tp_setattr*/ 0, /*tp_compare*/ 0, /*tp_repr*/ 0, /*tp_as_number*/ 0, /*tp_as_sequence*/ 0, /*tp_as_mapping*/ 0, /*tp_hash */ 0, /*tp_call*/ 0, /*tp_str*/ 0, /*tp_getattro*/ 0, /*tp_setattro*/ 0, /*tp_as_buffer*/ Py_TPFLAGS_DEFAULT, /*tp_flags*/ "GNU MP mpf_t objects", /* tp_doc */ }; static PyMethodDef mpf_methods[] = { {NULL} }; #ifndef PyMODINIT_FUNC #define PyMODINIT_FUNC void #endif PyMODINIT_FUNC initmpf(void) { PyObject* m; MPFType.tp_new = PyType_GenericNew; if (PyType_Ready(&MPFType) < 0) return; m = Py_InitModule3("mpf", mpf_methods, "Wrapper around GNU MP mpf_t and related methods"); Py_INCREF(&MPFType); PyModule_AddObject(m, "MPF", (PyObject *) &MPFType); } Upon running my setup.py script, it gives me numerous warnings. These do not occur if I attempt to use Python 2.5. It also works fine under Python 2.5. weilawei at archeron:~/Code/mpf$ python setup.py build running build running build_ext building 'mpf' extension gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -DMAJOR_VERSION=1 -DMINOR_VERSION=0 -I/usr/local/include -I/usr/local/include/python3.0 -c mpfmodule.c -o build/temp.linux-i686-3.0 /mpfmodule.o *mpfmodule.c:11: warning: missing braces around initializer mpfmodule.c:11: warning: (near initialization for ‘MPFType.ob_base.ob_base’) mpfmodule.c:13: warning: initialization makes integer from pointer without a cast mpfmodule.c:32: warning: initialization from incompatible pointer type* gcc -pthread -shared build/temp.linux-i686-3.0/mpfmodule.o -L/usr/local/lib -lgmp -o build/lib.linux-i686-3.0/mpf.so Inside the Python interpreter: weilawei at archeron:~/Code/mpf/build/lib.linux-i686-3.0$ python Python 3.0a1 (py3k, Sep 15 2007, 00:33:44) [GCC 4.1.3 20070831 (prerelease) (Ubuntu 4.1.2-16ubuntu1)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import mpf >>> test = mpf.MPF() Traceback (most recent call last): File "", line 1, in *MemoryError* >>> mpf.MPF *Segmentation fault* Pointers as to what has changed and what I need to do to compile an extension for Py3k would be very much appreciated. Thank you all in advance for your time. - Rob -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/737432ba/attachment-0001.htm From alexandre at peadrop.com Wed Sep 19 00:09:00 2007 From: alexandre at peadrop.com (Alexandre Vassalotti) Date: Tue, 18 Sep 2007 18:09:00 -0400 Subject: [Python-Dev] Extending Python 3000 In-Reply-To: References: Message-ID: PyObject_HEAD was changed in Py3k to make it conform to C's strict aliasing rules (See PEP 3123 [1]). In your code, you need to change: static PyTypeObject MPFType = { PyObject_HEAD_INIT(NULL) 0, /*ob_size*/ ... } to this: static PyTypeObject MPFType = { PyVarObject_HEAD_INIT(NULL, 0) ... } Good luck, -- Alexandre [1]: http://www.python.org/dev/peps/pep-3123/ From thomas at python.org Wed Sep 19 02:02:59 2007 From: thomas at python.org (Thomas Wouters) Date: Tue, 18 Sep 2007 17:02:59 -0700 Subject: [Python-Dev] SSL certs In-Reply-To: <-6753447202579215070@unknownmsgid> References: <46DDCD7C.40004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <20070912013412.GB14034@panix.com> <20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de> <-6753447202579215070@unknownmsgid> Message-ID: <9e804ac0709181702v25bffd77laef8cc70127d5e11@mail.gmail.com> On 9/13/07, Bill Janssen wrote: > > > However, there is an alternative to using multiple IP addresses: > > one could also use multiple "subject alternative names", and create > > a certificate that lists them all. > > Unfortunately, much of the client code that does the hostname > verification is wrapped up in gullible Web browsers or Java HTTPS > libraries that swallowed RFC 2818 whole, and not easily accessible by > applications. Does any of it recognize and accept "subject > alternative name"? For what it's worth, when I last looked at this (a year or so ago), only a few fringe browsers on mobile phones had issues with accepting our wildcard certificate, and some of those only because they didn't trust the root authority. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/300bb343/attachment.htm From thomas at python.org Wed Sep 19 02:19:25 2007 From: thomas at python.org (Thomas Wouters) Date: Tue, 18 Sep 2007 17:19:25 -0700 Subject: [Python-Dev] Decimal news In-Reply-To: References: Message-ID: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> On 9/13/07, Facundo Batista wrote: > > Hi people! > > After some months, Decimal is now in the trunk again. > > It's fully updated to the latest Cowlishaw specification, and > complying with the latest test cases (from a few days ago, which even > take in consideration some feedback from ours). > > I want to thank so much to Mark Dickinson, who made *a lot* of this > work, not only the math part (he's a mathematician himself), but also > a lot of cleaning and speeding up. > > Now we will put our hands in the documentation, for it to be 100% OK > way before 2.6 arrives. > > Py3 will come after that. Unfortunately, that's not how it works :-) If you check something into the trunk, it will be merged into Py3k sooner or later. I may ask the original submitter for assistance if it's incredibly hard to figure out the changes, but so far, I only had to do that with the SSL changes. The decimal changes are being merged as I write this (tests running now.) Is there anything in particular that needs to be done for decimal in Py3k, besides renaming __div__ to __truediv__? If you re-eally need to check something into the trunk that re-eally must not be merged into py3k, but you're afraid it's not going to be obvious to the merger, please record the change as 'merged' using "svnmerge merge -M -r". Please take care when picking the revision ;) You can also just email me or someone else you see doing merges, as I doubt this will be a common occurance. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070918/c1ae54cf/attachment.htm From orsenthil at gmail.com Wed Sep 19 03:27:40 2007 From: orsenthil at gmail.com (O.R.Senthil Kumaran) Date: Wed, 19 Sep 2007 06:57:40 +0530 Subject: [Python-Dev] IPv6 hostname resolution using socket.getaddrinfo() In-Reply-To: References: <20070917163736.GA5434@gmail.com> Message-ID: <20070919012740.GA3410@gmail.com> * Blinston_Fernandes at Dell.com : > On python2.4.1 > > >>> socket.getaddrinfo('www.python.org', None, socket.AF_INET, > socket.SOCK_DGRAM, socket.IPPROTO_IP, socket.AI_CANONNAME) > [(2, 2, 17, 'dinsdale.python.org', ('82.94.237.218', 0))] > >>> > > Blinston. Thanks a lot, Blinston. That helped. I just have to take care of socket.AF_INET6 flag for IPv6 now. >>>socket.getaddrinfo('localhost', None, socket.AF_INET6, socket.SOCK_DGRAM, socket.IPPROTO_IP, socket.AI_CANONNAME) [(10, 2, 17, 'localhost', ('fe80::219:5bff:fefd:6270', 0, 0, 0))] Shall do a little more research on flags and see if documentation needs any update. Because current one speaks only about AI_CANONNAME being set. Thank you. :) Senthil > > -----Original Message----- > From: python-dev-bounces+blinston_fernandes=dell.com at python.org > [mailto:python-dev-bounces+blinston_fernandes=dell.com at python.org] On > Behalf Of O.R.Senthil Kumaran > Sent: Monday, September 17, 2007 10:08 PM > To: python-dev at python.org > Subject: [Python-Dev] IPv6 hostname resolution using > socket.getaddrinfo() > > To get the hostname, I can use socket.gethostbyname() but that has an > inherent limitation wherein does it not support IPv6 name resolution, > and > getaddrinfo() should be used instead. > > Looking up the socket.getaddrinfo() documentation, I come to know that > > The getaddrinfo() function returns a list of 5-tuples with the following > structure: > > (family, socktype, proto, canonname, sockaddr) > > family, socktype, proto are all integer and are meant to be passed to > the socket() function. canonname is a string representing the canonical > name of the host. It can be a numeric IPv4/v6 address when AI_CANONNAME > is specified for a numeric host. > > With this information, if I try something like this: > > >>> for res in socket.getaddrinfo('goofy.goofy.com', None, > socket.AI_CANONNAME): > > print res > > (2, 1, 6, '', ('10.98.1.6', 0)) > (2, 2, 17, '', ('10.98.1.6', 0)) > (2, 3, 0, '', ('10.98.1.6', 0)) > > In the output, I see the cannoname to be always blank ''. I am not > getting the IPv4 or IPv6 address as a result of using getaddrinfo(). > > Am I making any mistake? > > What i am trying is a replacement function for > socket.gethostbyname(hostname) which will work for both IPv4 and IPv6 > (and make changes in urllib2 to support that) > > # return hostbyname for either IPv4 or IPv6 address. Common function. > > def ipv6_gethostbyname(hostname): > for res in socket.getaddrinfo(hostname,None, > socket.AI_CANONNAME): > fa, socktype, proto, canonname, sa = res > return cannoname > > The above function does not seem to work. It returns blank value only. > > Any help/ pointers? > > -- > O.R.Senthil Kumaran > http://uthcode.sarovar.org > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/blinston_fernandes%40d > ell.com -- O.R.Senthil Kumaran http://uthcode.sarovar.org From janssen at parc.com Wed Sep 19 03:29:23 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 18 Sep 2007 18:29:23 PDT Subject: [Python-Dev] SSL certs In-Reply-To: <9e804ac0709181702v25bffd77laef8cc70127d5e11@mail.gmail.com> References: <46DDCD7C.40004@v.loewis.de> <46DECFF6.4040107@v.loewis.de> <46DEF5FF.8040602@v.loewis.de> <46DEFF3C.90306@v.loewis.de> <-1936579380892715012@unknownmsgid> <60ed19d40709060950qe3ea6eft27b0276768ffa7bb@mail.gmail.com> <20070912013412.GB14034@panix.com> <20070913042606.GB27547@panix.com> <46E8E4B0.60909@v.loewis.de> <-6753447202579215070@unknownmsgid> <9e804ac0709181702v25bffd77laef8cc70127d5e11@mail.gmail.com> Message-ID: <07Sep18.182929pdt."57996"@synergy1.parc.xerox.com> I guess something we should think about is whether to introduce RFC 2818 hostname checking into urllib.urlopen() and similar utilities. Presumably one would add an optional arg specifying a file full of root certs to validate against, and if that arg was present, would retrieve the hostname info from the validated cert, and do the client-side check. Bill From ksankar at doubleclix.net Wed Sep 19 04:36:25 2007 From: ksankar at doubleclix.net (Krishna Sankar) Date: Tue, 18 Sep 2007 19:36:25 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: References: <46F01CC5.3000603@doubleclix.net> Message-ID: <46F08B29.6050206@doubleclix.net> Guido, The vagueness is deliberate, to keep the options open until we have some form o convergence. Parallelism/concurrency is a vast and important domain that I do not want to develop a hasty proposal. But I did want to start thinking in terms of concrete proposals, not pontifying, hence the "pragmatic" section. Happy to hear that you are open to PVM changes. It will not be asked unless and until we all are crisp about it. Cheers Guido van Rossum wrote: > On 9/18/07, Krishna Sankar wrote: > >> Folks, >> As a follow-up to the py3k discussions started by Bruce and Guido, I >> pinged Brett and he suggested I submit an exploratory proposal. Would >> appreciate insights, wisdom, the good, the bad and the ugly. >> >> A) Does it make sense ? >> B) Which application sets should we consider in designing the >> interfaces and implementations >> C) In this proposal, parallelism and concurrency are used in an >> interchangeable fashion. Thoughts ? >> D) Please suggest pertinent links, discussions and insights. >> E) I have kept the proposal to a minimum to start the discussions and >> to explore if this is the right thing to do. Collaboratively, as we >> zero-in on one or two approaches, the idea is to expand it to a crisp >> and clear PEP. Need to do some more formatting as well. >> > > I'd say it is a little light on specific proposals. The only section > that actually proposes anything is this: > > >> Pragmatic proposal >> ------------------ >> May I suggest we embed two primitives in Python 3K: >> A) A functional style share-nothing set of interfaces (and >> implementations thereof) - provides the task parallelism/concurrency >> capability, "small messages, big computations" as Joe Armstrong calls it[3] >> B) A limited shared memory based model for data intensive parallelism >> >> Most probably this would be part of stdlib. While Guido is almost right >> in saying that this is a (std)library problem, it is not fully so. We >> would need a few primitives from the underlying PVM substrate. Possibly >> one reason for Guido's position is the lack of clarity as to what needs >> to be changed and why. IMHO, just saying take GIL off does not solve the >> problem either. >> > > Before I can meaningfully comment I think I'd like to hear more about > what specifically you are thinking of. > > I don't mind the necessary changes to the PVM. I do like to see how > this affects existing C extensions though. > > From ksankar at doubleclix.net Wed Sep 19 04:43:18 2007 From: ksankar at doubleclix.net (Krishna Sankar) Date: Tue, 18 Sep 2007 19:43:18 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <2cfeb93c0709181254v50f26800h3d484b50e208db67@mail.gmail.com> References: <46F01CC5.3000603@doubleclix.net> <2cfeb93c0709181254v50f26800h3d484b50e208db67@mail.gmail.com> Message-ID: <46F08CC6.2060208@doubleclix.net> Justin, Yep, trying out an implementation is a good way. Please share your thoughts as and when you are ready. Cheers & good luck Justin Tulloss wrote: > > > On 9/18/07, *Krishna Sankar* > wrote: > > Folks, > As a follow-up to the py3k discussions started by Bruce and > Guido, I > pinged Brett and he suggested I submit an exploratory proposal. Would > appreciate insights, wisdom, the good, the bad and the ugly. > > > I am currently working on parallelizing python as an undergraduate > independent study. I plan on first removing the GIL with as little > overall effect as possible and then implementing a task-oriented > threading API on top, probably based on Stackless (since they already > do a great job with concurrency in a single thread). > > If you're interested in all the details, I'd be happy to share. I > haven't gotten far yet (the semester just started!), but I feel that > actually implementing these things would be the best way to get a PEP > through. > > Justin > > From guido at python.org Wed Sep 19 05:24:28 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Sep 2007 20:24:28 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F08B29.6050206@doubleclix.net> References: <46F01CC5.3000603@doubleclix.net> <46F08B29.6050206@doubleclix.net> Message-ID: On 9/18/07, Krishna Sankar wrote: > The vagueness is deliberate, to keep the options open until we have > some form o convergence. Parallelism/concurrency is a vast and important > domain that I do not want to develop a hasty proposal. But I did want to > start thinking in terms of concrete proposals, not pontifying, hence the > "pragmatic" section. As long as it's this vague it doesn't deserve to be called a PEP though. PEPs can't be vague, they must make specific proposals. As long as this is intentionally half-baked it belongs back in python-ideas and there's no point in pretending to be writing a "PEP". > Happy to hear that you are open to PVM changes. It will not be asked > unless and until we all are crisp about it. > Cheers > > > Guido van Rossum wrote: > > On 9/18/07, Krishna Sankar wrote: > > > >> Folks, > >> As a follow-up to the py3k discussions started by Bruce and Guido, I > >> pinged Brett and he suggested I submit an exploratory proposal. Would > >> appreciate insights, wisdom, the good, the bad and the ugly. > >> > >> A) Does it make sense ? > >> B) Which application sets should we consider in designing the > >> interfaces and implementations > >> C) In this proposal, parallelism and concurrency are used in an > >> interchangeable fashion. Thoughts ? > >> D) Please suggest pertinent links, discussions and insights. > >> E) I have kept the proposal to a minimum to start the discussions and > >> to explore if this is the right thing to do. Collaboratively, as we > >> zero-in on one or two approaches, the idea is to expand it to a crisp > >> and clear PEP. Need to do some more formatting as well. > >> > > > > I'd say it is a little light on specific proposals. The only section > > that actually proposes anything is this: > > > > > >> Pragmatic proposal > >> ------------------ > >> May I suggest we embed two primitives in Python 3K: > >> A) A functional style share-nothing set of interfaces (and > >> implementations thereof) - provides the task parallelism/concurrency > >> capability, "small messages, big computations" as Joe Armstrong calls it[3] > >> B) A limited shared memory based model for data intensive parallelism > >> > >> Most probably this would be part of stdlib. While Guido is almost right > >> in saying that this is a (std)library problem, it is not fully so. We > >> would need a few primitives from the underlying PVM substrate. Possibly > >> one reason for Guido's position is the lack of clarity as to what needs > >> to be changed and why. IMHO, just saying take GIL off does not solve the > >> problem either. > >> > > > > Before I can meaningfully comment I think I'd like to hear more about > > what specifically you are thinking of. > > > > I don't mind the necessary changes to the PVM. I do like to see how > > this affects existing C extensions though. > > > > > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Sep 19 05:24:28 2007 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Sep 2007 20:24:28 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F08B29.6050206@doubleclix.net> References: <46F01CC5.3000603@doubleclix.net> <46F08B29.6050206@doubleclix.net> Message-ID: On 9/18/07, Krishna Sankar wrote: > The vagueness is deliberate, to keep the options open until we have > some form o convergence. Parallelism/concurrency is a vast and important > domain that I do not want to develop a hasty proposal. But I did want to > start thinking in terms of concrete proposals, not pontifying, hence the > "pragmatic" section. As long as it's this vague it doesn't deserve to be called a PEP though. PEPs can't be vague, they must make specific proposals. As long as this is intentionally half-baked it belongs back in python-ideas and there's no point in pretending to be writing a "PEP". > Happy to hear that you are open to PVM changes. It will not be asked > unless and until we all are crisp about it. > Cheers > > > Guido van Rossum wrote: > > On 9/18/07, Krishna Sankar wrote: > > > >> Folks, > >> As a follow-up to the py3k discussions started by Bruce and Guido, I > >> pinged Brett and he suggested I submit an exploratory proposal. Would > >> appreciate insights, wisdom, the good, the bad and the ugly. > >> > >> A) Does it make sense ? > >> B) Which application sets should we consider in designing the > >> interfaces and implementations > >> C) In this proposal, parallelism and concurrency are used in an > >> interchangeable fashion. Thoughts ? > >> D) Please suggest pertinent links, discussions and insights. > >> E) I have kept the proposal to a minimum to start the discussions and > >> to explore if this is the right thing to do. Collaboratively, as we > >> zero-in on one or two approaches, the idea is to expand it to a crisp > >> and clear PEP. Need to do some more formatting as well. > >> > > > > I'd say it is a little light on specific proposals. The only section > > that actually proposes anything is this: > > > > > >> Pragmatic proposal > >> ------------------ > >> May I suggest we embed two primitives in Python 3K: > >> A) A functional style share-nothing set of interfaces (and > >> implementations thereof) - provides the task parallelism/concurrency > >> capability, "small messages, big computations" as Joe Armstrong calls it[3] > >> B) A limited shared memory based model for data intensive parallelism > >> > >> Most probably this would be part of stdlib. While Guido is almost right > >> in saying that this is a (std)library problem, it is not fully so. We > >> would need a few primitives from the underlying PVM substrate. Possibly > >> one reason for Guido's position is the lack of clarity as to what needs > >> to be changed and why. IMHO, just saying take GIL off does not solve the > >> problem either. > >> > > > > Before I can meaningfully comment I think I'd like to hear more about > > what specifically you are thinking of. > > > > I don't mind the necessary changes to the PVM. I do like to see how > > this affects existing C extensions though. > > > > > > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ksankar at doubleclix.net Wed Sep 19 05:43:20 2007 From: ksankar at doubleclix.net (Krishna Sankar) Date: Tue, 18 Sep 2007 20:43:20 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: References: <46F01CC5.3000603@doubleclix.net> <46F08B29.6050206@doubleclix.net> Message-ID: <46F09AD8.10005@doubleclix.net> Agreed it is not a PEP yet. Hence the word "Exploration" in front of it. This ia domain which needs some discussions before developing a good PEP. May be I should call it a PEPlet ;o) Cheers Guido van Rossum wrote: > On 9/18/07, Krishna Sankar wrote: > >> The vagueness is deliberate, to keep the options open until we have >> some form o convergence. Parallelism/concurrency is a vast and important >> domain that I do not want to develop a hasty proposal. But I did want to >> start thinking in terms of concrete proposals, not pontifying, hence the >> "pragmatic" section. >> > > As long as it's this vague it doesn't deserve to be called a PEP > though. PEPs can't be vague, they must make specific proposals. As > long as this is intentionally half-baked it belongs back in > python-ideas and there's no point in pretending to be writing a "PEP". > > >> Happy to hear that you are open to PVM changes. It will not be asked >> unless and until we all are crisp about it. >> Cheers >> >> >> Guido van Rossum wrote: >> >>> On 9/18/07, Krishna Sankar wrote: >>> >>> >>>> Folks, >>>> As a follow-up to the py3k discussions started by Bruce and Guido, I >>>> pinged Brett and he suggested I submit an exploratory proposal. Would >>>> appreciate insights, wisdom, the good, the bad and the ugly. >>>> >>>> A) Does it make sense ? >>>> B) Which application sets should we consider in designing the >>>> interfaces and implementations >>>> C) In this proposal, parallelism and concurrency are used in an >>>> interchangeable fashion. Thoughts ? >>>> D) Please suggest pertinent links, discussions and insights. >>>> E) I have kept the proposal to a minimum to start the discussions and >>>> to explore if this is the right thing to do. Collaboratively, as we >>>> zero-in on one or two approaches, the idea is to expand it to a crisp >>>> and clear PEP. Need to do some more formatting as well. >>>> >>>> >>> I'd say it is a little light on specific proposals. The only section >>> that actually proposes anything is this: >>> >>> >>> >>>> Pragmatic proposal >>>> ------------------ >>>> May I suggest we embed two primitives in Python 3K: >>>> A) A functional style share-nothing set of interfaces (and >>>> implementations thereof) - provides the task parallelism/concurrency >>>> capability, "small messages, big computations" as Joe Armstrong calls it[3] >>>> B) A limited shared memory based model for data intensive parallelism >>>> >>>> Most probably this would be part of stdlib. While Guido is almost right >>>> in saying that this is a (std)library problem, it is not fully so. We >>>> would need a few primitives from the underlying PVM substrate. Possibly >>>> one reason for Guido's position is the lack of clarity as to what needs >>>> to be changed and why. IMHO, just saying take GIL off does not solve the >>>> problem either. >>>> >>>> >>> Before I can meaningfully comment I think I'd like to hear more about >>> what specifically you are thinking of. >>> >>> I don't mind the necessary changes to the PVM. I do like to see how >>> this affects existing C extensions though. >>> >>> >>> >> > > > From jdsw2002 at yahoo.com Wed Sep 19 10:40:30 2007 From: jdsw2002 at yahoo.com (jd) Date: Wed, 19 Sep 2007 01:40:30 -0700 (PDT) Subject: [Python-Dev] Pygtk app and hangs. Message-ID: <183106.76731.qm@web35813.mail.mud.yahoo.com> Hi I have a non-trivial pygtk running in to hangs/freezes. Over all here is how program looks like. gobject.threads_init() gtk.main within threads_enter/threads_leave All UI operaions in main threads. Some call backs create UIWorker threads, UI worker thread does some work.. and then do gobkect.idle_add to call a function that updates the UI. I have a timer uses gobject.timeout_add. the idle callbacks and timeout call backs are in threads_enter/threads_leave. I use some lib, that creates its own threads and does socket operations. Q. Anything I missed, or any suggestions. Is there a comprehensive list/scheme on how to write a MT pygtk app? Q. I tried to setup debug version 2.5 but it fails with import gtk File "/usr/lib/python2.5/site-packages/gtk-2.0/gtk/__init__.py", line 38, in import gobject as _gobject File "/usr/lib/python2.5/site-packages/gtk-2.0/gobject/__init__.py", line 30, in from _gobject import * ImportError: /usr/lib/python2.5/site-packages/gtk-2.0/gobject/_gobject.so: undefined symbol: PyUnicodeUCS4_FromObject What special switch do I need to give to the configure while building python ? Q. I have attached thread dumps. Any input on what might be going ? Q. Modal dialogboxes event processing happens in the main thread ? Sorry for sending it to both the list. But the app is pygtk while stack *seems* fairly clean (other than main thread). Thanks a ton, in advance. /Jd ____________________________________________________________________________________ Pinpoint customers who are looking for what you sell. http://searchmarketing.yahoo.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: gdb_hang_problem Type: application/octet-stream Size: 18347 bytes Desc: 130815649-gdb_hang_problem Url : http://mail.python.org/pipermail/python-dev/attachments/20070919/11299660/attachment-0001.obj From steve at shrogers.com Wed Sep 19 14:39:33 2007 From: steve at shrogers.com (Steven H. Rogers) Date: Wed, 19 Sep 2007 06:39:33 -0600 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F01CC5.3000603@doubleclix.net> References: <46F01CC5.3000603@doubleclix.net> Message-ID: <46F11885.7000406@shrogers.com> Krishna Sankar wrote: > Folks, > As a follow-up to the py3k discussions started by Bruce and Guido, I > pinged Brett and he suggested I submit an exploratory proposal. Would > appreciate insights, wisdom, the good, the bad and the ugly. > > A) Does it make sense ? > B) Which application sets should we consider in designing the > interfaces and implementations > C) In this proposal, parallelism and concurrency are used in an > interchangeable fashion. Thoughts ? > D) Please suggest pertinent links, discussions and insights. > E) I have kept the proposal to a minimum to start the discussions and > to explore if this is the right thing to do. Collaboratively, as we > zero-in on one or two approaches, the idea is to expand it to a crisp > and clear PEP. Need to do some more formatting as well. > Cheers > > P.S : I had sent this to python-ideas couple of days ago and received > two comments (Thanks Leonardo, Thanks Adam) I haven't incorporated their > comments yet. Folks who are on both lists, pardon me for the spam. # Proto-PEP elided. Other than number of cores, you don't mention hardware architecture. I presume that you're thinking of symmetric multiprocessor architectures. If so, this should be explicit. You should also consider that SMP may not be the predominant multi-core architecture in the future, the Cell processor has one general purpose processor and eight more specialized processors. You might not want to limit the PEP to 32 cores, I know of startups working on 40 and 64 core chips. Shared memory may be necessary for good performance, but it doesn't have to be exposed at the language level. While Erlang has strictly message passing semantics, I believe that it uses shared memory in the low level implementation. # Steve From facundobatista at gmail.com Wed Sep 19 15:01:32 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 19 Sep 2007 10:01:32 -0300 Subject: [Python-Dev] Decimal news In-Reply-To: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> Message-ID: 2007/9/18, Thomas Wouters : > Unfortunately, that's not how it works :-) If you check something into the > trunk, it will be merged into Py3k sooner or later. I may ask the original > submitter for assistance if it's incredibly hard to figure out the changes, > but so far, I only had to do that with the SSL changes. The decimal changes > are being merged as I write this (tests running now.) Is there anything in > particular that needs to be done for decimal in Py3k, besides renaming > __div__ to __truediv__? There isn't nothing really special to do, but my plan was because I didn't know how the mechanism worked, ;) It'd be great if all the changes that I'm making to Decimal are automatically, at some point, merged into Py3k (I guess that using the conversion tool). But at some point, both codes may start to diverge, because Py3k-specific optimizations could be done there... but this could be done in an year or two, ;). So, how is this handled? Until which moment can I expect that the changes in the trunk are merged to Py3k? Thank you very much! Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From thomas at python.org Wed Sep 19 16:30:18 2007 From: thomas at python.org (Thomas Wouters) Date: Wed, 19 Sep 2007 07:30:18 -0700 Subject: [Python-Dev] Decimal news In-Reply-To: References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> Message-ID: <9e804ac0709190730m2ff4a290x7601c74d18e06237@mail.gmail.com> On 9/19/07, Facundo Batista wrote: > > 2007/9/18, Thomas Wouters : > > > Unfortunately, that's not how it works :-) If you check something into > the > > trunk, it will be merged into Py3k sooner or later. I may ask the > original > > submitter for assistance if it's incredibly hard to figure out the > changes, > > but so far, I only had to do that with the SSL changes. The decimal > changes > > are being merged as I write this (tests running now.) Is there anything > in > > particular that needs to be done for decimal in Py3k, besides renaming > > __div__ to __truediv__? > > There isn't nothing really special to do, but my plan was because I > didn't know how the mechanism worked, ;) > > It'd be great if all the changes that I'm making to Decimal are > automatically, at some point, merged into Py3k (I guess that using the > conversion tool). I don't usually have to use the 2to3 tool, but sometimes, yes. But at some point, both codes may start to diverge, because > Py3k-specific optimizations could be done there... but this could be > done in an year or two, ;). > > So, how is this handled? Until which moment can I expect that the > changes in the trunk are merged to Py3k? Until you hear otherwise :) You can commit py3k-specific changes to the py3k branch, the merges shouldn't lose them. (Of course, mistakes in merging are possible, which is why tests are good :) If I do base the merge on the 2to3 outpt of the trunk version, I'd be careful not to lose changes made in the branch. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070919/e30f5b69/attachment.htm From facundobatista at gmail.com Wed Sep 19 16:33:48 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 19 Sep 2007 11:33:48 -0300 Subject: [Python-Dev] Decimal news In-Reply-To: <9e804ac0709190730m2ff4a290x7601c74d18e06237@mail.gmail.com> References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> <9e804ac0709190730m2ff4a290x7601c74d18e06237@mail.gmail.com> Message-ID: 2007/9/19, Thomas Wouters : > > So, how is this handled? Until which moment can I expect that the > > changes in the trunk are merged to Py3k? > > Until you hear otherwise :) You can commit py3k-specific changes to the py3k > branch, the merges shouldn't lose them. (Of course, mistakes in merging are > possible, which is why tests are good :) If I do base the merge on the 2to3 > outpt of the trunk version, I'd be careful not to lose changes made in the > branch. Ok, thank you very much!! Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From aahz at pythoncraft.com Wed Sep 19 17:13:07 2007 From: aahz at pythoncraft.com (Aahz) Date: Wed, 19 Sep 2007 08:13:07 -0700 Subject: [Python-Dev] Pygtk app and hangs. In-Reply-To: <183106.76731.qm@web35813.mail.mud.yahoo.com> References: <183106.76731.qm@web35813.mail.mud.yahoo.com> Message-ID: <20070919151307.GA7802@panix.com> On Wed, Sep 19, 2007, jd wrote: > > I have a non-trivial pygtk running in to hangs/freezes. python-dev is not an appropriate place to ask for help with debugging your programs. It is only for people working on the Python package itself. Please use the pygtk list (which you already did) or the newsgroup comp.lang.python. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ The best way to get information on Usenet is not to ask a question, but to post the wrong information. From status at bugs.python.org Wed Sep 19 20:14:03 2007 From: status at bugs.python.org (Tracker) Date: Wed, 19 Sep 2007 18:14:03 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070919181403.346B9782C1@psf.upfronthosting.co.za> ACTIVITY SUMMARY (09/12/07 - 09/19/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1266 open (+13) / 11396 closed (+11) / 12662 total (+24) Open issues with patches: 408 Average duration of open issues: 678 days. Median duration of open issues: 643 days. Open Issues Breakdown open 1262 (+13) pending 4 ( +0) Issues Created Or Reopened (24) _______________________________ Suggested change to _exit function description in os module docu 09/12/07 CLOSED http://bugs.python.org/issue1156 created jtonsing test_urllib2net fails on test_ftp 09/12/07 http://bugs.python.org/issue1157 created gvanrossum %f format for datetime objects 09/13/07 http://bugs.python.org/issue1158 created skip.montanaro py3k, patch os.getenv() not updated after external module uses C putenv() 09/13/07 http://bugs.python.org/issue1159 created robert.ancell Medium size regexp crashes python 09/13/07 http://bugs.python.org/issue1160 created ostkamp Garbled chars in offending line of SyntaxError traceback 09/13/07 http://bugs.python.org/issue1161 created eopadoan Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" B 09/13/07 CLOSED http://bugs.python.org/issue1162 created swaroopch Patch to make py3k/Lib/test/test_thread.py use unittest 09/13/07 http://bugs.python.org/issue1163 created JonoDiCarlo patch tp_print slots don't release the GIL 09/14/07 CLOSED http://bugs.python.org/issue1164 created arigo patch Should itertools.count work for arbitrary integers? 09/15/07 http://bugs.python.org/issue1165 created eopadoan py3k NameError when calling malloc 09/15/07 CLOSED http://bugs.python.org/issue1166 created esr gdbm/ndbm 1.8.1+ needs libgdbm_compat.so 09/16/07 http://bugs.python.org/issue1167 created ikelly patch complex arithmetic: strange results with "imag" 09/16/07 CLOSED http://bugs.python.org/issue1168 created newman Option -OO doesn't remove docstrings from functions 09/16/07 CLOSED http://bugs.python.org/issue1169 created piro patch shlex have problems with parsing unicode 09/17/07 http://bugs.python.org/issue1170 created dexen allow subclassing of bytes type 09/17/07 http://bugs.python.org/issue1171 created mfenniak py3k, patch Documentation for done attribute of FieldStorage class 09/17/07 CLOSED http://bugs.python.org/issue1172 created bkline patch yield expressions not documented in Language Reference 09/17/07 CLOSED http://bugs.python.org/issue1173 created dangyogi new generator methods not documented in Library Reference 09/17/07 CLOSED http://bugs.python.org/issue1174 created dangyogi .readline() has bug WRT nonblocking files 09/18/07 CLOSED http://bugs.python.org/issue1175 created ajb str.split() takes no keyword arguments (Should this be expected? 09/18/07 http://bugs.python.org/issue1176 created sergioc urllib* 20x responses not OK? 09/19/07 CLOSED http://bugs.python.org/issue1177 reopened jafo patch IDLE - add "paste code" functionality 09/18/07 http://bugs.python.org/issue1178 created taleinat patch [CVE-2007-4965] Integer overflow in imageop module 09/19/07 http://bugs.python.org/issue1179 created cartman Issues Now Closed (34) ______________________ cgi: parse_qs and parse_qsl misbehave on empty strings 24 days http://bugs.python.org/issue1014 gvanrossum [py3k] pdb does not work in python 3000 16 days http://bugs.python.org/issue1038 georg.brandl py3k platform system may be Windows or Microsoft since Vista 17 days http://bugs.python.org/issue1082 p.lavarre at ieee.org patch "make altinstall" installs pydoc, idle, smtpd.py with broken she 6 days http://bugs.python.org/issue1120 georg.brandl Document inspect.getfullargspec() 6 days http://bugs.python.org/issue1121 georg.brandl py3k split(None, maxsplit) does not strip whitespace correctly 12 days http://bugs.python.org/issue1123 nirs file.fileno and file.isatty() should be implementable by any fil 11 days http://bugs.python.org/issue1126 jafo Reference Manual: "for statement" links to "break statement" 10 days http://bugs.python.org/issue1131 georg.brandl re.sub returns str when processing empty unicode string 7 days http://bugs.python.org/issue1140 jafo reading large files 8 days http://bugs.python.org/issue1141 jafo TypeError on join - httplib mixing str and bytes 1 days http://bugs.python.org/issue1148 gvanrossum fdopen does not work as expected 6 days http://bugs.python.org/issue1149 jafo Rename PyBUF_WRITEABLE to PyBUF_WRITABLE 6 days http://bugs.python.org/issue1150 jafo patch help(pickle) fails: unorderable types: type() < type() 0 days http://bugs.python.org/issue1153 georg.brandl Carbon.CF memory leak 0 days http://bugs.python.org/issue1154 georg.brandl Suggested change to _exit function description in os module docu 2 days http://bugs.python.org/issue1156 jtonsing Python doesn't compile on Microsoft Visual Studio 2008 "Orcas" B 1 days http://bugs.python.org/issue1162 georg.brandl tp_print slots don't release the GIL 2 days http://bugs.python.org/issue1164 brett.cannon patch NameError when calling malloc 0 days http://bugs.python.org/issue1166 loewis complex arithmetic: strange results with "imag" 0 days http://bugs.python.org/issue1168 georg.brandl Option -OO doesn't remove docstrings from functions 3 days http://bugs.python.org/issue1169 georg.brandl patch Documentation for done attribute of FieldStorage class 1 days http://bugs.python.org/issue1172 jafo patch yield expressions not documented in Language Reference 0 days http://bugs.python.org/issue1173 georg.brandl new generator methods not documented in Library Reference 0 days http://bugs.python.org/issue1174 georg.brandl .readline() has bug WRT nonblocking files 1 days http://bugs.python.org/issue1175 gvanrossum urllib* 20x responses not OK? 0 days http://bugs.python.org/issue1177 facundobatista patch time mod's timezone doesn't honor TZ var 2114 days http://bugs.python.org/issue487331 brett.cannon asyncore file wrapper & os.error 1988 days http://bugs.python.org/issue539444 brett.cannon long file name support broken in windows 1983 days http://bugs.python.org/issue542314 mhammond urllib2 raises exception with non-200 success codes. 1193 days http://bugs.python.org/issue971965 georg.brandl class property fset not working 842 days http://bugs.python.org/issue1207379 georg.brandl Reading with bz2.BZ2File() returns one garbage character 306 days http://bugs.python.org/issue1597011 jafo Decimal and long hash, compatibly and efficiently 38 days http://bugs.python.org/issue1772851 facundobatista patch ctypes on Solaris 25 days http://bugs.python.org/issue1777530 theller Top Issues Most Discussed (10) ______________________________ 12 platform system may be Windows or Microsoft since Vista 17 days closed http://bugs.python.org/issue1082 9 tp_print slots don't release the GIL 2 days closed http://bugs.python.org/issue1164 9 os.getenv() not updated after external module uses C putenv() 6 days open http://bugs.python.org/issue1159 9 %f format for datetime objects 7 days open http://bugs.python.org/issue1158 7 urllib* 20x responses not OK? 0 days closed http://bugs.python.org/issue1177 6 Optimizations for cgi.FieldStorage methods 399 days open http://bugs.python.org/issue1541463 6 .readline() has bug WRT nonblocking files 1 days closed http://bugs.python.org/issue1175 6 Allow str.join to join non-string types (as per PEP 3100) 8 days open http://bugs.python.org/issue1145 5 Decimal and long hash, compatibly and efficiently 38 days closed http://bugs.python.org/issue1772851 5 Documentation for done attribute of FieldStorage class 1 days closed http://bugs.python.org/issue1172 From facundobatista at gmail.com Wed Sep 19 22:41:49 2007 From: facundobatista at gmail.com (Facundo Batista) Date: Wed, 19 Sep 2007 17:41:49 -0300 Subject: [Python-Dev] Python tickets summary In-Reply-To: References: Message-ID: 2007/9/10, Facundo Batista : > I modified my tool, whichs makes a summary of all the Python tickets > (I moved the source where the info is taken from SF to our Roundup). Based on an idea from Dennis Benzinger, now the temporal bars show the moments where each comment was made, so it's easy to see the "rhythm" of the ticket activity: http://www.taniquetil.com.ar/facundo/py_tickets.html Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From ksankar at doubleclix.net Wed Sep 19 23:00:48 2007 From: ksankar at doubleclix.net (Krishna Sankar) Date: Wed, 19 Sep 2007 14:00:48 -0700 Subject: [Python-Dev] Exploration PEP : Concurrency for moderately massive (4 to 32 cores) multi-core architectures In-Reply-To: <46F11885.7000406@shrogers.com> References: <46F01CC5.3000603@doubleclix.net> <46F11885.7000406@shrogers.com> Message-ID: <46F18E00.40700@doubleclix.net> Steve, Thanks. a) Yep, SMP for now. Agreed on the need for asymmetric architectures like cell-processor. We need to start somewhere and then can extend to more exotic realms. b) Yep, need to scale to arbitrary number of cores. But as a start, I wanted to differentiate from massive parallelism. c) Yep, we can have message passing semantics at the interface level and then underneath share the memory (even optimize with the copy on write patter). I was thinking that we would need to cross process space; for example federate 8 separate py processes (in an 8 core machine) and have a shared data path between them, based on shared memory allocated at configuration time. Cheers Steven H. Rogers wrote: > Krishna Sankar wrote: > >> Folks, >> As a follow-up to the py3k discussions started by Bruce and Guido, I >> pinged Brett and he suggested I submit an exploratory proposal. Would >> appreciate insights, wisdom, the good, the bad and the ugly. >> >> A) Does it make sense ? >> B) Which application sets should we consider in designing the >> interfaces and implementations >> C) In this proposal, parallelism and concurrency are used in an >> interchangeable fashion. Thoughts ? >> D) Please suggest pertinent links, discussions and insights. >> E) I have kept the proposal to a minimum to start the discussions and >> to explore if this is the right thing to do. Collaboratively, as we >> zero-in on one or two approaches, the idea is to expand it to a crisp >> and clear PEP. Need to do some more formatting as well. >> Cheers >> >> P.S : I had sent this to python-ideas couple of days ago and received >> two comments (Thanks Leonardo, Thanks Adam) I haven't incorporated their >> comments yet. Folks who are on both lists, pardon me for the spam. >> > # Proto-PEP elided. > > Other than number of cores, you don't mention hardware architecture. I > presume that you're thinking of symmetric multiprocessor architectures. > If so, this should be explicit. You should also consider that SMP may > not be the predominant multi-core architecture in the future, the Cell > processor has one general purpose processor and eight more specialized > processors. You might not want to limit the PEP to 32 cores, I know of > startups working on 40 and 64 core chips. > > Shared memory may be necessary for good performance, but it doesn't have > to be exposed at the language level. While Erlang has strictly message > passing semantics, I believe that it uses shared memory in the low level > implementation. > > # Steve > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ksankar%40doubleclix.net > > > From rrr at ronadam.com Thu Sep 20 01:55:41 2007 From: rrr at ronadam.com (Ron Adam) Date: Wed, 19 Sep 2007 18:55:41 -0500 Subject: [Python-Dev] Python tickets summary In-Reply-To: References: Message-ID: <46F1B6FD.70007@ronadam.com> Facundo Batista wrote: > 2007/9/10, Facundo Batista : > >> I modified my tool, whichs makes a summary of all the Python tickets >> (I moved the source where the info is taken from SF to our Roundup). > > Based on an idea from Dennis Benzinger, now the temporal bars show the > moments where each comment was made, so it's easy to see the "rhythm" > of the ticket activity: > > http://www.taniquetil.com.ar/facundo/py_tickets.html Looks good. :-) I noticed that there is a background of light blue between marks. That is hard to see on my computer because it is so close to the grey tone. Also shouldn't the light blue background bar extend all the way to the end for all open items? Cheers, Ron From janssen at parc.com Thu Sep 20 19:29:30 2007 From: janssen at parc.com (Bill Janssen) Date: Thu, 20 Sep 2007 10:29:30 PDT Subject: [Python-Dev] SSL module backport package ready for more testing Message-ID: <07Sep20.102937pdt."57996"@synergy1.parc.xerox.com> I've posted an sdist version of the 'ssl' module for Pythons 2.3.5 to 2.5.x, at http://www.parc.com/janssen/transient/ssl-1.3.tar.gz. I think this is 'gold master', but before I upload it to the Cheeseshop, I'd like to get more testing on a broader variety of platforms. The intent of this package is to allow code development with older versions of Python that will continue to work on Python 2.6 and 3.x. To build, python setup.py build To test, python setup.py test I'd appreciate feedback on testing results; please send to janssen at parc.com. Thanks! Bill From alex.neundorf at kitware.com Thu Sep 20 22:30:36 2007 From: alex.neundorf at kitware.com (Alexander Neundorf) Date: Thu, 20 Sep 2007 16:30:36 -0400 Subject: [Python-Dev] Building Python with CMake In-Reply-To: <200708301628.57127.alex.neundorf@kitware.com> References: <200707131359.17030.alex.neundorf@kitware.com> <200708301628.57127.alex.neundorf@kitware.com> Message-ID: <200709201630.36733.alex.neundorf@kitware.com> Hi, On Thursday 30 August 2007 16:28, Alexander Neundorf wrote: ... > The cmake files for building python are now in a cvs repository: > http://www.cmake.org/cgi-bin/viewcvs.cgi/Utilities/CMakeBuildForPython/?roo >t=ParaView3 > > This is inside the ParaView3 repository: > http://www.paraview.org/New/download.html > > I used them today to build Python from svn trunk. > > I'll add some documentation how to use them, how to get them and what works > and what doesn't work tomorrow. Ok, it took a bit longer. The wiki page is here: http://paraview.org/ParaView3/index.php/BuildingPythonWithCMake With the cmake files from cvs you can build Python svn, which will become Python 2.6. It use it for Linux, IBM BlueGene/L and Cray Xt3 (in both cases for the compute nodes, not the front end nodes). It works also for Windows, but I didn't take the time to check that all the configure checks deliver the correct results, so I just reused the premade pyconfig.h there. Most modules are built now. For every module you can select whether to build it statically or dynamically or not at all. Source and binary packages can be created using "make packages". These files don't conflict with any files in Python svn, so if somebody is interested adding them to Python svn shouldn't cause any problems. Bye Alex P.S. due to moving I'll be mainly offline in the next weeks From steven.bethard at gmail.com Thu Sep 20 22:58:54 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Thu, 20 Sep 2007 14:58:54 -0600 Subject: [Python-Dev] Building Python with CMake In-Reply-To: <200709201630.36733.alex.neundorf@kitware.com> References: <200707131359.17030.alex.neundorf@kitware.com> <200708301628.57127.alex.neundorf@kitware.com> <200709201630.36733.alex.neundorf@kitware.com> Message-ID: On 9/20/07, Alexander Neundorf wrote: > On Thursday 30 August 2007 16:28, Alexander Neundorf wrote: > ... > > The cmake files for building python are now in a cvs repository: > > http://www.cmake.org/cgi-bin/viewcvs.cgi/Utilities/CMakeBuildForPython/?roo > >t=ParaView3 Thanks for your work on this! That page seems to require a login. Any chance you could post it to something like:: http://wiki.python.org/moin/BuildingPythonWithCMake STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From alex.neundorf at kitware.com Thu Sep 20 23:08:35 2007 From: alex.neundorf at kitware.com (Alexander Neundorf) Date: Thu, 20 Sep 2007 17:08:35 -0400 Subject: [Python-Dev] Building Python with CMake In-Reply-To: References: <200707131359.17030.alex.neundorf@kitware.com> <200709201630.36733.alex.neundorf@kitware.com> Message-ID: <200709201708.36006.alex.neundorf@kitware.com> On Thursday 20 September 2007 16:58, Steven Bethard wrote: > On 9/20/07, Alexander Neundorf wrote: > > On Thursday 30 August 2007 16:28, Alexander Neundorf wrote: > > ... > > > > > The cmake files for building python are now in a cvs repository: > > > http://www.cmake.org/cgi-bin/viewcvs.cgi/Utilities/CMakeBuildForPython/ > > >?roo t=ParaView3 > > Thanks for your work on this! That page seems to require a login. > Any chance you could post it to something like:: > > http://wiki.python.org/moin/BuildingPythonWithCMake I guess I need a login there too, so I put it somewhere where I already have one: http://www.cmake.org/Wiki/BuildingPythonWithCMake Alex From rrr at ronadam.com Fri Sep 21 00:03:19 2007 From: rrr at ronadam.com (Ron Adam) Date: Thu, 20 Sep 2007 17:03:19 -0500 Subject: [Python-Dev] Better unittest failures Message-ID: <46F2EE27.7020507@ronadam.com> The value of a unittest test is not in how well they pass, but in how well they fail. While looking at possibly helping with the str_uni branch when that was going on I found that in some cases unittest failure results can take a little bit (or a lot) of work to figure out just what was failing, where and why. While helping Eric test the new format function and class I came up with a partial solution which may be a bases for further improvements. Eric told me it did help quite a bit. So I think it's worth looking into. Since we were running over a hundred different options over several different implementations to make sure they all passed and failed in the same way, we were using data based test cases so we could easily test the same data with each version. Unfortunately that has a drawback that the traceback doesn't show what data was used when testing exceptions. Additionally when something did fail it was not always obvious what and why it was failing. One of the conclusions I came to is it would be better if tests did not raise standard python exceptions unless the test itself has a problem. By having tests raise special *Test_Only* exceptions, it can make the output of the test very much clearer. Here are the added Test_Only Excepitons. These would only be in the unittest module to catch the following situations. Wrong_Result_Returned Unexpected_Exception_Raised No_Exception_Raised Wrong_Exception_Raised And two new functions that use them. assertTestReturns(expect, test, message) assertTestRaises(expect, test, message) These additions would not effect any existing tests. To use these requires the code to be tested to be wrapped in a function with no arguments. And it is the same format for both assertTestReturns and assertTestRaises. for data in testdata: expect, a, b, c = data def test(): return foo(a, b, c) assertTestReturns(expect, test, repr(data)) Replacing all existing tests with this form isn't reasonable but adding this as an option for those who want to use it is very easy to do. The test file I used to generate the following output is attached. Cheers, Ron ### # # Test output using standard assertEquals and assertRaises. # * The data has the form [(ref#, expect, args, kwds), ...] * The ref# is there to help find the failing test for situation where you may have dozens of almost identical data. It's not required but helpful to have. * I didn't include actual bad testcase tests in these examples, but if some generated exceptions similar to the that of the failing tests, I think it could add a bit more confusion to the situation than the not too confusing example here. $ python ut_test.py EEFFFFFF ====================================================================== ERROR: test_A (__main__.test1_normal_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 100, in test_A result = some_function(*args, **kwds) File "ut_test.py", line 62, in some_function baz = kwds['baz'] KeyError: 'baz' # # This fails as a test "error" instead of a test "fail". # What was args and kwds here? # ====================================================================== ERROR: test_B (__main__.test1_normal_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 108, in test_B self.assertRaises(expect, test, args, kwds) File "unittest.py", line 320, in failUnlessRaises callableObj(*args, **kwargs) File "ut_test.py", line 107, in test return some_function(*args, **kwds) File "ut_test.py", line 62, in some_function baz = kwds['baz'] KeyError: 'baz' # # Same as above. Fails as a test "error", unkown arguments # values for some_function(). # ====================================================================== FAIL: test_C (__main__.test1_normal_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 114, in test_C self.assertRaises(expect, test, args, kwds) AssertionError: KeyError not raised # # What was args, and kwds values? # ====================================================================== FAIL: test_D (__main__.test1_normal_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 120, in test_D repr((n, expect, args, kwds))) AssertionError: (8, ('Total baz:', 4), [1, 2], {'baz': 'Total baz:'}) # # This one is ok. # ### # # Test output using the added methods and test only exceptions with # the same test data. # * Test errors only occur on actual test "errors". * The reason for the fail is explained in all cases for test "fails". * The only time you get an actual python exception is when the test it self has a problem. Otherwise you get an test_exception that refers to the exception in the actual code. ====================================================================== FAIL: test_A (__main__.test2_new_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 131, in test_A repr((n, expect, args, kwds))) File "ut_test.py", line 36, in assertTestReturns result = test() File "ut_test.py", line 129, in test return some_function(*args, **kwds) File "ut_test.py", line 62, in some_function baz = kwds['baz'] Unexpected_Exception_Raised: KeyError('baz',) Reference: (2, ('Total baz:', 3), [1, 2], {'raz': 'Total baz:'}) ====================================================================== FAIL: test_B (__main__.test2_new_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 138, in test_B repr((n, expect, args, kwds))) File "ut_test.py", line 45, in assertTestRaises result = test() File "ut_test.py", line 136, in test return some_function(*args, **kwds) File "ut_test.py", line 62, in some_function baz = kwds['baz'] Wrong_Exception_Raised: KeyError('baz',) Reference: (4, , [1, 2], {'raz': 'Total baz:'}) ====================================================================== FAIL: test_C (__main__.test2_new_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 145, in test_C repr((n, expect, args, kwds))) File "ut_test.py", line 52, in assertTestRaises raise self.No_Exception_Raised(result, ref) No_Exception_Raised: returned -> ('Total baz:', 3) Reference: (6, , [1, 2], {'baz': 'Total baz:'}) ====================================================================== FAIL: test_D (__main__.test2_new_failures) ---------------------------------------------------------------------- Traceback (most recent call last): File "ut_test.py", line 152, in test_D repr((n, expect, args, kwds))) File "ut_test.py", line 41, in assertTestReturns raise self.Wrong_Result_Returned(result, ref) Wrong_Result_Returned: ('Total baz:', 3) Reference: (8, ('Total baz:', 4), [1, 2], {'baz': 'Total baz:'}) ---------------------------------------------------------------------- Ran 8 tests in 0.004s FAILED (failures=6, errors=2) -------------- next part -------------- A non-text attachment was scrubbed... Name: ut_test.py Type: text/x-python Size: 4850 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070920/134d74ee/attachment.py From status at bugs.python.org Fri Sep 21 19:37:10 2007 From: status at bugs.python.org (Tracker) Date: Fri, 21 Sep 2007 17:37:10 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070921173710.78B38782C1@psf.upfronthosting.co.za> ACTIVITY SUMMARY (09/14/07 - 09/21/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1269 open (+12) / 11400 closed (+11) / 12669 total (+23) Open issues with patches: 412 Average duration of open issues: 678 days. Median duration of open issues: 655 days. Open Issues Breakdown open 1264 (+12) pending 5 ( +0) Issues Created Or Reopened (23) _______________________________ tp_print slots don't release the GIL 09/14/07 CLOSED http://bugs.python.org/issue1164 created arigo patch Should itertools.count work for arbitrary integers? 09/15/07 http://bugs.python.org/issue1165 created eopadoan py3k NameError when calling malloc 09/15/07 CLOSED http://bugs.python.org/issue1166 created esr gdbm/ndbm 1.8.1+ needs libgdbm_compat.so 09/16/07 http://bugs.python.org/issue1167 created ikelly patch complex arithmetic: strange results with "imag" 09/16/07 CLOSED http://bugs.python.org/issue1168 created newman Option -OO doesn't remove docstrings from functions 09/16/07 CLOSED http://bugs.python.org/issue1169 created piro patch shlex have problems with parsing unicode 09/17/07 http://bugs.python.org/issue1170 created dexen allow subclassing of bytes type 09/17/07 http://bugs.python.org/issue1171 created mfenniak py3k, patch Documentation for done attribute of FieldStorage class 09/17/07 CLOSED http://bugs.python.org/issue1172 created bkline patch yield expressions not documented in Language Reference 09/17/07 CLOSED http://bugs.python.org/issue1173 created dangyogi new generator methods not documented in Library Reference 09/17/07 CLOSED http://bugs.python.org/issue1174 created dangyogi .readline() has bug WRT nonblocking files 09/18/07 CLOSED http://bugs.python.org/issue1175 created ajb str.split() takes no keyword arguments (Should this be expected? 09/18/07 CLOSED http://bugs.python.org/issue1176 created sergioc urllib* 20x responses not OK? 09/19/07 CLOSED http://bugs.python.org/issue1177 reopened jafo patch IDLE - add "paste code" functionality 09/18/07 http://bugs.python.org/issue1178 created taleinat patch [CVE-2007-4965] Integer overflow in imageop module 09/19/07 http://bugs.python.org/issue1179 created cartman Option to ignore or substitute ~/.pydistutils.cfg 09/19/07 http://bugs.python.org/issue1180 created hoffman Redefine clear() for os.environ to use unsetenv() if possible 09/19/07 CLOSED http://bugs.python.org/issue1181 created horcicka patch Paticular decimal mod operation wrongly output NaN. 09/20/07 http://bugs.python.org/issue1182 created ocean-city race in SocketServer.ForkingMixIn.collect_children 09/20/07 http://bugs.python.org/issue1183 created dripton patch test fixes for immutable bytes change 09/20/07 http://bugs.python.org/issue1184 created hupp py3k, patch py3k: Completely remove nb_coerce slot 09/20/07 http://bugs.python.org/issue1185 created amaury.forgeotdarc patch optparse documentation: -- being collapsed to - in HTML 09/21/07 http://bugs.python.org/issue1186 created hoffman Issues Now Closed (29) ______________________ cgi: parse_qs and parse_qsl misbehave on empty strings 24 days http://bugs.python.org/issue1014 gvanrossum platform system may be Windows or Microsoft since Vista 17 days http://bugs.python.org/issue1082 p.lavarre at ieee.org patch split(None, maxsplit) does not strip whitespace correctly 12 days http://bugs.python.org/issue1123 nirs file.fileno and file.isatty() should be implementable by any fil 11 days http://bugs.python.org/issue1126 jafo Reference Manual: "for statement" links to "break statement" 10 days http://bugs.python.org/issue1131 georg.brandl re.sub returns str when processing empty unicode string 7 days http://bugs.python.org/issue1140 jafo reading large files 8 days http://bugs.python.org/issue1141 jafo fdopen does not work as expected 6 days http://bugs.python.org/issue1149 jafo Rename PyBUF_WRITEABLE to PyBUF_WRITABLE 6 days http://bugs.python.org/issue1150 jafo patch Suggested change to _exit function description in os module docu 2 days http://bugs.python.org/issue1156 jtonsing tp_print slots don't release the GIL 2 days http://bugs.python.org/issue1164 brett.cannon patch NameError when calling malloc 0 days http://bugs.python.org/issue1166 loewis complex arithmetic: strange results with "imag" 0 days http://bugs.python.org/issue1168 georg.brandl Option -OO doesn't remove docstrings from functions 3 days http://bugs.python.org/issue1169 georg.brandl patch Documentation for done attribute of FieldStorage class 1 days http://bugs.python.org/issue1172 jafo patch yield expressions not documented in Language Reference 0 days http://bugs.python.org/issue1173 georg.brandl new generator methods not documented in Library Reference 0 days http://bugs.python.org/issue1174 georg.brandl .readline() has bug WRT nonblocking files 1 days http://bugs.python.org/issue1175 gvanrossum str.split() takes no keyword arguments (Should this be expected? 2 days http://bugs.python.org/issue1176 sergioc urllib* 20x responses not OK? 0 days http://bugs.python.org/issue1177 facundobatista patch Redefine clear() for os.environ to use unsetenv() if possible 1 days http://bugs.python.org/issue1181 georg.brandl patch Need Windows os.link() support 2146 days http://bugs.python.org/issue478407 jafo patch long file name support broken in windows 1983 days http://bugs.python.org/issue542314 mhammond urllib2 raises exception with non-200 success codes. 1193 days http://bugs.python.org/issue971965 georg.brandl Optimizations for cgi.FieldStorage methods 400 days http://bugs.python.org/issue1541463 georg.brandl patch Reading with bz2.BZ2File() returns one garbage character 306 days http://bugs.python.org/issue1597011 jafo UnicodeError in compileall if "make install" is run before "make 154 days http://bugs.python.org/issue1704287 jafo patch Decimal and long hash, compatibly and efficiently 38 days http://bugs.python.org/issue1772851 facundobatista patch ctypes on Solaris 25 days http://bugs.python.org/issue1777530 theller Top Issues Most Discussed (10) ______________________________ 12 platform system may be Windows or Microsoft since Vista 17 days closed http://bugs.python.org/issue1082 9 [CVE-2007-4965] Integer overflow in imageop module 3 days open http://bugs.python.org/issue1179 9 tp_print slots don't release the GIL 2 days closed http://bugs.python.org/issue1164 7 Optimizations for cgi.FieldStorage methods 400 days closed http://bugs.python.org/issue1541463 7 urllib* 20x responses not OK? 0 days closed http://bugs.python.org/issue1177 6 .readline() has bug WRT nonblocking files 1 days closed http://bugs.python.org/issue1175 5 Documentation for done attribute of FieldStorage class 1 days closed http://bugs.python.org/issue1172 4 64/32-bit issue when unpickling random.Random 115 days open http://bugs.python.org/issue1727780 4 UnicodeError in compileall if "make install" is run before "mak 154 days closed http://bugs.python.org/issue1704287 4 Redefine clear() for os.environ to use unsetenv() if possible 1 days closed http://bugs.python.org/issue1181 From scav at blueyonder.co.uk Wed Sep 26 10:42:28 2007 From: scav at blueyonder.co.uk (scav at blueyonder.co.uk) Date: Wed, 26 Sep 2007 09:42:28 +0100 (BST) Subject: [Python-Dev] Python 3.0a documentation Message-ID: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> I'd like to help out cleaning up the Python3.0 documentation. There are a lot of little leftovers from 2.x that are no longer true. (mentions of long, callable() etc.) Ideally (especially in the tutorial), we should only refer to 3.0 features and syntax, and keep the special cases and "other ways to do it" to a minimum. Before I dive in and start submitting patches, what does everyone else think? How much reference to previous python versions should be left in? Does it make sense to keep notes of the nature of "since version 2.3 ..." when there is an intentional discontinuity at 3.0? Peter Harris From guido at python.org Wed Sep 26 16:27:50 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Sep 2007 07:27:50 -0700 Subject: [Python-Dev] Python 3.0a documentation In-Reply-To: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> Message-ID: I fully support removing all historic references from the 3.0 language manual. Please do help out! You can just start putting patches ("svn diff") into bugs.python.org; typically Georg gets to these very quickly. Do use subversion, not the distributed tarbal (which was out of date by the time it was uploaded to python.org. :-). --Guido On 9/26/07, scav at blueyonder.co.uk wrote: > I'd like to help out cleaning up the Python3.0 documentation. There are a > lot of little leftovers from 2.x that are no longer true. (mentions of > long, callable() etc.) > > Ideally (especially in the tutorial), we should only refer to 3.0 features > and syntax, and keep the special cases and "other ways to do it" to a > minimum. > > Before I dive in and start submitting patches, what does everyone else > think? How much reference to previous python versions should be left in? > Does it make sense to keep notes of the nature of "since version 2.3 ..." > when there is an intentional discontinuity at 3.0? > > Peter Harris > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From phd at phd.pp.ru Wed Sep 26 17:24:08 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 26 Sep 2007 19:24:08 +0400 Subject: [Python-Dev] Iterating over objects of unknown length Message-ID: <20070926152408.GA24021@phd.pp.ru> Hello! (This seems like a "developing with Python" question and partially it is but please read on.) I have a class that represents SQL queries. Instances of the class can be iterated over. As an SQL query doesn't know in advance if it will produce any row the class doesn't implement __len__(). Moreover, users of the class sometimes write if sqlQuery: for row in sqlQuery: ... else: # no rows which is a bug (the query doesn't know if it's True or False; to find it out the user have to execute the query by trying to iterate over it). To prevent users from writing such code the class implements __nonzero__() that always raises an exception. Unfortunately, I found some libraries test the object in boolean context before iterating over it and that, of course, triggers the exception from __nonzero__(). Even worse, some libraries test the object in boolean context regardless of iterating over it. For example, logging module (this is where my question becomes "developing for Python") triggers the exception in such simple case: logginig.debug("Query: %s", sqlQuery) Funny, the code logginig.debug("Query: %s, another: %s", sqlQuery, another_value) doesn't trigger the exception. This is due to the code in logginig/__init__.py: if args and (len(args) == 1) and args[0] and (type(args[0]) == types.DictType): args = args[0] (class LogRecord, method __init__). "and args[0]" triggers the exception. My questions are: 1. Should I consider this a bug in the logging module (and other libraries) and submit patches? 2. Or should I stop raising exceptions in __nonzero__()? In this particular case with logging the fix is simple - do "and args[0]" after type check. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido at python.org Wed Sep 26 18:29:10 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Sep 2007 09:29:10 -0700 Subject: [Python-Dev] Iterating over objects of unknown length In-Reply-To: <20070926152408.GA24021@phd.pp.ru> References: <20070926152408.GA24021@phd.pp.ru> Message-ID: The logging code looks archaic: IMO it should be: if args and len(args) == 1 and isinstance(args[0], dict) and args[0]: But I also fail to see why you would be so draconian as to disallow truth testing of a query altogether. Your query looks like an iterator. There are tons of other iterators in the language, library and 3rd party code, and it would be madness to try to fix all of them in the way you suggest just because some users don't get the concept of iterators. So I'm for #1 *and* #2. --Guido On 9/26/07, Oleg Broytmann wrote: > Hello! > > (This seems like a "developing with Python" question and partially it is > but please read on.) > > I have a class that represents SQL queries. Instances of the class can > be iterated over. As an SQL query doesn't know in advance if it will > produce any row the class doesn't implement __len__(). Moreover, users of > the class sometimes write > > if sqlQuery: > for row in sqlQuery: ... > else: > # no rows > > which is a bug (the query doesn't know if it's True or False; to find it > out the user have to execute the query by trying to iterate over it). To > prevent users from writing such code the class implements __nonzero__() > that always raises an exception. > Unfortunately, I found some libraries test the object in boolean context > before iterating over it and that, of course, triggers the exception from > __nonzero__(). > Even worse, some libraries test the object in boolean context regardless > of iterating over it. For example, logging module (this is where my > question becomes "developing for Python") triggers the exception in such > simple case: > > logginig.debug("Query: %s", sqlQuery) > > Funny, the code > > logginig.debug("Query: %s, another: %s", sqlQuery, another_value) > > doesn't trigger the exception. This is due to the code in > logginig/__init__.py: > > if args and (len(args) == 1) and args[0] and (type(args[0]) == types.DictType): > args = args[0] > > (class LogRecord, method __init__). "and args[0]" triggers the exception. > > My questions are: > > 1. Should I consider this a bug in the logging module (and other libraries) > and submit patches? > 2. Or should I stop raising exceptions in __nonzero__()? > > In this particular case with logging the fix is simple - do "and args[0]" > after type check. > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pje at telecommunity.com Wed Sep 26 18:33:33 2007 From: pje at telecommunity.com (Phillip J. Eby) Date: Wed, 26 Sep 2007 12:33:33 -0400 Subject: [Python-Dev] Iterating over objects of unknown length In-Reply-To: <20070926152408.GA24021@phd.pp.ru> References: <20070926152408.GA24021@phd.pp.ru> Message-ID: <20070926163100.5A1303A4045@sparrow.telecommunity.com> At 07:24 PM 9/26/2007 +0400, Oleg Broytmann wrote: >Hello! > > (This seems like a "developing with Python" question and partially it is >but please read on.) > > I have a class that represents SQL queries. Instances of the class can >be iterated over. As an SQL query doesn't know in advance if it will >produce any row the class doesn't implement __len__(). Moreover, users of >the class sometimes write > >if sqlQuery: > for row in sqlQuery: ... >else: > # no rows This isn't consistent with iterators; e.g.: >>> x=iter([]) >>> if x: print "yes" ... yes ISTM that you should be returning "True" from __nonzero__, since you don't implement len(). >1. Should I consider this a bug in the logging module (and other libraries) > and submit patches? >2. Or should I stop raising exceptions in __nonzero__()? #2 - Python objects should always be __nonzero__, unless they are empty containers, zeros, or otherwise specifically False. It's reasonable for libraries to expect that truth-testing an object is always safe. From skip at pobox.com Wed Sep 26 18:34:45 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 26 Sep 2007 11:34:45 -0500 Subject: [Python-Dev] Python 3.0a documentation In-Reply-To: References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> Message-ID: <18170.35365.126292.41202@montanaro.dyndns.org> Guido> I fully support removing all historic references from the 3.0 Guido> language manual. By historic I assume you mean references to 2.x modules, classes, functions, etc which are no longer present. One thing I would suggest is that the more recent versionadded strings be kept. At the very least, if something is going to be new in 2.6, keep that. Maybe also keep the 2.5 versionadded references. Older references can probably be deleted. Skip From guido at python.org Wed Sep 26 18:37:17 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Sep 2007 09:37:17 -0700 Subject: [Python-Dev] Python 3.0a documentation In-Reply-To: <18170.35365.126292.41202@montanaro.dyndns.org> References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> <18170.35365.126292.41202@montanaro.dyndns.org> Message-ID: On 9/26/07, skip at pobox.com wrote: > > Guido> I fully support removing all historic references from the 3.0 > Guido> language manual. > > By historic I assume you mean references to 2.x modules, classes, functions, > etc which are no longer present. One thing I would suggest is that the more > recent versionadded strings be kept. At the very least, if something is > going to be new in 2.6, keep that. Maybe also keep the 2.5 versionadded > references. Older references can probably be deleted. In the 2.x docs, all versionadded strings should stay. But IMO in the 3.0 docs we should get rid of them all. If you want compatibility information, look at the 2.6 docs (those should also mention things that are changing in 3.0). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From phd at phd.pp.ru Wed Sep 26 18:37:28 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 26 Sep 2007 20:37:28 +0400 Subject: [Python-Dev] Iterating over objects of unknown length In-Reply-To: References: <20070926152408.GA24021@phd.pp.ru> Message-ID: <20070926163728.GB26579@phd.pp.ru> On Wed, Sep 26, 2007 at 09:29:10AM -0700, Guido van Rossum wrote: > But I also fail to see why you would be so draconian as to disallow > truth testing of a query altogether. Your query looks like an > iterator. There are tons of other iterators in the language, library > and 3rd party code, and it would be madness to try to fix all of them > in the way you suggest just because some users don't get the concept > of iterators. Seems me myself didn't get it: On Wed, Sep 26, 2007 at 12:33:33PM -0400, Phillip J. Eby wrote: > This isn't consistent with iterators; e.g.: > > >>> x=iter([]) > >>> if x: print "yes" > ... > yes On Wed, Sep 26, 2007 at 09:29:10AM -0700, Guido van Rossum wrote: > So I'm for #1 *and* #2. I see now. Thank you! Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From martin at v.loewis.de Wed Sep 26 20:14:09 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 26 Sep 2007 20:14:09 +0200 Subject: [Python-Dev] Python 3.0a documentation In-Reply-To: References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> <18170.35365.126292.41202@montanaro.dyndns.org> Message-ID: <46FAA171.2040104@v.loewis.de> > In the 2.x docs, all versionadded strings should stay. But IMO in the > 3.0 docs we should get rid of them all. If you want compatibility > information, look at the 2.6 docs (those should also mention things > that are changing in 3.0). I agree. People who target 3.x need to test anyway if they also want to support some 2.x version (if that is possible at all), so it does not help them to know what Python version introduced a certain feature they use. Regards, Martin From g.brandl at gmx.net Wed Sep 26 20:31:04 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 26 Sep 2007 20:31:04 +0200 Subject: [Python-Dev] Python 3.0a documentation In-Reply-To: <46FAA171.2040104@v.loewis.de> References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> <18170.35365.126292.41202@montanaro.dyndns.org> <46FAA171.2040104@v.loewis.de> Message-ID: Martin v. L?wis schrieb: >> In the 2.x docs, all versionadded strings should stay. But IMO in the >> 3.0 docs we should get rid of them all. If you want compatibility >> information, look at the 2.6 docs (those should also mention things >> that are changing in 3.0). > > I agree. People who target 3.x need to test anyway if they also want > to support some 2.x version (if that is possible at all), so it does > not help them to know what Python version introduced a certain feature > they use. Also, it has already been done, and would be painful to undo :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From dinov at exchange.microsoft.com Wed Sep 26 22:23:58 2007 From: dinov at exchange.microsoft.com (Dino Viehland) Date: Wed, 26 Sep 2007 13:23:58 -0700 Subject: [Python-Dev] New lines, carriage returns, and Windows Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion. When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n. This works great as long as you stay within an entirely Python world. Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n. But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n. Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file. So today users have to be aware of the fact that Python internally always uses \n. They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version. It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n. Ultimately that creates a file that has line endings which aren't good on any platform. On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that. And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc... Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want. So I'm curious: Is there a reason this behavior is useful that I'm missing? Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe? Or should we just tell our users to open files in binary mode? :) From fuzzyman at voidspace.org.uk Wed Sep 26 22:42:09 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 26 Sep 2007 21:42:09 +0100 Subject: [Python-Dev] [python] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FAC421.3020809@voidspace.org.uk> Dino Viehland wrote: > We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion. > > When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n. This works great as long as you stay within an entirely Python world. Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n. But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n. Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file. > > So today users have to be aware of the fact that Python internally always uses \n. They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version. > > It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n. Ultimately that creates a file that has line endings which aren't good on any platform. On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that. And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc... Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want. > > So I'm curious: Is there a reason this behavior is useful that I'm missing? Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe? Or should we just tell our users to open files in binary mode? :) > It is normal when working with Windows interaction in the Python world to be aware that you might receive strings with '\r\n' in and do the conversion yourself. We come across this a great deal with Resolver (when working with multi line text boxes for example) and quite happily replace '\r\n' with '\n' and vice versa as needed. As a developer who uses both Python and IronPython I say that this isn't a problem. I may be wrong or outvoted of course... Michael http://www.manning.com/foord > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From martin at v.loewis.de Thu Sep 27 00:00:44 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Sep 2007 00:00:44 +0200 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FAD68C.5030702@v.loewis.de> > This works great as long as you stay within an entirely Python world. > Because Python uses \n for everything internally I think you misunderstand fairly significantly how this all works together. Python does not use \n "for everything internally". Python is well capable of representing \r separately, and does so if you ask it to. > So I'm curious: Is there a reason this behavior is useful that I'm > missing? I think you are missing how it works in the first place (or else you failed to communicate to me what precise behavior you find puzzling). Regards, Martin From guido at python.org Thu Sep 27 00:04:26 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Sep 2007 15:04:26 -0700 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: On 9/26/07, Dino Viehland wrote: > We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion. > > When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n. This works great as long as you stay within an entirely Python world. Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n. But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n. Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file. > > So today users have to be aware of the fact that Python internally always uses \n. They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version. > > It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n. Ultimately that creates a file that has line endings which aren't good on any platform. On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that. And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc... Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want. > > So I'm curious: Is there a reason this behavior is useful that I'm missing? No, it is simply the way Microsoft's C stdio library works. :-( A simple workaround would be to apply s.replace('\r', '') before writing anything of course. > Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe? Or should we just tell our users to open files in binary mode? :) Py3k supports a number of different ways of working with newlines for text files, but not (yet) one that leaves \r\n alone while translating a lone \n into \r\n. It may not be too late to change that though. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dinov at exchange.microsoft.com Thu Sep 27 00:09:04 2007 From: dinov at exchange.microsoft.com (Dino Viehland) Date: Wed, 26 Sep 2007 15:09:04 -0700 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <46FAD68C.5030702@v.loewis.de> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file. That's what I meant by uses \n for everything internally. But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream. My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is. But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing. Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file. But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings. -----Original Message----- From: "Martin v. L?wis" [mailto:martin at v.loewis.de] Sent: Wednesday, September 26, 2007 3:01 PM To: Dino Viehland Cc: python-dev at python.org Subject: Re: [Python-Dev] New lines, carriage returns, and Windows > This works great as long as you stay within an entirely Python world. > Because Python uses \n for everything internally I think you misunderstand fairly significantly how this all works together. Python does not use \n "for everything internally". Python is well capable of representing \r separately, and does so if you ask it to. > So I'm curious: Is there a reason this behavior is useful that I'm > missing? I think you are missing how it works in the first place (or else you failed to communicate to me what precise behavior you find puzzling). Regards, Martin From fuzzyman at voidspace.org.uk Thu Sep 27 00:14:58 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 26 Sep 2007 23:14:58 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FAD9E2.6000103@voidspace.org.uk> Dino Viehland wrote: > My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file. That's what I meant by uses \n for everything internally. > > But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream. My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is. > > But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing. Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file. But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings. > We always do replace('\r\n','\n') but same difference... Michael > > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: Wednesday, September 26, 2007 3:01 PM > To: Dino Viehland > Cc: python-dev at python.org > Subject: Re: [Python-Dev] New lines, carriage returns, and Windows > > >> This works great as long as you stay within an entirely Python world. >> Because Python uses \n for everything internally >> > > I think you misunderstand fairly significantly how this all works > together. Python does not use \n "for everything internally". Python > is well capable of representing \r separately, and does so if you > ask it to. > > >> So I'm curious: Is there a reason this behavior is useful that I'm >> missing? >> > > I think you are missing how it works in the first place (or else > you failed to communicate to me what precise behavior you find > puzzling). > > Regards, > Martin > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From dinov at exchange.microsoft.com Thu Sep 27 00:17:01 2007 From: dinov at exchange.microsoft.com (Dino Viehland) Date: Wed, 26 Sep 2007 15:17:01 -0700 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <46FAD9E2.6000103@voidspace.org.uk> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD9E2.6000103@voidspace.org.uk> Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CBBE@DF-GRTDANE-MSG.exchange.corp.microsoft.com> And if this is fine for you, given that you may have the largest WinForms / IronPython code base, I tend to think the replace may be reasonable. But we have had someone get surprised by this behavior. -----Original Message----- From: Michael Foord [mailto:fuzzyman at voidspace.org.uk] Sent: Wednesday, September 26, 2007 3:15 PM To: Dino Viehland Cc: python-dev at python.org Subject: Re: [python] Re: [Python-Dev] New lines, carriage returns, and Windows Dino Viehland wrote: > My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file. That's what I meant by uses \n for everything internally. > > But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream. My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is. > > But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing. Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file. But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings. > We always do replace('\r\n','\n') but same difference... Michael > > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: Wednesday, September 26, 2007 3:01 PM > To: Dino Viehland > Cc: python-dev at python.org > Subject: Re: [Python-Dev] New lines, carriage returns, and Windows > > >> This works great as long as you stay within an entirely Python world. >> Because Python uses \n for everything internally >> > > I think you misunderstand fairly significantly how this all works > together. Python does not use \n "for everything internally". Python > is well capable of representing \r separately, and does so if you > ask it to. > > >> So I'm curious: Is there a reason this behavior is useful that I'm >> missing? >> > > I think you are missing how it works in the first place (or else > you failed to communicate to me what precise behavior you find > puzzling). > > Regards, > Martin > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From fuzzyman at voidspace.org.uk Thu Sep 27 00:22:52 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 26 Sep 2007 23:22:52 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBBE@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD9E2.6000103@voidspace.org.uk> <7AD436E4270DD54A94238001769C2227CCBD18CBBE@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FADBBC.5070407@voidspace.org.uk> Dino Viehland wrote: > And if this is fine for you, given that you may have the largest WinForms / IronPython code base, I tend to think the replace may be reasonable. But we have had someone get surprised by this behavior. > It is a slight impedance mismatch between Python and Windows - but isn't restricted to IronPython, so changing Python semantics doesn't seem like the right answer. Alternatively a more intelligent text mode (that writes '\n' as '\r\n' and '\r\n' as '\r\n' on Windows) doesn't sound like *such* a bad idea - but you will still get caught out by this. A string read in text mode will read '\r\n' as '\n'. Setting this on a winforms component will still do the wrong thing. Better to be aware of the difference and use binary mode. Michael > -----Original Message----- > From: Michael Foord [mailto:fuzzyman at voidspace.org.uk] > Sent: Wednesday, September 26, 2007 3:15 PM > To: Dino Viehland > Cc: python-dev at python.org > Subject: Re: [python] Re: [Python-Dev] New lines, carriage returns, and Windows > > Dino Viehland wrote: > >> My understanding is that users can write code that uses only \n and Python will write the end-of-line character(s) that are appropriate for the platform when writing to a file. That's what I meant by uses \n for everything internally. >> >> But if you write \r\n to a file Python completely ignores the presence of the \r and transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream. My last question is simply does anyone find writing \r\r\n when the original string contained \r\n a useful behavior - personally I don't see how it is. >> >> But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation and not something that Python is explicitly doing. Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only get \r\n in the resulting file. But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users of the behavior when interoperating w/ APIs that return \r\n in strings. >> >> > > We always do replace('\r\n','\n') but same difference... > > Michael > > >> -----Original Message----- >> From: "Martin v. L?wis" [mailto:martin at v.loewis.de] >> Sent: Wednesday, September 26, 2007 3:01 PM >> To: Dino Viehland >> Cc: python-dev at python.org >> Subject: Re: [Python-Dev] New lines, carriage returns, and Windows >> >> >> >>> This works great as long as you stay within an entirely Python world. >>> Because Python uses \n for everything internally >>> >>> >> I think you misunderstand fairly significantly how this all works >> together. Python does not use \n "for everything internally". Python >> is well capable of representing \r separately, and does so if you >> ask it to. >> >> >> >>> So I'm curious: Is there a reason this behavior is useful that I'm >>> missing? >>> >>> >> I think you are missing how it works in the first place (or else >> you failed to communicate to me what precise behavior you find >> puzzling). >> >> Regards, >> Martin >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk >> >> >> > > > From jimjjewett at gmail.com Thu Sep 27 02:35:36 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 26 Sep 2007 20:35:36 -0400 Subject: [Python-Dev] urllib exception compatibility Message-ID: urllib goes to goes to some trouble to ensure that it raises IOError, even when the underlying exception comes from another module.[*] I'm wondering if it would make sense to just have those modules' exceptions inherit from IOError. In particular, should socket.error, ftp.Error and httplib.HTTPException (used in Py3K) inherit from IOError? I'm also wondering whether it would be acceptable to change the details of the exceptions. For example, could raise IOError, ('ftp error', msg), sys.exc_info()[2] be reworded, or is there there too much risk that someone is checking for an "errno" of 'ftp error'? [*] This isn't a heavily tested path; some actually fail with a TypeError since 2.5, because IOError no longer accepts argument tuples longer than 3. http://bugs.python.org/issue1209 Fortunately, this makes me less worried about changing the contents of the specific attributes to something more useful... -jJ From greg.ewing at canterbury.ac.nz Thu Sep 27 03:33:47 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Sep 2007 13:33:47 +1200 Subject: [Python-Dev] Iterating over objects of unknown length In-Reply-To: <20070926152408.GA24021@phd.pp.ru> References: <20070926152408.GA24021@phd.pp.ru> Message-ID: <46FB087B.8080107@canterbury.ac.nz> Oleg Broytmann wrote: > Hello! > > (This seems like a "developing with Python" question and partially it is > but please read on.) > > I have a class that represents SQL queries. Instances of the class can > be iterated over. ... users of > the class sometimes write > > if sqlQuery: > for row in sqlQuery: ... > else: > # no rows > > To prevent users from writing such code the class implements __nonzero__() > that always raises an exception. I'm not sure I like that idea. It's common practice to write 'if x:' as a shorthand for 'if x is not None:' when it's known that x is an object that doesn't have a notion of emptiness. A __nonzero__ that always raises an exception just to spite you interferes with that. Another thing is that any code doing "if x" to test for emptiness is clearly expecting x to be a sequence, *not* an iterator, and you've violated the contract by passing it one. This is what you may be running into with the libraries you mention. Generally I think it's a bad idea to try to protect people from themselves when doing so can interfere with legitimate usage. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From guido at python.org Thu Sep 27 03:52:45 2007 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Sep 2007 18:52:45 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: Message-ID: Shouldn't these all inherit from EnvironmentError? Or should EnvironmentError and IOError be the same thing perhaps? --Guido On 9/26/07, Jim Jewett wrote: > urllib goes to goes to some trouble to ensure that it raises IOError, > even when the underlying exception comes from another module.[*] I'm > wondering if it would make sense to just have those modules' > exceptions inherit from IOError. > > In particular, should socket.error, ftp.Error and > httplib.HTTPException (used in Py3K) inherit from IOError? > > I'm also wondering whether it would be acceptable to change the > details of the exceptions. For example, could > > raise IOError, ('ftp error', msg), sys.exc_info()[2] > > be reworded, or is there there too much risk that someone is checking > for an "errno" of 'ftp error'? > > > [*] This isn't a heavily tested path; some actually fail with a > TypeError since 2.5, because IOError no longer accepts argument tuples > longer than 3. http://bugs.python.org/issue1209 Fortunately, this > makes me less worried about changing the contents of the specific > attributes to something more useful... > > -jJ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From phd at phd.pp.ru Thu Sep 27 04:04:08 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 27 Sep 2007 06:04:08 +0400 Subject: [Python-Dev] Iterating over objects of unknown length In-Reply-To: <46FB087B.8080107@canterbury.ac.nz> References: <20070926152408.GA24021@phd.pp.ru> <46FB087B.8080107@canterbury.ac.nz> Message-ID: <20070927020408.GC4287@phd.pp.ru> On Thu, Sep 27, 2007 at 01:33:47PM +1200, Greg Ewing wrote: > Oleg Broytmann wrote: > >if sqlQuery: > > for row in sqlQuery: ... > >else: > > # no rows > > > >To prevent users from writing such code the class implements __nonzero__() > >that always raises an exception. > > I'm not sure I like that idea. It's common practice to write > 'if x:' as a shorthand for 'if x is not None:' when it's known > that x is an object that doesn't have a notion of emptiness. > Another thing is that any code doing "if x" to test for > emptiness is clearly expecting x to be a sequence, *not* > an iterator, and you've violated the contract by passing > it one. This is what you may be running into with the libraries > you mention. In most cases the code in those libraries is, using the word of Mr. van Rossum, "archaic". It was developed for old versions of Python (long before Python has got the iterator protocol). I will file bug reports and patches (I have filed one about logginig/__init__.py) to allow developers to either fix the code or document the fact the code really requires a finite sequence. Unfortunately now when my code no longer raises an exception it would be harder to spot the buggy libraries. > Generally I think it's a bad idea to try to protect people > from themselves when doing so can interfere with legitimate > usage. I agree. I admitted in mailing list it was my design mistake. The offending __nonzero__ was removed from SVN today. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From greg.ewing at canterbury.ac.nz Thu Sep 27 04:04:13 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Sep 2007 14:04:13 +1200 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FB0F9D.6010303@canterbury.ac.nz> Dino Viehland wrote: > When writing a string in text mode that contains \r\n we both write \r\r\n Maybe there should be a universal newlines mode defined for output as well as input, which translates any of "\r", "\n" or "\r\n" into the platform line ending. Although I suspect that a string containing "\r\n" is going to cause more problems for Python applications than this. E.g. consider what happens when you try to split a string on newlines. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From skip at pobox.com Thu Sep 27 04:46:07 2007 From: skip at pobox.com (skip at pobox.com) Date: Wed, 26 Sep 2007 21:46:07 -0500 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <46FB0F9D.6010303@canterbury.ac.nz> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FB0F9D.6010303@canterbury.ac.nz> Message-ID: <18171.6511.526695.684154@montanaro.dyndns.org> Greg> Maybe there should be a universal newlines mode defined for output Greg> as well as input, which translates any of "\r", "\n" or "\r\n" Greg> into the platform line ending. I thought that's the way it was supposed to work, but it clearly doesn't: >>> f = open("test.txt", "wt") >>> f.write("a\rb\rnc\n") 7 >>> f.close() >>> open("test.txt", "rb").read() b'a\rb\rnc\n' I'd be open to such a change. Principle of least surprise? Skip From greg.ewing at canterbury.ac.nz Thu Sep 27 04:54:51 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 27 Sep 2007 14:54:51 +1200 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: Message-ID: <46FB1B7B.1020001@canterbury.ac.nz> Jim Jewett wrote: > In particular, should socket.error, ftp.Error and > httplib.HTTPException (used in Py3K) inherit from IOError? I'd say that if they incorporate a C library result code they should inherit from IOError, or if they incorporate a system call result code they should inherit from OSError. Otherwise they should inherit from EnvironmentError. I don't think there's any point in simply catching one of these and re-wrapping it in the library's own exeption class, but if such wrapping is done, it should inherit from EnvironmentError as well. It's convenient to be able to catch EnvironmentError and get anything that is caused by circumstances outside the program's control. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From martin at v.loewis.de Thu Sep 27 07:24:41 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 27 Sep 2007 07:24:41 +0200 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FB3E99.9030100@v.loewis.de> > My understanding is that users can write code that uses only \n and > Python will write the end-of-line character(s) that are appropriate > for the platform when writing to a file. That's what I meant by uses > \n for everything internally. Here you misunderstand - that's only the case when the file is opened in text mode. If the file is opened in binary mode, and you write \n, then it writes just a single byte (0xA). > But if you write \r\n to a file Python completely ignores the > presence of the \r and transforms the \n into a \r\n anyway, hence > the \r\r in the resulting stream. My last question is simply does > anyone find writing \r\r\n when the original string contained \r\n a > useful behavior - personally I don't see how it is. That's just for consistency. > But Guido's response makes this sound like it's a problem w/ VC++ > stdio implementation and not something that Python is explicitly > doing. That's correct - it's the notion of "text mode" for file IO. > Anyway, it'd might be useful to have a text-mode file that > you can write \r\n to and only get \r\n in the resulting file. This I don't understand. Why don't you just use binary mode then? At least for Python 2.x, the *only* difference between text and binary mode is the treatment of line endings. For Python 3, things will be different as the distinction goes further; the precise API for line endings is still being considered there. Regards, Martin From guido at python.org Thu Sep 27 16:32:44 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Sep 2007 07:32:44 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <46FB1B7B.1020001@canterbury.ac.nz> References: <46FB1B7B.1020001@canterbury.ac.nz> Message-ID: How about making IOError, OSError and EnvironmentError all aliases for the same thing? The distinction is really worthless historical baggage. On 9/26/07, Greg Ewing wrote: > Jim Jewett wrote: > > In particular, should socket.error, ftp.Error and > > httplib.HTTPException (used in Py3K) inherit from IOError? > > I'd say that if they incorporate a C library result code they > should inherit from IOError, or if they incorporate a system > call result code they should inherit from OSError. Otherwise > they should inherit from EnvironmentError. > > I don't think there's any point in simply catching one of > these and re-wrapping it in the library's own exeption > class, but if such wrapping is done, it should inherit > from EnvironmentError as well. > > It's convenient to be able to catch EnvironmentError and > get anything that is caused by circumstances outside the > program's control. > > -- > Greg Ewing, Computer Science Dept, +--------------------------------------+ > University of Canterbury, | Carpe post meridiem! | > Christchurch, New Zealand | (I'm not a morning person.) | > greg.ewing at canterbury.ac.nz +--------------------------------------+ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dinov at exchange.microsoft.com Thu Sep 27 18:49:07 2007 From: dinov at exchange.microsoft.com (Dino Viehland) Date: Thu, 27 Sep 2007 09:49:07 -0700 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <46FB3E99.9030100@v.loewis.de> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FB3E99.9030100@v.loewis.de> Message-ID: <7AD436E4270DD54A94238001769C2227CCBD18CD3B@DF-GRTDANE-MSG.exchange.corp.microsoft.com> > This I don't understand. Why don't you just use binary mode then? > At least for Python 2.x, the *only* difference between text and > binary mode is the treatment of line endings. That just flips the problem to the other side. Now if I have a Python library that I'm mixing w/ .NET code I need to be sure to transform the line endings, but now from \n -> \r\n (and hopefully they'd detect the new-line style they should use so it works correctly on Mono on *nix or Silverlight on OS X as well). From guido at python.org Thu Sep 27 19:35:18 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Sep 2007 10:35:18 -0700 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <18171.6511.526695.684154@montanaro.dyndns.org> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FB0F9D.6010303@canterbury.ac.nz> <18171.6511.526695.684154@montanaro.dyndns.org> Message-ID: [moving to python-3000] The symmetry isn't as strong as you suggest, but I agree it would be a useful feature. Would you mind filing a Py3k feature request so we don't forget? A proposal for an API given the existing newlines=... parameter (described in detail in PEP 3116) would be even better. And a patch would be best, but you know that. :-) --Guido On 9/26/07, skip at pobox.com wrote: > > Greg> Maybe there should be a universal newlines mode defined for output > Greg> as well as input, which translates any of "\r", "\n" or "\r\n" > Greg> into the platform line ending. > > I thought that's the way it was supposed to work, but it clearly doesn't: > > >>> f = open("test.txt", "wt") > >>> f.write("a\rb\rnc\n") > 7 > >>> f.close() > >>> open("test.txt", "rb").read() > b'a\rb\rnc\n' > > I'd be open to such a change. Principle of least surprise? > > Skip > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg at krypto.org Thu Sep 27 20:09:45 2007 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 27 Sep 2007 11:09:45 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <46FB1B7B.1020001@canterbury.ac.nz> Message-ID: <52dc1c820709271109j7d95b63chad74d2b615dd792e@mail.gmail.com> On 9/27/07, Guido van Rossum wrote: > > How about making IOError, OSError and EnvironmentError all aliases for > the same thing? The distinction is really worthless historical > baggage. > +1 on that. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070927/5ba4dd70/attachment.htm From brett at python.org Thu Sep 27 22:23:58 2007 From: brett at python.org (Brett Cannon) Date: Thu, 27 Sep 2007 13:23:58 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <46FB1B7B.1020001@canterbury.ac.nz> Message-ID: On 9/27/07, Guido van Rossum wrote: > How about making IOError, OSError and EnvironmentError all aliases for > the same thing? The distinction is really worthless historical > baggage. > +1 from me. Should OSError and IOError become aliases to EnvironmentError? I assume WindowsError and VMSError will just directly subclass which ever exception sticks around. And should we bother with a PendingDeprecationWarning for IOError or OSError? Or just have a Py3K warning for them and not worry about their removal in the 2.x series and just let 2to3 handle the transition? -Brett From dalcinl at gmail.com Fri Sep 28 00:18:09 2007 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Thu, 27 Sep 2007 19:18:09 -0300 Subject: [Python-Dev] building with -Wwrite-strings Message-ID: I'm trying to build Python (2.6) with GCC the option -Wwrite-strings. 1 - Is there any interest on this? 2 - What should I do for the very common (taken from int_new): static char *kwlist[] = {"x", "base", 0}; I was able to remove all the warning in Objects/*, except those related to (2). -- Lisandro Dalc?n --------------- Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC) Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC) Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET) PTLC - G?emes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 From graham.horler at gmail.com Fri Sep 28 00:35:36 2007 From: graham.horler at gmail.com (Graham Horler) Date: Thu, 27 Sep 2007 23:35:36 +0100 Subject: [Python-Dev] urllib exception compatibility References: <46FB1B7B.1020001@canterbury.ac.nz> Message-ID: <63jqje$2dsube@venus.eclipse.kcom.com> On 27 Sep 2007, 21:23:58, Brett Cannon wrote: > Should OSError and IOError become aliases to EnvironmentError? I > assume WindowsError and VMSError will just directly subclass which > ever exception sticks around. > > And should we bother with a PendingDeprecationWarning for IOError or > OSError? Or just have a Py3K warning for them and not worry about > their removal in the 2.x series and just let 2to3 handle the > transition? Am I missing something, as I thought Py2K was supposed to throw backwards compatability to the wind in favor of doing the "Right Thing"? If so, can't we lose the proposed OSError and IOError aliases altogether, and just keep EnvironmentError? Perhaps "EnvironmentError" is a bit long to type in all the places OSError and IOError are used, I personally like the look of OSError and IOError better in my code. I vote for a shorter name for EnvironmentError, e.g. EMError. just my 2c, Graham > > -Brett > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/graham.horler%40gmail.com From guido at python.org Fri Sep 28 00:59:12 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Sep 2007 15:59:12 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <63jqje$2dsube@venus.eclipse.kcom.com> References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> Message-ID: I'd be happy to make them all IOError. 2to3 can clean this up. On 9/27/07, Graham Horler wrote: > On 27 Sep 2007, 21:23:58, Brett Cannon wrote: > > Should OSError and IOError become aliases to EnvironmentError? I > > assume WindowsError and VMSError will just directly subclass which > > ever exception sticks around. > > > > And should we bother with a PendingDeprecationWarning for IOError or > > OSError? Or just have a Py3K warning for them and not worry about > > their removal in the 2.x series and just let 2to3 handle the > > transition? > > Am I missing something, as I thought Py2K was supposed to throw backwards > compatability to the wind in favor of doing the "Right Thing"? > > If so, can't we lose the proposed OSError and IOError aliases altogether, > and just keep EnvironmentError? > > Perhaps "EnvironmentError" is a bit long to type in all the places OSError > and IOError are used, I personally like the look of OSError and IOError better > in my code. I vote for a shorter name for EnvironmentError, e.g. EMError. > > just my 2c, Graham > > > > > -Brett > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/graham.horler%40gmail.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bjourne at gmail.com Fri Sep 28 01:35:27 2007 From: bjourne at gmail.com (=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=) Date: Fri, 28 Sep 2007 01:35:27 +0200 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <46FB1B7B.1020001@canterbury.ac.nz> Message-ID: <740c3aec0709271635i4186cb73y84f905882f446f67@mail.gmail.com> On 9/27/07, Guido van Rossum wrote: > How about making IOError, OSError and EnvironmentError all aliases for > the same thing? The distinction is really worthless historical > baggage. Wouldn't it also be nice to have some subclasses of IOError like FileNotFoundError, IOPermissionError and EOFError? Often that would be easier than having to use the errno attribute to find out the exact cause. -- mvh Bj?rn From guido at python.org Fri Sep 28 01:41:57 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Sep 2007 16:41:57 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <740c3aec0709271635i4186cb73y84f905882f446f67@mail.gmail.com> References: <46FB1B7B.1020001@canterbury.ac.nz> <740c3aec0709271635i4186cb73y84f905882f446f67@mail.gmail.com> Message-ID: I suspect that the use case for those errors is far less than you think. On 9/27/07, BJ?rn Lindqvist wrote: > On 9/27/07, Guido van Rossum wrote: > > How about making IOError, OSError and EnvironmentError all aliases for > > the same thing? The distinction is really worthless historical > > baggage. > > Wouldn't it also be nice to have some subclasses of IOError like > FileNotFoundError, IOPermissionError and EOFError? Often that would be > easier than having to use the errno attribute to find out the exact > cause. > > -- > mvh Bj?rn > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From eric+python-dev at trueblade.com Fri Sep 28 02:39:28 2007 From: eric+python-dev at trueblade.com (Eric Smith) Date: Thu, 27 Sep 2007 20:39:28 -0400 Subject: [Python-Dev] Decimal news In-Reply-To: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> Message-ID: <46FC4D40.4090808@trueblade.com> Thomas Wouters wrote: > Unfortunately, that's not how it works :-) If you check something into > the trunk, it will be merged into Py3k sooner or later. I may ask the > original submitter for assistance if it's incredibly hard to figure out > the changes, but so far, I only had to do that with the SSL changes. The > decimal changes are being merged as I write this (tests running now.) Is > there anything in particular that needs to be done for decimal in Py3k, > besides renaming __div__ to __truediv__? > > If you re-eally need to check something into the trunk that re-eally > must not be merged into py3k, but you're afraid it's not going to be > obvious to the merger, please record the change as 'merged' using > "svnmerge merge -M -r". Please take care when picking the > revision ;) You can also just email me or someone else you see doing > merges, as I doubt this will be a common occurance. I'm getting ready to port my PEP 3101 implementation (str.format() and friends) from py3k back to 2.6. How do I make it obvious that this is something that doesn't need to be ported to py3k? I'm not sure what "obvious to the merger" means. Is a "backported" checkin comment good enough? With any luck this will be done with a single checkin to the trunk, and another checkin to py3k so that the implementations can remain identical. I just want to make sure I don't make life more difficult than necessary for the folks doing the very valuable merge process. Eric. From greg.ewing at canterbury.ac.nz Fri Sep 28 03:30:58 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 28 Sep 2007 13:30:58 +1200 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <46FB1B7B.1020001@canterbury.ac.nz> Message-ID: <46FC5952.7070707@canterbury.ac.nz> Guido van Rossum wrote: > How about making IOError, OSError and EnvironmentError all aliases for > the same thing? The distinction is really worthless historical > baggage. To my mind, the distinction is that IOError and OSError have an attribute for the error code, and the code found there has a well-defined meaning (C library error code and system call error code respectively), whereas EnvironmentError is more general. While it might be possible to merge them all together on Unix-like systems, that wouldn't necessarily be true on all platforms -- the IOError and OSError codes might belong to different domains. Although I suppose you could have another attribute to distinguish them if necessary. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From thomas at python.org Fri Sep 28 03:32:41 2007 From: thomas at python.org (Thomas Wouters) Date: Thu, 27 Sep 2007 18:32:41 -0700 Subject: [Python-Dev] Decimal news In-Reply-To: <46FC4D40.4090808@trueblade.com> References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> <46FC4D40.4090808@trueblade.com> Message-ID: <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com> On 9/27/07, Eric Smith wrote: > > Thomas Wouters wrote: > > > Unfortunately, that's not how it works :-) If you check something into > > the trunk, it will be merged into Py3k sooner or later. I may ask the > > original submitter for assistance if it's incredibly hard to figure out > > the changes, but so far, I only had to do that with the SSL changes. The > > decimal changes are being merged as I write this (tests running now.) Is > > there anything in particular that needs to be done for decimal in Py3k, > > besides renaming __div__ to __truediv__? > > > > If you re-eally need to check something into the trunk that re-eally > > must not be merged into py3k, but you're afraid it's not going to be > > obvious to the merger, please record the change as 'merged' using > > "svnmerge merge -M -r". Please take care when picking the > > revision ;) You can also just email me or someone else you see doing > > merges, as I doubt this will be a common occurance. > > I'm getting ready to port my PEP 3101 implementation (str.format() and > friends) from py3k back to 2.6. How do I make it obvious that this is > something that doesn't need to be ported to py3k? I'm not sure what > "obvious to the merger" means. Is a "backported" checkin comment good > enough? With any luck this will be done with a single checkin to the > trunk, and another checkin to py3k so that the implementations can > remain identical. Funny, just a few hours ago Guido mentioned we (meaning I) should write this up in a PEP :) I'll do that in the next few weeks. Obvious to the merger means whatever the merger expects it to mean ;) Yes, checkin comments are good. If an automatic merge fails, and the code isn't straightforward to merge from just looking at the files, looking at the actual changes in both branches is the next step. If the comment says 'backport from py3k' (preferably with which version was backported), that makes it easy to ignore the whole change (but perhaps not later checkins.) After you backport, maintenance should be done in the trunk, not the py3k branch (except of course, for parts that don't apply to the trunk.) I just want to make sure I don't make life more difficult than necessary > for the folks doing the very valuable merge process. As long as you commit any given thing only once, it's pretty easy to work out. As soon as you find yourself (more than once) committing things to py3k and then realizing it should go to the trunk, you may be making life harder. I appreciate that you're careful about this though, thanks :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070927/7cc0dd5c/attachment.htm From greg.ewing at canterbury.ac.nz Fri Sep 28 03:45:01 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 28 Sep 2007 13:45:01 +1200 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CD3B@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FB3E99.9030100@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CD3B@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <46FC5C9D.3030803@canterbury.ac.nz> Dino Viehland wrote: >>Why don't you just use binary mode then? > > That just flips the problem to the other side. Seems to me that IronPython really needs two string types, "Python string" and ".NET string", with automatic conversion when crossing a boundary between Python code and .NET code. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From greg at krypto.org Fri Sep 28 06:58:05 2007 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 27 Sep 2007 21:58:05 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> Message-ID: <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> Is IOError is the right name to use? OSError is raised for things that are not IO such as subprocess, dlopen, system. Nobody likes typing out EnvironmentError and dislike the suggestion of EMError, should it just be OSError? errno values are after all OS specific. -gps On 9/27/07, Guido van Rossum wrote: > > I'd be happy to make them all IOError. 2to3 can clean this up. > > On 9/27/07, Graham Horler wrote: > > On 27 Sep 2007, 21:23:58, Brett Cannon wrote: > > > Should OSError and IOError become aliases to EnvironmentError? I > > > assume WindowsError and VMSError will just directly subclass which > > > ever exception sticks around. > > > > > > And should we bother with a PendingDeprecationWarning for IOError or > > > OSError? Or just have a Py3K warning for them and not worry about > > > their removal in the 2.x series and just let 2to3 handle the > > > transition? > > > > Am I missing something, as I thought Py2K was supposed to throw > backwards > > compatability to the wind in favor of doing the "Right Thing"? > > > > If so, can't we lose the proposed OSError and IOError aliases > altogether, > > and just keep EnvironmentError? > > > > Perhaps "EnvironmentError" is a bit long to type in all the places > OSError > > and IOError are used, I personally like the look of OSError and IOError > better > > in my code. I vote for a shorter name for EnvironmentError, e.g. > EMError. > > > > just my 2c, Graham > > > > > > > > -Brett > > > _______________________________________________ > > > Python-Dev mailing list > > > Python-Dev at python.org > > > http://mail.python.org/mailman/listinfo/python-dev > > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/graham.horler%40gmail.com > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070927/0061630e/attachment.htm From greg.ewing at canterbury.ac.nz Fri Sep 28 07:50:17 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 28 Sep 2007 17:50:17 +1200 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> Message-ID: <46FC9619.6020400@canterbury.ac.nz> Gregory P. Smith wrote: > Is IOError is the right name to use? OSError is raised for things that > are not IO such as subprocess, dlopen, system. The trouble with either of these is that the class of errors we're talking about don't necessarily come directly from the OS or I/O library. Often I raise my own EnvironmentError instances for things which don't have any associated OS error code but are nonetheless environment-related, such as an error in a file format. I don't reuse IOError or OSError because I feel as though I ought to supply an errno with these, but there isn't any. I suppose we could pick one of these and make it official that it's okay to instantiate it without an errno. But it's hard to decide which one, because they both sound too narrow in scope. I don't like EMError either, btw. Maybe EnvError? Although that sounds like it has something to do with the unix environment variables. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From rhamph at gmail.com Fri Sep 28 08:03:17 2007 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 28 Sep 2007 00:03:17 -0600 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <46FC9619.6020400@canterbury.ac.nz> References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> <46FC9619.6020400@canterbury.ac.nz> Message-ID: On 9/27/07, Greg Ewing wrote: > Gregory P. Smith wrote: > > Is IOError is the right name to use? OSError is raised for things that > > are not IO such as subprocess, dlopen, system. > > The trouble with either of these is that the class > of errors we're talking about don't necessarily come > directly from the OS or I/O library. > > Often I raise my own EnvironmentError instances for > things which don't have any associated OS error code > but are nonetheless environment-related, such as an > error in a file format. > > I don't reuse IOError or OSError because I feel as > though I ought to supply an errno with these, but > there isn't any. > > I suppose we could pick one of these and make it > official that it's okay to instantiate it without > an errno. But it's hard to decide which one, > because they both sound too narrow in scope. > > I don't like EMError either, btw. Maybe EnvError? > Although that sounds like it has something to do > with the unix environment variables. ExternalError? Pretty vague though. -- Adam Olsen, aka Rhamphoryncus From theller at ctypes.org Fri Sep 28 13:24:42 2007 From: theller at ctypes.org (Thomas Heller) Date: Fri, 28 Sep 2007 13:24:42 +0200 Subject: [Python-Dev] Decimal news In-Reply-To: <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com> References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> <46FC4D40.4090808@trueblade.com> <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com> Message-ID: Thomas Wouters schrieb: > On 9/27/07, Eric Smith wrote: >> >> Thomas Wouters wrote: >> >> > Unfortunately, that's not how it works :-) If you check something into >> > the trunk, it will be merged into Py3k sooner or later. I may ask the >> > original submitter for assistance if it's incredibly hard to figure out >> > the changes, but so far, I only had to do that with the SSL changes. The >> > decimal changes are being merged as I write this (tests running now.) Is >> > there anything in particular that needs to be done for decimal in Py3k, >> > besides renaming __div__ to __truediv__? >> > >> > If you re-eally need to check something into the trunk that re-eally >> > must not be merged into py3k, but you're afraid it's not going to be >> > obvious to the merger, please record the change as 'merged' using >> > "svnmerge merge -M -r". Please take care when picking the >> > revision ;) You can also just email me or someone else you see doing >> > merges, as I doubt this will be a common occurance. I think that the 'svnmerge block -r' command should be used. Or not? Thomas From g.brandl at gmx.net Fri Sep 28 15:42:09 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 28 Sep 2007 15:42:09 +0200 Subject: [Python-Dev] Python 3.0a documentation In-Reply-To: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> References: <11667.84.19.238.82.1190796148.VFkUQmFaS098Sh0W.squirrel@84.19.238.82> Message-ID: scav at blueyonder.co.uk schrieb: > I'd like to help out cleaning up the Python3.0 documentation. There are a > lot of little leftovers from 2.x that are no longer true. (mentions of > long, callable() etc.) I've applied the first four patches, thank you! Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From dangyogi at gmail.com Sun Sep 23 02:03:30 2007 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sat, 22 Sep 2007 20:03:30 -0400 Subject: [Python-Dev] Adding concat function to itertools Message-ID: <46F5AD52.5040407@gmail.com> An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070922/daea9fb4/attachment-0001.htm -------------- next part -------------- A non-text attachment was scrubbed... Name: itertoolsmodule.c Type: text/x-csrc Size: 62269 bytes Desc: not available Url : http://mail.python.org/pipermail/python-dev/attachments/20070922/daea9fb4/attachment-0001.c From bioinformed at gmail.com Fri Sep 28 17:50:27 2007 From: bioinformed at gmail.com (Kevin Jacobs ) Date: Fri, 28 Sep 2007 11:50:27 -0400 Subject: [Python-Dev] Adding concat function to itertools In-Reply-To: <46F5AD52.5040407@gmail.com> References: <46F5AD52.5040407@gmail.com> Message-ID: <2e1434c10709280850j513ddf6s6eaf748a1d1cc90@mail.gmail.com> On 9/22/07, Bruce Frederiksen wrote: > > I've added a new function to itertools called 'concat'. This function is > much like *chain*, but takes all of the iterables as a single argument. > I've needed this once or twice, though my implementation was called 'starchain', in line with 'starmap'. I'm not a big fan of either name, though -- 'chainstar' and 'mapstar' are only marginally better (though it makes me want to come up with 'saw' and 'chainsaw' functions). Nor can I comment on the general applicability of such a function, other than to say that it was useful in some of my applications that utilize iterators of iterators of indeterminate length. -Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070928/d14fd999/attachment.htm From brett at python.org Fri Sep 28 19:07:40 2007 From: brett at python.org (Brett Cannon) Date: Fri, 28 Sep 2007 10:07:40 -0700 Subject: [Python-Dev] Adding concat function to itertools In-Reply-To: <46F5AD52.5040407@gmail.com> References: <46F5AD52.5040407@gmail.com> Message-ID: On 9/22/07, Bruce Frederiksen wrote: > > I've added a new function to itertools called 'concat'. This function is > much like chain, but takes all of the iterables as a single argument. Thus > concat(some_iterables) is logically equivalent to chain(*some_iterables); > the difference being that chain(*some_iterables) results in some_iterables > being fully expanded before the call to chain, while concat(some_iterables) > only iterates on some_iterables as needed. This makes concat more > attractive when some_iterables is either expensive to expand or "infinite" > in length. > > Thus, concat(iterable) is like: > > def concat(iterables): > for it in iterables: > for element in it: > yield element > > > > I've attached an updated itertoolsmodule.c file to this email with concat > added to it. This was based on the 2.5.1 source. > > I ask that this be considered for adoption into standard python. > > Thanks in advance! > Best thing to do is to put this up on the bug tracker (bugs.python.org) and assign it to Raymond Hettinger as itertools is his baby. -Brett From status at bugs.python.org Fri Sep 28 19:37:06 2007 From: status at bugs.python.org (Tracker) Date: Fri, 28 Sep 2007 17:37:06 +0000 (UTC) Subject: [Python-Dev] Summary of Tracker Issues Message-ID: <20070928173706.BFD71782DC@psf.upfronthosting.co.za> ACTIVITY SUMMARY (09/21/07 - 09/28/07) Tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue number. Do NOT respond to this message. 1278 open (+16) / 11424 closed (+17) / 12702 total (+33) Open issues with patches: 415 Average duration of open issues: 680 days. Median duration of open issues: 670 days. Open Issues Breakdown open 1273 (+16) pending 5 ( +0) Issues Created Or Reopened (33) _______________________________ pipe fd handling issues in subprocess.py on POSIX 09/21/07 http://bugs.python.org/issue1187 created anissen patch universal newlines doesn't identify CRLF during tell() 09/22/07 CLOSED http://bugs.python.org/issue1188 created pjenvey Documentation for tp_as_number tp_as_sequence tp_as_mapping 09/22/07 CLOSED http://bugs.python.org/issue1189 created amaury.forgeotdarc patch Windows rants& sugestions. 09/22/07 CLOSED http://bugs.python.org/issue1190 created wolfstar359 bsddb does not build with bsddb lib v3.1. 09/22/07 http://bugs.python.org/issue1191 created giraffedata Python 3 documents crash Firefox 09/24/07 CLOSED http://bugs.python.org/issue1192 created rtmq os.system() encoding bug on Windows 09/24/07 http://bugs.python.org/issue1193 created r_mosaic The reduce() documentation is lost in Python 3.0a1 09/24/07 CLOSED http://bugs.python.org/issue1194 created r_mosaic Problems on Linux with Ctrl-D and Ctrl-C during raw_input 09/24/07 http://bugs.python.org/issue1195 created Rebecca int() documentation does not specify default radix 09/24/07 CLOSED http://bugs.python.org/issue1196 created tcdelaney logging: formatter does not accept %(funcName)s properly 09/24/07 CLOSED http://bugs.python.org/issue1197 created CM Test of 2to3 component auditor 09/24/07 CLOSED http://bugs.python.org/issue1198 created dubois Documentation for tp_as_number... version 2.6 09/25/07 http://bugs.python.org/issue1199 created amaury.forgeotdarc patch Allow array.array to be parsed by the t# format unit. 09/25/07 http://bugs.python.org/issue1200 created jyasskin Error in array concept 09/25/07 CLOSED http://bugs.python.org/issue1201 created zip zlib.crc32() and adler32() return value 09/25/07 http://bugs.python.org/issue1202 created arigo ctypes doesn't work on Mac with --disable-toolbox-glue 09/25/07 http://bugs.python.org/issue1203 created janssen readline configuration for shared libs w/o curses dependencies 09/25/07 http://bugs.python.org/issue1204 created mbeachy patch urllib fail to read URL contents, urllib2 crash Python 09/26/07 http://bugs.python.org/issue1205 created cosoleto logging/__init__.py 09/26/07 CLOSED http://bugs.python.org/issue1206 created phd patch Load tests from path (patch included) 09/26/07 http://bugs.python.org/issue1207 created bluebird Match object should be guaranteed to always be true 09/26/07 CLOSED http://bugs.python.org/issue1208 created horcicka IOError won't accept tuples longer than 3 09/27/07 CLOSED http://bugs.python.org/issue1209 created jimjjewett imaplib does not run under Python 3 09/27/07 http://bugs.python.org/issue1210 created rtmq cleanup patch for 3.0 tutorial/interpreter.rst 09/27/07 CLOSED http://bugs.python.org/issue1211 created scav patch 3.0 tutorial/introduction.rst mentions 'long' 09/27/07 CLOSED http://bugs.python.org/issue1212 created scav patch 3.0 tutorial/classes.rst patch 09/27/07 CLOSED http://bugs.python.org/issue1213 created scav patch Timeout in CGIXMLRPCRequestHandler under IIS 09/27/07 http://bugs.python.org/issue1214 created steenie patch Python hang when catching a segfault 09/27/07 http://bugs.python.org/issue1215 created tebeka Python2.5.1 fails to compile under VC.NET2002 ( 7.0 ) 09/27/07 http://bugs.python.org/issue1216 created kartlee infinite loop in re module 09/27/07 CLOSED http://bugs.python.org/issue1217 created andresriancho Restrict Google search to docs when in the docs subtree? 09/27/07 http://bugs.python.org/issue1218 created skip.montanaro 3.0 library/stdtypes.rst patch 09/28/07 CLOSED http://bugs.python.org/issue1219 created scav patch Issues Now Closed (26) ______________________ logging.basicConfig does not allow to set NOTSET level 33 days http://bugs.python.org/issue1021 vsajip Test issue 23 days http://bugs.python.org/issue1064 loewis Allow str.join to join non-string types (as per PEP 3100) 16 days http://bugs.python.org/issue1145 gvanrossum patch urllib* 20x responses not OK? 5 days http://bugs.python.org/issue1177 georg.brandl patch py3k: Completely remove nb_coerce slot 1 days http://bugs.python.org/issue1185 gvanrossum patch optparse documentation: -- being collapsed to - in HTML 3 days http://bugs.python.org/issue1186 georg.brandl universal newlines doesn't identify CRLF during tell() 1 days http://bugs.python.org/issue1188 gvanrossum Documentation for tp_as_number tp_as_sequence tp_as_mapping 3 days http://bugs.python.org/issue1189 georg.brandl patch Windows rants& sugestions. 0 days http://bugs.python.org/issue1190 loewis Python 3 documents crash Firefox 0 days http://bugs.python.org/issue1192 loewis The reduce() documentation is lost in Python 3.0a1 0 days http://bugs.python.org/issue1194 georg.brandl int() documentation does not specify default radix 0 days http://bugs.python.org/issue1196 georg.brandl logging: formatter does not accept %(funcName)s properly 1 days http://bugs.python.org/issue1197 vsajip Test of 2to3 component auditor 0 days http://bugs.python.org/issue1198 dubois Error in array concept 0 days http://bugs.python.org/issue1201 gvanrossum logging/__init__.py 1 days http://bugs.python.org/issue1206 vsajip patch Match object should be guaranteed to always be true 0 days http://bugs.python.org/issue1208 georg.brandl IOError won't accept tuples longer than 3 0 days http://bugs.python.org/issue1209 georg.brandl cleanup patch for 3.0 tutorial/interpreter.rst 1 days http://bugs.python.org/issue1211 georg.brandl patch 3.0 tutorial/introduction.rst mentions 'long' 1 days http://bugs.python.org/issue1212 georg.brandl patch 3.0 tutorial/classes.rst patch 1 days http://bugs.python.org/issue1213 georg.brandl patch infinite loop in re module 0 days http://bugs.python.org/issue1217 brett.cannon 3.0 library/stdtypes.rst patch 0 days http://bugs.python.org/issue1219 georg.brandl patch syslog syscall support for SysLogLogger 147 days http://bugs.python.org/issue1711603 luke-jr patch RotatingFileHandler.doRollover behave wrong vs. log4j's 77 days http://bugs.python.org/issue1752539 vsajip logging.FileHandler may throw exception in flush() 63 days http://bugs.python.org/issue1760556 vsajip Top Issues Most Discussed (6) _____________________________ 5 urllib fail to read URL contents, urllib2 crash Python 2 days open http://bugs.python.org/issue1205 5 ctypes doesn't work on Mac with --disable-toolbox-glue 3 days open http://bugs.python.org/issue1203 5 Patch to rename HTMLParser module to lower_case 36 days open http://bugs.python.org/issue1002 4 infinite loop in re module 0 days closed http://bugs.python.org/issue1217 4 optparse documentation: -- being collapsed to - in HTML 3 days closed http://bugs.python.org/issue1186 3 Paticular decimal mod operation wrongly output NaN. 8 days open http://bugs.python.org/issue1182 From python at rcn.com Fri Sep 28 19:45:19 2007 From: python at rcn.com (Raymond Hettinger) Date: Fri, 28 Sep 2007 10:45:19 -0700 Subject: [Python-Dev] Adding concat function to itertools References: <46F5AD52.5040407@gmail.com> Message-ID: <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1> [Bruce Frederiksen] >> I've added a new function to itertools called 'concat'. This function is >> much like chain, but takes all of the iterables as a single argument. Any practical use cases or is this just a theoretical improvement? For Py2.x, I'm not willing to unnecessarily expand the module. However, for Py3k, I'm open to changing the signature for chain(). Raymond From p.f.moore at gmail.com Fri Sep 28 20:28:10 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 28 Sep 2007 19:28:10 +0100 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FAD68C.5030702@v.loewis.de> <7AD436E4270DD54A94238001769C2227CCBD18CBB8@DF-GRTDANE-MSG.exchange.corp.microsoft.com> Message-ID: <79990c6b0709281128j71e2ff6ep3601c6ffbb270836@mail.gmail.com> On 26/09/2007, Dino Viehland wrote: > My understanding is that users can write code that uses only \n and Python will write the > end-of-line character(s) that are appropriate for the platform when writing to a file. That's > what I meant by uses \n for everything internally. OK, so far so good - although I'm not *quite* sure there's a self-consistent definition of "code that only uses \n". I'll assume you mean code that has a concept of lines, that lines never contain anything other than text (specifically, neither \r or \n can appear in a line, I'll punt on whether other weird stuff like form feed are legal), and that whenever your code needs to write data to a file, it writes lines with \n alone between them. > But if you write \r\n to a file Python completely ignores the presence of the \r and > transforms the \n into a \r\n anyway, hence the \r\r in the resulting stream. My last > question is simply does anyone find writing \r\r\n when the original string contained \r\n a > useful behavior - personally I don't see how it is. In the above model, lines can't contain \r and between lines you only ever write \n - so where did the \r\n come from? If you receive what you think are lines from an outside source, and they contain \r, then you didn't sanity check your data. If you receive a block of raw (effectively binary!) data which you want to translate into your model, it's up to you how you cut it up into lines. If you read data using one of Python's text modes, it's up to you to understand how it works. > But Guido's response makes this sound like it's a problem w/ VC++ stdio implementation > and not something that Python is explicitly doing. I'm not sure it's a CRT issue. Certainly the \r\n vs \n confusion comes from the CRT - the underlying OS (just like Unix!!!!) only deals in files as streams of bytes. But ultimately, "lines" are an abstraction in your code. All the CRT (and Python) do is help (or maybe hinder) you with the "normal" cases. > Anyway, it'd might be useful to have a text-mode file that you can write \r\n to and only > get \r\n in the resulting file. I can't comment on that, other than to say that if you better defined the semantic model (lines, how things are encoded/decoded to files, etc, somewhat like I tried to above) it would be more obvious what use case this was trying to address. > But if the general sentiment is s.replace('\r', '') is the way to go we can advice our users > of the behavior when interoperating w/ APIs that return \r\n in strings. I'd say users of the relevant APIs need to understand how the APIs represent "lines", so that they can convert the received data to their program's model of lines. Of course, that probably corresponds to something like s.replace('\r','') or likely more correctly data_lines = s.split('\r\n'). A "rule of thumb" that doesn't make it clear that the concept of "line" has 2 different binary representations in 2 different areas (data back from APIs vs data from files) is likely to ultimately lead to mistakes and confusion. If you think this is bad, wait until you have to deal with Unicode issues like what *encoding* the data is being supplied to you in. Makes guessing newline conventions seem simple (at least to this parochial English-speaker :-)) Although as this is IronPython, you may already have that covered... Paul. PS In real life, you often just want a cheap and cheerful answer. For that, "strip out spurious \r characters" may be fine. From stephen at xemacs.org Fri Sep 28 22:11:33 2007 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 29 Sep 2007 05:11:33 +0900 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <46FC9619.6020400@canterbury.ac.nz> References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> <46FC9619.6020400@canterbury.ac.nz> Message-ID: <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp> Greg Ewing writes: > Gregory P. Smith wrote: > > Is IOError is the right name to use? OSError is raised for things that > > are not IO such as subprocess, dlopen, system. > > The trouble with either of these is that the class > of errors we're talking about don't necessarily come > directly from the OS or I/O library. Agree, but I think this is a case where practicality beats purity. +1 for OSerror. From guido at python.org Fri Sep 28 22:27:38 2007 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Sep 2007 13:27:38 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp> References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> <46FC9619.6020400@canterbury.ac.nz> <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 9/28/07, Stephen J. Turnbull wrote: > Greg Ewing writes: > > Gregory P. Smith wrote: > > > Is IOError is the right name to use? OSError is raised for things that > > > are not IO such as subprocess, dlopen, system. > > > > The trouble with either of these is that the class > > of errors we're talking about don't necessarily come > > directly from the OS or I/O library. > > Agree, but I think this is a case where practicality beats purity. > > +1 for OSerror. The OS is a somewhat troublesome abstraction boundary. I/O is a more general concept (and PPBP). +1 for IOError. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From martin at v.loewis.de Fri Sep 28 23:09:54 2007 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 28 Sep 2007 23:09:54 +0200 Subject: [Python-Dev] building with -Wwrite-strings In-Reply-To: References: Message-ID: <46FD6DA2.1060107@v.loewis.de> > I'm trying to build Python (2.6) with GCC the option -Wwrite-strings. > > 1 - Is there any interest on this? It might be nice to have, but will certainly come at a cost. So feel free to try this out; at the end, we might agree that this change is too intrusive. > 2 - What should I do for the very common (taken from int_new): > static char *kwlist[] = {"x", "base", 0}; What's wrong with static const char *kwlist[] = {"x", "base", 0}; Regards, Martin From mike.klaas at gmail.com Fri Sep 28 23:40:08 2007 From: mike.klaas at gmail.com (Mike Klaas) Date: Fri, 28 Sep 2007 14:40:08 -0700 Subject: [Python-Dev] Adding concat function to itertools In-Reply-To: <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1> References: <46F5AD52.5040407@gmail.com> <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1> Message-ID: On 28-Sep-07, at 10:45 AM, Raymond Hettinger wrote: > [Bruce Frederiksen] >>> I've added a new function to itertools called 'concat'. This >>> function is >>> much like chain, but takes all of the iterables as a single >>> argument. > > Any practical use cases or is this just a theoretical improvement? > > For Py2.x, I'm not willing to unnecessarily expand the module. > However, for Py3k, I'm open to changing the signature for chain(). For me, a fraction of chain() uses are of the * variety: d = defaultdict(list) allvals = chain(*d.values()) return chain(*imap(cache.__getitem__, keylist)) Interestingly, they seem to all have something to do with dictionary values() that are themselves iterable. -Mike From python at rcn.com Fri Sep 28 23:53:42 2007 From: python at rcn.com (Raymond Hettinger) Date: Fri, 28 Sep 2007 14:53:42 -0700 Subject: [Python-Dev] Adding concat function to itertools References: <46F5AD52.5040407@gmail.com> <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1> Message-ID: <002301c8021a$0a05d9e0$69196b0a@RaymondLaptop1> [Bruce Frederiksen] >>>> I've added a new function to itertools called 'concat'. This >>>> function is >>>> much like chain, but takes all of the iterables as a single >>>> argument. [Raymond] >> Any practical use cases or is this just a theoretical improvement? >> >> For Py2.x, I'm not willing to unnecessarily expand the module. >> However, for Py3k, I'm open to changing the signature for chain(). [Bruce] > For me, a fraction of chain() uses are of the * variety: > > d = defaultdict(list) > allvals = chain(*d.values()) > > return chain(*imap(cache.__getitem__, keylist)) > > Interestingly, they seem to all have something to do with dictionary > values() that are themselves iterable. I see. These are instances of a recurring general use case of chain() as a one-level flattener. Will give consideration to changing the signature of chain() for Py3.0. Besides the concat() variation using a single iterable input, another alternative is the min()/max() style signature where one input is interpreted as iterable and multiple arguments as comprising an input tuple. Raymond From djm at mindrot.org Sat Sep 29 00:09:32 2007 From: djm at mindrot.org (Damien Miller) Date: Sat, 29 Sep 2007 08:09:32 +1000 (EST) Subject: [Python-Dev] Adding concat function to itertools In-Reply-To: <002301c8021a$0a05d9e0$69196b0a@RaymondLaptop1> References: <46F5AD52.5040407@gmail.com> <001e01c801f7$56bdfa60$69196b0a@RaymondLaptop1> <002301c8021a$0a05d9e0$69196b0a@RaymondLaptop1> Message-ID: On Fri, 28 Sep 2007, Raymond Hettinger wrote: > > Interestingly, they seem to all have something to do with dictionary > > values() that are themselves iterable. > > I see. These are instances of a recurring general use case of > chain() as a one-level flattener. > > Will give consideration to changing the signature of chain() for Py3.0. > Besides the concat() variation using a single iterable input, another > alternative is the min()/max() style signature where one input is > interpreted as iterable and multiple arguments as comprising an > input tuple. Has anyone considered making the iterator __add__ operator perform something similar to chain? I.e. list(a + b) => [ a0, a1, ... an, b0, b1, bn] (where "a" and "b" are iterables) -d From brett at python.org Sat Sep 29 02:23:24 2007 From: brett at python.org (Brett Cannon) Date: Fri, 28 Sep 2007 17:23:24 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <46FB1B7B.1020001@canterbury.ac.nz> <63jqje$2dsube@venus.eclipse.kcom.com> <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> <46FC9619.6020400@canterbury.ac.nz> <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 9/28/07, Guido van Rossum wrote: > On 9/28/07, Stephen J. Turnbull wrote: > > Greg Ewing writes: > > > Gregory P. Smith wrote: > > > > Is IOError is the right name to use? OSError is raised for things that > > > > are not IO such as subprocess, dlopen, system. > > > > > > The trouble with either of these is that the class > > > of errors we're talking about don't necessarily come > > > directly from the OS or I/O library. > > > > Agree, but I think this is a case where practicality beats purity. > > > > +1 for OSerror. > > The OS is a somewhat troublesome abstraction boundary. I/O is a more > general concept (and PPBP). +1 for IOError. What is PPBP? -Brett From tjreedy at udel.edu Sat Sep 29 03:43:27 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 28 Sep 2007 21:43:27 -0400 Subject: [Python-Dev] Adding concat function to itertools References: <46F5AD52.5040407@gmail.com> Message-ID: "Bruce Frederiksen" wrote in message news:46F5AD52.5040407 at gmail.com... A 64K attachment. Please do not do such a worse-than-useless thing again. Especially when only 1K is original. From guido at python.org Sat Sep 29 05:10:52 2007 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Sep 2007 20:10:52 -0700 Subject: [Python-Dev] urllib exception compatibility In-Reply-To: References: <63jqje$2dsube@venus.eclipse.kcom.com> <52dc1c820709272158hda4e1e9o601a98eb5982cd23@mail.gmail.com> <46FC9619.6020400@canterbury.ac.nz> <87ejgible2.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 9/28/07, Brett Cannon wrote: > On 9/28/07, Guido van Rossum wrote: > > On 9/28/07, Stephen J. Turnbull wrote: > > > Greg Ewing writes: > > > > Gregory P. Smith wrote: > > > > > Is IOError is the right name to use? OSError is raised for things that > > > > > are not IO such as subprocess, dlopen, system. > > > > > > > > The trouble with either of these is that the class > > > > of errors we're talking about don't necessarily come > > > > directly from the OS or I/O library. > > > > > > Agree, but I think this is a case where practicality beats purity. > > > > > > +1 for OSerror. > > > > The OS is a somewhat troublesome abstraction boundary. I/O is a more > > general concept (and PPBP). +1 for IOError. > > What is PPBP? Typo for PBP : Practicality Beats Purity. :) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From nmm1 at cus.cam.ac.uk Sat Sep 29 11:25:29 2007 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 29 Sep 2007 10:25:29 +0100 Subject: [Python-Dev] New lines, carriage returns, and Windows Message-ID: "Paul Moore" wrote: > > OK, so far so good - although I'm not *quite* sure there's a > self-consistent definition of "code that only uses \n". I'll assume > you mean code that has a concept of lines, that lines never contain > anything other than text (specifically, neither \r or \n can appear in > a line, I'll punt on whether other weird stuff like form feed are > legal), and that whenever your code needs to write data to a file, it > writes lines with \n alone between them. I won't. There are a few of us still left who know how this started, and here is a simplified description. Unix was a computer scientist's workbench, and made no attempt to be general. In particular, its text datastream model was appropriate for the imnportant devices of the day - teletypes and similar. So far, so good. But what was forgotten later is that the model does NOT extend to other systems and, in particular, made no sense on the record-oriented models generally used by mainframes (see Fortran for an example). When C was standardised, this was fudged. I tried to get it improved, but it is one of the many things I failed to do. The handling of ALL of the control characters in text I/O is non-portable (even \t, despite what the satndard says), and you have to follow the system's constraints if things are to work. Unfortunately, the kludging that the compiler does to map C to the operating system confuses things still further - though it is essential. Now, BCPL was an ancestor of C, but always was a more portable language (i.e. it didn't start with a specific operating system in mind), and used/uses a rather better model. In this, line separators are atomic - e.g. '\f' is newline-with-form-feed and '\r' is "newline-with-overprinting". Now, THAT model is more generic. Not fully generic, of course, but it would cater for all of Unix, CPM and its derivatives (yes, Microsoft), MacOS and most mainframes (with some reservations). So, until and unless Python chooses to define its own I/O model, these problems will continue to arise. Whether this one is a simple bug or an avoidable feature, I can't say without looking harder, but bugs are often caused by attempting to implement impossible or confusing specifications. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From guido at python.org Sat Sep 29 17:07:18 2007 From: guido at python.org (Guido van Rossum) Date: Sat, 29 Sep 2007 08:07:18 -0700 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: References: Message-ID: On 9/29/07, Nick Maclaren wrote: > "Paul Moore" wrote: > > > > OK, so far so good - although I'm not *quite* sure there's a > > self-consistent definition of "code that only uses \n". I'll assume > > you mean code that has a concept of lines, that lines never contain > > anything other than text (specifically, neither \r or \n can appear in > > a line, I'll punt on whether other weird stuff like form feed are > > legal), and that whenever your code needs to write data to a file, it > > writes lines with \n alone between them. > > I won't. There are a few of us still left who know how this started, > and here is a simplified description. > > Unix was a computer scientist's workbench, and made no attempt to be > general. In particular, its text datastream model was appropriate > for the imnportant devices of the day - teletypes and similar. So > far, so good. But what was forgotten later is that the model does > NOT extend to other systems and, in particular, made no sense on the > record-oriented models generally used by mainframes (see Fortran for > an example). > > When C was standardised, this was fudged. I tried to get it improved, > but it is one of the many things I failed to do. The handling of > ALL of the control characters in text I/O is non-portable (even \t, > despite what the satndard says), and you have to follow the system's > constraints if things are to work. Unfortunately, the kludging that > the compiler does to map C to the operating system confuses things > still further - though it is essential. > > Now, BCPL was an ancestor of C, but always was a more portable > language (i.e. it didn't start with a specific operating system in > mind), and used/uses a rather better model. In this, line separators > are atomic - e.g. '\f' is newline-with-form-feed and '\r' is > "newline-with-overprinting". Now, THAT model is more generic. > Not fully generic, of course, but it would cater for all of Unix, > CPM and its derivatives (yes, Microsoft), MacOS and most mainframes > (with some reservations). > > So, until and unless Python chooses to define its own I/O model, > these problems will continue to arise. Whether this one is a simple > bug or an avoidable feature, I can't say without looking harder, > but bugs are often caused by attempting to implement impossible > or confusing specifications. Have you looked at Py3k at all, especially PEP 3116 (new I/O)? Python *does* have its own I/O model. There are binary files and text files. For binary files, you write bytes and the semantic model is that of an array of bytes; byte indices are seek positions. For text files, the contents is considered to be Unicode, encoded as bytes in a binary file. So text file always has an underlying binary file. Two translations take place, both of which have defaults varying by platform. One translation is encoding Unicode text into bytes upon output, and decoding bytes to Unicode text upon input. This can use any encoding supported by the encodings package. The other translation deals with line endings. Upon input, any of \r\n, \r, or \n is translated to a single \n by default (this is nhe "universal newlines" algorithm from Python 2.x). This can be tweaked or disabled. Upon output, \n is translated into a platform specific string chosen from \r\n, \r, or \n. This can also be disabled or overridden. Note that \r, when written, is never treated specially; if you want special processing for \r on output, you can write your own translation layer. That's all. There is nothing unimplementable or confusing in these specifications. Python doesn't care about record I/O on legacy OSes; it does care about variability found in practice between popular OSes. Note that \r, \n and friends in Python 3000 are either ASCII (in bytes literals) or Unicode (in text literals). Again, no support for legacy systems that don't use ASCII or a superset. Legacy OSes are called that for a reason. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From thomas at python.org Sat Sep 29 17:26:35 2007 From: thomas at python.org (Thomas Wouters) Date: Sat, 29 Sep 2007 17:26:35 +0200 Subject: [Python-Dev] Decimal news In-Reply-To: References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> <46FC4D40.4090808@trueblade.com> <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com> Message-ID: <9e804ac0709290826g69b7aedk31167d9d81cefd1b@mail.gmail.com> On 9/28/07, Thomas Heller wrote: > > Thomas Wouters schrieb: > >> > If you re-eally need to check something into the trunk that re-eally > >> > must not be merged into py3k, but you're afraid it's not going to be > >> > obvious to the merger, please record the change as 'merged' using > >> > "svnmerge merge -M -r". Please take care when picking the > >> > revision ;) You can also just email me or someone else you see doing > >> > merges, as I doubt this will be a common occurance. > > I think that the 'svnmerge block -r' command should be used. Or > not? If you're comfortable with using svnmerge yourself, sure. If you're worried that you might mess up the state of the branch, you can leave it up to us (me.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070929/7ea5fdb3/attachment.htm From fuzzyman at voidspace.org.uk Sat Sep 29 17:30:26 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 29 Sep 2007 16:30:26 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: References: Message-ID: <46FE6F92.40601@voidspace.org.uk> Guido van Rossum wrote: > [snip..] > Python *does* have its own I/O model. There are binary files and text > files. For binary files, you write bytes and the semantic model is > that of an array of bytes; byte indices are seek positions. > > For text files, the contents is considered to be Unicode, encoded as > bytes in a binary file. So text file always has an underlying binary > file. Two translations take place, both of which have defaults varying > by platform. One translation is encoding Unicode text into bytes upon > output, and decoding bytes to Unicode text upon input. This can use > any encoding supported by the encodings package. > > The other translation deals with line endings. Upon input, any of > \r\n, \r, or \n is translated to a single \n by default (this is nhe > "universal newlines" algorithm from Python 2.x). This can be tweaked > or disabled. Upon output, \n is translated into a platform specific > string chosen from \r\n, \r, or \n. This can also be disabled or > overridden. Note that \r, when written, is never treated specially; if > you want special processing for \r on output, you can write your own > translation layer. > So the question is, that when a string containing '\r\n' is written to a file in text mode on a Windows platform, should it be written with the encoded representation of '\r\n' or '\r\r\n'? Purity would dictate the latter and practicality the former (IMO)... However, that would mean that round tripping a string would change it ('\r\n' would be written as '\r\n' and then read as '\n') - on the other hand (particularly given that we are treating the data as text and not a binary blob) I don't see how writing '\r\r\n' would ever actually be useful in text. +1 on just writing '\r\n' from me. Michael Foord http://www.manning.com/foord > That's all. There is nothing unimplementable or confusing in these > specifications. > > Python doesn't care about record I/O on legacy OSes; it does care > about variability found in practice between popular OSes. > > Note that \r, \n and friends in Python 3000 are either ASCII (in bytes > literals) or Unicode (in text literals). Again, no support for legacy > systems that don't use ASCII or a superset. > > Legacy OSes are called that for a reason. > > From python at rcn.com Sat Sep 29 18:46:58 2007 From: python at rcn.com (Raymond Hettinger) Date: Sat, 29 Sep 2007 09:46:58 -0700 Subject: [Python-Dev] Decimal news In-Reply-To: <9e804ac0709290826g69b7aedk31167d9d81cefd1b@mail.gmail.com> References: <9e804ac0709181719l4df483eeg9c6a1a4accaadc8e@mail.gmail.com> <46FC4D40.4090808@trueblade.com> <9e804ac0709271832u306af234i13f656a35672c4ce@mail.gmail.com> <9e804ac0709290826g69b7aedk31167d9d81cefd1b@mail.gmail.com> Message-ID: <14790F8C-DCFB-43E7-BA28-1AB3EF80EEFC@rcn.com> If the differences are few, I prefer that you insert some conditionals that attach different functions based on the version number. That way we can keep a single version of the source that works on all of the pythons. Raymond On Sep 29, 2007, at 8:26 AM, "Thomas Wouters" wrote: > > > On 9/28/07, Thomas Heller wrote: > Thomas Wouters schrieb: > >> > If you re-eally need to check something into the trunk that re- > eally > >> > must not be merged into py3k, but you're afraid it's not going > to be > >> > obvious to the merger, please record the change as 'merged' using > >> > "svnmerge merge -M -r". Please take care when picking > the > >> > revision ;) You can also just email me or someone else you see > doing > >> > merges, as I doubt this will be a common occurance. > > I think that the 'svnmerge block -r' command should be > used. Or not? > > If you're comfortable with using svnmerge yourself, sure. If you're > worried that you might mess up the state of the branch, you can > leave it up to us (me.) > > > -- > Thomas Wouters > > Hi! I'm a .signature virus! copy me into your .signature file to > help me spread! > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/python%40rcn.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.python.org/pipermail/python-dev/attachments/20070929/375b55d2/attachment.htm From tjreedy at udel.edu Sat Sep 29 20:30:59 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 29 Sep 2007 14:30:59 -0400 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows References: <46FE6F92.40601@voidspace.org.uk> Message-ID: "Michael Foord" wrote in message news:46FE6F92.40601 at voidspace.org.uk... | Guido van Rossum wrote: [snip first part of nice summary of Python i/o model] | > The other translation deals with line endings. Upon input, any of | > \r\n, \r, or \n is translated to a single \n by default (this is nhe [sic] | > "universal newlines" algorithm from Python 2.x). This can be tweaked | > or disabled. Upon output, \n is translated into a platform specific | > string chosen from \r\n, \r, or \n. This can also be disabled or | > overridden. Note that \r, when written, is never treated specially; if | > you want special processing for \r on output, you can write your own | > translation layer. | So the question is, that when a string containing '\r\n' is written to a | file in text mode on a Windows platform, should it be written with the | encoded representation of '\r\n' or '\r\r\n'? I think Guido pretty clearly said that on output, the default behavior is that \r is nothing special. If you want a special case exception, write a special case translator. +1 from me. To propose otherwise is to propose that the default semantic meaning of Python text objects depend on the platform that it might be output-translated for. I believe the point of universal newline support was to get away from this. | Purity would dictate the latter and practicality the former (IMO)... I disagree. Special case exceptions complicate both learnability and code readability and maintainability. Simplicity is practicality. The symmetry of 'platform-line-endings =input> \n =output> plaform-line-endings' is both pure and practical. | However, that would mean that round tripping a string would change it | ('\r\n' would be written as '\r\n' and then read as '\n') Whereas \r\r\n would be read back as \r\n, which is what should happen. Round-trip-ability is practical to me. | - on the other | hand (particularly given that we are treating the data as text and not a | binary blob) I don't see how writing '\r\r\n' would ever actually be | useful in text. There are two normal ways for internal Python text to have \r\n: 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the same platform). 2. Intentially put there by a programmer. If s/he also chooses default \n translation on output, \r is correct. The leaves 1. Bugs due to ignorance or accident. These should be repaired. 2. Other special situations, which can be handled by disabling, overriding, and layering the defaults. This seems enough flexibility to me. Terry Jan Reedy From fuzzyman at voidspace.org.uk Sat Sep 29 20:35:53 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 29 Sep 2007 19:35:53 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: References: <46FE6F92.40601@voidspace.org.uk> Message-ID: <46FE9B09.8000800@voidspace.org.uk> Terry Reedy wrote: > "Michael Foord" wrote in message > news:46FE6F92.40601 at voidspace.org.uk... > | Guido van Rossum wrote: > > [snip first part of nice summary of Python i/o model] > > | > The other translation deals with line endings. Upon input, any of > | > \r\n, \r, or \n is translated to a single \n by default (this is nhe > [sic] > | > "universal newlines" algorithm from Python 2.x). This can be tweaked > | > or disabled. Upon output, \n is translated into a platform specific > | > string chosen from \r\n, \r, or \n. This can also be disabled or > | > overridden. Note that \r, when written, is never treated specially; if > | > you want special processing for \r on output, you can write your own > | > translation layer. > > | So the question is, that when a string containing '\r\n' is written to a > | file in text mode on a Windows platform, should it be written with the > | encoded representation of '\r\n' or '\r\r\n'? > > I think Guido pretty clearly said that on output, the default behavior is > that \r is nothing special. If you want a special case exception, write a > special case translator. +1 from me. > > To propose otherwise is to propose that the default semantic meaning of > Python text objects depend on the platform that it might be > output-translated for. I believe the point of universal newline support > was to get away from this. > > | Purity would dictate the latter and practicality the former (IMO)... > > I disagree. Special case exceptions complicate both learnability and code > readability and maintainability. Simplicity is practicality. The symmetry > of 'platform-line-endings =input> \n =output> plaform-line-endings' is both > pure and practical. > > | However, that would mean that round tripping a string would change it > | ('\r\n' would be written as '\r\n' and then read as '\n') > > Whereas \r\r\n would be read back as \r\n, which is what should happen. > Round-trip-ability is practical to me. > > | - on the other > | hand (particularly given that we are treating the data as text and not a > | binary blob) I don't see how writing '\r\r\n' would ever actually be > | useful in text. > > There are two normal ways for internal Python text to have \r\n: > 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the > same platform). > 2. Intentially put there by a programmer. If s/he also chooses default \n > translation on output, \r is correct. > Actually, I usually get these strings from Windows UI components. A file containing '\r\n' is read in with '\r\n' being translated to '\n'. New user input is added containing '\r\n' line endings. The file is written out and now contains a mix of '\r\n' and '\r\r\n'. Michael From steven.bethard at gmail.com Sat Sep 29 20:42:38 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 29 Sep 2007 12:42:38 -0600 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <46FE9B09.8000800@voidspace.org.uk> References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> Message-ID: On 9/29/07, Michael Foord wrote: > Terry Reedy wrote: > > There are two normal ways for internal Python text to have \r\n: > > 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the > > same platform). > > 2. Intentially put there by a programmer. If s/he also chooses default \n > > translation on output, \r is correct. > > > Actually, I usually get these strings from Windows UI components. A file > containing '\r\n' is read in with '\r\n' being translated to '\n'. New > user input is added containing '\r\n' line endings. The file is written > out and now contains a mix of '\r\n' and '\r\r\n'. Out of curiosity, why don't the Python wrappers for your Windows UI components do the appropriate '\r\n' -> '\n' conversions? STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From fuzzyman at voidspace.org.uk Sat Sep 29 20:47:20 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 29 Sep 2007 19:47:20 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> Message-ID: <46FE9DB8.9000604@voidspace.org.uk> Steven Bethard wrote: > On 9/29/07, Michael Foord wrote: > >> Terry Reedy wrote: >> >>> There are two normal ways for internal Python text to have \r\n: >>> 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the >>> same platform). >>> 2. Intentially put there by a programmer. If s/he also chooses default \n >>> translation on output, \r is correct. >>> >>> >> Actually, I usually get these strings from Windows UI components. A file >> containing '\r\n' is read in with '\r\n' being translated to '\n'. New >> user input is added containing '\r\n' line endings. The file is written >> out and now contains a mix of '\r\n' and '\r\r\n'. >> > > Out of curiosity, why don't the Python wrappers for your Windows UI > components do the appropriate '\r\n' -> '\n' conversions? > One of the great things about IronPython is that you don't *need* any wrappers - you access .NET objects natively (which in fact wrap the lower level win32 API) - and the .NET APIs are usually not as bad as you probably assume. ;-) You just have to be aware that line endings are '\r\n'. I'm not sure how or if pywin32 handles this. Michael > STeVe > From nmm1 at cus.cam.ac.uk Sat Sep 29 20:48:20 2007 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sat, 29 Sep 2007 19:48:20 +0100 Subject: [Python-Dev] New lines, carriage returns, and Windows Message-ID: "Guido van Rossum" wrote: > > Have you looked at Py3k at all, especially PEP 3116 (new I/O)? No. > Python *does* have its own I/O model. There are binary files and text > files. For binary files, you write bytes and the semantic model is > that of an array of bytes; byte indices are seek positions. That is the same model as C and Unix. It is text files that we are discussing. > For text files, the contents is considered to be Unicode, encoded as > bytes in a binary file. So text file always has an underlying binary > file. Two translations take place, both of which have defaults varying > by platform. One translation is encoding Unicode text into bytes upon > output, and decoding bytes to Unicode text upon input. This can use > any encoding supported by the encodings package. The character code isn't the issue here, and is almost completely irrelevant. > The other translation deals with line endings. Upon input, any of > \r\n, \r, or \n is translated to a single \n by default (this is nhe > "universal newlines" algorithm from Python 2.x). This can be tweaked > or disabled. Upon output, \n is translated into a platform specific > string chosen from \r\n, \r, or \n. This can also be disabled or > overridden. Note that \r, when written, is never treated specially; if > you want special processing for \r on output, you can write your own > translation layer. Grrk. That's the problem. You don't get back what you have written, for a start, which isn't nice. There are other issues, too. > That's all. There is nothing unimplementable or confusing in these > specifications. Nothing unimplementable, I agree. Nothing confusing? Not in the experience of the users I have dealt with. > Python doesn't care about record I/O on legacy OSes; it does care > about variability found in practice between popular OSes. As a short-term solution, that is fine. But I have seen the wheel turn a couple of times in 40 years, and expect it to continue after I am safely 6' under .... > Note that \r, \n and friends in Python 3000 are either ASCII (in bytes > literals) or Unicode (in text literals). Again, no support for legacy > systems that don't use ASCII or a superset. That's not a problem. I don't see that changing in the forseeable future. > Legacy OSes are called that for a reason. Well, I remember when the text I/O model that C, Unix and Python use WAS a feature of legacy OSs :-) Seriously. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From steven.bethard at gmail.com Sat Sep 29 20:59:28 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Sat, 29 Sep 2007 12:59:28 -0600 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <46FE9DB8.9000604@voidspace.org.uk> References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> <46FE9DB8.9000604@voidspace.org.uk> Message-ID: On 9/29/07, Michael Foord wrote: > Steven Bethard wrote: > > On 9/29/07, Michael Foord wrote: > > > >> Terry Reedy wrote: > >> > >>> There are two normal ways for internal Python text to have \r\n: > >>> 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the > >>> same platform). > >>> 2. Intentially put there by a programmer. If s/he also chooses default \n > >>> translation on output, \r is correct. > >>> > >>> > >> Actually, I usually get these strings from Windows UI components. A file > >> containing '\r\n' is read in with '\r\n' being translated to '\n'. New > >> user input is added containing '\r\n' line endings. The file is written > >> out and now contains a mix of '\r\n' and '\r\r\n'. > > > > Out of curiosity, why don't the Python wrappers for your Windows UI > > components do the appropriate '\r\n' -> '\n' conversions? > > One of the great things about IronPython is that you don't *need* any > wrappers - you access .NET objects natively (which in fact wrap the > lower level win32 API) - and the .NET APIs are usually not as bad as you > probably assume. ;-) > > You just have to be aware that line endings are '\r\n'. Ahh, I see. So all the .NET components function like Python 3.0's io.open(..., newline='\n'), where no translation of \n (to or from \r\n) is performed. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From fuzzyman at voidspace.org.uk Sat Sep 29 21:19:24 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 29 Sep 2007 20:19:24 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> <46FE9DB8.9000604@voidspace.org.uk> Message-ID: <46FEA53C.4070406@voidspace.org.uk> Steven Bethard wrote: > On 9/29/07, Michael Foord wrote: > >> Steven Bethard wrote: >> >>> On 9/29/07, Michael Foord wrote: >>> >>> >>>> Terry Reedy wrote: >>>> >>>> >>>>> There are two normal ways for internal Python text to have \r\n: >>>>> 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the >>>>> same platform). >>>>> 2. Intentially put there by a programmer. If s/he also chooses default \n >>>>> translation on output, \r is correct. >>>>> >>>>> >>>>> >>>> Actually, I usually get these strings from Windows UI components. A file >>>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New >>>> user input is added containing '\r\n' line endings. The file is written >>>> out and now contains a mix of '\r\n' and '\r\r\n'. >>>> >>> Out of curiosity, why don't the Python wrappers for your Windows UI >>> components do the appropriate '\r\n' -> '\n' conversions? >>> >> One of the great things about IronPython is that you don't *need* any >> wrappers - you access .NET objects natively (which in fact wrap the >> lower level win32 API) - and the .NET APIs are usually not as bad as you >> probably assume. ;-) >> >> You just have to be aware that line endings are '\r\n'. >> > > Ahh, I see. So all the .NET components function like Python 3.0's > io.open(..., newline='\n'), where no translation of \n (to or from > \r\n) is performed. > Effectively yes. Although for Python compatibility, opening a file in text mode using the python 'open' or 'file' will behave in the usual way. Michael > STeVe > From p.f.moore at gmail.com Sat Sep 29 21:47:35 2007 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 29 Sep 2007 20:47:35 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <46FEA53C.4070406@voidspace.org.uk> References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> <46FE9DB8.9000604@voidspace.org.uk> <46FEA53C.4070406@voidspace.org.uk> Message-ID: <79990c6b0709291247i45cc37cdl9a0b56b29053bbe6@mail.gmail.com> >>> Actually, I usually get these strings from Windows UI components. A file >>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New >>> user input is added containing '\r\n' line endings. The file is written >>> out and now contains a mix of '\r\n' and '\r\r\n'. >>> >> Out of curiosity, why don't the Python wrappers for your Windows UI >> components do the appropriate '\r\n' -> '\n' conversions? >> > One of the great things about IronPython is that you don't *need* any > wrappers - you access .NET objects natively (which in fact wrap the > lower level win32 API) - and the .NET APIs are usually not as bad as you > probably assume. ;-) Given the current lengthy discussion about newline translation, maybe it isn't such a great thing :-) Seriously, you do need a wrapper in this particular case - to convert the .NET line ending convention to Python's. The issue here is that such a wrapper is so trivial, that it's usually easier to simply do the translation with adhoc .replace('\r\n', '\n') calls. The problem comes when you accidentally forget a translation - then you get the clash between the .NET (\r\\n) and Python (\n) models. But of course, the solution in that case is to simply add the omitted translation, not to change Python's IO model. Of course, all this grand theory is just that - theory. In my case, it helped me understand what's going on, but that's all. For real life code, you just add the appropriate replace() calls. Whether theory helps you keep track of where replace() is needed, or whether you just know, doesn't really matter much. But regardless - the Python IO model doesn't need changing. (Not even 2.x, and the py3k model is even better in this regard). Paul. From tjreedy at udel.edu Sat Sep 29 23:53:49 2007 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 29 Sep 2007 17:53:49 -0400 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> Message-ID: "Michael Foord" wrote in message news:46FE9B09.8000800 at voidspace.org.uk... | Terry Reedy wrote: | > There are two normal ways for internal Python text to have \r\n: | > 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the | > same platform). | > 2. Intentially put there by a programmer. If s/he also chooses default \n | > translation on output, \r is correct. | > | Actually, I usually get these strings from Windows UI components. A file | containing '\r\n' is read in with '\r\n' being translated to '\n'. New | user input is added containing '\r\n' line endings. The file is written | out and now contains a mix of '\r\n' and '\r\r\n'. I covered this in the part you snipped: "2. Other special situations, which can be handled by disabling, overriding, and layering the defaults. This seems enough flexibility to me." While mixing input like this may seem 'normal' to you, I believe it is 'special' considering the total Python community. I can think of at least 4 decent solutions, depending on the details of the input and what you do with it. tjr From greg.ewing at canterbury.ac.nz Sun Sep 30 01:46:19 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Sep 2007 11:46:19 +1200 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: References: Message-ID: <46FEE3CB.1010002@canterbury.ac.nz> On 9/29/07, Nick Maclaren wrote: > Now, BCPL was an ancestor of C, but always was a more portable > language (i.e. it didn't start with a specific operating system in > mind), and used/uses a rather better model. In this, line separators > are atomic - e.g. '\f' is newline-with-form-feed and '\r' is > "newline-with-overprinting". I don't see how this is different from Unix/C "\n" being an atomic newline character. If you're saying that BCPL is better because it defines standard semantics for more control characters than just "\n", that may be true, but C is doing about the best it can with "\n" as far as I can see, given all the crazy things that different OSes want to do with line endings. In any case, the problem which started all this isn't really an I/O problem at all, it's a mismatch between the world of Python strings which use "\n" and .NET library code expecting strings which use "\r\n". The correct thing to do with that is to translate whenever a string crosses a boundary between Python code and .NET code. This is something that ought to be done automatically by the Python/.NET interfacing machinery, maybe by having a different type for .NET strings. -- Greg From greg.ewing at canterbury.ac.nz Sun Sep 30 02:22:19 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Sep 2007 12:22:19 +1200 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <46FE9DB8.9000604@voidspace.org.uk> References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> <46FE9DB8.9000604@voidspace.org.uk> Message-ID: <46FEEC3B.90006@canterbury.ac.nz> Michael Foord wrote: > One of the great things about IronPython is that you don't *need* any > wrappers - you access .NET objects natively But it seems that you really *do* need wrappers to deal with the line endings problem, whether they're provided automatically or you it yourself manually. This is reminiscent of the C-string vs. Pascal-string fiasco when Apple switched from Pascal to C as their main application programming language. Some development environments provided glue code that did the translation automatically; others required you to do it yourself, which was a huge nuisance. -- Greg From greg.ewing at canterbury.ac.nz Sun Sep 30 02:30:45 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 30 Sep 2007 12:30:45 +1200 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: References: Message-ID: <46FEEE35.7060007@canterbury.ac.nz> Nick Maclaren wrote: > Grrk. That's the problem. You don't get back what you have written You do as long as you *don't* use universal newlines mode for reading. This is the best that can be done, because universal newlines are inherently ambiguous. If you want universal newlines, you just have to accept that you can't also have \r characters meaning something other than newlines in your files. This is true regardless of what programming language or I/O model is being used. -- Greg From nmm1 at cus.cam.ac.uk Sun Sep 30 11:34:58 2007 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sun, 30 Sep 2007 10:34:58 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows Message-ID: Greg Ewing wrote: > > > Grrk. That's the problem. You don't get back what you have written > > You do as long as you *don't* use universal newlines mode > for reading. This is the best that can be done, because > universal newlines are inherently ambiguous. I don't know PRECISELY what you mean by "universal newlines mode", and this issue is all about the details, so any response would merely enhance the confusion. > If you want universal newlines, you just have to accept > that you can't also have \r characters meaning something > other than newlines in your files. This is true regardless > of what programming language or I/O model is being used. No, that is not true, and I have used more than one model where it wasn't. Let's stick to models where newlines are special characters - I prefer the ones where they are not, but that is by the way. Model 1: certain characters can be used only in combination. E.g. \f must occur immediately before (or after) a \n, which it modifies. r is either a newline-with-overprint or must be associated with a \n. In both cases, only ONE of the alternatives is permitted in the chosen model - the other use then becomes an error (and raises an exception). Model 2: (BCPL) there are a variety of newline characters, \n for plain newline, \f for newline-with-form-feed and \r for newline- with-overprint. ALL cause a newline, with the associated property. Note that the above is what the program sees - what is written to the outside world and how input is read is another matter. But I can assure you, from my own and many other people's experience, that neither of the above models cause the confusion being shown by the postings in this thread. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From nmm1 at cus.cam.ac.uk Sun Sep 30 11:49:56 2007 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sun, 30 Sep 2007 10:49:56 +0100 Subject: [Python-Dev] New lines, carriage returns, and Windows Message-ID: Greg Ewing wrote: > > I don't see how this is different from Unix/C "\n" being > an atomic newline character. Have you used systems with the I/O models I referred to (or ones with newlines being out-of-bound data)? > If you're saying that BCPL is better because it defines > standard semantics for more control characters than just > "\n", that may be true, but C is doing about the best it > can with "\n" as far as I can see, given all the crazy > things that different OSes want to do with line endings. I am afraid that you are wrong - see my other posting for how to do it better. Look, I have implemented both of those two models on systems that are FAR more different than most people can imagine. Both work, and neither causes confusion. The C/Unix/Python one does. > In any case, the problem which started all this isn't > really an I/O problem at all, it's a mismatch between > the world of Python strings which use "\n" and .NET > library code expecting strings which use "\r\n". That's an I/O problem :-) > The correct thing to do with that is to translate whenever > a string crosses a boundary between Python code and > .NET code. This is something that ought to be done > automatically by the Python/.NET interfacing machinery, > maybe by having a different type for .NET strings. Agreed. But the REASON it causes trouble is the inconsistency in the basic C/Unix/Python text I/O model. Let's consider just \f, \r and \n, and a few questions: Exactly what does a free-standing \f mean? Does \n\f\n mean starting at the top of a page or one line down? How do \r and \f interact with line-buffering? Think about MacOS here. I could go on, but those are enough to indicate that the problem is insoluble. The answer "Undefined but not even explicitly discouraged" is a recipe for confusion. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From skip at pobox.com Sun Sep 30 15:28:28 2007 From: skip at pobox.com (skip at pobox.com) Date: Sun, 30 Sep 2007 08:28:28 -0500 Subject: [Python-Dev] New lines, carriage returns, and Windows In-Reply-To: References: <7AD436E4270DD54A94238001769C2227CCBD18CB33@DF-GRTDANE-MSG.exchange.corp.microsoft.com> <46FB0F9D.6010303@canterbury.ac.nz> <18171.6511.526695.684154@montanaro.dyndns.org> Message-ID: <18175.42108.40732.660470@montanaro.dyndns.org> Greg> Maybe there should be a universal newlines mode defined for output Greg> as well as input, which translates any of "\r", "\n" or "\r\n" Greg> into the platform line ending. Skip> I'd be open to such a change. Principle of least surprise? Guido> The symmetry isn't as strong as you suggest, but I agree it would Guido> be a useful feature. Would you mind filing a Py3k feature request Guido> so we don't forget? Guido> A proposal for an API given the existing newlines=... parameter Guido> (described in detail in PEP 3116) would be even better. I've been thinking about this some more (in lieu of actually writing up any sort of proposal ;-) and I'm not so sure it would be all that useful. If you've opened a file in text mode you should only be writing newlines as '\n' anyway. If you want to translate a text file imported from another system to use the current system's line ending just open both the input and output files in text mode. With universal newlines mode for output, should writing '\r\n' result in one or two newlines (or one-and-a-half)? Depending on the platform you can argue that it should write out '\r\r', '\r\n\r\n' or '\n\n' or if on Windows that it should be left alone as '\r\n'. There is, of course, the current '\r\r\n' behavior as well. I don't think there's obviously one best answer. If you want to do something esoteric, open the file in binary mode and do whatever you like. Skip From nmm1 at cus.cam.ac.uk Sun Sep 30 15:38:12 2007 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sun, 30 Sep 2007 14:38:12 +0100 Subject: [Python-Dev] New lines, carriage returns, and Windows Message-ID: skip at pobox.com wrote: > > I've been thinking about this some more (in lieu of actually writing up any > sort of proposal ;-) and I'm not so sure it would be all that useful. If > you've opened a file in text mode you should only be writing newlines as > '\n' anyway. If you want to translate a text file imported from another > system to use the current system's line ending just open both the input and > output files in text mode. I.e. at least \r, \f and \v are discouraged - i.e. system-dependent, at best. That works. > With universal newlines mode for output, should writing '\r\n' result in one > or two newlines (or one-and-a-half)? Depending on the platform you can > argue that it should write out '\r\r', '\r\n\r\n' or '\n\n' or if on Windows > that it should be left alone as '\r\n'. There is, of course, the current > '\r\r\n' behavior as well. I don't think there's obviously one best answer. Quite. And it has nothing to do with the format the outside system uses - your first question is purely a matter of what the semantics of the Python program are. The question applies as much to zOS as to any of the systems Python supports. > If you want to do something esoteric, open the file in binary mode and do > whatever you like. Er, no. That's the Unix mistake. It works, provided two things are true: 1) You don't need to write portable formatting. 2) The 'outside system' uses the control characters of a byte stream for formatting. Let's skip (1) - but (2) is universally true, nowadays, isn't it? Er, no. Consider reading and writing to an X window (NOT an xterm). Such formatting is out-of-band (sorry, I used out-of-bound in a previous posting). Ouch. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679 From skip at pobox.com Sun Sep 30 15:39:42 2007 From: skip at pobox.com (skip at pobox.com) Date: Sun, 30 Sep 2007 08:39:42 -0500 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <46FE9B09.8000800@voidspace.org.uk> References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> Message-ID: <18175.42782.873060.154910@montanaro.dyndns.org> Michael> Actually, I usually get these strings from Windows UI Michael> components. A file containing '\r\n' is read in with '\r\n' Michael> being translated to '\n'. New user input is added containing Michael> '\r\n' line endings. The file is written out and now contains a Michael> mix of '\r\n' and '\r\r\n'. So you need a translation layer between the UI component and your code. Treat the component as a text file and perform the desired mapping. Yes? Skip From fuzzyman at voidspace.org.uk Sun Sep 30 15:49:56 2007 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 30 Sep 2007 14:49:56 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows In-Reply-To: <18175.42782.873060.154910@montanaro.dyndns.org> References: <46FE6F92.40601@voidspace.org.uk> <46FE9B09.8000800@voidspace.org.uk> <18175.42782.873060.154910@montanaro.dyndns.org> Message-ID: <46FFA984.9060602@voidspace.org.uk> skip at pobox.com wrote: > Michael> Actually, I usually get these strings from Windows UI > Michael> components. A file containing '\r\n' is read in with '\r\n' > Michael> being translated to '\n'. New user input is added containing > Michael> '\r\n' line endings. The file is written out and now contains a > Michael> mix of '\r\n' and '\r\r\n'. > > So you need a translation layer between the UI component and your code. > Treat the component as a text file and perform the desired mapping. Yes? > > Actually the problem was reported by one of the IronPython developers on behalf of another user. We stick to using the .NET file I/O and so don't have a problem. The only time it is an issue for us is our tests, where we have string literals in our test code (where new lines are obviously '\n') and we do a manual 'replace'. Not very difficult. It is just slightly ironic that the time Python 'gets it wrong' (for some value of wrong) is when you are using text mode for I/O :-) Michael > Skip > > From nmm1 at cus.cam.ac.uk Sun Sep 30 16:12:00 2007 From: nmm1 at cus.cam.ac.uk (Nick Maclaren) Date: Sun, 30 Sep 2007 15:12:00 +0100 Subject: [Python-Dev] [python] Re: New lines, carriage returns, and Windows Message-ID: Michael Foord wrote: > skip at pobox.com wrote: > > > Michael> Actually, I usually get these strings from Windows UI > > Michael> components. A file containing '\r\n' is read in with '\r\n' > > Michael> being translated to '\n'. New user input is added containing > > Michael> '\r\n' line endings. The file is written out and now contains a > > Michael> mix of '\r\n' and '\r\r\n'. > > > > So you need a translation layer between the UI component and your code. > > Treat the component as a text file and perform the desired mapping. Yes? > > Actually the problem was reported by one of the IronPython developers on > behalf of another user. We stick to using the .NET file I/O and so don't > have a problem. The only time it is an issue for us is our tests, where > we have string literals in our test code (where new lines are obviously > '\n') and we do a manual 'replace'. Not very difficult. > > It is just slightly ironic that the time Python 'gets it wrong' (for > some value of wrong) is when you are using text mode for I/O :-) Plus ca change, .... That has been the problem for as long as I have been using the byte stream model (nearly 40 years now). Provided that you can get control, OR there are well-defined semantics, you can sort things out. The semantics "we define only the trivial case, and the programmer must do something arcane, undefined and system-dependent for the rest" means that it is impossible for an interface to do the 'right' translation unless it knows what each side of it is assuming. As I say, there are solutions. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: nmm1 at cam.ac.uk Tel.: +44 1223 334761 Fax: +44 1223 334679