From scott+python-dev at scottdial.com Mon Nov 1 00:08:00 2010 From: scott+python-dev at scottdial.com (Scott Dial) Date: Sun, 31 Oct 2010 19:08:00 -0400 Subject: [Python-Dev] [Python-checkins] r85934 - in python/branches/py3k: Misc/NEWS Modules/socketmodule.c In-Reply-To: <4CCC7B59.2030408@v.loewis.de> References: <20101029182008.DFE35EEBE1@mail.python.org> <4CCC3034.20008@m2.ccsnet.ne.jp> <4CCC7B59.2030408@v.loewis.de> Message-ID: <4CCDF6D0.3020409@scottdial.com> On 10/30/2010 4:08 PM, Martin v. L?wis wrote: >> I think size should be in TCHARs, not in bytes. (MSDN says so) >> And GetComputerName's signature differs from MSDN. (Maybe should >> use GetComputerNameExW again?) > > You are right. So how about this patch? Still not quite right. The call to GetComputerNameExW after ERROR_MORE_DATA (which gives the number of *bytes* needed) still needs to pass "size/sizeof(wchar_t)" back into GetComputerNameExW since it wants the number TCHARs. I don't think the +1 is needed either (MSDN says it already included the null-terminator in the byte count. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From benjamin at python.org Mon Nov 1 00:20:07 2010 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 31 Oct 2010 18:20:07 -0500 Subject: [Python-Dev] str.format_from_mapping In-Reply-To: <4CCDF3A0.4020107@g.nevcal.com> References: <4CCDD410.6020104@trueblade.com> <20101031215551.3d4c2aec@pitrou.net> <4CCDED7A.6080706@g.nevcal.com> <4CCDEE6C.1010901@trueblade.com> <4CCDF3A0.4020107@g.nevcal.com> Message-ID: 2010/10/31 Glenn Linderman : > On 10/31/2010 3:32 PM, Eric Smith wrote: > > On 10/31/2010 6:28 PM, Glenn Linderman wrote: > > On 10/31/2010 2:02 PM, Benjamin Peterson wrote: > > 2010/10/31 Antoine Pitrou: > >>? On Sun, 31 Oct 2010 16:39:44 -0400 >>? Eric Smith? wrote: >> > >>>? What are your thoughts on adding a str.format_from_mapping (or similar >>>? name, maybe the suggested "format_map") to 3.2? See >>>? http://bugs.python.org/issue6081? . This method would be similar to >>>? "%(foo)s %(bar)s" % d, where d is a dict (or rather any mapping object), >>>? but of course would use str.format syntax: "{foo} >>>? {bar}".format_from_mapping(d). > >> >>? I must be missing something, but what's the difference with >>? XXX.format(**d)? > > It allows arbitrary mappings. > > Other than the language moratorium, why are arbitrary mappings not > allowed for the (**d) syntax? > > An arbitrary mapping would be converted to a dict. > > Yes, but why convert? Because callees always get a dictionary *copy* of the arguments. -- Regards, Benjamin From barry at python.org Mon Nov 1 01:45:21 2010 From: barry at python.org (Barry Warsaw) Date: Sun, 31 Oct 2010 20:45:21 -0400 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <20101031121150.3dee7990@mission> Message-ID: <20101031204521.263769bf@mission> On Oct 31, 2010, at 09:54 AM, Gregory P. Smith wrote: >> - moving the documentation to an "advanced" or "complete reference" section >> > >Agreed, I perfer simply deemphasizing these methods by reorganizing the >documentation and mentioning in their documentation to, "just use >assertEqual." De-documenting them is the first step towards causing >unnecessary pain by taking either of the next two steps: > >- make the methods non-public by prepending an underscore >> - leaving them public but adding deprecation warnings to the code >> > >Please do not make any existing released methods from the unittest module >non-public or add any deprecation warnings. That will simply cause >unnecessary code churn and pain for people porting their code from one >version to the next without benefiting anyone. I was hoping someone would get my not-too-subtle hint. :) -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Mon Nov 1 01:53:05 2010 From: barry at python.org (Barry Warsaw) Date: Sun, 31 Oct 2010 20:53:05 -0400 Subject: [Python-Dev] str.format_from_mapping In-Reply-To: <4CCDD410.6020104@trueblade.com> References: <4CCDD410.6020104@trueblade.com> Message-ID: <20101031205305.012eb563@snowdog> On Oct 31, 2010, at 04:39 PM, Eric Smith wrote: >What are your thoughts on adding a str.format_from_mapping (or similar >name, maybe the suggested "format_map") to 3.2? +1 for the shorter name. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From kristjan at ccpgames.com Mon Nov 1 03:32:01 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 1 Nov 2010 10:32:01 +0800 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <4CCA9EA8.5060300@v.loewis.de> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <201010270933.18420.eckhardt@satorlaser.com> <4CCA9EA8.5060300@v.loewis.de> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FE42@exchcn.ccp.ad.local> You just moved your copying down one level into stream.read(). This magic function must be implemented by possibly concatenating several "socket.recv()" calls. This invariably involves data copying, either by "".join() or stringio.write() K -----Original Message----- From: python-dev-bounces+kristjan=ccpgames.com at python.org [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of "Martin v. L?wis" Sent: Friday, October 29, 2010 18:15 To: python-dev at python.org Subject: Re: [Python-Dev] new buffer in python2.7 That is easy to achieve using the existing API: def read_and_unpack(stream, format): data = stream.read(struct.calcsize(format)) return struct.unpack(format, data) > Otherwise, I'm +1 on your suggestion, avoiding copying is a good thing. I believe my function also doesn't involve any unnecessary copies. From fuzzyman at voidspace.org.uk Mon Nov 1 03:55:35 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 02:55:35 +0000 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> Message-ID: <4CCE2C27.5000706@voidspace.org.uk> On 30/10/2010 06:56, Raymond Hettinger wrote: > > On Oct 29, 2010, at 9:11 PM, Michael Foord wrote: >>> Just to clarify. The following fails in Python 3: >>> >>> sorted([3, 1, 2, None]) >>> >>> If you want to compare that two iterables containing heterogeneous >>> types have the same members then it is tricky to implement correctly >>> and assertItemsEqual does it for you. >>> >>> I agree that the name is not ideal and would be happy to change the >>> name (deprecating the old name as it was released in 2.7). API churn >>> is as bad as API bloat, but at least changing the name is something >>> only done once. >> >> Sorry for the noise. Suggested alternative name: >> >> assertElementsEqual >> >> The docs need updating to make it clear that the method isn't just a >> synonym for assertEqual(sorted(iter1), sorted(iter2)) and that it >> works with unorderable types. > > I looked at this again and think we should just remove > assertItemsEqual() from Py3.2 and dedocument it in Py2.7. It is listed > as being new in 3.2 so nothing is lost. As it has been released in 2.7 (and in unittest2 for earlier versions of Python) removing it would add another pain point for those porting from Python 2 to 3. From a backwards compatibility point of view this method has been released (it is only new in 3.2 for the Python 3 series). Note that for this issues plus the other cleanup related topics we have been discussing Raymond has created issue 10273: http://bugs.python.org/issue10273 > > A new name like assertElementsEqual is an improvement because it > doesn't suggest something like assertEqual(d.items(), d.items()), but > it falls short in describing its key features: > > * the method doesn't care about order Something that implied order would be good but we shouldn't let the perfect be the enemy of the good. > * it does care about duplicates Both the old name and the new one imply that it does care about duplicates (to me at least). > * it don't need hashability > * it can handle sequences of non-comparable types The name doesn't imply that it needs hashability or comparable types either (although the latter needs to be documented as the current documentation could be read as saying that comparable types are needed). The name doesn't need to include all its *non-requirements*, it just needs to describe what it does. > > Also, I think the O(n**2) behavior is unexpected. I agree that this should be fixed. > There is a O(n log n) fast-path but it has a bug and needs to be > removed. See issue 10242. > Having a more efficient 'slow-path' and moving to that by default would fix it. The bug is only a duplicate of the bug in sorted - caused by the fact that sets / frozensets can't be sorted in the standard Python way (their less than comparison adheres to the set definition). This is something that will probably surprise many Python developers: >>> a = [{2,4}, {1,2}] >>> b = a[::-1] >>> sorted(a) [set([2, 4]), set([1, 2])] >>> sorted(b) [set([1, 2]), set([2, 4])] (Fixing the bug in sorted would fix assertItemsEqual ;-) As I stated in my previous email, the functionality is still useful. Add on the fact that this has already been released I'm -1 one removing, +1 on fixing O(n**2) behaviour and +0 on an alternative name. > The sole benefit over the more explicit variants like > assertEqual(set(a), set(b)) and assertEqual(sorted(a), sorted(b)) is > that it handles a somewhat rare corner case where neither of those > work (unordered comparison of non-compable types when you do care > about duplicates). That particular case doesn't come-up much and isn't > suggested by either the current name or its proposed replacement. > I have test suites littered with self.assertEqual(sorted(expected), sorted(actual)) - anywhere I care about the contents of a sequence but not about the order it is generated in (perhaps created by iteration over a set or dictionary). It is not uncommon for these lists to contain None which makes them un-sortable in Python 3. Decorating the members with something that allows a stable sort would fix that - and that is one possible fix for the efficiency issue. It would probably propagate the issue that sets / frozensets don't work with sorted. > FWIW, I checked-out some other unittest suites in other languages and > did not find an equivalent. That strongly suggests this is YAGNI and > it shouldn't be added in Py3.2. There needs to be more evidence of > need before putting this in. And if it goes in, it needs a really good > name that tells what operations are hidden behind the abstraction. > When reading test assertion, it is vital that the reader understand > exactly what is being tested. It's an API fail if a reader guesses > that assertElementsEqual(a,b) means list(a)==list(b); the test will > pass unintentionally. I agree very much that asserts need to be readable. I think assertSameElements is "good enough" on this score though. All the best, Michael > > See: > http://www.phpunit.de/manual/3.4/en/api.html > http://kentbeck.github.com/junit/javadoc/latest/ > > > Raymond > > > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From martin at v.loewis.de Mon Nov 1 07:22:14 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 01 Nov 2010 07:22:14 +0100 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FE42@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <201010270933.18420.eckhardt@satorlaser.com> <4CCA9EA8.5060300@v.loewis.de> <2E034B571A5CE44E949B9FCC3B6D24EE5761FE42@exchcn.ccp.ad.local> Message-ID: <4CCE5C96.305@v.loewis.de> >> def read_and_unpack(stream, format): >> data = stream.read(struct.calcsize(format)) >> return struct.unpack(format, data) >> >>> Otherwise, I'm +1 on your suggestion, avoiding copying is a good >>> thing. >> >> I believe my function also doesn't involve any unnecessary copies. > You just moved your copying down one level into stream.read(). This > magic function must be implemented by possibly concatenating > several "socket.recv()" calls. > This invariably involves data copying, either by "".join() or > stringio.write() Assuming there are multiple recv calls. For a typical struct, all data will come out of the stream with a single recv. so no join will be necessary. Regards, Martin From stefan_ml at behnel.de Mon Nov 1 09:35:41 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 01 Nov 2010 09:35:41 +0100 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC46@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <20101027123622.0ab194f9@pitrou.net> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC37@exchcn.ccp.ad.local> <1288178142.3533.9.camel@localhost.localdomain> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC40@exchcn.ccp.ad.local> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC46@exchcn.ccp.ad.local> Message-ID: Kristj?n Valur J?nsson, 27.10.2010 16:32: > Sorry, here the tables properly formatted: Certainly looked better on your first try. Stefan From stefan_ml at behnel.de Mon Nov 1 09:45:06 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 01 Nov 2010 09:45:06 +0100 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC45@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <20101027123622.0ab194f9@pitrou.net> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC37@exchcn.ccp.ad.local> <1288178142.3533.9.camel@localhost.localdomain> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC40@exchcn.ccp.ad.local> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC45@exchcn.ccp.ad.local> Message-ID: Kristj?n Valur J?nsson, 27.10.2010 16:28: > Notice how a Slice object is generated. Then a PyObject_GetItem() is > done. The salient code path is from apply_slice(). A slice object must > be constructed and destroyed. If slice object creation bothers you here, it might be worth using a free list in PySlice_New() instead of creating new slice objects on request. Creating a slice of something is not necessarily such a costly operation that it dominates creating the slice object, so optimising the slice request itself sounds like a good idea. You can take a look at how it's done in tupleoject.c if you want to provide a patch. Then, please open a bug tracker ticket and attach the patch there (and post a link to the ticket in this thread). Stefan From kristjan at ccpgames.com Mon Nov 1 09:57:09 2010 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Mon, 1 Nov 2010 16:57:09 +0800 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <20101027123622.0ab194f9@pitrou.net> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC37@exchcn.ccp.ad.local> <1288178142.3533.9.camel@localhost.localdomain> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC40@exchcn.ccp.ad.local> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC45@exchcn.ccp.ad.local> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF19@exchcn.ccp.ad.local> I've already created a patch. See http://bugs.python.org/issue10227. I was working with 2.7 where slicing sequences is done differently than in 3.2, so the difference is not that very great. I'm going to have another go at profiling the 3.2 version later and see why slicing a bytearray is so much more expensive than slicing a bytes object. K > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf > Of Stefan Behnel > Sent: 1. n?vember 2010 16:45 > To: python-dev at python.org > Subject: Re: [Python-Dev] new buffer in python2.7 > > Kristj?n Valur J?nsson, 27.10.2010 16:28: > > Notice how a Slice object is generated. Then a PyObject_GetItem() > is > > done. The salient code path is from apply_slice(). A slice object > must > > be constructed and destroyed. > > If slice object creation bothers you here, it might be worth using a > free > list in PySlice_New() instead of creating new slice objects on request. > > Creating a slice of something is not necessarily such a costly > operation > that it dominates creating the slice object, so optimising the slice > request itself sounds like a good idea. > > You can take a look at how it's done in tupleoject.c if you want to > provide > a patch. Then, please open a bug tracker ticket and attach the patch > there > (and post a link to the ticket in this thread). > > Stefan > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python- > dev/kristjan%40ccpgames.com From stefan_ml at behnel.de Mon Nov 1 09:58:22 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 01 Nov 2010 09:58:22 +0100 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <20101027123622.0ab194f9@pitrou.net> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC37@exchcn.ccp.ad.local> <1288178142.3533.9.camel@localhost.localdomain> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC40@exchcn.ccp.ad.local> <2E034B571A5CE44E949B9FCC3B6D24EE5761FC45@exchcn.ccp.ad.local> Message-ID: Stefan Behnel, 01.11.2010 09:45: > If slice object creation bothers you here, it might be worth using a > free list in PySlice_New() instead of creating new slice objects on > request. >[...] > You can take a look at how it's done in tupleoject.c if you want to > provide a patch. Hmm, that's actually a particularly bad place to look. The implementation in listobject.c is much simpler. Stefan From kristjan at ccpgames.com Mon Nov 1 10:09:31 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 1 Nov 2010 17:09:31 +0800 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <4CCE5C96.305@v.loewis.de> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <201010270933.18420.eckhardt@satorlaser.com> <4CCA9EA8.5060300@v.loewis.de> <2E034B571A5CE44E949B9FCC3B6D24EE5761FE42@exchcn.ccp.ad.local> <4CCE5C96.305@v.loewis.de> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF1D@exchcn.ccp.ad.local> Ah, yes. There are, in my case. (why do I always seem to be doing stuff that is different from what you all are doing:) The particular piece of code is from the chunked reader. It may be reading rather large chunks at a time (several lots of Kb.): def recvchunk(socket): len = socket.unpack('i', recv_exactly(socket, 4)) return recv_exactly(len) #old style def recv_exactly(socket, length): data = [] while length: got = socket.receive(length) if not got: raise EOFError data.append(got) length -= len(got) return "".join(data) #new style def recv_exactly(socket, length): data = bytearray(length) view = memoryview(data) while length: got = socket.receive_into(view[-length:]) if not got: raise EOFError length -= len(got) return data Here I spot another optimzation oppertunity: let memoryview[:] return self, since the object is immutable, I believe. K > -----Original Message----- > From: "Martin v. L?wis" [mailto:martin at v.loewis.de] > Sent: 1. n?vember 2010 14:22 > To: Kristj?n Valur J?nsson > Cc: python-dev at python.org > Subject: Re: [Python-Dev] new buffer in python2.7 > > > Assuming there are multiple recv calls. For a typical struct, all data > will come out of the stream with a single recv. so no join will be > necessary. > > Regards, > Martin From ncoghlan at gmail.com Mon Nov 1 12:25:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 Nov 2010 21:25:22 +1000 Subject: [Python-Dev] str.format_from_mapping In-Reply-To: <20101031205305.012eb563@snowdog> References: <4CCDD410.6020104@trueblade.com> <20101031205305.012eb563@snowdog> Message-ID: On Mon, Nov 1, 2010 at 10:53 AM, Barry Warsaw wrote: > On Oct 31, 2010, at 04:39 PM, Eric Smith wrote: > >>What are your thoughts on adding a str.format_from_mapping (or similar >>name, maybe the suggested "format_map") to 3.2? > > +1 for the shorter name. +1 for a format_map() method that takes a single mapping argument (Eric's patch on the issue). -1 for the most recent patch attached to that issue that allows further positional arguments after the mapping object. (Raymond mentioned it on the issue, but I'll mention it here as well: this addition would fall under the "case-by-case exemption" clause in the moratorium PEP, since it adds a new method to a builtin type) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Nov 1 12:32:51 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 Nov 2010 21:32:51 +1000 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF1D@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <201010270933.18420.eckhardt@satorlaser.com> <4CCA9EA8.5060300@v.loewis.de> <2E034B571A5CE44E949B9FCC3B6D24EE5761FE42@exchcn.ccp.ad.local> <4CCE5C96.305@v.loewis.de> <2E034B571A5CE44E949B9FCC3B6D24EE5761FF1D@exchcn.ccp.ad.local> Message-ID: 2010/11/1 Kristj?n Valur J?nsson : > Ah, yes. ?There are, in my case. ?(why do I always seem to be doing stuff that is different from what you all are doing:) I would guess that most of us aren't writing MMOs for a living. Gamers seem to be a particularly demanding breed of user :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Nov 1 12:33:31 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 1 Nov 2010 12:33:31 +0100 Subject: [Python-Dev] Stable sort and partial order References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> Message-ID: <20101101123331.444668c9@pitrou.net> On Mon, 01 Nov 2010 02:55:35 +0000 Michael Foord wrote: > Having a more efficient 'slow-path' and moving to that by default would > fix it. The bug is only a duplicate of the bug in sorted - caused by the > fact that sets / frozensets can't be sorted in the standard Python way > (their less than comparison adheres to the set definition). This is > something that will probably surprise many Python developers: > > >>> a = [{2,4}, {1,2}] > >>> b = a[::-1] > >>> sorted(a) > [set([2, 4]), set([1, 2])] > >>> sorted(b) > [set([1, 2]), set([2, 4])] > > (Fixing the bug in sorted would fix assertItemsEqual ;-) How is this a bug? The sort algorithm is stable, which means the above behaviour is a feature. I see no easy way of eliminating the O(n*n) issue. Custom key functions can't work in all cases. Regards Antoine. From rdmurray at bitdance.com Mon Nov 1 14:06:40 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 01 Nov 2010 09:06:40 -0400 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <20101101123331.444668c9@pitrou.net> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> Message-ID: <20101101130640.66424218E38@kimball.webabinitio.net> On Mon, 01 Nov 2010 12:33:31 +0100, Antoine Pitrou wrote: > On Mon, 01 Nov 2010 02:55:35 +0000 > Michael Foord wrote: > > Having a more efficient 'slow-path' and moving to that by default would > > fix it. The bug is only a duplicate of the bug in sorted - caused by the > > fact that sets / frozensets can't be sorted in the standard Python way > > (their less than comparison adheres to the set definition). This is > > something that will probably surprise many Python developers: > > > > >>> a = [{2,4}, {1,2}] > > >>> b = a[::-1] > > >>> sorted(a) > > [set([2, 4]), set([1, 2])] > > >>> sorted(b) > > [set([1, 2]), set([2, 4])] > > > > (Fixing the bug in sorted would fix assertItemsEqual ;-) > > How is this a bug? The sort algorithm is stable, which means the above > behaviour is a feature. I see no easy way of eliminating the O(n*n) > issue. Custom key functions can't work in all cases. Even granting some theoretical way to sort sets by their contents, it still wouldn't be a bug in sorted. Sorted is just using the results returned by '__lt__', which is what it should do. Special casing sets in sorted would be wrong. -- R. David Murray www.bitdance.com From kristjan at ccpgames.com Mon Nov 1 14:34:09 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 1 Nov 2010 21:34:09 +0800 Subject: [Python-Dev] Continuing 2.x In-Reply-To: <4CCAD656.7070408@v.loewis.de> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> <4CC9749E.1030200@v.loewis.de> <20101028120414.1f4f3016@mission> <2E034B571A5CE44E949B9FCC3B6D24EE5761FD38@exchcn.ccp.ad.local> <4CCACF52.8090202@egenix.com> <4CCAD656.7070408@v.loewis.de> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4B@exchcn.ccp.ad.local> I've been sitting on the sideline seeing this unfold. We've seen some different viewpoints on the matter and I'm happy to see that I'm not the only one lamenting the proclaimed death of the 2.x linage. However, As correctly stated by Martin, I merely voiced a suggestion and I have gotten helpful counter-suggestions. A private branch is fine (More correctly a fork, even, as people have pointed out) and Hg is going to support user-branches. In the meantime, however, unless someone strongly objects, I'm probably going to set up a temporary branch off /release27-maint under /stackless/sandboxes/ until the Hg switchover. Name undecided yet. Cheers, Kristj?n > -----Original Message----- > From: python-dev-bounces+kristjan=ccpgames.com at python.org > [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf > Of "Martin v. L?wis" > Sent: 29. okt?ber 2010 22:13 > This thread was started by a specific proposal from Kristjan, and > Kristjan got a specific suggestion on how to proceed (namely, wait > for the Mercurial switchover, then publish his changes in a branch). > So despite the more general subject (which I think is still mostly > hypothetical), the real issue Kristjan raised has been resolved, > AFAICT (although Kristjan has not yet voiced an opinion of whether > he finds that resolution acceptable). From kristjan at ccpgames.com Mon Nov 1 15:00:59 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 1 Nov 2010 22:00:59 +0800 Subject: [Python-Dev] time.wallclock() Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4D@exchcn.ccp.ad.local> Working on Condition variables and semaphores (see http://bugs.python.org/issue10260) I noticed that time.time() was being used to correctly time blocking system calls. On windows, I would have used time.clock() but reading the documentation made me realize that on Unix that would return CPU seconds which are useless when blocking. However, on Windows, time.clock() has a much higher resolution, apart from being a "wallclock" time, and is thus better suited to timing that time.time(). In addition, time.time() has the potential of giving unexpected results if someone messes with the system clock. I was wondering if it were helpful to have a function such as time.wallclock() which is specified to give relative wallclock time between invocations or an approximation thereof, to the system's best ability? We could then choose this to be an alias of time.clock() on windows and time.time() on any other machine, or even have custom implementations on machines that support such a notion. Kristj?n -------------- next part -------------- An HTML attachment was scrubbed... URL: From olemis at gmail.com Mon Nov 1 15:04:10 2010 From: olemis at gmail.com (Olemis Lang) Date: Mon, 1 Nov 2010 09:04:10 -0500 Subject: [Python-Dev] Change to logging Formatters: support for alternative format styles In-Reply-To: References: <20101029110702.414c298d@mission> Message-ID: On Sun, Oct 31, 2010 at 9:55 AM, Vinay Sajip wrote: > Olemis Lang gmail.com> writes: >> >> On Fri, Oct 29, 2010 at 10:07 AM, Barry Warsaw python.org> wrote: >> > I haven't played with it yet, but do you think it makes sense to add a >> > 'style' keyword argument to basicConfig()? ?That would make it pretty easy >> > to get the formatting style you want without having to explicitly >> > instantiate a Formatter, at least for simple logging clients. >> > >> >> Since this may be considered as a little sophisticated, I'd rather >> prefer these new classes to be added to configuration sections using >> fileConfig (and default behavior if missing), and still leave >> `basicConfig` unchanged (i.e. *basic*) . >> > > Actually it's no biggie to have an optional style argument for basicConfig. > People who don't use it don't have to specify it; the style argument would only > apply if format was specified. > ok > For some people, use of {} over % is more about personal taste than about the > actual usage of str.format's flexibility; Thought you were talking about me, you only needed to say ?he has black hair and blue eyes? ... ;o) > we may as well accommodate that > preference, as it encourages in a small way the use of {}-formatting. > ok , nevermind , it's ok for me anyway (provided that sections for `fileConfig` will be available) . -- Regards, Olemis. Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From fuzzyman at voidspace.org.uk Mon Nov 1 15:05:14 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 14:05:14 +0000 Subject: [Python-Dev] time.wallclock() In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4D@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4D@exchcn.ccp.ad.local> Message-ID: <4CCEC91A.5010807@voidspace.org.uk> On 01/11/2010 14:00, Kristj?n Valur J?nsson wrote: > > Working on Condition variables and semaphores (see > http://bugs.python.org/issue10260) I noticed that time.time() was > being used to correctly time blocking system calls. On windows, I > would have used time.clock() but reading the documentation made me > realize that on Unix that would return CPU seconds which are useless > when blocking. However, on Windows, time.clock() has a much higher > resolution, apart from being a "wallclock" time, and is thus better > suited to timing that time.time(). In addition, time.time() has the > potential of giving unexpected results if someone messes with the > system clock. > > I was wondering if it were helpful to have a function such as > time.wallclock() which is specified to give relative wallclock time > between invocations or an approximation thereof, to the system's best > ability? > > We could then choose this to be an alias of time.clock() on windows > and time.time() on any other machine, or even have custom > implementations on machines that support such a notion. > I think this would be helpful. Having to do platform specific checks to choose which time function to use is annoying. Michael > Kristj?n > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Mon Nov 1 15:26:19 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 14:26:19 +0000 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <20101101123331.444668c9@pitrou.net> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> Message-ID: <4CCECE0B.40400@voidspace.org.uk> On 01/11/2010 11:33, Antoine Pitrou wrote: > On Mon, 01 Nov 2010 02:55:35 +0000 > Michael Foord wrote: >> Having a more efficient 'slow-path' and moving to that by default would >> fix it. The bug is only a duplicate of the bug in sorted - caused by the >> fact that sets / frozensets can't be sorted in the standard Python way >> (their less than comparison adheres to the set definition). This is >> something that will probably surprise many Python developers: >> >> >>> a = [{2,4}, {1,2}] >> >>> b = a[::-1] >> >>> sorted(a) >> [set([2, 4]), set([1, 2])] >> >>> sorted(b) >> [set([1, 2]), set([2, 4])] >> >> (Fixing the bug in sorted would fix assertItemsEqual ;-) > How is this a bug? The sort algorithm is stable, which means the above > behaviour is a feature. Well, bug is the wrong word as it is obviously an intended feature (or consequence of a feature). I still think, given the general behaviour of Python sorting, that it is unexpected. It breaks what is usually an invariant for sorting without an explicit key that sortable types sorted(l) == sorted(l[::-1]). There is however a note in the set documentation: Since sets only define partial ordering (subset relationships), the output of the list.sort() method is undefined for lists of sets. > I see no easy way of eliminating the O(n*n) issue. Custom key functions > can't work in all cases. > Right. Special casing sets and frozensets would be one (particularly inelegant) way however. All the best, Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From steven.bethard at gmail.com Mon Nov 1 15:48:48 2010 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 1 Nov 2010 14:48:48 +0000 Subject: [Python-Dev] okay to remove argparse.__all__? Message-ID: I think the easiest and most sensible way to address http://bugs.python.org/issue9353 is to simply remove the __all__ definition from argparse - everything that doesn't start with an underscore in the module is already meant to be exposed. But then I wonder - is __all__ considered part of the public API of a module? Or is it okay to just remove it and assume that no one should have been accessing it directly anyway? Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? ? ? ? ? --- The Hiphopopotamus From fuzzyman at voidspace.org.uk Mon Nov 1 15:53:24 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 14:53:24 +0000 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: References: Message-ID: <4CCED464.80202@voidspace.org.uk> On 01/11/2010 14:48, Steven Bethard wrote: > I think the easiest and most sensible way to address > http://bugs.python.org/issue9353 is to simply remove the __all__ > definition from argparse - everything that doesn't start with an > underscore in the module is already meant to be exposed. > > But then I wonder - is __all__ considered part of the public API of a > module? Or is it okay to just remove it and assume that no one should > have been accessing it directly anyway? Isn't it better to add the missing elements - what is the problem with that approach? Not defining __all__ will mean that "from argparse import *" will also export all the modules you import (copy, os, re, sys, textwrap). All the best, Michael > Steve -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From steven.bethard at gmail.com Mon Nov 1 15:55:25 2010 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 1 Nov 2010 14:55:25 +0000 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: <4CCED464.80202@voidspace.org.uk> References: <4CCED464.80202@voidspace.org.uk> Message-ID: On Mon, Nov 1, 2010 at 2:53 PM, Michael Foord wrote: > On 01/11/2010 14:48, Steven Bethard wrote: >> >> I think the easiest and most sensible way to address >> http://bugs.python.org/issue9353 is to simply remove the __all__ >> definition from argparse - everything that doesn't start with an >> underscore in the module is already meant to be exposed. >> >> But then I wonder - is __all__ considered part of the public API of a >> module? Or is it okay to just remove it and assume that no one should >> have been accessing it directly anyway? > > Isn't it better to add the missing elements - what is the problem with that > approach? It just requires extra synchronization, and history shows that I always forget to add them. ;-) > Not defining __all__ will mean that "from argparse import *" will also > export all the modules you import (copy, os, re, sys, textwrap). That won't happen in the case of argparse - all modules are imported like "import os as _os". Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? ? ? ? ? --- The Hiphopopotamus From guido at python.org Mon Nov 1 15:57:13 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 1 Nov 2010 07:57:13 -0700 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: <4CCED464.80202@voidspace.org.uk> References: <4CCED464.80202@voidspace.org.uk> Message-ID: On Mon, Nov 1, 2010 at 7:53 AM, Michael Foord wrote: > On 01/11/2010 14:48, Steven Bethard wrote: >> >> I think the easiest and most sensible way to address >> http://bugs.python.org/issue9353 is to simply remove the __all__ >> definition from argparse - everything that doesn't start with an >> underscore in the module is already meant to be exposed. >> >> But then I wonder - is __all__ considered part of the public API of a >> module? Or is it okay to just remove it and assume that no one should >> have been accessing it directly anyway? > > Isn't it better to add the missing elements - what is the problem with that > approach? Agreed, that's what I would do. > Not defining __all__ will mean that "from argparse import *" will also > export all the modules you import (copy, os, re, sys, textwrap). Well, the copy of argparse.py that I have carefully renames those to _copy, _os etc. to avoid this. You never know. It is also possible to write automated tests that flag likely missing symbols in __all__ (as well as symbols in __all__ missing from the module). -- --Guido van Rossum (python.org/~guido) From steven.bethard at gmail.com Mon Nov 1 15:59:03 2010 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 1 Nov 2010 14:59:03 +0000 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: References: <4CCED464.80202@voidspace.org.uk> Message-ID: On Mon, Nov 1, 2010 at 2:57 PM, Guido van Rossum wrote: > On Mon, Nov 1, 2010 at 7:53 AM, Michael Foord wrote: >> On 01/11/2010 14:48, Steven Bethard wrote: >>> But then I wonder - is __all__ considered part of the public API of a >>> module? Or is it okay to just remove it and assume that no one should >>> have been accessing it directly anyway? >> >> Isn't it better to add the missing elements - what is the problem with that >> approach? > > Agreed, that's what I would do. Ok, sounds good. > It is also possible to write automated tests that flag likely missing > symbols in __all__ (as well as symbols in __all__ missing from the > module). Yep, I plan on doing that. I already had a test something like this to remind me how I broke __all__ before. ;-) Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? ? ? ? ? --- The Hiphopopotamus From fuzzyman at voidspace.org.uk Mon Nov 1 15:59:07 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 14:59:07 +0000 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: References: <4CCED464.80202@voidspace.org.uk> Message-ID: <4CCED5BB.6060406@voidspace.org.uk> On 01/11/2010 14:57, Guido van Rossum wrote: > [snip...] >> Not defining __all__ will mean that "from argparse import *" will also >> export all the modules you import (copy, os, re, sys, textwrap). > Well, the copy of argparse.py that I have carefully renames those to > _copy, _os etc. to avoid this. Bah.... Sorry about that. Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Mon Nov 1 16:10:44 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 2 Nov 2010 01:10:44 +1000 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: References: <4CCED464.80202@voidspace.org.uk> Message-ID: On Tue, Nov 2, 2010 at 12:57 AM, Guido van Rossum wrote: > It is also possible to write automated tests that flag likely missing > symbols in __all__ (as well as symbols in __all__ missing from the > module). These days, test___all__ checks that everything in __all__ exists in standard library modules. It is also possible for individual module tests to include a check that goes the other way along the lines of: def test_all_is_complete(): known_private = {"known", "unexported", "names"} expected_public = (k for k in mod.__dict__ if k not in known_private and not k.startswith("_")) self.assertEqual(set(mod.__all__), expected_public) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From kristjan at ccpgames.com Mon Nov 1 16:10:51 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 1 Nov 2010 23:10:51 +0800 Subject: [Python-Dev] time.wallclock() In-Reply-To: <4CCEC91A.5010807@voidspace.org.uk> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4D@exchcn.ccp.ad.local> <4CCEC91A.5010807@voidspace.org.uk> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF51@exchcn.ccp.ad.local> Ok, please see http://bugs.python.org/issue10278 K From: Michael Foord [mailto:fuzzyman at voidspace.org.uk] Sent: 1. n?vember 2010 22:05 To: Kristj?n Valur J?nsson Cc: python-dev at python.org Subject: Re: [Python-Dev] time.wallclock() I think this would be helpful. Having to do platform specific checks to choose which time function to use is annoying. Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Mon Nov 1 16:10:53 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 01 Nov 2010 11:10:53 -0400 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <4CCECE0B.40400@voidspace.org.uk> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> <4CCECE0B.40400@voidspace.org.uk> Message-ID: <20101101151053.B2180228063@kimball.webabinitio.net> On Mon, 01 Nov 2010 14:26:19 -0000, Michael Foord wrote: > On 01/11/2010 11:33, Antoine Pitrou wrote: > > On Mon, 01 Nov 2010 02:55:35 +0000 > > Michael Foord wrote: > >> Having a more efficient 'slow-path' and moving to that by default would > >> fix it. The bug is only a duplicate of the bug in sorted - caused by the > >> fact that sets / frozensets can't be sorted in the standard Python way > >> (their less than comparison adheres to the set definition). This is > >> something that will probably surprise many Python developers: > >> > >> >>> a =3D [{2,4}, {1,2}] > >> >>> b =3D a[::-1] > >> >>> sorted(a) > >> [set([2, 4]), set([1, 2])] > >> >>> sorted(b) > >> [set([1, 2]), set([2, 4])] > >> > >> (Fixing the bug in sorted would fix assertItemsEqual ;-) > > How is this a bug? The sort algorithm is stable, which means the above > > behaviour is a feature. > > Well, bug is the wrong word as it is obviously an intended feature (or > consequence of a feature). I still think, given the general behaviour of > Python sorting, that it is unexpected. It breaks what is usually an > invariant for sorting without an explicit key that sortable types > sorted(l) = sorted(l[::-1]). Well, as Antoine pointed out, Python's sorting algorithm is stable, so that is in fact *not* an invariant: >>> x = ['abcd', 'foo'*50, 'foo'*50, 'dkke'] >>> y = x[::-1] >>> [id(n) for n in sorted(x)] [3073747680, 3073747904, 3073747624, 3073747512] >>> [id(n) for n in sorted(y)] [3073747680, 3073747904, 3073747512, 3073747624] Yes, == usually hides the fact that the *objects* are in a different order, but obviously that doesn't apply to sets :) -- R. David Murray www.bitdance.com From fuzzyman at voidspace.org.uk Mon Nov 1 16:14:36 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 15:14:36 +0000 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <20101101151053.B2180228063@kimball.webabinitio.net> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> <4CCECE0B.40400@voidspace.org.uk> <20101101151053.B2180228063@kimball.webabinitio.net> Message-ID: <4CCED95C.8080302@voidspace.org.uk> On 01/11/2010 15:10, R. David Murray wrote: > On Mon, 01 Nov 2010 14:26:19 -0000, Michael Foord wrote: >> On 01/11/2010 11:33, Antoine Pitrou wrote: >>> On Mon, 01 Nov 2010 02:55:35 +0000 >>> Michael Foord wrote: >>>> Having a more efficient 'slow-path' and moving to that by default would >>>> fix it. The bug is only a duplicate of the bug in sorted - caused by the >>>> fact that sets / frozensets can't be sorted in the standard Python way >>>> (their less than comparison adheres to the set definition). This is >>>> something that will probably surprise many Python developers: >>>> >>>> >>> a =3D [{2,4}, {1,2}] >>>> >>> b =3D a[::-1] >>>> >>> sorted(a) >>>> [set([2, 4]), set([1, 2])] >>>> >>> sorted(b) >>>> [set([1, 2]), set([2, 4])] >>>> >>>> (Fixing the bug in sorted would fix assertItemsEqual ;-) >>> How is this a bug? The sort algorithm is stable, which means the above >>> behaviour is a feature. >> Well, bug is the wrong word as it is obviously an intended feature (or >> consequence of a feature). I still think, given the general behaviour of >> Python sorting, that it is unexpected. It breaks what is usually an >> invariant for sorting without an explicit key that sortable types >> sorted(l) = sorted(l[::-1]). > Well, as Antoine pointed out, Python's sorting algorithm is stable, > so that is in fact *not* an invariant: > >>>> x = ['abcd', 'foo'*50, 'foo'*50, 'dkke'] >>>> y = x[::-1] >>>> [id(n) for n in sorted(x)] > [3073747680, 3073747904, 3073747624, 3073747512] >>>> [id(n) for n in sorted(y)] > [3073747680, 3073747904, 3073747512, 3073747624] > > Yes, == usually hides the fact that the *objects* are in a different > order, but obviously that doesn't apply to sets :) > Sorry, that should have been sorted(l) == sorted(l[::-1]) - which *is* the case for your example above. Michael > -- > R. David Murray www.bitdance.com -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ocean-city at m2.ccsnet.ne.jp Mon Nov 1 16:12:47 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Tue, 02 Nov 2010 00:12:47 +0900 Subject: [Python-Dev] PyMem_MALLOC vs PyMem_Malloc In-Reply-To: <4CCC56BB.9010202@egenix.com> References: <4CCC3309.1020104@m2.ccsnet.ne.jp> <4CCC55BD.5000606@egenix.com> <4CCC56BB.9010202@egenix.com> Message-ID: <4CCED8EF.8060103@m2.ccsnet.ne.jp> On 2010/10/31 2:32, M.-A. Lemburg wrote: > M.-A. Lemburg wrote: >> Hirokazu Yamamoto wrote: >>> Hello. I found several codes using PyMem_Free to free >>> allocated memory with PyMem_MALLOC (ie: PyUnicode_AsWideCharString) >>> >>> Is it safe? >> >> Within the interpreter: yes. >> >> In extensions: depends on the platform, but probably not. >> >> The macros provide faster access to the C lib malloc calls. >> >> The functions need to be used in extensions in case the interpreter will >> free the resource or the extension wants to free an interpreter >> allocated resource. They provide access to the malloc calls >> used by the interpreter, which may operate on a different heap >> than the extensions. >> >> Within an extension the macros use the extension heap. >> >> A subtle, but important difference. > > BTW: If you were referring to extensions using PyMem_Free() > to deallocate memory allocated in the interpreter using > PyMem_MALLOC(), then that's exactly how things should be > done. > > I was referring to use of the two mentioned APIs within > an extension. Thank you for reply, probably I could understand. From phd at phd.pp.ru Mon Nov 1 16:08:27 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Mon, 1 Nov 2010 18:08:27 +0300 Subject: [Python-Dev] okay to remove argparse.__all__? In-Reply-To: References: <4CCED464.80202@voidspace.org.uk> Message-ID: <20101101150827.GB28735@phd.pp.ru> On Mon, Nov 01, 2010 at 02:55:25PM +0000, Steven Bethard wrote: > On Mon, Nov 1, 2010 at 2:53 PM, Michael Foord wrote: > > Isn't it better to add the missing elements - what is the problem with that > > approach? > > It just requires extra synchronization, and history shows that I > always forget to add them. ;-) Automate: for key, value in globals().items(): if not key.startswith('_'): __all__.append(key) Further filter (by key or value) to your needs. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From rdmurray at bitdance.com Mon Nov 1 16:33:39 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 01 Nov 2010 11:33:39 -0400 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <4CCED95C.8080302@voidspace.org.uk> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> <4CCECE0B.40400@voidspace.org.uk> <20101101151053.B2180228063@kimball.webabinitio.net> <4CCED95C.8080302@voidspace.org.uk> Message-ID: <20101101153339.85A5F22805B@kimball.webabinitio.net> On Mon, 01 Nov 2010 15:14:36 -0000, Michael Foord wrote: > On 01/11/2010 15:10, R. David Murray wrote: > > On Mon, 01 Nov 2010 14:26:19 -0000, Michael Foord wrote: > >> Well, bug is the wrong word as it is obviously an intended feature (or > >> consequence of a feature). I still think, given the general behaviour of > >> Python sorting, that it is unexpected. It breaks what is usually an > >> invariant for sorting without an explicit key that sortable types > >> sorted(l) = sorted(l[::-1]). > > Well, as Antoine pointed out, Python's sorting algorithm is stable, > > so that is in fact *not* an invariant: > > > >>>> x = ['abcd', 'foo'*50, 'foo'*50, 'dkke'] > >>>> y = x[::-1] > >>>> [id(n) for n in sorted(x)] > > [3073747680, 3073747904, 3073747624, 3073747512] > >>>> [id(n) for n in sorted(y)] > > [3073747680, 3073747904, 3073747512, 3073747624] > > > > Yes, == usually hides the fact that the *objects* are in a different > > order, but obviously that doesn't apply to sets :) > > > > Sorry, that should have been sorted(l) == sorted(l[::-1]) - which *is* > the case for your example above. Yes, I know that's what you meant, that's why I said "== usually hides this". If you are restricting yourself to built in types, then your invariant is mostly true but (IMO) misleading, as set demonstrates. And it certainly doesn't have to be true for custom types, even if they don't redefine __lt__. You can argue that in a good design it should be, but as the set example indicates, there are problem domains where it is useful for it not to be. Or, to put it another way, *if* there is a bug here it would be in set, not sorted. -- R. David Murray www.bitdance.com From ocean-city at m2.ccsnet.ne.jp Mon Nov 1 17:10:32 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Tue, 02 Nov 2010 01:10:32 +0900 Subject: [Python-Dev] [Python-checkins] r85987 - python/branches/py3k/Lib/test/test_os.py In-Reply-To: <20101030212421.717CCEE986@mail.python.org> References: <20101030212421.717CCEE986@mail.python.org> Message-ID: <4CCEE678.7040705@m2.ccsnet.ne.jp> On 2010/10/31 6:24, brian.curtin wrote: > Author: brian.curtin > Date: Sat Oct 30 23:24:21 2010 > New Revision: 85987 > > Log: > Fix #10257. Clear resource warnings by using os.popen's context manager. > > > Modified: > python/branches/py3k/Lib/test/test_os.py > > Modified: python/branches/py3k/Lib/test/test_os.py > ============================================================================== > --- python/branches/py3k/Lib/test/test_os.py (original) > +++ python/branches/py3k/Lib/test/test_os.py Sat Oct 30 23:24:21 2010 > @@ -406,17 +406,19 @@ > os.environ.clear() > if os.path.exists("/bin/sh"): > os.environ.update(HELLO="World") > - value = os.popen("/bin/sh -c 'echo $HELLO'").read().strip() > - self.assertEquals(value, "World") > + with os.popen("/bin/sh -c 'echo $HELLO'") as popen: > + value = popen.read().strip() > + self.assertEquals(value, "World") Does this really cause resource warning? I think os.popen instance won't be into traceback because it's not declared as variable. So I suppose it will be deleted by reference count == 0 even when exception occurs. From ncoghlan at gmail.com Mon Nov 1 17:23:41 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 2 Nov 2010 02:23:41 +1000 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <20101101153339.85A5F22805B@kimball.webabinitio.net> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> <4CCECE0B.40400@voidspace.org.uk> <20101101151053.B2180228063@kimball.webabinitio.net> <4CCED95C.8080302@voidspace.org.uk> <20101101153339.85A5F22805B@kimball.webabinitio.net> Message-ID: On Tue, Nov 2, 2010 at 1:33 AM, R. David Murray wrote: > Or, to put it another way, *if* there is a bug here it would be in set, > not sorted. Put me in the "it's not a bug, it's a feature" camp. Providing a "elements equal" check that doesn't rely on LT providing a total ordering is a non-trivial exercise. Looking at assertItemsEqual, I'd be inclined to insert a check that falls back to the "unorderable_list_difference" approach in the case where "expected != sorted(reversed(expected))" (only need to check the one, since if the expected values are totally ordered, while the actual values are not, this should show up when comparing the elements). It slows down the fast path a bit, but the updated function should at least handle partial orderings more correctly than it does now. Cheers, Nick. P.S. Late night post, so I may be missing something obvious... -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Mon Nov 1 17:26:35 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 01 Nov 2010 16:26:35 +0000 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> <4CCECE0B.40400@voidspace.org.uk> <20101101151053.B2180228063@kimball.webabinitio.net> <4CCED95C.8080302@voidspace.org.uk> <20101101153339.85A5F22805B@kimball.webabinitio.net> Message-ID: <4CCEEA3B.30200@voidspace.org.uk> On 01/11/2010 16:23, Nick Coghlan wrote: > On Tue, Nov 2, 2010 at 1:33 AM, R. David Murray wrote: >> Or, to put it another way, *if* there is a bug here it would be in set, >> not sorted. > Put me in the "it's not a bug, it's a feature" camp. Providing a > "elements equal" check that doesn't rely on LT providing a total > ordering is a non-trivial exercise. > > Looking at assertItemsEqual, I'd be inclined to insert a check that > falls back to the "unorderable_list_difference" approach in the case > where "expected != sorted(reversed(expected))" If that is sufficient then it would be a nice way of keeping the fast path. (I'm not arguing that Antoine and R. David aren't correct in what they're saying about set ordering - I'm just saying that I was surprised and bet I'm not the only one. Bit of a dead end discussion. :-) Michael > (only need to check the > one, since if the expected values are totally ordered, while the > actual values are not, this should show up when comparing the > elements). It slows down the fast path a bit, but the updated function > should at least handle partial orderings more correctly than it does > now. > > Cheers, > Nick. > > P.S. Late night post, so I may be missing something obvious... > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Mon Nov 1 17:30:09 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 2 Nov 2010 02:30:09 +1000 Subject: [Python-Dev] [Python-checkins] r85987 - python/branches/py3k/Lib/test/test_os.py In-Reply-To: <4CCEE678.7040705@m2.ccsnet.ne.jp> References: <20101030212421.717CCEE986@mail.python.org> <4CCEE678.7040705@m2.ccsnet.ne.jp> Message-ID: On Tue, Nov 2, 2010 at 2:10 AM, Hirokazu Yamamoto wrote: > Does this really cause resource warning? I think os.popen instance > won't be into traceback because it's not declared as variable. So I > suppose it will be deleted by reference count == 0 even when exception > occurs. Any time __del__ has to close the resource triggers ResourceWarning, regardless of whether that is due to the cyclic garbage collector or the refcount naturally falling to zero. In the past dealing with this was clumsy, so it made sense to rely on CPython's refcounting to do the work. However, we have better tools for deterministic resource management now (in the form of context managers), so these updates help make the standard library and its test suite more suitable for use with non-refcounting Python implementations (such as PyPy, Jython and IronPython). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Nov 1 17:38:40 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 2 Nov 2010 02:38:40 +1000 Subject: [Python-Dev] Stable sort and partial order In-Reply-To: <4CCEEA3B.30200@voidspace.org.uk> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101101123331.444668c9@pitrou.net> <4CCECE0B.40400@voidspace.org.uk> <20101101151053.B2180228063@kimball.webabinitio.net> <4CCED95C.8080302@voidspace.org.uk> <20101101153339.85A5F22805B@kimball.webabinitio.net> <4CCEEA3B.30200@voidspace.org.uk> Message-ID: On Tue, Nov 2, 2010 at 2:26 AM, Michael Foord wrote: > On 01/11/2010 16:23, Nick Coghlan wrote: >> Looking at assertItemsEqual, I'd be inclined to insert a check that >> falls back to the "unorderable_list_difference" approach in the case >> where "expected != sorted(reversed(expected))" > > If that is sufficient then it would be a nice way of keeping the fast path. As far as I can tell, that check is a valid partial ordering detector for any sequence that contains one or more pairs of items for which LT, EQ and GE are all False: >>> seq = [{1}, {2}] >>> seq[0] < seq[1] False >>> seq[0] == seq[1] False >>> seq[0] > seq[1] False >>> sorted(seq) [{1}, {2}] >>> sorted(reversed(sorted(seq))) [{2}, {1}] Obviously, if the sequence doesn't contain any such items (e.g. all subsets of each other), then it will look like a total ordering and use the fast path. I see that as an upside :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From brett at python.org Mon Nov 1 17:50:12 2010 From: brett at python.org (Brett Cannon) Date: Mon, 1 Nov 2010 09:50:12 -0700 Subject: [Python-Dev] Continuing 2.x In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4B@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> <4CC9749E.1030200@v.loewis.de> <20101028120414.1f4f3016@mission> <2E034B571A5CE44E949B9FCC3B6D24EE5761FD38@exchcn.ccp.ad.local> <4CCACF52.8090202@egenix.com> <4CCAD656.7070408@v.loewis.de> <2E034B571A5CE44E949B9FCC3B6D24EE5761FF4B@exchcn.ccp.ad.local> Message-ID: 2010/11/1 Kristj?n Valur J?nsson : > I've been sitting on the sideline seeing this unfold. > We've seen some different viewpoints on the matter and I'm happy to see that I'm not the only one lamenting the proclaimed death of the 2.x linage. > However, As correctly stated by Martin, I merely voiced a suggestion and I have gotten helpful counter-suggestions. > A private branch is fine (More correctly a fork, even, as people have pointed out) and Hg is going to support user-branches. > In the meantime, however, unless someone strongly objects, I'm probably going to set up a temporary branch off /release27-maint under /stackless/sandboxes/ until the Hg switchover. ?Name undecided yet. No objection from me; branches in svn are for experimental stuff and this is what you are proposing. -Brett > > Cheers, > Kristj?n > > >> -----Original Message----- >> From: python-dev-bounces+kristjan=ccpgames.com at python.org >> [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf >> Of "Martin v. L?wis" >> Sent: 29. okt?ber 2010 22:13 >> This thread was started by a specific proposal from Kristjan, and >> Kristjan got a specific suggestion on how to proceed (namely, wait >> for the Mercurial switchover, then publish his changes in a branch). >> So despite the more general subject (which I think is still mostly >> hypothetical), the real issue Kristjan raised has been resolved, >> AFAICT (although Kristjan has not yet voiced an opinion of whether >> he finds that resolution acceptable). > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From vinay_sajip at yahoo.co.uk Mon Nov 1 18:49:36 2010 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Mon, 1 Nov 2010 17:49:36 +0000 (UTC) Subject: [Python-Dev] Change to logging Formatters: support for alternative format styles References: <20101029110702.414c298d@mission> Message-ID: Olemis Lang gmail.com> writes: > > > For some people, use of {} over % is more about personal taste than about the > > actual usage of str.format's flexibility; > > Thought you were talking about me, you only needed to say ?he has > black hair and blue eyes? ... ;o) > No, it was a general comment; I don't know your preferences. The basicConfig() change has now been checked into the py3k branch. Regards, Vinay Sajip From tjreedy at udel.edu Mon Nov 1 19:37:33 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 01 Nov 2010 14:37:33 -0400 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <4CCE2C27.5000706@voidspace.org.uk> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> Message-ID: On 10/31/2010 10:55 PM, Michael Foord wrote: > fact that sets / frozensets can't be sorted in the standard Python way > (their less than comparison adheres to the set definition). This is > something that will probably surprise many Python developers: Any programmer who sorts (or uses functions that depend on proper sorting) should know and respect the difference between partial orders, such as set inclusion, and total orders, such as lex order of sequences. So I am surprised by the above claim ;-). > >>> a = [{2,4}, {1,2}] > >>> b = a[::-1] > >>> sorted(a) > [set([2, 4]), set([1, 2])] > >>> sorted(b) > [set([1, 2]), set([2, 4])] The bug is not in the sort method, but the attempt to sort partially ordered items, which are not properly sortable. a = [{2,4}, {1,2}] b = a[::-1] print(sorted(a,key=sorted)) #[{1, 2}, {2, 4}] print(sorted(b,key=sorted)) #[{1, 2}, {2, 4}] A test method (or internal branch) that depends on sorting to work properly could just refuse to work with sets (and frozensets). -- Terry Jan Reedy From ctb at msu.edu Tue Nov 2 02:40:05 2010 From: ctb at msu.edu (C. Titus Brown) Date: Mon, 1 Nov 2010 18:40:05 -0700 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> Message-ID: <20101102014005.GA28322@idyll.org> On Mon, Nov 01, 2010 at 02:37:33PM -0400, Terry Reedy wrote: > On 10/31/2010 10:55 PM, Michael Foord wrote: > >> fact that sets / frozensets can't be sorted in the standard Python way >> (their less than comparison adheres to the set definition). This is >> something that will probably surprise many Python developers: > > Any programmer who sorts (or uses functions that depend on proper > sorting) should know and respect the difference between partial orders, > such as set inclusion, and total orders, such as lex order of sequences. > So I am surprised by the above claim ;-). Huh. Count me out. I guess I don't live up to your standards. --titus p.s. Seriously? I can accept that there's a rational minimalist argument for this "feature", but arguing that it's somehow the responsibility of a programmer to *expect* this seems kind of whack. -- C. Titus Brown, ctb at msu.edu From brett at python.org Tue Nov 2 03:35:18 2010 From: brett at python.org (Brett Cannon) Date: Mon, 1 Nov 2010 19:35:18 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <20101027124223.119ce5b1@pitrou.net> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou wrote: > On Tue, 26 Oct 2010 22:06:37 -0400 > Alexander Belopolsky wrote: >> >> While I appreciate your and Michael's eloquence, I don't really see >> why five 400-line modules are necessarily easier to maintain than one >> 2000-line module. ?Splitting code into modules is certainly a good >> thing when the resulting modules can be used independently. ?This >> helps users write leaner programs, reduces mental footprint of >> individual modules, etc, etc. ? The split unittest module does not >> bring any such benefits. ?It still presents a single "big-ball-of-mud" >> namespace, only rather than implemented in a single file, it is now >> swept in from eight different files. > > Are you saying that it has become a pile of medium-sized balls of mud? > I would like to say thanks for the mud, Michael! It's high quality mud > for sure. I realize I am a little late in this reply but issue 10273 linked to this and so now I am actually bothering to read this thread since it felt like bikeshedding when the thread began. I think the issue here is that the file structure of the code no longer matches the public API documented by unittest. Personally I, like most people it seems, prefer source files to be structured in a way to match the public API. In the case of unittest Michael didn't. He did ask python-dev if it was okay to do what he did, we all kept quiet, and now we have realized that most of us prefer to have files that mirror the API; lesson learned. But Python 2.7 shipped with this file layout so we have to stick with it lest we break any imports out there that use the package-like file structure Michael went with (which we could actually document and use if we wanted now that Michael has already broken things up). Reversing the trend by sticking all the code into unittest/__init__.py and then sticking import shims into the existing modules would be a stupid waste of time, especially considering the head maintainer of the package likes it the way it is. So basically it seems like we have learned a lesson: we prefer to have our code structured in files that match the public API. I think that is a legitimate design rule for the stdlib to follow from now on, but in the case of unittest it's too late to change it back (and it's a minor price to pay to learn this lesson and to have Michael maintaining unittest like he has been, plus we could consider using the new structure so that the public API matches the file structure when the need arises). From stephen at xemacs.org Tue Nov 2 09:28:43 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 02 Nov 2010 17:28:43 +0900 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <20101102014005.GA28322@idyll.org> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> Message-ID: <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> C. Titus Brown writes: > p.s. Seriously? I can accept that there's a rational minimalist argument > for this "feature", It is a feature, even if you aren't gonna need it. I want it. Many programmers do know that sets are partially ordered by inclusion, preordered by size, and (in Python) totally ordered by memory address. There's nothing wrong with not knowing that -- these are rather abstract mathematical concepts. But it's very useful that sorted() or .sort() use <=, and it's very useful that Python so often obeys simple consistent rules, and it would be quite confusing to those who do understand that "in Python the set type is partially ordered by inclusion" if sorted() used some other relation to order collections of sets. It's not so hard to change this: class SizedSet (set): def __lt__(a, b): return length(a) < length(b) def __le__(a, b): return length(a) <= length(b) def __gt__(a, b): return length(a) > length(b) def __ge__(a, b): return length(a) >= length(b) # These two are arguable, which makes size comparison not so # great as a candidate for the OOWTDI of set.__lt__(). def __eq__(a, b): return length(a) == length(b) def __ne__(a, b): return length(a) != length(b) If there were an obvious way to compare sets for use in sorting, that way would very likely be the most useful definition for <=, too. But there isn't, really (it's pretty obvious that comparing memory addresses is implausible, but otherwise, there are lots of candidates that are at least sometimes useful). Do you think otherwise? If so, what do you propose for the OOWTDI of sorting a collection of sets? > but arguing that it's somehow the responsibility of a programmer to > *expect* this seems kind of whack. I don't quite agree that everyone should "expect exactly the implemented behavior", but I do think it's a Python *programmer's* responsibility to refrain from expecting something else in this case. From victor.stinner at haypocalc.com Tue Nov 2 13:55:40 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 2 Nov 2010 13:55:40 +0100 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: References: <20101029003858.7D584EEA52@mail.python.org> Message-ID: <201011021355.40153.victor.stinner@haypocalc.com> I don't know how to ignore the BytesWarning without importing warning. How can I do that? Victor Le vendredi 29 octobre 2010 04:31:42, Benjamin Peterson a ?crit : > 2010/10/28 victor.stinner : > > Author: victor.stinner > > Date: Fri Oct 29 02:38:58 2010 > > New Revision: 85902 > > > > Log: > > Issue #10210: os.get_exec_path() ignores BytesWarning warnings > > > > > > Modified: > > python/branches/py3k/Lib/os.py > > python/branches/py3k/Lib/test/test_os.py > > > > Modified: python/branches/py3k/Lib/os.py > > ========================================================================= > > ===== --- python/branches/py3k/Lib/os.py (original) > > +++ python/branches/py3k/Lib/os.py Fri Oct 29 02:38:58 2010 > > @@ -382,18 +382,32 @@ > > *env* must be an environment variable dict or None. If *env* is > > None, os.environ will be used. > > """ > > + # Use a local import instead of a global import to avoid bootstrap > > issue: + # the os module is used to build Python extensions. > > + import warnings > > This sort of function import should be avoided. From ctb at msu.edu Tue Nov 2 15:05:55 2010 From: ctb at msu.edu (C. Titus Brown) Date: Tue, 2 Nov 2010 07:05:55 -0700 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20101102140555.GB22027@idyll.org> On Tue, Nov 02, 2010 at 05:28:43PM +0900, Stephen J. Turnbull wrote: > C. Titus Brown writes: > > > p.s. Seriously? I can accept that there's a rational minimalist argument > > for this "feature", > > It is a feature, even if you aren't gonna need it. I want it. > > Many programmers do know that sets are partially ordered by inclusion, > preordered by size, and (in Python) totally ordered by memory address. > There's nothing wrong with not knowing that -- these are rather > abstract mathematical concepts. But it's very useful that sorted() or > .sort() use <=, and it's very useful that Python so often obeys simple > consistent rules, and it would be quite confusing to those who do > understand that "in Python the set type is partially ordered by > inclusion" if sorted() used some other relation to order collections > of sets. > > It's not so hard to change this: [ ... ] > If there were an obvious way to compare sets for use in sorting, that > way would very likely be the most useful definition for <=, too. But > there isn't, really (it's pretty obvious that comparing memory > addresses is implausible, but otherwise, there are lots of candidates > that are at least sometimes useful). Do you think otherwise? If so, > what do you propose for the OOWTDI of sorting a collection of sets? I don't have one... > > but arguing that it's somehow the responsibility of a programmer to > > *expect* this seems kind of whack. > > I don't quite agree that everyone should "expect exactly the > implemented behavior", but I do think it's a Python *programmer's* > responsibility to refrain from expecting something else in this case. ...but, as someone who has to figure out how to teach stuff to CSE undergrads (and biology grads) I hate the statement "...any programmer should expect this..." because (unless you're going to disqualify a huge swathe of people from being programmers) it's *just not true*. I don't expect Python to cater to the lowest common denominator but we should be mindful of our audience, too. I think Python has a great advantage in not being too surprising much of the time, which helps quite a bit with learning. I hope people keep that in mind for future features. cheers, --t -- C. Titus Brown, ctb at msu.edu From ocean-city at m2.ccsnet.ne.jp Tue Nov 2 16:03:33 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Wed, 03 Nov 2010 00:03:33 +0900 Subject: [Python-Dev] Resource leaks warnings In-Reply-To: <20100929130156.29767af1@pitrou.net> References: <20100808221846.CFD80EEA3F@mail.python.org> <20100929130156.29767af1@pitrou.net> Message-ID: <4CD02845.2070107@m2.ccsnet.ne.jp> Sorry for late post. On 2010/09/29 20:01, Antoine Pitrou wrote: > Furthermore, it can produce real bugs, especially under Windows when > coupled with refererence cycles created by traceback objects I think this can be relaxed with the patch in #9815. ;-) From tjreedy at udel.edu Tue Nov 2 17:23:10 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 02 Nov 2010 12:23:10 -0400 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <20101102140555.GB22027@idyll.org> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> Message-ID: On 11/2/2010 10:05 AM, C. Titus Brown wrote: > ...but, as someone who has to figure out how to teach stuff to CSE undergrads > (and biology grads) I hate the statement "...any programmer should > expect this..." And indeed I (intentionally) did not say that. People who are ignorant and inexperienced about something should avoid making expectations in any direction until they have read the doc and experimented a bit. What I did say in the post you responded to is "Any programmer who sorts (or uses functions that depend on proper sorting) should know and respect the difference between partial orders, such as set inclusion, and total orders, such as lex order of sequences." I should hope that you teach the difference, or rather, help students to notice what they already know. Tell them to consider that difference between sorting people by a totally ordered characteristic like height or weight and a characteristic that is at best partially ordered, like hair color or ethical character. Or have them consider the partial order dependencies between morning get-ready-for-class activities (socks before shoes versus pants and shirt in either order). They already do topological sorting every day, even if the name seems fancy. -- Terry Jan Reedy From fuzzyman at voidspace.org.uk Tue Nov 2 17:29:36 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 02 Nov 2010 16:29:36 +0000 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> Message-ID: <4CD03C70.4040903@voidspace.org.uk> On 02/11/2010 16:23, Terry Reedy wrote: > On 11/2/2010 10:05 AM, C. Titus Brown wrote: > >> ...but, as someone who has to figure out how to teach stuff to CSE >> undergrads >> (and biology grads) I hate the statement "...any programmer should >> expect this..." > > And indeed I (intentionally) did not say that. People who are ignorant > and inexperienced about something should avoid making expectations in > any direction until they have read the doc and experimented a bit. Expectations come from consistent behaviour. sorted behaves consistently for *most* of the built-in types and will also work for custom types that provide a 'standard' (total ordering) implementation of __lt__. It is very easy to *not realise* that a consequence of sets (and frozensets) providing partial ordering through operator overloading is that sorting is undefined for them. Particularly as it still works for other mutable collections. Worth being aware that custom implementations of standard operators will break expectations of users who aren't intimately aware of the problem domains that the specific type may be created for. All the best, Michael Foord > > What I did say in the post you responded to is "Any programmer who > sorts (or uses functions that depend on proper sorting) should know > and respect the difference between partial orders, such as set > inclusion, and total orders, such as lex order of sequences." I should > hope that you teach the difference, or rather, help students to notice > what they already know. Tell them to consider that difference between > sorting people by a totally ordered characteristic like height or > weight and a characteristic that is at best partially ordered, like > hair color or ethical character. Or have them consider the partial > order dependencies between morning get-ready-for-class activities > (socks before shoes versus pants and shirt in either order). They > already do topological sorting every day, even if the name seems fancy. > -- http://www.voidspace.org.uk/ From jacob at jacobian.org Tue Nov 2 17:37:17 2010 From: jacob at jacobian.org (Jacob Kaplan-Moss) Date: Tue, 2 Nov 2010 11:37:17 -0500 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> Message-ID: On Tue, Nov 2, 2010 at 11:23 AM, Terry Reedy wrote: > What I did say in the post you responded to is "Any programmer who sorts (or > uses functions that depend on proper sorting) should know and respect the > difference between partial orders, such as set inclusion, and total orders, > such as lex order of sequences." FWIW (i.e. not much): before this thread if you'd asked me about partial and total orders I'd have had to run to Wikipedia real quick to figure it out. Hopefully I'm still allowed to use Python. Jacob From fdrake at acm.org Tue Nov 2 17:41:37 2010 From: fdrake at acm.org (Fred Drake) Date: Tue, 2 Nov 2010 12:41:37 -0400 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> Message-ID: On Tue, Nov 2, 2010 at 12:37 PM, Jacob Kaplan-Moss wrote: > Hopefully I'm still allowed to use Python. Definitely! Python's a great place to learn about all these things. :-) ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From stephen at xemacs.org Tue Nov 2 18:00:22 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 03 Nov 2010 02:00:22 +0900 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> Message-ID: <87hbfzwti1.fsf@uwakimon.sk.tsukuba.ac.jp> Terry Reedy writes: > ethical character. Or have them consider the partial order dependencies > between morning get-ready-for-class activities (socks before shoes > versus pants and shirt in either order). They already do topological > sorting every day, even if the name seems fancy. Augment the example a bit, perhaps: socks and pants before shoes, socks and pants in either order. From exarkun at twistedmatrix.com Tue Nov 2 18:17:38 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Tue, 02 Nov 2010 17:17:38 -0000 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <4CD03C70.4040903@voidspace.org.uk> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> <4CD03C70.4040903@voidspace.org.uk> Message-ID: <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain> On 04:29 pm, fuzzyman at voidspace.org.uk wrote: >On 02/11/2010 16:23, Terry Reedy wrote: >>On 11/2/2010 10:05 AM, C. Titus Brown wrote: >>>...but, as someone who has to figure out how to teach stuff to CSE >>>undergrads >>>(and biology grads) I hate the statement "...any programmer should >>>expect this..." >> >>And indeed I (intentionally) did not say that. People who are ignorant >>and inexperienced about something should avoid making expectations in >>any direction until they have read the doc and experimented a bit. >Expectations come from consistent behaviour. sorted behaves >consistently for *most* of the built-in types and will also work for >custom types that provide a 'standard' (total ordering) implementation >of __lt__. > >It is very easy to *not realise* that a consequence of sets (and >frozensets) providing partial ordering through operator overloading is >that sorting is undefined for them. Perhaps. The documentation for sets says this, though: Since sets only define partial ordering (subset relationships), the output of the list.sort() method is undefined for lists of sets. >Particularly as it still works for other mutable collections. Worth >being aware that custom implementations of standard operators will >break expectations of users who aren't intimately aware of the problem >domains that the specific type may be created for. I can't help thinking that most of this confusion is caused by using < for determining subsets. If < were not defined for sets and people had to use "set.issubset" (which exists already), then sorting a list with sets would raise an exception, a much more understandable failure mode than getting back a list in arbitrary order. Jean-Paul From fuzzyman at voidspace.org.uk Tue Nov 2 18:23:45 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 02 Nov 2010 17:23:45 +0000 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> <4CD03C70.4040903@voidspace.org.uk> <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain> Message-ID: <4CD04921.4040401@voidspace.org.uk> On 02/11/2010 17:17, exarkun at twistedmatrix.com wrote: > On 04:29 pm, fuzzyman at voidspace.org.uk wrote: >> On 02/11/2010 16:23, Terry Reedy wrote: >>> On 11/2/2010 10:05 AM, C. Titus Brown wrote: >>>> ...but, as someone who has to figure out how to teach stuff to CSE >>>> undergrads >>>> (and biology grads) I hate the statement "...any programmer should >>>> expect this..." >>> >>> And indeed I (intentionally) did not say that. People who are >>> ignorant and inexperienced about something should avoid making >>> expectations in any direction until they have read the doc and >>> experimented a bit. >> Expectations come from consistent behaviour. sorted behaves >> consistently for *most* of the built-in types and will also work for >> custom types that provide a 'standard' (total ordering) >> implementation of __lt__. >> >> It is very easy to *not realise* that a consequence of sets (and >> frozensets) providing partial ordering through operator overloading >> is that sorting is undefined for them. > > Perhaps. The documentation for sets says this, though: > > Since sets only define partial ordering (subset relationships), the > output of the list.sort() method is undefined for lists of sets. Right, I did quote that exact text earlier in the thread. False expectations come when there are exceptions to otherwise-consistent behaviour. >> Particularly as it still works for other mutable collections. Worth >> being aware that custom implementations of standard operators will >> break expectations of users who aren't intimately aware of the >> problem domains that the specific type may be created for. > > I can't help thinking that most of this confusion is caused by using < > for determining subsets. If < were not defined for sets and people had > to use "set.issubset" (which exists already), then sorting a list with > sets would raise an exception, a much more understandable failure mode > than getting back a list in arbitrary order. > I agree. This is a cost of overloading operators with domain specific meanings. All the best, Michael Foord > Jean-Paul > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From tjreedy at udel.edu Tue Nov 2 23:13:47 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 02 Nov 2010 18:13:47 -0400 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <4CD04921.4040401@voidspace.org.uk> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> <4CD03C70.4040903@voidspace.org.uk> <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain> <4CD04921.4040401@voidspace.org.uk> Message-ID: On 11/2/2010 1:23 PM, Michael Foord wrote: > Right, I did quote that exact text earlier in the thread. False > expectations come when there are exceptions to otherwise-consistent > behaviour. > >>> Particularly as it still works for other mutable collections. Worth >>> being aware that custom implementations of standard operators will >>> break expectations of users who aren't intimately aware of the >>> problem domains that the specific type may be created for. >> >> I can't help thinking that most of this confusion is caused by using < >> for determining subsets. If < were not defined for sets and people had >> to use "set.issubset" (which exists already), then sorting a list with >> sets would raise an exception, a much more understandable failure mode >> than getting back a list in arbitrary order. >> > I agree. This is a cost of overloading operators with domain specific > meanings. I disagree. In mathematics, total ordering is a special case of partial ordering, not the other way around. Set inclusion is a standard example of non-total ordering. In everyday life, another example (other than action dependencies) are ancestry relationships. In general, acyclic directed graphs model sets with partial orders. Totally ordered linear chains, as with integers, are a special case. A Python program, for instance, is usually a non-unique topological sort of a set a statements with a non-total dependency order. This is related to a topological sort of a set of actions with a non-total dependency order. A NameError, if not due to a misspelling, is typically a result of violating one of the space or time order constraints. So I stick with my statement that a programmer should have some understanding (at least at a gut level) of non-total orders and non-unique sorts. They are a major part of what programming is. -- Terry Jan Reedy From ncoghlan at gmail.com Tue Nov 2 23:33:28 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 3 Nov 2010 08:33:28 +1000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon wrote: > So basically it seems like we have learned a lesson: we prefer to have > our code structured in files that match the public API. I think that > is a legitimate design rule for the stdlib to follow from now on, but > in the case of unittest it's too late to change it back (and it's a > minor price to pay to learn this lesson and to have Michael > maintaining unittest like he has been, plus we could consider using > the new structure so that the public API matches the file structure > when the need arises). Something to note in PEP 8, perhaps? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Tue Nov 2 23:33:39 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 03 Nov 2010 11:33:39 +1300 Subject: [Python-Dev] Cleaning-up the new unittest API In-Reply-To: <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain> References: <20866584-B4E0-4F97-8086-87455A836053@gmail.com> <4CCB9106.4020302@voidspace.org.uk> <4CCB9754.5000508@voidspace.org.uk> <4CCB9AD7.8020504@voidspace.org.uk> <4CCE2C27.5000706@voidspace.org.uk> <20101102014005.GA28322@idyll.org> <878w1cjfic.fsf@uwakimon.sk.tsukuba.ac.jp> <20101102140555.GB22027@idyll.org> <4CD03C70.4040903@voidspace.org.uk> <20101102171738.2040.158093645.divmod.xquotient.507@localhost.localdomain> Message-ID: <4CD091C3.5030009@canterbury.ac.nz> exarkun at twistedmatrix.com wrote: > I can't help thinking that most of this confusion is caused by using < > for determining subsets. If < were not defined for sets and people had > to use "set.issubset" (which exists already), then sorting a list with > sets would raise an exception, a much more understandable failure mode > than getting back a list in arbitrary order. Personally I think it was premature to throw out __cmp__. What should have happened instead is for __cmp__ to be augmented with a fourth outcome, "not equal but unordered". Then operations such as sorting that require a total ordering could use __cmp__ and complain if they get an unordered result. -- Greg From ncoghlan at gmail.com Tue Nov 2 23:38:12 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 3 Nov 2010 08:38:12 +1000 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: <201011021355.40153.victor.stinner@haypocalc.com> References: <20101029003858.7D584EEA52@mail.python.org> <201011021355.40153.victor.stinner@haypocalc.com> Message-ID: On Tue, Nov 2, 2010 at 10:55 PM, Victor Stinner wrote: > I don't know how to ignore the BytesWarning without importing warning. How can > I do that? I was suggesting trying to fix the bootstrap issue so you could use a top-level import, instead of working around it with a function level import (which we've learned from experience is a recipe for later reports from users of programs deadlocking on the import lock - we've made lots of improvement to avoid such deadlocks, but still prefer to avoid function level imports anyway). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From brett at python.org Tue Nov 2 23:43:29 2010 From: brett at python.org (Brett Cannon) Date: Tue, 2 Nov 2010 15:43:29 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: On Tue, Nov 2, 2010 at 15:33, Nick Coghlan wrote: > On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon wrote: >> So basically it seems like we have learned a lesson: we prefer to have >> our code structured in files that match the public API. I think that >> is a legitimate design rule for the stdlib to follow from now on, but >> in the case of unittest it's too late to change it back (and it's a >> minor price to pay to learn this lesson and to have Michael >> maintaining unittest like he has been, plus we could consider using >> the new structure so that the public API matches the file structure >> when the need arises). > > Something to note in PEP 8, perhaps? If everyone agrees with making this policy, then yes. -Brett > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > From raymond.hettinger at gmail.com Tue Nov 2 23:47:58 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 2 Nov 2010 15:47:58 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote: > > I think the issue here is that the file structure of the code no > longer matches the public API documented by unittest. Personally I, > like most people it seems, prefer source files to be structured in a > way to match the public API. In the case of unittest Michael didn't. > He did ask python-dev if it was okay to do what he did, we all kept > quiet, and now we have realized that most of us prefer to have files > that mirror the API; lesson learned. But Python 2.7 shipped with this > file layout so we have to stick with it lest we break any imports out > there that use the package-like file structure Michael went with > (which we could actually document and use if we wanted now that > Michael has already broken things up). Reversing the trend by sticking > all the code into unittest/__init__.py and then sticking import shims > into the existing modules would be a stupid waste of time, especially > considering the head maintainer of the package likes it the way it is. I'm not sure I follow where we're stuck with the current package. AFAICT, the module is still used with "import unittest". The file splitting was done badly, so I don't think there any of the components are usable directly, i.e. "from unitest.case import SkipTest". Also, I don't think the package structure was documented or announced. This is in contrast to the logging module which does have a clean separation of components and where it isn't unusual to import just part of the package. What is it you're seeing as a risk that I'm not seeing? Are we permanently locked into the exact ten filenames that are currently used: utils, suite, loader, case, result, main, signals, etc? Is the file structure now frozen? Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Tue Nov 2 23:52:11 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 2 Nov 2010 15:52:11 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: <6F0FAD7F-FF5E-466D-8EA5-33B70D0D4218@gmail.com> On Nov 2, 2010, at 3:33 PM, Nick Coghlan wrote: > On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon wrote: >> So basically it seems like we have learned a lesson: we prefer to have >> our code structured in files that match the public API. I think that >> is a legitimate design rule for the stdlib to follow from now on, but >> in the case of unittest it's too late to change it back (and it's a >> minor price to pay to learn this lesson and to have Michael >> maintaining unittest like he has been, plus we could consider using >> the new structure so that the public API matches the file structure >> when the need arises). > > Something to note in PEP 8, perhaps? I'll propose some PEP 8 wording in the bug tracker (essentially advice on when and how to use packaging), and everyone can offer their assent, dissent, and word-smithing. Raymond From barry at python.org Tue Nov 2 23:58:05 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 2 Nov 2010 18:58:05 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: <20101102185805.08ca8ca8@mission> On Nov 02, 2010, at 03:43 PM, Brett Cannon wrote: >On Tue, Nov 2, 2010 at 15:33, Nick Coghlan wrote: >> On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon wrote: >>> So basically it seems like we have learned a lesson: we prefer to have >>> our code structured in files that match the public API. I think that >>> is a legitimate design rule for the stdlib to follow from now on, but >>> in the case of unittest it's too late to change it back (and it's a >>> minor price to pay to learn this lesson and to have Michael >>> maintaining unittest like he has been, plus we could consider using >>> the new structure so that the public API matches the file structure >>> when the need arises). >> >> Something to note in PEP 8, perhaps? > >If everyone agrees with making this policy, then yes. If SHOULD not MUST, then +0 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From guido at python.org Tue Nov 2 23:58:51 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 2 Nov 2010 15:58:51 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Tue, Nov 2, 2010 at 3:47 PM, Raymond Hettinger wrote: > I'm not sure I follow where we're stuck with the current package. > AFAICT, the module is still used with "import unittest". > The file splitting was done badly, so I don't think there any of the > components are usable directly, i.e. "from unitest.case import SkipTest". > Also, I don't think the package structure was documented or announced. > > This is in contrast to the logging module which does have a > clean separation of components and where it isn't unusual > to import just part of the package. > > What is it you're seeing as a risk that I'm not seeing? > Are we permanently locked into the exact ten filenames > that are currently used: ?utils, suite, loader, case, result, main, signals, > etc? > Is the file structure now frozen? To spout a somewhat contrarian opinion, I just browsed the new unittest package, and the structure seems reasonable to me, even if its submodules are not particularly reusable. I've used this kind of style for development myself. What is so offensive about it? -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Tue Nov 2 23:59:28 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 02 Nov 2010 23:59:28 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <1288738768.3541.12.camel@localhost.localdomain> Le mardi 02 novembre 2010 ? 15:47 -0700, Raymond Hettinger a ?crit : > > What is it you're seeing as a risk that I'm not seeing? > Are we permanently locked into the exact ten filenames > that are currently used: utils, suite, loader, case, result, main, > signals, etc? > Is the file structure now frozen? I don't think it's frozen. It's just that Michael and Benjamin (perhaps others too) prefer it like that, and given who does most of the maintenance and improvement work it is reasonable to respect that decision. If one day someone else becomes maintainer of unittest, then, sure, they can undo the splitting or do it differently. But right now there's no reason to change. Oh, and I much prefer a splitting without any impact on the public API. I *hate* writing "urllib.request.urlopen" and I really wish we hadn't done that; "urllib.urlopen" was so much easier to remember it isn't funny :/ Regards Antoine. From brett at python.org Wed Nov 3 00:00:40 2010 From: brett at python.org (Brett Cannon) Date: Tue, 2 Nov 2010 16:00:40 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Tue, Nov 2, 2010 at 15:47, Raymond Hettinger wrote: > On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote: > > I think the issue here is that the file structure of the code no > longer matches the public API documented by unittest. Personally I, > like most people it seems, prefer source files to be structured in a > way to match the public API. In the case of unittest Michael didn't. > He did ask python-dev if it was okay to do what he did, we all kept > quiet, and now we have realized that most of us prefer to have files > that mirror the API; lesson learned. But Python 2.7 shipped with this > file layout so we have to stick with it lest we break any imports out > there that use the package-like file structure Michael went with > (which we could actually document and use if we wanted now that > Michael has already broken things up). Reversing the trend by sticking > all the code into unittest/__init__.py and then sticking import shims > into the existing modules would be a stupid waste of time, especially > considering the head maintainer of the package likes it the way it is. > > I'm not sure I follow where we're stuck with the current package. > AFAICT, the module is still used with "import unittest". Yes, as far as you can tell, but who the hell knows what someone is doing with code you are *not* aware of. As I said, Python 2.7 shipped with the code structured like this, so it's possible someone is importing unittest.case.TestCase instead of unittest.TestCase. > The file splitting was done badly, so I don't think there any of the > components are usable directly, i.e. "from unitest.case import SkipTest". I wouldn't say it was done badly, just non-standard. I was able to figure out where all the key classes were based on the file names. I personally would have no trouble doing ``from unittest.case import TestCase`` if more test case classes came along for various needs. > Also, I don't think the package structure was documented or announced. Announced publicly? No, not that I know of. > This is in contrast to the logging module which does have a > clean separation of components and where it isn't unusual > to import just part of the package. > What is it you're seeing as a risk that I'm not seeing? Broken imports between Python 2.7 code and any version of Python where unittest is re-merged back into a single module. > Are we permanently locked into the exact ten filenames > that are currently used: ?utils, suite, loader, case, result, main, signals, > etc? > Is the file structure now frozen? Somewhat, yes. Screwing with unittest is always touchy as absolutely no one wants their tests to break, and that includes messing with imports. We could stick in import shims to shift everything into unittest/__init__.py, but the benefits you have outlined already don't strike me as not worth the hassle especially since you won't ever get your unittest.py format back. From benjamin at python.org Wed Nov 3 00:08:51 2010 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 2 Nov 2010 18:08:51 -0500 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: 2010/11/2 Raymond Hettinger : > On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote: > > I think the issue here is that the file structure of the code no > longer matches the public API documented by unittest. Personally I, > like most people it seems, prefer source files to be structured in a > way to match the public API. In the case of unittest Michael didn't. > He did ask python-dev if it was okay to do what he did, we all kept > quiet, and now we have realized that most of us prefer to have files > that mirror the API; lesson learned. But Python 2.7 shipped with this > file layout so we have to stick with it lest we break any imports out > there that use the package-like file structure Michael went with > (which we could actually document and use if we wanted now that > Michael has already broken things up). Reversing the trend by sticking > all the code into unittest/__init__.py and then sticking import shims > into the existing modules would be a stupid waste of time, especially > considering the head maintainer of the package likes it the way it is. > > I'm not sure I follow where we're stuck with the current package. > AFAICT, the module is still used with "import unittest". > The file splitting was done badly, so I don't think there any of the > components are usable directly, i.e. "from unitest.case import SkipTest". > Also, I don't think the package structure was documented or announced. > This is in contrast to the logging module which does have a > clean separation of components and where it isn't unusual > to import just part of the package. See http://docs.python.org/whatsnew/2.7.html#updated-module-unittest -- Regards, Benjamin From fuzzyman at voidspace.org.uk Wed Nov 3 00:11:57 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 02 Nov 2010 23:11:57 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: <4CD09ABD.9020908@voidspace.org.uk> On 02/11/2010 22:43, Brett Cannon wrote: > On Tue, Nov 2, 2010 at 15:33, Nick Coghlan wrote: >> On Tue, Nov 2, 2010 at 12:35 PM, Brett Cannon wrote: >>> So basically it seems like we have learned a lesson: we prefer to have >>> our code structured in files that match the public API. I think that >>> is a legitimate design rule for the stdlib to follow from now on, but >>> in the case of unittest it's too late to change it back (and it's a >>> minor price to pay to learn this lesson and to have Michael >>> maintaining unittest like he has been, plus we could consider using >>> the new structure so that the public API matches the file structure >>> when the need arises). >> Something to note in PEP 8, perhaps? > If everyone agrees with making this policy, then yes. > I'd like to reply a bit further, I'll do it as a reply to your earlier email though. Michael > -Brett > >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From raymond.hettinger at gmail.com Wed Nov 3 00:20:38 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 2 Nov 2010 16:20:38 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <411A82ED-8A80-4901-B304-8E4D5A17B7C4@gmail.com> On Nov 2, 2010, at 3:58 PM, Guido van Rossum wrote: > To spout a somewhat contrarian opinion, I just browsed the new > unittest package, and the structure seems reasonable to me, even if > its submodules are not particularly reusable. I've used this kind of > style for development myself. What is so offensive about it? I don't find anything offensive about it. The issues have to do with being able to find and analyze code. For example, to find-out what assert.ItemsEqual does, I have to figure-out that it was put in the case.py file. In Py2.6, you code use IDLE's Open Module tool to immediately bring up all the source for unittest. Then you could fire-up the class browser to quickly see and navigate the structure, but that also no longer works in Py2.7. Also, it used to be the just knowing the module name was sufficient to find the code with http://svn.python.org/view/python/branches/release26-maint/Lib/unittest.py?view=markup All you needed to study the code was a web browser and its find function. Now you need to open ten tabs to be able to browse this code. IOW, the packaging broke a read-the-source-luke style of research that I've been teaching people to use for years. I probably didn't articulate the above very well, but I think Martin said it more succinctly in this same thread. The other issue that Brett pointed out is that the file names now become part of the API, "from unittest.utils import safe_repr". In the logging module, packaging was done well. The files fell along natural lines in the API, some of the components we usable separately and testable separately. Likewise with the xml packages. In contrast, the unittest module is full of cross-imports and tightly coupled pieces (like suite and case) have been separated. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Nov 3 00:24:46 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 03 Nov 2010 00:24:46 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <411A82ED-8A80-4901-B304-8E4D5A17B7C4@gmail.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <411A82ED-8A80-4901-B304-8E4D5A17B7C4@gmail.com> Message-ID: <1288740286.3541.22.camel@localhost.localdomain> Le mardi 02 novembre 2010 ? 16:20 -0700, Raymond Hettinger a ?crit : > > For example, to find-out what assert.ItemsEqual does, I have > to figure-out that it was put in the case.py file. Well, it's a TestCase method, so it seems rather intuitive to look for it in case.py. Regards Antoine. From raymond.hettinger at gmail.com Wed Nov 3 00:32:09 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 2 Nov 2010 16:32:09 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Nov 2, 2010, at 4:00 PM, Brett Cannon wrote: >> Are we permanently locked into the exact ten filenames >> that are currently used: utils, suite, loader, case, result, main, signals, >> etc? >> Is the file structure now frozen? > > Somewhat, yes. That's a bummer. Sounds like a decision to split a module into a package is a big commitment. Each of the individual file names becomes a permanent part of the API. Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils). Raymond From solipsis at pitrou.net Wed Nov 3 00:34:15 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 03 Nov 2010 00:34:15 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <1288740855.3541.24.camel@localhost.localdomain> Le mardi 02 novembre 2010 ? 16:32 -0700, Raymond Hettinger a ?crit : > On Nov 2, 2010, at 4:00 PM, Brett Cannon wrote: > >> Are we permanently locked into the exact ten filenames > >> that are currently used: utils, suite, loader, case, result, main, signals, > >> etc? > >> Is the file structure now frozen? > > > > Somewhat, yes. > > That's a bummer. > > Sounds like a decision to split a module into a package is a big > commitment. Each of the individual file names becomes a permanent > part of the API. Even future additional splits are precluded because > it might break someones dotted import (i.e. not a single function can > be moved between those files -- once in unittest.utils, alway in > unittest.utils). I don't agree with this. Until it's documented, it's an implementation detail and should be able to change without notice. If someone wants to depend on some undocumented detail of the directory layout it's their problem (like people depending on bytecode and other stuff). From fuzzyman at voidspace.org.uk Wed Nov 3 00:34:38 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 02 Nov 2010 23:34:38 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <4CD0A00E.2010709@voidspace.org.uk> On 02/11/2010 23:00, Brett Cannon wrote: > On Tue, Nov 2, 2010 at 15:47, Raymond Hettinger > wrote: >> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote: >> >> I think the issue here is that the file structure of the code no >> longer matches the public API documented by unittest. Personally I, >> like most people it seems, prefer source files to be structured in a >> way to match the public API. In the case of unittest Michael didn't. >> He did ask python-dev if it was okay to do what he did, we all kept >> quiet, and now we have realized that most of us prefer to have files >> that mirror the API; lesson learned. But Python 2.7 shipped with this >> file layout so we have to stick with it lest we break any imports out >> there that use the package-like file structure Michael went with >> (which we could actually document and use if we wanted now that >> Michael has already broken things up). Reversing the trend by sticking >> all the code into unittest/__init__.py and then sticking import shims >> into the existing modules would be a stupid waste of time, especially >> considering the head maintainer of the package likes it the way it is. >> >> I'm not sure I follow where we're stuck with the current package. >> AFAICT, the module is still used with "import unittest". > Yes, as far as you can tell, but who the hell knows what someone is > doing with code you are *not* aware of. As I said, Python 2.7 shipped > with the code structured like this, so it's possible someone is > importing unittest.case.TestCase instead of unittest.TestCase. > It is also shipped in unittest (and unittest2py3k I might add) so that users of earlier versions of Python can use the new features seamlessly. (unittest2 will be in Django 1.3.) Much better times to discuss this would be when it was proposed or when it was done, not months after it has been shipped in a production release. > [snip...] >> This is in contrast to the logging module which does have a >> clean separation of components and where it isn't unusual >> to import just part of the package. >> What is it you're seeing as a risk that I'm not seeing? > Broken imports between Python 2.7 code and any version of Python where > unittest is re-merged back into a single module. > I *know* that some people are using the new package structure directly, because the topic has come up on the Testing in Python mailing list. >> Are we permanently locked into the exact ten filenames >> that are currently used: utils, suite, loader, case, result, main, signals, >> etc? >> Is the file structure now frozen? > Somewhat, yes. Screwing with unittest is always touchy as absolutely > no one wants their tests to break, and that includes messing with > imports. We could stick in import shims to shift everything into > unittest/__init__.py, but the benefits you have outlined already don't > strike me as not worth the hassle especially since you won't ever get > your unittest.py format back. Absolutely, that would be the worst of all possible worlds - a monolithic huge module but still embedded in a package. People *are* using the existing package structure to import directly from (against my advice!) as this particular topic has been discussed on the Testing In Python mailing list. Although Raymond has been vociferous in his detestation of this particular split he does not represent a "clear consensus" in favour of merging back. Both Fred Drake and Barrry Warsaw voiced their approval and on the "Clean up unittest API" issue both yourself (Brett) and Antoine have supported keeping the existing structure. Alexander Belopolsky and Martin Loewis expressed difficulties with the new structure, but that was in response to the original email from Raymond that didn't seem (on my reading) to expressly suggest re-merging unittest back into a module but was instead seemed to be using it as an example. I am aware of the costs of having a package rather than single (however large it may be) module, but to my mind the benefits to maintenance still outweigh these cost. I'm happy to again discuss these benefits at great length, but having had the same conversation in person with Raymond twice and at repeated most of the points (but by no means all) in this thread on the mailing list it really feels like going round in circles. As the maintainer of unittest I'd like to say that in the absence of clear consensus that the merger should happen, or a fiat from the BDFL, the merger won't happen. I believe that this is standard Python development process. All the best, Michael > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Wed Nov 3 00:39:13 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 02 Nov 2010 23:39:13 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <4CD0A121.1060503@voidspace.org.uk> On 02/11/2010 22:58, Guido van Rossum wrote: > On Tue, Nov 2, 2010 at 3:47 PM, Raymond Hettinger > wrote: >> I'm not sure I follow where we're stuck with the current package. >> AFAICT, the module is still used with "import unittest". >> The file splitting was done badly, so I don't think there any of the >> components are usable directly, i.e. "from unitest.case import SkipTest". >> Also, I don't think the package structure was documented or announced. >> >> This is in contrast to the logging module which does have a >> clean separation of components and where it isn't unusual >> to import just part of the package. >> >> What is it you're seeing as a risk that I'm not seeing? >> Are we permanently locked into the exact ten filenames >> that are currently used: utils, suite, loader, case, result, main, signals, >> etc? >> Is the file structure now frozen? > To spout a somewhat contrarian opinion, I just browsed the new > unittest package, and the structure seems reasonable to me, even if > its submodules are not particularly reusable. I've used this kind of > style for development myself. What is so offensive about it? > Amen. Although not that contrarian, others have spoken up in favour. The split is pretty obvious (in general - I'm sure its not perfect) and divided along major functionality. TestCase - case.py TestResult - result.py TestSuite - suite.py TextTestRunner - runner.py TestLoader - loader.py main function - main.py signal handling - signals.py The utils module is somewhat an odd one out as it is only used by case.py, but this is hardly the most egregious error in the world. If you can't guess where a class lives, __init__.py shows you explicitly (a clear advantage of exporting the public API at the top level ;-) Due to the original design of unittest (and I have many thoughts on that) the modules aren't strictly "reusable" (i.e. isolated from each other) - but many test frameworks (for example) just use the TestCase without using other components. I find it difficult to believe that this package structure is only acceptable if we make people import the TestCase from unittest.case and not expose it at the top level. As mentioned in another email, but this thread has many long and tedious emails, there is no clear consensus that there should be a remerger and I am aware that there are already some consumers of the new package structure. As the maintainer of unittest I'd like to say that in the absence of clear consensus that the merger should happen, or a fiat from the BDFL, the merger won't happen. I believe that this is standard Python development process. All the best, Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From guido at python.org Wed Nov 3 00:43:29 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 2 Nov 2010 16:43:29 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD0A121.1060503@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD0A121.1060503@voidspace.org.uk> Message-ID: On Tue, Nov 2, 2010 at 4:39 PM, Michael Foord wrote: > As the maintainer of unittest I'd like to say that in the absence of clear > consensus that the merger should happen, or a fiat from the BDFL, the merger > won't happen. I believe that this is standard Python development process. I don't think that anybody seriously expected the unittest package would be restructured again. The remaining thrust of the thread seems to be whether PEP 8 should advise against breaking code up into many little modules. Personally I don't think it should -- it should by now be clear that this is not an area where one style will fit all. I also think that the convenience of one style over another might have something to do with the tools being used. -- --Guido van Rossum (python.org/~guido) From raymond.hettinger at gmail.com Wed Nov 3 00:44:12 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 2 Nov 2010 16:44:12 -0700 Subject: [Python-Dev] Question on imports and packages Message-ID: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> Brett, Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package. Raymod From raymond.hettinger at gmail.com Wed Nov 3 01:03:00 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 2 Nov 2010 17:03:00 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD0A121.1060503@voidspace.org.uk> Message-ID: <004A5B22-5936-44C9-A51A-CCA2811CD76D@gmail.com> On Nov 2, 2010, at 4:43 PM, Guido van Rossum wrote: > The remaining thrust of the thread seems > to be whether PEP 8 should advise against breaking code up into many > little modules. I was thinking of PEP 8 wording that listed the forces for and against. For example, ply.yacc and ply.lex was a very useful split (separately testable, natural division of concerns, no nested of cross-imports). The xml.sax, xml.dom, and xml.minidom was a nice split because it separated distinct tools. The xml packaging also worked well because it is easy to substitute in alternate parsers implementing the same API. I think we also want to recommend against putting much if any code in __init__.py. Some forces against packaging are that it breaks the class browser. As you say, different users of different toolsets are affected differently. For me, the unittest split broke my usual ways of finding out how the new methods were implemented. Another force against is what Brett pointed-out, that the package file structure becomes a permanent and unchangeable part of the API. It's a one-way street. In general, I think the advice should be that packaging should be done when there is some clear benefit beyond "turning one big file into lots of smaller files". Raymond From fuzzyman at voidspace.org.uk Wed Nov 3 01:02:37 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 00:02:37 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD0A00E.2010709@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD0A00E.2010709@voidspace.org.uk> Message-ID: <4CD0A69D.8040702@voidspace.org.uk> On 02/11/2010 23:34, Michael Foord wrote: > On 02/11/2010 23:00, Brett Cannon wrote: >> On Tue, Nov 2, 2010 at 15:47, Raymond Hettinger >> wrote: >>> On Nov 1, 2010, at 7:35 PM, Brett Cannon wrote: >>> >>> I think the issue here is that the file structure of the code no >>> longer matches the public API documented by unittest. Personally I, >>> like most people it seems, prefer source files to be structured in a >>> way to match the public API. In the case of unittest Michael didn't. >>> He did ask python-dev if it was okay to do what he did, we all kept >>> quiet, and now we have realized that most of us prefer to have files >>> that mirror the API; lesson learned. But Python 2.7 shipped with this >>> file layout so we have to stick with it lest we break any imports out >>> there that use the package-like file structure Michael went with >>> (which we could actually document and use if we wanted now that >>> Michael has already broken things up). Reversing the trend by sticking >>> all the code into unittest/__init__.py and then sticking import shims >>> into the existing modules would be a stupid waste of time, especially >>> considering the head maintainer of the package likes it the way it is. >>> >>> I'm not sure I follow where we're stuck with the current package. >>> AFAICT, the module is still used with "import unittest". >> Yes, as far as you can tell, but who the hell knows what someone is >> doing with code you are *not* aware of. As I said, Python 2.7 shipped >> with the code structured like this, so it's possible someone is >> importing unittest.case.TestCase instead of unittest.TestCase. >> > > It is also shipped in unittest (and unittest2py3k I might add) so that > users of earlier versions of Python can use the new features > seamlessly. (unittest2 will be in Django 1.3.) unittest2 dammit. Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ben+python at benfinney.id.au Wed Nov 3 01:06:56 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 03 Nov 2010 11:06:56 +1100 Subject: [Python-Dev] On breaking modules into packages References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <87eib3jmn3.fsf_-_@benfinney.id.au> Raymond Hettinger writes: > >> Are we permanently locked into the exact ten filenames that are > >> currently used: utils, suite, loader, case, result, main, signals, > >> etc? [?] > Sounds like a decision to split a module into a package is a big > commitment. Each of the individual file names becomes a permanent part > of the API. Even future additional splits are precluded because it > might break someones dotted import (i.e. not a single function can be > moved between those files -- once in unittest.utils, alway in > unittest.utils). Is this a case where it would be better if the package names had the leading underscore: ?_utils?, ?_suite?, etc.? Does the convention on single-leading-underscore identifiers as ?don't rely on this name staying the same in future versions? hold for package names? -- \ ?Alternative explanations are always welcome in science, if | `\ they are better and explain more. Alternative explanations that | _o__) explain nothing are not welcome.? ?Victor J. Stenger, 2001-11-05 | Ben Finney From guido at python.org Wed Nov 3 01:28:48 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 2 Nov 2010 17:28:48 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <004A5B22-5936-44C9-A51A-CCA2811CD76D@gmail.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD0A121.1060503@voidspace.org.uk> <004A5B22-5936-44C9-A51A-CCA2811CD76D@gmail.com> Message-ID: On Tue, Nov 2, 2010 at 5:03 PM, Raymond Hettinger wrote: > Some forces against packaging are that it breaks the class browser. ?As you say, different users of different toolsets are affected differently. ?For me, the unittest split broke my usual ways of finding out how the new methods were implemented. Maybe the IDLE class browser can be fixed; there is plenty of code with this structure that can't or won't be restructured, no matter how strongly PEP 8 is worded. FWIW, personally I don't use the IDLE class browser -- I use Emacs, grep, and find. :-) -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Nov 3 01:35:55 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 2 Nov 2010 17:35:55 -0700 Subject: [Python-Dev] Question on imports and packages In-Reply-To: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> References: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> Message-ID: If you are importing the code, the __module__ attribute on each class should tell you where it is actually defined (as opposed to where you imported it from). Then sys.modules gives you the module object which has a __file__ attribute, etc. On Tue, Nov 2, 2010 at 4:44 PM, Raymond Hettinger wrote: > Brett, ?Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? ?I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package. -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Wed Nov 3 01:47:55 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 03 Nov 2010 11:47:55 +1100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> Message-ID: <8762wfjkqs.fsf@benfinney.id.au> Antoine Pitrou writes: > I don't agree with this. Until it's documented, it's an implementation > detail and should be able to change without notice. If it's an implementation detail, shouldn't it be named as one (i.e. with a leading underscore)? > If someone wants to depend on some undocumented detail of the > directory layout it's their problem (like people depending on bytecode > and other stuff). I would say that names without a single leading underscore are part of the public API, whether documented or not. -- \ ?Your [government] representative owes you, not his industry | `\ only, but his judgment; and he betrays, instead of serving you, | _o__) if he sacrifices it to your opinion.? ?Edmund Burke, 1774 | Ben Finney From fuzzyman at voidspace.org.uk Wed Nov 3 02:08:21 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 01:08:21 +0000 Subject: [Python-Dev] Question on imports and packages In-Reply-To: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> References: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> Message-ID: <4CD0B605.4050704@voidspace.org.uk> On 02/11/2010 23:44, Raymond Hettinger wrote: > Brett, Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package. Can it not do that through static analysis - just look at the classes / functions defined in the sub-modules. I mean, you could do it from the ast, right. Relying on importing code to analyse it is unpleasant if the code has top level side-effects (which no good code does of course). There may be *some* cases where magic makes things weird (__package__), but how common are those in practise? If you build up a data-structure representing definitions in a package, working out where any individual class / function used in a module is defined is a matter of looking at where it is imported (assuming it hasn't been aliased or fetched dynamically) and matching the import to a package you have analysed (or analyse on the fly). A project that attempts to do something like this is pysmell: http://github.com/orestis/pysmell/ All the best, Michael > Raymod > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From exarkun at twistedmatrix.com Wed Nov 3 03:23:47 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Wed, 03 Nov 2010 02:23:47 -0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <8762wfjkqs.fsf@benfinney.id.au> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> <8762wfjkqs.fsf@benfinney.id.au> Message-ID: <20101103022347.2040.1348266477.divmod.xquotient.512@localhost.localdomain> On 12:47 am, ben+python at benfinney.id.au wrote: >Antoine Pitrou writes: >>I don't agree with this. Until it's documented, it's an implementation >>detail and should be able to change without notice. > >If it's an implementation detail, shouldn't it be named as one (i.e. >with a leading underscore)? >>If someone wants to depend on some undocumented detail of the >>directory layout it's their problem (like people depending on bytecode >>and other stuff). > >I would say that names without a single leading underscore are part of >the public API, whether documented or not. And if that isn't the rule, then what the heck is? Jean-Paul From fdrake at acm.org Wed Nov 3 03:49:29 2010 From: fdrake at acm.org (Fred Drake) Date: Tue, 2 Nov 2010 22:49:29 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <8762wfjkqs.fsf@benfinney.id.au> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> <8762wfjkqs.fsf@benfinney.id.au> Message-ID: On Tue, Nov 2, 2010 at 8:47 PM, Ben Finney wrote: > I would say that names without a single leading underscore are part of > the public API, whether documented or not. I don't recall this ever being the standard library's policy. There are many modules using leading underscores as hints, and many others which don't. Other people consider the presence of a docstring on a non-underscored name significant, while still others refer to the out-of-line documentation as definitive. For modules, an __all__ attribute is generally agreed on as definitive, if present, but that's a fairly limited case. At this point, there isn't a single clear way to determine if something is public API. I doubt it will be likely to agree on a single definition now without engendering a lengthy discussion on whether names can be changed to reflect such a policy (where backward compatibility is sure to be lost). I'm sticking to the out-of-line documentation to determine what's public. -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From brett at python.org Wed Nov 3 03:50:11 2010 From: brett at python.org (Brett Cannon) Date: Tue, 2 Nov 2010 19:50:11 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD0A121.1060503@voidspace.org.uk> Message-ID: On Tue, Nov 2, 2010 at 16:43, Guido van Rossum wrote: > On Tue, Nov 2, 2010 at 4:39 PM, Michael Foord wrote: >> As the maintainer of unittest I'd like to say that in the absence of clear >> consensus that the merger should happen, or a fiat from the BDFL, the merger >> won't happen. I believe that this is standard Python development process. > > I don't think that anybody seriously expected the unittest package > would be restructured again. The remaining thrust of the thread seems > to be whether PEP 8 should advise against breaking code up into many > little modules. Personally I don't think it should -- it should by now > be clear that this is not an area where one style will fit all. I also > think that the convenience of one style over another might have > something to do with the tools being used. This is not what I am suggesting for PEP 8. I want to say that a package's file structure should reflect the public API. I personally have no trouble with modules in packages that do not have a ton of objects in them. I just think if you have pkg/mod.py, pkg.mod should be exposed in the API, else name the file _mod.py. In the case of unittest that would just mean documenting that it's unittest.case.TestCase and that unittest.TestCase is for legacy reasons, much like os.path is just blindly added on to os even though it is a separate module(s). From fuzzyman at voidspace.org.uk Wed Nov 3 03:50:51 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 02:50:51 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> Message-ID: <4CD0CE0B.3090201@voidspace.org.uk> On 02/11/2010 02:35, Brett Cannon wrote: > On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou wrote: >> On Tue, 26 Oct 2010 22:06:37 -0400 >> Alexander Belopolsky wrote: >>> While I appreciate your and Michael's eloquence, I don't really see >>> why five 400-line modules are necessarily easier to maintain than one >>> 2000-line module. Splitting code into modules is certainly a good >>> thing when the resulting modules can be used independently. This >>> helps users write leaner programs, reduces mental footprint of >>> individual modules, etc, etc. The split unittest module does not >>> bring any such benefits. It still presents a single "big-ball-of-mud" >>> namespace, only rather than implemented in a single file, it is now >>> swept in from eight different files. >> Are you saying that it has become a pile of medium-sized balls of mud? >> I would like to say thanks for the mud, Michael! It's high quality mud >> for sure. > I realize I am a little late in this reply but issue 10273 linked to > this and so now I am actually bothering to read this thread since it > felt like bikeshedding when the thread began. > > I think the issue here is that the file structure of the code no > longer matches the public API documented by unittest. Personally I, > like most people it seems, prefer source files to be structured in a > way to match the public API. In the case of unittest Michael didn't. Well the structure *does* match the API (which is primarily why I disagree with Raymond that this is a 'bad split'). How could we have split the module into a package in a way that matched the API, whilst still retaining backwards compatibility with the old API? We had no choice but to export the public names at the top level. > He did ask python-dev if it was okay to do what he did, we all kept > quiet, and now we have realized that most of us prefer to have files Most of us? Raymond, Alexander and Martin are the only ones I *recall* complaining about the split specifically in this thread and not all of those were on the grounds you mention. Several people supported the split. Guido didn't object to it on these grounds and Antoine noted that finding core classes was generally straightforward. > [snip...] > So basically it seems like we have learned a lesson: we prefer to have > our code structured in files that match the public API. When designing packages from the ground up that is a sensible rule of thumb to follow, but usually follows naturally from good design. This doesn't necessarily mean that all the sub-modules will export public APIs for consumers, so this is where I disagree with this. Python's package system is a very useful way of providing internal structure for projects. That doesn't mean that this structure must always be exposed publicly. It should be as easy to navigate as possible (and there is plenty about the old unittest.py module that wasn't easy to navigate I can assure you), but I *don't* think that the new package fails in a substantially greater way on that score. As Guido points out, this may depend a lot on which tools you use. I could write more about the role and value of packages, I guess I'll save it for a blog post. All the best, Michael Foord > I think that > is a legitimate design rule for the stdlib to follow from now on, but > in the case of unittest it's too late to change it back (and it's a > minor price to pay to learn this lesson and to have Michael > maintaining unittest like he has been, plus we could consider using > the new structure so that the public API matches the file structure > when the need arises). > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From brett at python.org Wed Nov 3 03:54:05 2010 From: brett at python.org (Brett Cannon) Date: Tue, 2 Nov 2010 19:54:05 -0700 Subject: [Python-Dev] Question on imports and packages In-Reply-To: References: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> Message-ID: On Tue, Nov 2, 2010 at 17:35, Guido van Rossum wrote: > If you are importing the code, the __module__ attribute on each class > should tell you where it is actually defined (as opposed to where you > imported it from). Then sys.modules gives you the module object which > has a __file__ attribute, etc. What Guido said. It's the equivalent of browsing an object that a function returned to you. Working backwards to where something is defined has nothing to do with imports and more to do with __module__, __class__, etc. Import has nothing to do with introspection for things that you access off of a module that happened to have imported the object. > > On Tue, Nov 2, 2010 at 4:44 PM, Raymond Hettinger > wrote: >> Brett, ?Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? ?I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package. > > -- > --Guido van Rossum (python.org/~guido) > From brett at python.org Wed Nov 3 03:57:48 2010 From: brett at python.org (Brett Cannon) Date: Tue, 2 Nov 2010 19:57:48 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD0CE0B.3090201@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <4CD0CE0B.3090201@voidspace.org.uk> Message-ID: On Tue, Nov 2, 2010 at 19:50, Michael Foord wrote: > On 02/11/2010 02:35, Brett Cannon wrote: >> >> On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou ?wrote: >>> >>> On Tue, 26 Oct 2010 22:06:37 -0400 >>> Alexander Belopolsky ?wrote: >>>> >>>> While I appreciate your and Michael's eloquence, I don't really see >>>> why five 400-line modules are necessarily easier to maintain than one >>>> 2000-line module. ?Splitting code into modules is certainly a good >>>> thing when the resulting modules can be used independently. ?This >>>> helps users write leaner programs, reduces mental footprint of >>>> individual modules, etc, etc. ? The split unittest module does not >>>> bring any such benefits. ?It still presents a single "big-ball-of-mud" >>>> namespace, only rather than implemented in a single file, it is now >>>> swept in from eight different files. >>> >>> Are you saying that it has become a pile of medium-sized balls of mud? >>> I would like to say thanks for the mud, Michael! It's high quality mud >>> for sure. >> >> I realize I am a little late in this reply but issue 10273 linked to >> this and so now I am actually bothering to read this thread since it >> felt like bikeshedding when the thread began. >> >> I think the issue here is that the file structure of the code no >> longer matches the public API documented by unittest. Personally I, >> like most people it seems, prefer source files to be structured in a >> way to match the public API. In the case of unittest Michael didn't. > > Well the structure *does* match the API (which is primarily why I disagree > with Raymond that this is a 'bad split'). But not the public API as documented, e.g., it's documented as unittest.TestCase, not unittest.case.TestCase as the file structure suggests. > > How could we have split the module into a package in a way that matched the > API, whilst still retaining backwards compatibility with the old API? We had > no choice but to export the public names at the top level. I'm not disagreeing with that. What I am saying is can now document that it's unittest.case.TestCase instead of having that just be an implementation detail. -Brett > >> He did ask python-dev if it was okay to do what he did, we all kept >> quiet, and now we have realized that most of us prefer to have files > > Most of us? Raymond, Alexander and Martin are the only ones I *recall* > complaining about the split specifically in this thread and not all of those > were on the grounds you mention. Several people supported the split. Guido > didn't object to it on these grounds and Antoine noted that finding core > classes was generally straightforward. > >> [snip...] >> So basically it seems like we have learned a lesson: we prefer to have >> our code structured in files that match the public API. > > When designing packages from the ground up that is a sensible rule of thumb > to follow, but usually follows naturally from good design. This doesn't > necessarily mean that all the sub-modules will export public APIs for > consumers, so this is where I disagree with this. Python's package system is > a very useful way of providing internal structure for projects. That doesn't > mean that this structure must always be exposed publicly. > > It should be as easy to navigate as possible (and there is plenty about the > old unittest.py module that wasn't easy to navigate I can assure you), but I > *don't* think that the new package fails in a substantially greater way on > that score. > > As Guido points out, this may depend a lot on which tools you use. I could > write more about the role and value of packages, I guess I'll save it for a > blog post. > > All the best, > > Michael Foord > >> I think that >> is a legitimate design rule for the stdlib to follow from now on, but >> in the case of unittest it's too late to change it back (and it's a >> minor price to pay to learn this lesson and to have Michael >> maintaining unittest like he has been, plus we could consider using >> the new structure so that the public API matches the file structure >> when the need arises). >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > > -- > > http://www.voidspace.org.uk/ > > READ CAREFULLY. By accepting and reading this email you agree, > on behalf of your employer, to release me from all obligations > and waivers arising from any and all NON-NEGOTIATED agreements, > licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, > confidentiality, non-disclosure, non-compete and acceptable use > policies (?BOGUS AGREEMENTS?) that I have entered into with your > employer, its partners, licensors, agents and assigns, in > perpetuity, without prejudice to my ongoing rights and privileges. > You further represent that you have the authority to release me > from any BOGUS AGREEMENTS on behalf of your employer. > > From guido at python.org Wed Nov 3 04:02:42 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 2 Nov 2010 20:02:42 -0700 Subject: [Python-Dev] Question on imports and packages In-Reply-To: References: <275F8441-269A-495A-83E5-4F5E617B43F0@gmail.com> Message-ID: FWIW, I also agree with Michael that static analysis would be much preferred. You never know what side effects importing a module has. (This could even be construed as an attack vector.) --Guido On Tue, Nov 2, 2010 at 7:54 PM, Brett Cannon wrote: > On Tue, Nov 2, 2010 at 17:35, Guido van Rossum wrote: >> If you are importing the code, the __module__ attribute on each class >> should tell you where it is actually defined (as opposed to where you >> imported it from). Then sys.modules gives you the module object which >> has a __file__ attribute, etc. > > What Guido said. It's the equivalent of browsing an object that a > function returned to you. Working backwards to where something is > defined has nothing to do with imports and more to do with __module__, > __class__, etc. Import has nothing to do with introspection for things > that you access off of a module that happened to have imported the > object. > >> >> On Tue, Nov 2, 2010 at 4:44 PM, Raymond Hettinger >> wrote: >>> Brett, ?Does the import mechanism for importing packages preserve enough information to be able to figure-out where all the components are defined? ?I'm wondering if it is possible for the class browser to be built-out to scan/navigate class structure across a module that has been split into a package. >> >> -- >> --Guido van Rossum (python.org/~guido) >> > -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Nov 3 04:33:48 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 3 Nov 2010 04:33:48 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> <8762wfjkqs.fsf@benfinney.id.au> Message-ID: <20101103043348.01a3e8f3@pitrou.net> On Wed, 03 Nov 2010 11:47:55 +1100 Ben Finney wrote: > > > If someone wants to depend on some undocumented detail of the > > directory layout it's their problem (like people depending on bytecode > > and other stuff). > > I would say that names without a single leading underscore are part of > the public API, whether documented or not. That's not what we are talking about; we are talking about their locations. If the official location is the unittest package, then I don't see why we should also support undocumented locations just because they happen to work. Otherwise we should also support e.g. "unittest.unlink" if the unittest package happens to have "from os import unlink" at its top. I don't think it's reasonable. Antoine. From solipsis at pitrou.net Wed Nov 3 04:35:38 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 3 Nov 2010 04:35:38 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <4CD0CE0B.3090201@voidspace.org.uk> Message-ID: <20101103043538.38154211@pitrou.net> On Tue, 2 Nov 2010 19:57:48 -0700 Brett Cannon wrote: > > > > How could we have split the module into a package in a way that matched the > > API, whilst still retaining backwards compatibility with the old API? We had > > no choice but to export the public names at the top level. > > I'm not disagreeing with that. What I am saying is can now document > that it's unittest.case.TestCase instead of having that just be an > implementation detail. -1. unittest.TestCase is far simpler and more obvious that any javaesque qualified name. urllib.request and friends are already annoying enough. Regards Antoine. From guido at python.org Wed Nov 3 04:01:18 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 2 Nov 2010 20:01:18 -0700 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD0A121.1060503@voidspace.org.uk> Message-ID: On Tue, Nov 2, 2010 at 7:50 PM, Brett Cannon wrote: > This is not what I am suggesting for PEP 8. I want to say that a > package's file structure should reflect the public API. But what does that mean? I could argue that unittest's structure (TestCase in case.py, etc.) reflects its public API just fine. > I personally > have no trouble with modules in packages that do not have a ton of > objects in them. I just think if you have pkg/mod.py, pkg.mod should > be exposed in the API, else name the file _mod.py. In the case of > unittest that would just mean documenting that it's > unittest.case.TestCase and that unittest.TestCase is for legacy > reasons, much like os.path is just blindly added on to os even though > it is a separate module(s). I really don't think we should encourage the use as unittest.case.TestCase -- it's unnecessarily introducing structure. I think it's fine now that the cat is out of the bag to document unittest.case.TestCase as an alternative spelling, but I don't think it should be the preferred one. os.path is so old that should not be taken as an example for anything. (It predates packages!) But it should not be changed either, there'd be too much churn. -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Wed Nov 3 05:29:18 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 03 Nov 2010 15:29:18 +1100 Subject: [Python-Dev] On breaking modules into packages References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> <8762wfjkqs.fsf@benfinney.id.au> <20101103043348.01a3e8f3@pitrou.net> Message-ID: <87sjzjhvxd.fsf_-_@benfinney.id.au> Antoine Pitrou writes: > On Wed, 03 Nov 2010 11:47:55 +1100 > Ben Finney wrote: > > > > > If someone wants to depend on some undocumented detail of the > > > directory layout it's their problem (like people depending on > > > bytecode and other stuff). > > > > I would say that names without a single leading underscore are part > > of the public API, whether documented or not. > > That's not what we are talking about; we are talking about their > locations. If the official location is the unittest package, then I > don't see why we should also support undocumented locations just > because they happen to work. So long as the names available for import are such that they indicate whether they're public or implementation-detail (i.e. without a leading single underscore or with one), I agree that this is distinct from the issue of locations on the filesystem. > Otherwise we should also support e.g. "unittest.unlink" if the > unittest package happens to have "from os import unlink" at its top. I > don't think it's reasonable. Hmm. That example does give me pause. I'm trying to think of a simple way that such imports are excluded from being ?public interface?, but can't immediately think of one. The distinction is clear in my head, though, for what it's worth :-) -- \ ?I don't accept the currently fashionable assertion that any | `\ view is automatically as worthy of respect as any equal and | _o__) opposite view.? ?Douglas Adams | Ben Finney From kristjan at ccpgames.com Wed Nov 3 06:08:02 2010 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Wed, 3 Nov 2010 13:08:02 +0800 Subject: [Python-Dev] On breaking modules into packages In-Reply-To: <87sjzjhvxd.fsf_-_@benfinney.id.au> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> <8762wfjkqs.fsf@benfinney.id.au> <20101103043348.01a3e8f3@pitrou.net> <87sjzjhvxd.fsf_-_@benfinney.id.au> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE576200F8@exchcn.ccp.ad.local> Just a small input into this discussion: In EVE, for historical reasons, we implemented our own importing mechanism. I think it is because we started out with an ancient Python version that didn't support packages. Regardless, we still have a system where a hierarchy of files is scanned, and then code in each .py files determines where in the "namespace" it lands. This can be Declaratively (by using a __guid__ attribute on a class, for instance) or by defining a special __exports__ dict at the module level. The good thing about this system is that it allows us to separate code in a manner independent of the api. We can choose for example to group all network Code in a folder. Or have each class in the "game.entity" namespace be defined in its own file. It unhooks file structure from name structure. Now, this has its own problems of course, the biggest of it being that it is non-standard. Off the shelf IDEs have problems with it. And we have to implement dynamic reloading on our own. The list goes on, and for that reason, we are moving away from it in favor of standard python import. However, I am personally not super happy about how this will force one to think in "api" terms when creating source files. As has been mentioned, files cannot be moved and restructured once in general use, and when writing new code, one has to think long and hard about "where" to put the source, not "what" to put in it. What is more, a hierarchy, while a convenient system for storing files, does not, IMHO, always map to problem domain. But we're having a go at it. Time will tell if "forcing us to think inside the hierarchy" will be beneficial in the long run. Cheers, K From g.brandl at gmx.net Wed Nov 3 08:06:49 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 03 Nov 2010 07:06:49 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <20101103043538.38154211@pitrou.net> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <4CD0CE0B.3090201@voidspace.org.uk> <20101103043538.38154211@pitrou.net> Message-ID: Am 03.11.2010 03:35, schrieb Antoine Pitrou: > On Tue, 2 Nov 2010 19:57:48 -0700 > Brett Cannon wrote: >> > >> > How could we have split the module into a package in a way that matched the >> > API, whilst still retaining backwards compatibility with the old API? We had >> > no choice but to export the public names at the top level. >> >> I'm not disagreeing with that. What I am saying is can now document >> that it's unittest.case.TestCase instead of having that just be an >> implementation detail. > > -1. unittest.TestCase is far simpler and more obvious that any > javaesque qualified name. urllib.request and friends are already > annoying enough. Agreed. There are about 30 names importable from unittest, that is quite manageable in a single namespace. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From victor.stinner at haypocalc.com Wed Nov 3 11:55:41 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 3 Nov 2010 11:55:41 +0100 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: References: <20101029003858.7D584EEA52@mail.python.org> <201011021355.40153.victor.stinner@haypocalc.com> Message-ID: <201011031155.41814.victor.stinner@haypocalc.com> Le mardi 02 novembre 2010 23:38:12, vous avez ?crit : > On Tue, Nov 2, 2010 at 10:55 PM, Victor Stinner > > wrote: > > I don't know how to ignore the BytesWarning without importing warning. > > How can I do that? > > I was suggesting trying to fix the bootstrap issue so you could use a > top-level import, instead of working around it with a function level > import (which we've learned from experience is a recipe for later > reports from users of programs deadlocking on the import lock - we've > made lots of improvement to avoid such deadlocks, but still prefer to > avoid function level imports anyway). I don't know if there is a bootstrap issue. I'm using a local import because os is always loaded at startup, and get_exec_path() is only used to run a subprocess: os.exec*() and subprocess.Popen() (only the POSIX implementation). I suppose that a top level "import warnings" would augment the memory footprint. Victor From fuzzyman at voidspace.org.uk Wed Nov 3 12:25:50 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 11:25:50 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <4CD0CE0B.3090201@voidspace.org.uk> Message-ID: <4CD146BE.6060209@voidspace.org.uk> On 03/11/2010 02:57, Brett Cannon wrote: > On Tue, Nov 2, 2010 at 19:50, Michael Foord wrote: >> On 02/11/2010 02:35, Brett Cannon wrote: >>> On Wed, Oct 27, 2010 at 03:42, Antoine Pitrou wrote: >>>> On Tue, 26 Oct 2010 22:06:37 -0400 >>>> Alexander Belopolsky wrote: >>>>> While I appreciate your and Michael's eloquence, I don't really see >>>>> why five 400-line modules are necessarily easier to maintain than one >>>>> 2000-line module. Splitting code into modules is certainly a good >>>>> thing when the resulting modules can be used independently. This >>>>> helps users write leaner programs, reduces mental footprint of >>>>> individual modules, etc, etc. The split unittest module does not >>>>> bring any such benefits. It still presents a single "big-ball-of-mud" >>>>> namespace, only rather than implemented in a single file, it is now >>>>> swept in from eight different files. >>>> Are you saying that it has become a pile of medium-sized balls of mud? >>>> I would like to say thanks for the mud, Michael! It's high quality mud >>>> for sure. >>> I realize I am a little late in this reply but issue 10273 linked to >>> this and so now I am actually bothering to read this thread since it >>> felt like bikeshedding when the thread began. >>> >>> I think the issue here is that the file structure of the code no >>> longer matches the public API documented by unittest. Personally I, >>> like most people it seems, prefer source files to be structured in a >>> way to match the public API. In the case of unittest Michael didn't. >> Well the structure *does* match the API (which is primarily why I disagree >> with Raymond that this is a 'bad split'). > But not the public API as documented, e.g., it's documented as > unittest.TestCase, not unittest.case.TestCase as the file structure > suggests. Right. I don't think that whether or not the unittest package structure is a good structure or not is determined by where we make users import the names from. Like others I see little value in reccommending people use the longer form for imports. All the best, Michael Foord >> How could we have split the module into a package in a way that matched the >> API, whilst still retaining backwards compatibility with the old API? We had >> no choice but to export the public names at the top level. > I'm not disagreeing with that. What I am saying is can now document > that it's unittest.case.TestCase instead of having that just be an > implementation detail. > > -Brett > >>> He did ask python-dev if it was okay to do what he did, we all kept >>> quiet, and now we have realized that most of us prefer to have files >> Most of us? Raymond, Alexander and Martin are the only ones I *recall* >> complaining about the split specifically in this thread and not all of those >> were on the grounds you mention. Several people supported the split. Guido >> didn't object to it on these grounds and Antoine noted that finding core >> classes was generally straightforward. >> >>> [snip...] >>> So basically it seems like we have learned a lesson: we prefer to have >>> our code structured in files that match the public API. >> When designing packages from the ground up that is a sensible rule of thumb >> to follow, but usually follows naturally from good design. This doesn't >> necessarily mean that all the sub-modules will export public APIs for >> consumers, so this is where I disagree with this. Python's package system is >> a very useful way of providing internal structure for projects. That doesn't >> mean that this structure must always be exposed publicly. >> >> It should be as easy to navigate as possible (and there is plenty about the >> old unittest.py module that wasn't easy to navigate I can assure you), but I >> *don't* think that the new package fails in a substantially greater way on >> that score. >> >> As Guido points out, this may depend a lot on which tools you use. I could >> write more about the role and value of packages, I guess I'll save it for a >> blog post. >> >> All the best, >> >> Michael Foord >> >>> I think that >>> is a legitimate design rule for the stdlib to follow from now on, but >>> in the case of unittest it's too late to change it back (and it's a >>> minor price to pay to learn this lesson and to have Michael >>> maintaining unittest like he has been, plus we could consider using >>> the new structure so that the public API matches the file structure >>> when the need arises). >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk >> >> -- >> >> http://www.voidspace.org.uk/ >> >> READ CAREFULLY. By accepting and reading this email you agree, >> on behalf of your employer, to release me from all obligations >> and waivers arising from any and all NON-NEGOTIATED agreements, >> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, >> confidentiality, non-disclosure, non-compete and acceptable use >> policies (?BOGUS AGREEMENTS?) that I have entered into with your >> employer, its partners, licensors, agents and assigns, in >> perpetuity, without prejudice to my ongoing rights and privileges. >> You further represent that you have the authority to release me >> from any BOGUS AGREEMENTS on behalf of your employer. >> >> -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From benjamin at python.org Wed Nov 3 13:19:44 2010 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 3 Nov 2010 07:19:44 -0500 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: <201011031155.41814.victor.stinner@haypocalc.com> References: <20101029003858.7D584EEA52@mail.python.org> <201011021355.40153.victor.stinner@haypocalc.com> <201011031155.41814.victor.stinner@haypocalc.com> Message-ID: 2010/11/3 Victor Stinner : > Le mardi 02 novembre 2010 23:38:12, vous avez ?crit : >> On Tue, Nov 2, 2010 at 10:55 PM, Victor Stinner >> >> wrote: >> > I don't know how to ignore the BytesWarning without importing warning. >> > How can I do that? >> >> I was suggesting trying to fix the bootstrap issue so you could use a >> top-level import, instead of working around it with a function level >> import (which we've learned from experience is a recipe for later >> reports from users of programs deadlocking on the import lock - we've >> made lots of improvement to avoid such deadlocks, but still prefer to >> avoid function level imports anyway). > > I don't know if there is a bootstrap issue. I'm using a local import because > os is always loaded at startup, and get_exec_path() is only used to run a > subprocess: os.exec*() and subprocess.Popen() (only the POSIX implementation). > I suppose that a top level "import warnings" would augment the memory > footprint. Warnings is loaded every time anyway. -- Regards, Benjamin From hrvoje.niksic at avl.com Wed Nov 3 13:38:58 2010 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Wed, 03 Nov 2010 13:38:58 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <8762wfjkqs.fsf@benfinney.id.au> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> <8762wfjkqs.fsf@benfinney.id.au> Message-ID: <4CD157E2.6000108@avl.com> On 11/03/2010 01:47 AM, Ben Finney wrote: >> If someone wants to depend on some undocumented detail of the >> directory layout it's their problem (like people depending on bytecode >> and other stuff). > > I would say that names without a single leading underscore are part of > the public API, whether documented or not. I understand this reasoning, but I'd like to offer counter-examples. For instance, would you say that glob.glob0 and glob.glob1 are public API? They're undocumented, they're not in __all__, but they don't have a leading underscore either, and source comments call them "helper functions." I'm sure there is a lot of other examples like that, both in the standard library and in python packages out there. Other than the existing practice, there is the matter of esthetics. Accepting underscore-less identifiers as automatically public leads to a proliferation of identifiers with leading underscores, which many people (myself included) plainly don't like. From ncoghlan at gmail.com Wed Nov 3 15:00:30 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 4 Nov 2010 00:00:30 +1000 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: References: <20101029003858.7D584EEA52@mail.python.org> <201011021355.40153.victor.stinner@haypocalc.com> <201011031155.41814.victor.stinner@haypocalc.com> Message-ID: On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson wrote: > > Warnings is loaded every time anyway. I would have agreed with you, but the contents of sys.modules in a just-started interactive interpreter suggests that isn't true any more (which surprised me). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Nov 3 15:05:44 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 4 Nov 2010 00:05:44 +1000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger wrote: > Sounds like a decision to split a module into a package is a big commitment. ?Each of the individual file names becomes a permanent part of the API. ?Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils). Can Python 2.7 pickles containing unittest classes be unpickled using 2.6 or earlier? Even if nobody uses the new names for imports, I believe they implicitly end up included in any pickles involving affected classes (I seem to recall we've been bitten by that before when moving things around). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Wed Nov 3 15:16:18 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 14:16:18 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: <4CD16EB2.9090806@voidspace.org.uk> On 03/11/2010 14:05, Nick Coghlan wrote: > On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger > wrote: >> Sounds like a decision to split a module into a package is a big commitment. Each of the individual file names becomes a permanent part of the API. Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils). > Can Python 2.7 pickles containing unittest classes be unpickled using > 2.6 or earlier? Even if nobody uses the new names for imports, I > believe they implicitly end up included in any pickles involving > affected classes (I seem to recall we've been bitten by that before > when moving things around). Yes, since unittest.TestCase is still available (as are all the names). I believe so anyway... Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Wed Nov 3 15:17:49 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 03 Nov 2010 15:17:49 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD16EB2.9090806@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> Message-ID: <1288793869.5969.2.camel@localhost.localdomain> Le mercredi 03 novembre 2010 ? 14:16 +0000, Michael Foord a ?crit : > On 03/11/2010 14:05, Nick Coghlan wrote: > > On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger > > wrote: > >> Sounds like a decision to split a module into a package is a big commitment. Each of the individual file names becomes a permanent part of the API. Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils). > > Can Python 2.7 pickles containing unittest classes be unpickled using > > 2.6 or earlier? Even if nobody uses the new names for imports, I > > believe they implicitly end up included in any pickles involving > > affected classes (I seem to recall we've been bitten by that before > > when moving things around). > > Yes, since unittest.TestCase is still available (as are all the names). > I believe so anyway... unittest.TestCase is not really pickleable. There were test_multiprocessing issues because of that (see recent SVN checkins). Regards Antoine. From fuzzyman at voidspace.org.uk Wed Nov 3 15:26:23 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 14:26:23 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <1288793869.5969.2.camel@localhost.localdomain> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <1288793869.5969.2.camel@localhost.localdomain> Message-ID: <4CD1710F.8090908@voidspace.org.uk> On 03/11/2010 14:17, Antoine Pitrou wrote: > Le mercredi 03 novembre 2010 ? 14:16 +0000, Michael Foord a ?crit : >> On 03/11/2010 14:05, Nick Coghlan wrote: >>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger >>> wrote: >>>> Sounds like a decision to split a module into a package is a big commitment. Each of the individual file names becomes a permanent part of the API. Even future additional splits are precluded because it might break someones dotted import (i.e. not a single function can be moved between those files -- once in unittest.utils, alway in unittest.utils). >>> Can Python 2.7 pickles containing unittest classes be unpickled using >>> 2.6 or earlier? Even if nobody uses the new names for imports, I >>> believe they implicitly end up included in any pickles involving >>> affected classes (I seem to recall we've been bitten by that before >>> when moving things around). >> Yes, since unittest.TestCase is still available (as are all the names). >> I believe so anyway... > unittest.TestCase is not really pickleable. There were > test_multiprocessing issues because of that (see recent SVN checkins). Interesting. We made some fixes before 2.7 to ensure they were copyable, but we fixed this in the copy module. TestCase instances now store some method objects in a dictionary which may make them unpickleable, so that could be a new problem. I'll test with 2.6 and 2.7 to see. An easy fix would be to store the method names rather than the method objects themself (if this is indeed the cause of the problem). This is what unittest2 does so that it works with earlier versions of Python that don't have the fix we put in copy. Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Wed Nov 3 15:33:18 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 03 Nov 2010 15:33:18 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD1710F.8090908@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <1288793869.5969.2.camel@localhost.localdomain> <4CD1710F.8090908@voidspace.org.uk> Message-ID: <1288794798.5969.6.camel@localhost.localdomain> Le mercredi 03 novembre 2010 ? 14:26 +0000, Michael Foord a ?crit : > > Interesting. We made some fixes before 2.7 to ensure they were copyable, > but we fixed this in the copy module. TestCase instances now store some > method objects in a dictionary which may make them unpickleable, so that > could be a new problem. I'll test with 2.6 and 2.7 to see. I don't think it is a problem in unittest, unless pickling TestCase objects is really useful. I have fixed the problem in test_multiprocessing instead. Regards Antoine. From fuzzyman at voidspace.org.uk Wed Nov 3 15:38:56 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 14:38:56 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD1710F.8090908@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <1288793869.5969.2.camel@localhost.localdomain> <4CD1710F.8090908@voidspace.org.uk> Message-ID: <4CD17400.3010604@voidspace.org.uk> On 03/11/2010 14:26, Michael Foord wrote: > On 03/11/2010 14:17, Antoine Pitrou wrote: >> Le mercredi 03 novembre 2010 ? 14:16 +0000, Michael Foord a ?crit : >>> On 03/11/2010 14:05, Nick Coghlan wrote: >>>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger >>>> wrote: >>>>> Sounds like a decision to split a module into a package is a big >>>>> commitment. Each of the individual file names becomes a permanent >>>>> part of the API. Even future additional splits are precluded >>>>> because it might break someones dotted import (i.e. not a single >>>>> function can be moved between those files -- once in >>>>> unittest.utils, alway in unittest.utils). >>>> Can Python 2.7 pickles containing unittest classes be unpickled using >>>> 2.6 or earlier? Even if nobody uses the new names for imports, I >>>> believe they implicitly end up included in any pickles involving >>>> affected classes (I seem to recall we've been bitten by that before >>>> when moving things around). >>> Yes, since unittest.TestCase is still available (as are all the names). >>> I believe so anyway... >> unittest.TestCase is not really pickleable. There were >> test_multiprocessing issues because of that (see recent SVN checkins). > > Interesting. We made some fixes before 2.7 to ensure they were > copyable, but we fixed this in the copy module. TestCase instances now > store some method objects in a dictionary which may make them > unpickleable, so that could be a new problem. I'll test with 2.6 and > 2.7 to see. > > An easy fix would be to store the method names rather than the method > objects themself (if this is indeed the cause of the problem). This is > what unittest2 does so that it works with earlier versions of Python > that don't have the fix we put in copy. > Yep, looks like 2.7 introduced a bug making it impossible to pickle TestCase instances. I think it will be easy to fix, I'll create a specific issue: $python Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) [GCC 4.0.1 (Apple Inc. build 5493)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from unittest import TestCase >>> from pickle import dumps >>> t = TestCase('assert_') >>> dumps(t) "ccopy_reg\n_reconstructor\np0\n(cunittest\nTestCase\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'_testMethodDoc'\np6\nS'Fail the test unless the expression is true.'\np7\nsS'_testMethodName'\np8\nS'assert_'\np9\nsb." >>> bigmac:beta.python.org michael$ python2.7 ActivePython 2.7.0.1 (ActiveState Software Inc.) based on Python 2.7 (r27:82500, Jul 4 2010, 13:58:56) [GCC 4.2.1 (Apple Inc. build 5664)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from unittest import TestCase >>> from pickle import dumps >>> t = TestCase('assert_') >>> dumps(t) Traceback (most recent call last): File "", line 1, in File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 1374, in dumps Pickler(file, protocol).dump(obj) ... File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/pickle.py", line 306, in save rv = reduce(self.proto) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex raise TypeError, "can't pickle %s objects" % base.__name__ TypeError: can't pickle instancemethod objects All the best, Michael > Michael > >> Regards >> >> Antoine. >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From eric at trueblade.com Wed Nov 3 15:53:11 2010 From: eric at trueblade.com (Eric Smith) Date: Wed, 03 Nov 2010 10:53:11 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD16EB2.9090806@voidspace.org.uk> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> Message-ID: <4CD17757.4010004@trueblade.com> On 11/3/10 10:16 AM, Michael Foord wrote: > On 03/11/2010 14:05, Nick Coghlan wrote: >> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger >> wrote: >>> Sounds like a decision to split a module into a package is a big >>> commitment. Each of the individual file names becomes a permanent >>> part of the API. Even future additional splits are precluded because >>> it might break someones dotted import (i.e. not a single function can >>> be moved between those files -- once in unittest.utils, alway in >>> unittest.utils). >> Can Python 2.7 pickles containing unittest classes be unpickled using >> 2.6 or earlier? Even if nobody uses the new names for imports, I >> believe they implicitly end up included in any pickles involving >> affected classes (I seem to recall we've been bitten by that before >> when moving things around). > > Yes, since unittest.TestCase is still available (as are all the names). > I believe so anyway... Actually I think the answer is "no" (assuming you could pickle a TestCase). Here's an example with TestLoader: $ python27 Python 2.7.0+ (release27-maint:85878, Oct 28 2010, 06:40:25) [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import unittest >>> x = unittest.TestLoader() >>> import pickle >>> pickle.dumps(x) 'ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.' >>> $ python24 Python 2.4.4 (#1, Oct 23 2006, 13:58:00) [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pickle >>> pickle.loads('ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.') Traceback (most recent call last): File "", line 1, in ? File "/usr/lib/python2.4/pickle.py", line 1394, in loads return Unpickler(file).load() File "/usr/lib/python2.4/pickle.py", line 872, in load dispatch[key](self) File "/usr/lib/python2.4/pickle.py", line 1104, in load_global klass = self.find_class(module, name) File "/usr/lib/python2.4/pickle.py", line 1138, in find_class __import__(module) ImportError: No module named loader The problem is that there is no unittest.loader in 2.4, and unittest.loader.TestLoader is the name that the 2.7 pickle creates. We see this problem every time we try and move anything in the stdlib. -- Eric. From barry at python.org Wed Nov 3 15:54:33 2010 From: barry at python.org (Barry Warsaw) Date: Wed, 3 Nov 2010 10:54:33 -0400 Subject: [Python-Dev] On breaking modules into packages In-Reply-To: <87eib3jmn3.fsf_-_@benfinney.id.au> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <87eib3jmn3.fsf_-_@benfinney.id.au> Message-ID: <20101103105433.1530f1fd@mission> On Nov 03, 2010, at 11:06 AM, Ben Finney wrote: >Is this a case where it would be better if the package names had the >leading underscore: ?_utils?, ?_suite?, etc.? > >Does the convention on single-leading-underscore identifiers as ?don't >rely on this name staying the same in future versions? hold for package >names? I would vote "yes". I have seen more and more packages use this convention to signal that the module name is not intended to be imported directly. This should be part of any PEP 8 recommendation, IMO. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Nov 3 15:55:07 2010 From: barry at python.org (Barry Warsaw) Date: Wed, 3 Nov 2010 10:55:07 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <1288740855.3541.24.camel@localhost.localdomain> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <1288740855.3541.24.camel@localhost.localdomain> Message-ID: <20101103105507.11e12b41@mission> On Nov 03, 2010, at 12:34 AM, Antoine Pitrou wrote: >I don't agree with this. Until it's documented, it's an implementation >detail and should be able to change without notice. >If someone wants to depend on some undocumented detail of the directory >layout it's their problem (like people depending on bytecode and other >stuff). +1 -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From fuzzyman at voidspace.org.uk Wed Nov 3 15:56:45 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 14:56:45 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD17757.4010004@trueblade.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> Message-ID: <4CD1782D.8090301@voidspace.org.uk> On 03/11/2010 14:53, Eric Smith wrote: > On 11/3/10 10:16 AM, Michael Foord wrote: >> On 03/11/2010 14:05, Nick Coghlan wrote: >>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger >>> wrote: >>>> Sounds like a decision to split a module into a package is a big >>>> commitment. Each of the individual file names becomes a permanent >>>> part of the API. Even future additional splits are precluded because >>>> it might break someones dotted import (i.e. not a single function can >>>> be moved between those files -- once in unittest.utils, alway in >>>> unittest.utils). >>> Can Python 2.7 pickles containing unittest classes be unpickled using >>> 2.6 or earlier? Even if nobody uses the new names for imports, I >>> believe they implicitly end up included in any pickles involving >>> affected classes (I seem to recall we've been bitten by that before >>> when moving things around). >> >> Yes, since unittest.TestCase is still available (as are all the names). >> I believe so anyway... > > Actually I think the answer is "no" (assuming you could pickle a > TestCase). Here's an example with TestLoader: > Ah dammit, I read the question the other way round. Michael > $ python27 > Python 2.7.0+ (release27-maint:85878, Oct 28 2010, 06:40:25) > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import unittest > >>> x = unittest.TestLoader() > >>> import pickle > >>> pickle.dumps(x) > 'ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.' > > >>> > > $ python24 > Python 2.4.4 (#1, Oct 23 2006, 13:58:00) > [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import pickle > >>> > pickle.loads('ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.') > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/pickle.py", line 1394, in loads > return Unpickler(file).load() > File "/usr/lib/python2.4/pickle.py", line 872, in load > dispatch[key](self) > File "/usr/lib/python2.4/pickle.py", line 1104, in load_global > klass = self.find_class(module, name) > File "/usr/lib/python2.4/pickle.py", line 1138, in find_class > __import__(module) > ImportError: No module named loader > > The problem is that there is no unittest.loader in 2.4, and > unittest.loader.TestLoader is the name that the 2.7 pickle creates. We > see this problem every time we try and move anything in the stdlib. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From benjamin at python.org Wed Nov 3 16:01:32 2010 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 3 Nov 2010 10:01:32 -0500 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: References: <20101029003858.7D584EEA52@mail.python.org> <201011021355.40153.victor.stinner@haypocalc.com> <201011031155.41814.victor.stinner@haypocalc.com> Message-ID: 2010/11/3 Nick Coghlan : > On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson wrote: >> >> Warnings is loaded every time anyway. > > I would have agreed with you, but the contents of sys.modules in a > just-started interactive interpreter suggests that isn't true any more > (which surprised me). Is that perhaps because of _warnings? -- Regards, Benjamin From eric at trueblade.com Wed Nov 3 16:25:36 2010 From: eric at trueblade.com (Eric Smith) Date: Wed, 03 Nov 2010 11:25:36 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD17757.4010004@trueblade.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> Message-ID: <4CD17EF0.2060701@trueblade.com> On 11/3/10 10:53 AM, Eric Smith wrote: > The problem is that there is no unittest.loader in 2.4, and > unittest.loader.TestLoader is the name that the 2.7 pickle creates. We > see this problem every time we try and move anything in the stdlib. And BTW: for me, this is the strongest reason not to break up modules into packages or otherwise reorganize the stdlib. -- Eric. From alexander.belopolsky at gmail.com Wed Nov 3 16:26:18 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 3 Nov 2010 11:26:18 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Tue, Nov 2, 2010 at 6:58 PM, Guido van Rossum wrote: .. > To spout a somewhat contrarian opinion, I just browsed the new > unittest package, and the structure seems reasonable to me, even if > its submodules are not particularly reusable. I've used this kind of > style for development myself. What is so offensive about it? I would not call it "offensive", but what I find annoying is >>> import unittest >>> unittest.TestCase.__module__ 'unittest.case' This may not be a problem for smart tools, but for me and a simple editor what used to be: Let's find code for unittest.TestCase. 1. Open Lib/unittest.py. 2. Search for "class TestCase". is now 1. Open Lib/unittest.py -> No such file or directory. 2. OK, I'm in 2.7. Open Lib/unittest/__init__.py 3. Search for "class TestCase" -> beep 4. OK, search for "TestCase" -> from .case import (TestCase, FunctionTestCase, SkipTest, skip, skipIf, .. 5. Hmm, what is ".case". Ah, the relative import - open case.py 7. Search for "class TestCase". 8. What exactly was I looking for? The above is only slightly exaggerated scenario that I went through several times when I started using 2.7 before I conditioned myself to grep in Lib/unittest/*.py. What is unfortunate is that file split was accompanied by an explosion of assert* methods in TestCase API which means that anyone reading 2.7 unittests is likely to encounter an unfamiliar method that has to be looked up. I think the problem that I described is just a slightly reworded problem that Raymond reported at the beginning of this thread. In other words, I am not alone in seeing this as a problem. PS: For a "made from scratch" API I would prefer TestCase only be available from unittest.case. From foom at fuhm.net Wed Nov 3 18:04:53 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 3 Nov 2010 13:04:53 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD17EF0.2060701@trueblade.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> <4CD17EF0.2060701@trueblade.com> Message-ID: <75A18EA0-07C2-4D1F-A04C-5BCFB8A54445@fuhm.net> On Nov 3, 2010, at 11:25 AM, Eric Smith wrote: > On 11/3/10 10:53 AM, Eric Smith wrote: > >> The problem is that there is no unittest.loader in 2.4, and >> unittest.loader.TestLoader is the name that the 2.7 pickle creates. We >> see this problem every time we try and move anything in the stdlib. > > And BTW: for me, this is the strongest reason not to break up modules into packages or otherwise reorganize the stdlib. This is the strongest reason why I recommend to everyone I know that they not use pickle for storage they'd like to keep working after upgrades [not just of stdlib, but other 3rd party software or their own software]. :) James From techtonik at gmail.com Wed Nov 3 19:21:27 2010 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 3 Nov 2010 20:21:27 +0200 Subject: [Python-Dev] Code coverage doesn't show .py stats Message-ID: Hi, Python code coverage doesn't include any .py files. What happened? http://coverage.livinglogic.de/ Did it work before? -- anatoly t. From glyph at twistedmatrix.com Wed Nov 3 20:08:33 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Wed, 3 Nov 2010 15:08:33 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <75A18EA0-07C2-4D1F-A04C-5BCFB8A54445@fuhm.net> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> <4CD17EF0.2060701@trueblade.com> <75A18EA0-07C2-4D1F-A04C-5BCFB8A54445@fuhm.net> Message-ID: On Nov 3, 2010, at 1:04 PM, James Y Knight wrote: > This is the strongest reason why I recommend to everyone I know that they not use pickle for storage they'd like to keep working after upgrades [not just of stdlib, but other 3rd party software or their own software]. :) +1. Twisted actually tried to preserve pickle compatibility in the bad old days, but it was impossible. Pickles should never really be saved to disk unless they contain nothing but lists, ints, strings, and dicts. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Wed Nov 3 20:26:53 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 19:26:53 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <4CD17757.4010004@trueblade.com> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> Message-ID: <4CD1B77D.2070501@voidspace.org.uk> On 03/11/2010 14:53, Eric Smith wrote: > On 11/3/10 10:16 AM, Michael Foord wrote: >> On 03/11/2010 14:05, Nick Coghlan wrote: >>> On Wed, Nov 3, 2010 at 9:32 AM, Raymond Hettinger >>> wrote: >>>> Sounds like a decision to split a module into a package is a big >>>> commitment. Each of the individual file names becomes a permanent >>>> part of the API. Even future additional splits are precluded because >>>> it might break someones dotted import (i.e. not a single function can >>>> be moved between those files -- once in unittest.utils, alway in >>>> unittest.utils). >>> Can Python 2.7 pickles containing unittest classes be unpickled using >>> 2.6 or earlier? Even if nobody uses the new names for imports, I >>> believe they implicitly end up included in any pickles involving >>> affected classes (I seem to recall we've been bitten by that before >>> when moving things around). >> >> Yes, since unittest.TestCase is still available (as are all the names). >> I believe so anyway... > > Actually I think the answer is "no" (assuming you could pickle a > TestCase). Here's an example with TestLoader: > It is actually fixable by temporarily switching the __module__ attribute of the classes inside a __reduce__ or __reduce_ex__ method. I couldn't see a cleaner way of doing it using the pickling protocol methods. I asked on #python-dev but the *only* person who claimed to understand the pickle protocol methods was Barry, and he is clearly insane. Antoine is firmly of the opinion that making TestCase instances unpickleable is a feature... Although in practise this is less likely to be an issue for TestCase directly as it is extremely rare to use them without subclassing. More likely to be an issue for the test result or runner objects. All the best, Michael Foord > $ python27 > Python 2.7.0+ (release27-maint:85878, Oct 28 2010, 06:40:25) > [GCC 4.1.2 20070626 (Red Hat 4.1.2-13)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import unittest > >>> x = unittest.TestLoader() > >>> import pickle > >>> pickle.dumps(x) > 'ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.' > > >>> > > $ python24 > Python 2.4.4 (#1, Oct 23 2006, 13:58:00) > [GCC 4.1.1 20061011 (Red Hat 4.1.1-30)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import pickle > >>> > pickle.loads('ccopy_reg\n_reconstructor\np0\n(cunittest.loader\nTestLoader\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n.') > Traceback (most recent call last): > File "", line 1, in ? > File "/usr/lib/python2.4/pickle.py", line 1394, in loads > return Unpickler(file).load() > File "/usr/lib/python2.4/pickle.py", line 872, in load > dispatch[key](self) > File "/usr/lib/python2.4/pickle.py", line 1104, in load_global > klass = self.find_class(module, name) > File "/usr/lib/python2.4/pickle.py", line 1138, in find_class > __import__(module) > ImportError: No module named loader > > The problem is that there is no unittest.loader in 2.4, and > unittest.loader.TestLoader is the name that the 2.7 pickle creates. We > see this problem every time we try and move anything in the stdlib. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Wed Nov 3 20:45:04 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 3 Nov 2010 20:45:04 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> <4CD1B77D.2070501@voidspace.org.uk> Message-ID: <20101103204504.7029354f@pitrou.net> On Wed, 03 Nov 2010 19:26:53 +0000 Michael Foord wrote: > > Antoine is firmly of the opinion that making TestCase instances > unpickleable is a feature... Apparently you didn't really understand me. I'm of the opinion that making TestCase instances pickleable is useless if that pickling doesn't have well-defined semantics. And I wonder what the semantics of pickling a TestCase could be, and what the use cases are. Regards Antoine. From jnoller at gmail.com Wed Nov 3 20:48:27 2010 From: jnoller at gmail.com (Jesse Noller) Date: Wed, 3 Nov 2010 15:48:27 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: <20101103204504.7029354f@pitrou.net> References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> <4CD1B77D.2070501@voidspace.org.uk> <20101103204504.7029354f@pitrou.net> Message-ID: On Wed, Nov 3, 2010 at 3:45 PM, Antoine Pitrou wrote: > On Wed, 03 Nov 2010 19:26:53 +0000 > Michael Foord wrote: >> >> Antoine is firmly of the opinion that making TestCase instances >> unpickleable is a feature... > > Apparently you didn't really understand me. I'm of the opinion that > making TestCase instances pickleable is useless if that pickling > doesn't have well-defined semantics. And I wonder what the semantics of > pickling a TestCase could be, and what the use cases are. > > Regards > > Antoine. > Splitting groups of tests to run in parallel via multiple processes is a pretty good use case. From solipsis at pitrou.net Wed Nov 3 20:56:51 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 03 Nov 2010 20:56:51 +0100 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> <4CD1B77D.2070501@voidspace.org.uk> <20101103204504.7029354f@pitrou.net> Message-ID: <1288814211.5969.19.camel@localhost.localdomain> Le mercredi 03 novembre 2010 ? 15:48 -0400, Jesse Noller a ?crit : > On Wed, Nov 3, 2010 at 3:45 PM, Antoine Pitrou wrote: > > On Wed, 03 Nov 2010 19:26:53 +0000 > > Michael Foord wrote: > >> > >> Antoine is firmly of the opinion that making TestCase instances > >> unpickleable is a feature... > > > > Apparently you didn't really understand me. I'm of the opinion that > > making TestCase instances pickleable is useless if that pickling > > doesn't have well-defined semantics. And I wonder what the semantics of > > pickling a TestCase could be, and what the use cases are. > > > > Regards > > > > Antoine. > > > > Splitting groups of tests to run in parallel via multiple processes is > a pretty good use case. Indeed, but it implies a lot of things about TestCase instances, which could have additional non-pickleable attributes (e.g. file objects). You'd better pickle the TestCase class instead, or simply the module name as we do with regrtest -jN. Regards Antoine. From fuzzyman at voidspace.org.uk Wed Nov 3 21:15:51 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 03 Nov 2010 20:15:51 +0000 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> <4CD16EB2.9090806@voidspace.org.uk> <4CD17757.4010004@trueblade.com> <4CD1B77D.2070501@voidspace.org.uk> <20101103204504.7029354f@pitrou.net> Message-ID: <4CD1C2F7.1060901@voidspace.org.uk> On 03/11/2010 19:48, Jesse Noller wrote: > On Wed, Nov 3, 2010 at 3:45 PM, Antoine Pitrou wrote: >> On Wed, 03 Nov 2010 19:26:53 +0000 >> Michael Foord wrote: >>> Antoine is firmly of the opinion that making TestCase instances >>> unpickleable is a feature... >> Apparently you didn't really understand me. I'm of the opinion that >> making TestCase instances pickleable is useless if that pickling >> doesn't have well-defined semantics. And I wonder what the semantics of >> pickling a TestCase could be, and what the use cases are. >> >> Regards >> >> Antoine. >> > Splitting groups of tests to run in parallel via multiple processes is > a pretty good use case. That's something I've been thinking about a lot (and talking to Holger about) for the unittest plugins. I definitely won't be doing it with pickles but as Antoine says, sending test names to the subprocesses. You really want tests run in a child process to behave differently and it makes sense to set them up inside the child process. All the best, Michael Foord > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From glyph at twistedmatrix.com Wed Nov 3 21:59:35 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Wed, 3 Nov 2010 16:59:35 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Nov 3, 2010, at 11:26 AM, Alexander Belopolsky wrote: > This may not be a problem for smart tools, but for me and a simple > editor what used to be: Maybe this is the real problem? It's 2010, we should all be far enough beyond EDLIN that our editors can jump to the definition of a Python class. Even Vim can be convinced to do this (). Could Python itself make this easier? Maybe ship with a command that says "hey, somewhere on sys.path, there is a class with . Please run '$EDITOR file +line' (or the current OS's equivalent) so I can look at the source code". -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Nov 3 22:18:41 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 3 Nov 2010 17:18:41 -0400 Subject: [Python-Dev] On breaking modules into packages Was: [issue10199] Move Demo/turtle under Lib/ In-Reply-To: References: <1288110692.31.0.0560369831336.issue10199@psf.upfronthosting.co.za> <9011AC5F-4121-4BBE-9427-02D2A1C24259@gmail.com> <20101026190538.7E9791FE543@kimball.webabinitio.net> <4CC73E17.1000501@voidspace.org.uk> <20101027124223.119ce5b1@pitrou.net> <2055C288-1FFA-4023-AA4F-BA052AC2A3DB@gmail.com> Message-ID: On Wed, Nov 3, 2010 at 4:59 PM, Glyph Lefkowitz wrote: .. > ?Maybe ship with a command that says "hey, somewhere on sys.path, > there is a class with . ?Please run '$EDITOR file +line' (or the > current OS's equivalent) so I can look at the source code". > Well, we already have inspect.findsource() for that. From ncoghlan at gmail.com Wed Nov 3 23:12:01 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 4 Nov 2010 08:12:01 +1000 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: References: <20101029003858.7D584EEA52@mail.python.org> <201011021355.40153.victor.stinner@haypocalc.com> <201011031155.41814.victor.stinner@haypocalc.com> Message-ID: On Thu, Nov 4, 2010 at 1:01 AM, Benjamin Peterson wrote: > 2010/11/3 Nick Coghlan : >> On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson wrote: >>> >>> Warnings is loaded every time anyway. >> >> I would have agreed with you, but the contents of sys.modules in a >> just-started interactive interpreter suggests that isn't true any more >> (which surprised me). > > Is that perhaps because of _warnings? I suspect it's a combination of that and the patch to allow non-blocking module imports (which turns some things that would previously have been deadlocks into runtime exceptions). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From eric at trueblade.com Thu Nov 4 01:14:27 2010 From: eric at trueblade.com (Eric Smith) Date: Wed, 03 Nov 2010 20:14:27 -0400 Subject: [Python-Dev] str.format_from_mapping In-Reply-To: <4CCDD410.6020104@trueblade.com> References: <4CCDD410.6020104@trueblade.com> Message-ID: <4CD1FAE3.4020707@trueblade.com> On 10/31/10 4:39 PM, Eric Smith wrote: > What are your thoughts on adding a str.format_from_mapping (or similar > name, maybe the suggested "format_map") to 3.2? See > http://bugs.python.org/issue6081 . This method would be similar to > "%(foo)s %(bar)s" % d, where d is a dict (or rather any mapping object), > but of course would use str.format syntax: "{foo} > {bar}".format_from_mapping(d). > > I like the idea. It's particularly handy when converting from %-formatting. > > Eric. I've updated the issue with tests, minimal docs, and a name change to str.format_map. Having heard no objections and some support, I'll commit this shortly. -- Eric. From allan at archlinux.org Thu Nov 4 05:44:09 2010 From: allan at archlinux.org (Allan McRae) Date: Thu, 04 Nov 2010 14:44:09 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux Message-ID: <4CD23A19.6080002@archlinux.org> Hi, While this is not strictly related to python development, I thought that developers of python might be interested in some of the lessons provided by this. So forgive me if this is really wrong for this list... Recently Arch Linux did a big transition with respect to python. Now we support two python packages: "python" and "python2". The "python" package will always contain the latest 3.x release and currently has /usr/bin/python3.1 with symlinks to /usr/bin/python3 and /usr/bin/python. The "python2" package contains the latest from the "legacy" python-2.x branch and has /usr/bin/python2.7 with a symlink to /usr/bin/python2. I really do not want to debate the sanity of pointing /usr/bin/python at python-3.x here, but it suffices to say that I am of the opinion that if python-3.x is really the future of python, then /usr/bin/python must eventually point to a 3.x version. Also, Arch Linux is very bleeding edge and we expect our users to be competent enough to deal with thing like this. According to #python, we are all idiots.... And I have been (figuratively) yelled at by a couple of Debian developers (which is incidentally the only major distro I found without a /usr/bin/python2 symlink). Anyway, this transition was rather simple from a distribution point of view apart from the sheer number of packages involved. All our supported packages were rebuilt to work with that symlink layout and any "porting" software to use that layout was relatively simple. Most packages either detected the python2 symlink during the rebuild and just worked while others required some sort "export PYTHON=python2" or "--with-python=python2" or "python2 setup.py" or the like during the build. Software packages tend to fall into three categories at roughly equal frequencies: 1) packages that detected or were pointed at python2 and everything worked 2) packages that detected or were pointed at python2 and partially worked 3) packages that needed adjustment to work with the python2 symlink. The second case was particularly interesting. These software would change some of their #! to point at the python2 symlink and leave the rest pointing at python. Note that python-2.7 itself falls into this category as many files in /usr/lib/python2.7 still have "#!/usr/bin/env python" even when installed with "make altinstall". I can not remember the exact details, but I recall that some of these files were installed with executable permissions which would be bad, but I need to look into this again now things have calmed down... The packages that did not auto-detect and work with /usr/bin/python2 or /usr/bin/python2.7 mostly required a sed of their shebangs or a patch to any hardcoded /usr/bin/python paths so were easily fixed. So that is something that python software developers could think about for the future. When someone configures a module using a particular version of python, then ALL shebangs need changed to use that version. And it is generally bad practice to hardcode /usr/bin/python into any application as you are never quite sure which version you are getting. Instead allow it to be configured, keeping the current value as default. Cheers, Allan -- Allan McRae Arch Linux Developer From techtonik at gmail.com Thu Nov 4 07:28:11 2010 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 4 Nov 2010 08:28:11 +0200 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) Message-ID: On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz wrote: > > This is the strongest reason why I recommend to everyone I know that they > not use pickle for storage they'd like to keep working after upgrades [not > just of stdlib, but other 3rd party software or their own software]. :) > > +1. > Twisted actually tried to preserve pickle compatibility in the bad old days, > but it was impossible. ?Pickles should never really be saved to disk unless > they contain nothing but lists, ints, strings, and dicts. But what is alternative in stdlib? Don't you think that Python doesn't provide any? -- anatoly t. From victor.stinner at haypocalc.com Thu Nov 4 12:16:17 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 4 Nov 2010 12:16:17 +0100 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: References: <20101029003858.7D584EEA52@mail.python.org> Message-ID: <201011041216.17451.victor.stinner@haypocalc.com> On Wednesday 03 November 2010 23:12:01 Nick Coghlan wrote: > On Thu, Nov 4, 2010 at 1:01 AM, Benjamin Peterson wrote: > > 2010/11/3 Nick Coghlan : > >> On Wed, Nov 3, 2010 at 10:19 PM, Benjamin Peterson wrote: > >>> Warnings is loaded every time anyway. > >> > >> I would have agreed with you, but the contents of sys.modules in a > >> just-started interactive interpreter suggests that isn't true any more > >> (which surprised me). > > > > Is that perhaps because of _warnings? > > I suspect it's a combination of that and the patch to allow > non-blocking module imports (which turns some things that would > previously have been deadlocks into runtime exceptions). So do you still think that I should patch the os module to use a global import or not? Victor From exarkun at twistedmatrix.com Thu Nov 4 13:12:19 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Thu, 04 Nov 2010 12:12:19 -0000 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: <20101104121219.2040.225904071.divmod.xquotient.555@localhost.localdomain> On 06:28 am, techtonik at gmail.com wrote: >On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz > wrote: >> >>This is the strongest reason why I recommend to everyone I know that >>they >>not use pickle for storage they'd like to keep working after upgrades >>[not >>just of stdlib, but other 3rd party software or their own software]. >>:) >> >>+1. >>Twisted actually tried to preserve pickle compatibility in the bad old >>days, >>but it was impossible. ?Pickles should never really be saved to disk >>unless >>they contain nothing but lists, ints, strings, and dicts. > >But what is alternative in stdlib? >Don't you think that Python doesn't provide any? Persistence is a very hard problem. Lots and lots of trade-offs need to be made, and you generally want to tailor those trade-offs to the particular application at hand. This probably means that the stdlib isn't a suitable place to try to solve the problem. Look outside the stdlib and you'll find an extremely vibrant and diverse collection of software which is aimed at solving this problem, though. Jean-Paul From ncoghlan at gmail.com Thu Nov 4 14:33:38 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 4 Nov 2010 23:33:38 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD23A19.6080002@archlinux.org> References: <4CD23A19.6080002@archlinux.org> Message-ID: On Thu, Nov 4, 2010 at 2:44 PM, Allan McRae wrote: > The second case was particularly interesting. ?These software would change > some of their #! to point at the python2 symlink and leave the rest pointing > at python. ?Note that python-2.7 itself falls into this category as many > files in /usr/lib/python2.7 still have "#!/usr/bin/env python" even when > installed with "make altinstall". ?I can not remember the exact details, but > I recall that some of these files were installed with executable permissions > which would be bad, but I need to look into this again now things have > calmed down... > > The packages that did not auto-detect and work with /usr/bin/python2 or > /usr/bin/python2.7 mostly required a sed of their shebangs or a patch to any > hardcoded /usr/bin/python paths so were easily fixed. A very interesting exercise, indeed - especially the observation regarding software (including python itself) that supports installation under alternate names, but doesn't subsequently ensure use of that name in its shebang lines. I just did a quick grep of Lib in my py3k directory, and it looks like cgi.py is incorrectly set to use "/usr/local/bin/python", while the other files with shebang lines are set to "/usr/bin/env python3" as expected. Tools also had a few discrepancies: scripts/2to3.py: /usr/bin/env python (necessary, I think - I believe 2to3 is a 2.x only program) scripts/gprof2html.py: /usr/bin/env python32.3 (Huh? Automated correction gone wrong, perhaps?) scripts/reindent-rst.py: /usr/bin/env python (probably incorrect) pybench: /usr/bin/env python (not sure - has pybench been forward ported on the 3.x branch?) world: /usr/bin/env python (I have no idea what this script is even for) (Note that these examples are a matter of simply respecting the *default* install location for python3, without even getting into questions of altinstall or configured installation locations) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Nov 4 14:38:33 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 4 Nov 2010 23:38:33 +1000 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 4:28 PM, anatoly techtonik wrote: > On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz wrote: >> >> This is the strongest reason why I recommend to everyone I know that they >> not use pickle for storage they'd like to keep working after upgrades [not >> just of stdlib, but other 3rd party software or their own software]. :) >> >> +1. >> Twisted actually tried to preserve pickle compatibility in the bad old days, >> but it was impossible. ?Pickles should never really be saved to disk unless >> they contain nothing but lists, ints, strings, and dicts. > > But what is alternative in stdlib? > Don't you think that Python doesn't provide any? Python 3.2a3+ (py3k:85817, Oct 24 2010, 19:25:28) [GCC 4.4.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import json >>> dir(json) ['JSONDecoder', 'JSONEncoder', '__all__', '__author__', '__builtins__', '__cached__', '__doc__', '__file__', '__name__', '__package__', '__path__', '__version__', '_default_decoder', '_default_encoder', 'decoder', 'dump', 'dumps', 'encoder', 'load', 'loads', 'scanner'] pickle gets overspecific in many ways, and hence (despite our best efforts, and those of third parties) may break when changing Python versions. Serialising to something more language natural (be it JSON, YAML, XML or one of the multitude of other state encoding formats out there) is far more likely to be future proof. As a tool for communicating between different instances of the *same* version of Python though, pickle is fine. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From walter at livinglogic.de Thu Nov 4 14:19:16 2010 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 04 Nov 2010 14:19:16 +0100 Subject: [Python-Dev] Code coverage doesn't show .py stats In-Reply-To: References: Message-ID: <4CD2B2D4.60102@livinglogic.de> On 03.11.10 19:21, anatoly techtonik wrote: > Hi, > > Python code coverage doesn't include any .py files. What happened? > http://coverage.livinglogic.de/ > > Did it work before? It did, however currently the logfile http://coverage.livinglogic.de/testlog.txt shows the following exception: Traceback (most recent call last): File "Lib/test/regrtest.py", line 1500, in main() File "Lib/test/regrtest.py", line 696, in main r.write_results(show_missing=True, summary=True, coverdir=coverdir) File "/home/coverage/python/Lib/trace.py", line 319, in write_results lnotab, count) File "/home/coverage/python/Lib/trace.py", line 369, in write_results_file outfile.write(line.expandtabs(8)) UnicodeEncodeError: 'ascii' codec can't encode character '\xe4' in position 30: ordinal not in range(128) BTW, this is the py3k branch (i.e. http://svn.python.org/snapshots/python3k.tar.bz2) It seems the trace module has a problem with unicode. Servus, Walter From solipsis at pitrou.net Thu Nov 4 14:46:05 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 4 Nov 2010 14:46:05 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux References: <4CD23A19.6080002@archlinux.org> Message-ID: <20101104144605.18f4d166@pitrou.net> On Thu, 4 Nov 2010 23:33:38 +1000 Nick Coghlan wrote: > Tools also had a few discrepancies: > scripts/2to3.py: /usr/bin/env python (necessary, I think - I believe > 2to3 is a 2.x only program) > scripts/gprof2html.py: /usr/bin/env python32.3 (Huh? Automated > correction gone wrong, perhaps?) Or time machine gone wild? I think it is the version which automatically renames your classes and methods based on good taste, but still has the old assertLessEqual method at the bottom of the now 5-level deep unittest package hierarchy (while Michael enjoys his 3251st PSF community award after it was decided to make it a daily ceremony). pyclbr has been patched to handle it fine, though. Regards Antoine. From benjamin at python.org Thu Nov 4 14:59:45 2010 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 4 Nov 2010 08:59:45 -0500 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: 2010/11/4 Nick Coghlan : > On Thu, Nov 4, 2010 at 2:44 PM, Allan McRae wrote: >> The second case was particularly interesting. ?These software would change >> some of their #! to point at the python2 symlink and leave the rest pointing >> at python. ?Note that python-2.7 itself falls into this category as many >> files in /usr/lib/python2.7 still have "#!/usr/bin/env python" even when >> installed with "make altinstall". ?I can not remember the exact details, but >> I recall that some of these files were installed with executable permissions >> which would be bad, but I need to look into this again now things have >> calmed down... >> >> The packages that did not auto-detect and work with /usr/bin/python2 or >> /usr/bin/python2.7 mostly required a sed of their shebangs or a patch to any >> hardcoded /usr/bin/python paths so were easily fixed. > > A very interesting exercise, indeed - especially the observation > regarding software (including python itself) that supports > installation under alternate names, but doesn't subsequently ensure > use of that name in its shebang lines. > > I just did a quick grep of Lib in my py3k directory, and it looks like > cgi.py is incorrectly set to use "/usr/local/bin/python", while the > other files with shebang lines are set to "/usr/bin/env python3" as > expected. > > Tools also had a few discrepancies: > ?scripts/2to3.py: /usr/bin/env python (necessary, I think - I believe > 2to3 is a 2.x only program) No, I believe distutils is supposed to patch that up, though. -- Regards, Benjamin From ocean-city at m2.ccsnet.ne.jp Thu Nov 4 15:09:39 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Thu, 04 Nov 2010 23:09:39 +0900 Subject: [Python-Dev] [Python-checkins] r85987 - python/branches/py3k/Lib/test/test_os.py In-Reply-To: References: <20101030212421.717CCEE986@mail.python.org> <4CCEE678.7040705@m2.ccsnet.ne.jp> Message-ID: <4CD2BEA3.9010705@m2.ccsnet.ne.jp> On 2010/11/02 1:30, Nick Coghlan wrote: > On Tue, Nov 2, 2010 at 2:10 AM, Hirokazu Yamamoto > wrote: >> Does this really cause resource warning? I think os.popen instance >> won't be into traceback because it's not declared as variable. So I >> suppose it will be deleted by reference count == 0 even when exception >> occurs. > > Any time __del__ has to close the resource triggers ResourceWarning, > regardless of whether that is due to the cyclic garbage collector or > the refcount naturally falling to zero. In the past dealing with this > was clumsy, so it made sense to rely on CPython's refcounting to do > the work. However, we have better tools for deterministic resource > management now (in the form of context managers), so these updates > help make the standard library and its test suite more suitable for > use with non-refcounting Python implementations (such as PyPy, Jython > and IronPython). > > Cheers, > Nick. > Thank you for reply. Probably this is difficult problem. I often use with statement, but it's also true sometimes I feel this warning is a bit noisy. Is there a way to turn this off? C:\Documents and Settings\Ocean>py3k Python 3.2a3+ (py3k, Nov 3 2010, 00:27:28) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> open("a.py").read() __main__:1: ResourceWarning: unclosed file <_io.TextIOWrapper name='a.py' encodi ng='cp932'> '\nimport timeit\n\nt = timeit.Timer("""\nos.stat("e:/voltest/lnk")\n""", """\ni mport os\n""")\n\nprint(t.timeit(1000))\n\n' [49593 refs] From solipsis at pitrou.net Thu Nov 4 15:23:58 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 4 Nov 2010 15:23:58 +0100 Subject: [Python-Dev] [Python-checkins] r85987 - python/branches/py3k/Lib/test/test_os.py References: <20101030212421.717CCEE986@mail.python.org> <4CCEE678.7040705@m2.ccsnet.ne.jp> <4CD2BEA3.9010705@m2.ccsnet.ne.jp> Message-ID: <20101104152358.679f53b8@pitrou.net> On Thu, 04 Nov 2010 23:09:39 +0900 Hirokazu Yamamoto wrote: > On 2010/11/02 1:30, Nick Coghlan wrote: > > On Tue, Nov 2, 2010 at 2:10 AM, Hirokazu Yamamoto > > wrote: > >> Does this really cause resource warning? I think os.popen instance > >> won't be into traceback because it's not declared as variable. So I > >> suppose it will be deleted by reference count == 0 even when exception > >> occurs. > > > > Any time __del__ has to close the resource triggers ResourceWarning, > > regardless of whether that is due to the cyclic garbage collector or > > the refcount naturally falling to zero. In the past dealing with this > > was clumsy, so it made sense to rely on CPython's refcounting to do > > the work. However, we have better tools for deterministic resource > > management now (in the form of context managers), so these updates > > help make the standard library and its test suite more suitable for > > use with non-refcounting Python implementations (such as PyPy, Jython > > and IronPython). > > > > Cheers, > > Nick. > > > > Thank you for reply. Probably this is difficult problem. I often > use with statement, but it's also true sometimes I feel this warning is > a bit noisy. Is there a way to turn this off? You can use all the usual means of controlling emission of warnings, so for example "python -Wi" would work to silence them all. Also, ResourceWarning is silenced by default in "release" builds. Regards Antoine. From ocean-city at m2.ccsnet.ne.jp Thu Nov 4 15:41:04 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Thu, 04 Nov 2010 23:41:04 +0900 Subject: [Python-Dev] [Python-checkins] r85987 - python/branches/py3k/Lib/test/test_os.py In-Reply-To: <20101104152358.679f53b8@pitrou.net> References: <20101030212421.717CCEE986@mail.python.org> <4CCEE678.7040705@m2.ccsnet.ne.jp> <4CD2BEA3.9010705@m2.ccsnet.ne.jp> <20101104152358.679f53b8@pitrou.net> Message-ID: <4CD2C600.6000908@m2.ccsnet.ne.jp> On 2010/11/04 23:23, Antoine Pitrou wrote: > You can use all the usual means of controlling emission of warnings, so > for example "python -Wi" would work to silence them all. > Also, ResourceWarning is silenced by default in "release" builds. > > Regards > > Antoine. Thank you, this works. (I couldn't find the way from "python --help") From guido at python.org Thu Nov 4 15:51:17 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 4 Nov 2010 07:51:17 -0700 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: > On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz wrote: >> This is the strongest reason why I recommend to everyone I know that they >> not use pickle for storage they'd like to keep working after upgrades [not >> just of stdlib, but other 3rd party software or their own software]. :) >> >> +1. >> Twisted actually tried to preserve pickle compatibility in the bad old days, >> but it was impossible. ?Pickles should never really be saved to disk unless >> they contain nothing but lists, ints, strings, and dicts. But *that* set of types can safely be marshalled using the marshal module... -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Thu Nov 4 15:57:39 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 4 Nov 2010 10:57:39 -0400 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 10:51 AM, Guido van Rossum wrote: .. >>> Twisted actually tried to preserve pickle compatibility in the bad old days, >>> but it was impossible. ?Pickles should never really be saved to disk unless >>> they contain nothing but lists, ints, strings, and dicts. > > But *that* set of types can safely be marshalled using the marshal module... Not if the instances contain reference cycles. From ncoghlan at gmail.com Thu Nov 4 15:59:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 5 Nov 2010 00:59:22 +1000 Subject: [Python-Dev] [Python-checkins] r85902 - in python/branches/py3k/Lib: os.py test/test_os.py In-Reply-To: <201011041216.17451.victor.stinner@haypocalc.com> References: <20101029003858.7D584EEA52@mail.python.org> <201011041216.17451.victor.stinner@haypocalc.com> Message-ID: On Thu, Nov 4, 2010 at 9:16 PM, Victor Stinner wrote: > So do you still think that I should patch the os module to use a global import > or not? I'm actually more inclined to suggest we avoid triggering the warning under -bb in the first place by iterating over the environment in that case instead of using the mapping interface. (I was going to suggest a smarter version that used a SafeKey class instead, but it turns out os.environ only works with real string objects). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Thu Nov 4 16:29:00 2010 From: barry at python.org (Barry Warsaw) Date: Thu, 4 Nov 2010 11:29:00 -0400 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD23A19.6080002@archlinux.org> References: <4CD23A19.6080002@archlinux.org> Message-ID: <20101104112900.6ab53bd5@mission> On Nov 04, 2010, at 02:44 PM, Allan McRae wrote: >While this is not strictly related to python development, I thought that >developers of python might be interested in some of the lessons provided by >this. So forgive me if this is really wrong for this list... > >Recently Arch Linux did a big transition with respect to python. Now we >support two python packages: "python" and "python2". Very cool to hear about this first hand, thanks for posting it Allan. I was recently at the Ubuntu Developers Summit and Arch Linux's transition was a source of several hallway discussions. It's good to read about your work and successes in blazing that trail. >I really do not want to debate the sanity of pointing /usr/bin/python at >python-3.x here, but it suffices to say that I am of the opinion that if >python-3.x is really the future of python, then /usr/bin/python must >eventually point to a 3.x version. Also, Arch Linux is very bleeding edge >and we expect our users to be competent enough to deal with thing like this. >According to #python, we are all idiots.... And I have been (figuratively) >yelled at by a couple of Debian developers (which is incidentally the only >major distro I found without a /usr/bin/python2 symlink). Ah too bad, no one needs to yell :). It's an interesting discussion topic though and it's something I think other distros should start considering. In Ubuntu 11.04, we'll have Python 3.1 and 3.2, 2.6 and 2.7, with the default (i.e. /usr/bin/python) either at 2.6 (probable) or 2.7 (possible). `python3` currently points to 3.1.2, but we've talked about getting that to 3.2 for this cycle. Ubuntu's next Long Term Support release is scheduled for April 2012. It's an ambitious but worthy goal to see if we can transition to Python 3 as the default Python by then. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Thu Nov 4 16:31:36 2010 From: barry at python.org (Barry Warsaw) Date: Thu, 4 Nov 2010 11:31:36 -0400 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: <20101104113136.0c1147b6@mission> On Nov 04, 2010, at 11:33 PM, Nick Coghlan wrote: > world: /usr/bin/env python (I have no idea what this script is even for) It's basically a front-end to ISO 3166 country codes. IOW, it prints the expansion of top-level domain names and can do some reverse lookups too. E.g. % Tools/world/world us us originated from United States I once started to rip it out into a separate package but haven't gotten too far with that. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From techtonik at gmail.com Thu Nov 4 17:15:57 2010 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 4 Nov 2010 18:15:57 +0200 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 3:38 PM, Nick Coghlan wrote: > On Thu, Nov 4, 2010 at 4:28 PM, anatoly techtonik wrote: >> On Wed, Nov 3, 2010 at 9:08 PM, Glyph Lefkowitz wrote: >>> >>> This is the strongest reason why I recommend to everyone I know that they >>> not use pickle for storage they'd like to keep working after upgrades [not >>> just of stdlib, but other 3rd party software or their own software]. :) >>> >>> +1. >>> Twisted actually tried to preserve pickle compatibility in the bad old days, >>> but it was impossible. ?Pickles should never really be saved to disk unless >>> they contain nothing but lists, ints, strings, and dicts. >> >> But what is alternative in stdlib? >> Don't you think that Python doesn't provide any? > > Python 3.2a3+ (py3k:85817, Oct 24 2010, 19:25:28) > [GCC 4.4.3] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> import json >>>> dir(json) > ['JSONDecoder', 'JSONEncoder', '__all__', '__author__', > '__builtins__', '__cached__', '__doc__', '__file__', '__name__', > '__package__', '__path__', '__version__', '_default_decoder', > '_default_encoder', 'decoder', 'dump', 'dumps', 'encoder', 'load', > 'loads', 'scanner'] > > pickle gets overspecific in many ways, and hence (despite our best > efforts, and those of third parties) may break when changing Python > versions. Serialising to something more language natural (be it JSON, > YAML, XML or one of the multitude of other state encoding formats out > there) is far more likely to be future proof. > > As a tool for communicating between different instances of the *same* > version of Python though, pickle is fine. pickle is insecure, marshal too. What about JSON? IIUC you need a definition of a class to be able to unserialize it in all cases. I wonder how is this definition validated, i.e. what to watch for when modifying classes that can be serialized. -- anatoly t. From guido at python.org Thu Nov 4 17:49:39 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 4 Nov 2010 09:49:39 -0700 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 9:15 AM, anatoly techtonik wrote: > pickle is insecure, marshal too. What's the attack you're thinking of on marshal? It never executes any code while unmarshalling (although it can unmarshal code objects -- but the receiving program has to do something additionally to execute those). > What about JSON? IIUC you need a > definition of a class to be able to unserialize it in all cases. I > wonder how is this definition validated, i.e. what to watch for when > modifying classes that can be serialized. Security is all in the code used to deserialize. I haven't analyzed the json library that comes in the stdlib these days, but couldn't it in theory be as safe as XML? (Not that there haven't been any attacks on XML -- but they depended on bugs in the unmarshalling code, the format itself is not insecure.) -- --Guido van Rossum (python.org/~guido) From thomas at python.org Thu Nov 4 18:27:55 2010 From: thomas at python.org (Thomas Wouters) Date: Thu, 4 Nov 2010 18:27:55 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD23A19.6080002@archlinux.org> References: <4CD23A19.6080002@archlinux.org> Message-ID: On Thu, Nov 4, 2010 at 05:44, Allan McRae wrote: > According to #python, we are all idiots.... > To clarify (but I dont speak for the rest of #python, just myself), I think the move was premature, but I don't use Arch and I don't know what typical Arch users expect. The reason I think it's premature is that 'python2' just doesn't work everywhere, and I would have gone for a transitionary period where '/usr/bin/python' is something that screams loudly that it shouldn't be used before it executes 'python2'. That would've allowed for more time to fix things that use the wrong shebang line, or tools that use 'python' instead of letting distutils set it for them. I hope that's something other distributions will consider before changing the meaning of /usr/bin/python. As for #python, well, we got this storm of people utterly confused about how their stuff doesn't work anymore, and putting the blame in the wrong place. I don't think a distribution should ever cause that (even though many do in lesser ways) -- but as I said, I don't use Arch so maybe I don't understand the purpose of it. The complaints seem to have died down now (though possibly because of the 'no arch' topic :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Thu Nov 4 21:12:41 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Thu, 04 Nov 2010 21:12:41 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: <4CD313B9.8070202@v.loewis.de> > To clarify (but I dont speak for the rest of #python, just myself), I > think the move was premature, but I don't use Arch and I don't know what > typical Arch users expect. The reason I think it's premature is that > 'python2' just doesn't work everywhere, and I would have gone for a > transitionary period where '/usr/bin/python' is something that screams > loudly that it shouldn't be used before it executes 'python2'. I really do think the key point here is "don't know what typical Arch users expect". I don't know either, but my personal feeling is that Arch isn't that widely used, but ISTM that Arch users are expected to be technically advanced, compared to the wider community of Linux users. So if these user find a problem, they might know how to fix it, and they might know how to make bug reports. In essence, you are asking that there should be a smoother path to making /usr/bin/python Python 3 - and I observe that Arch's switching actually is a very useful step on that smoother path. If they figure out what changes to make, many of the changes may have been done when other Linux distributions just start to consider the change. > As for #python, well, we got this storm of people utterly confused about > how their stuff doesn't work anymore, and putting the blame in the wrong > place. I don't think a distribution should ever cause that (even though > many do in lesser ways) -- but as I said, I don't use Arch so maybe I > don't understand the purpose of it. The complaints seem to have died > down now (though possibly because of the 'no arch' topic :) So apparently, there is quite a number of Arch users, and they do make bug reports. Good :-) If this gets attributed correctly (i.e. as a deliberate decision by Arch, revealing bugs in many packages that have long existed), and if Google picks the canonical resolution quickly, I don't think any harm is done - and in the long run, it will smooth the migration for everybody else. Regards, Martin From glyph at twistedmatrix.com Thu Nov 4 21:25:47 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Thu, 4 Nov 2010 16:25:47 -0400 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: On Nov 4, 2010, at 12:49 PM, Guido van Rossum wrote: > What's the attack you're thinking of on marshal? It never executes any > code while unmarshalling (although it can unmarshal code objects -- > but the receiving program has to do something additionally to execute > those). These issues may have been fixed now, but a long time ago I recall seeing some nasty segfaults which looked exploitable when feeding marshal malformed data. If they still exist, running a fuzzer on some pyc files should reveal them pretty quickly. When I ran across them I didn't think much of them, and probably did not even report the bug, since marshal is mostly used to load code anyway, which is implicitly trusted. -------------- next part -------------- An HTML attachment was scrubbed... URL: From doko at ubuntu.com Thu Nov 4 22:08:07 2010 From: doko at ubuntu.com (Matthias Klose) Date: Thu, 04 Nov 2010 22:08:07 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD313B9.8070202@v.loewis.de> References: <4CD23A19.6080002@archlinux.org> <4CD313B9.8070202@v.loewis.de> Message-ID: <4CD320B7.70100@ubuntu.com> On 04.11.2010 21:12, "Martin v. L?wis" wrote: >> To clarify (but I dont speak for the rest of #python, just myself), I >> think the move was premature, but I don't use Arch and I don't know what >> typical Arch users expect. The reason I think it's premature is that >> 'python2' just doesn't work everywhere, and I would have gone for a >> transitionary period where '/usr/bin/python' is something that screams >> loudly that it shouldn't be used before it executes 'python2'. Iirc, it was an explicit decision made at the 2009 language summit not to introduce a python2 symlink, but using python3 for python3.x instead. Debian/Ubuntu don't ship a python2 symlink by intent. Did the plans change, i.e. are there plans to provide a python symlink for python 3.x altinstall in a future release, e.g in 3.4 or 3.5? Matthias From thomas at python.org Thu Nov 4 22:21:18 2010 From: thomas at python.org (Thomas Wouters) Date: Thu, 4 Nov 2010 22:21:18 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD313B9.8070202@v.loewis.de> References: <4CD23A19.6080002@archlinux.org> <4CD313B9.8070202@v.loewis.de> Message-ID: On Thu, Nov 4, 2010 at 21:12, "Martin v. L?wis" wrote: > > As for #python, well, we got this storm of people utterly confused about > > how their stuff doesn't work anymore, and putting the blame in the wrong > > place. I don't think a distribution should ever cause that (even though > > many do in lesser ways) -- but as I said, I don't use Arch so maybe I > > don't understand the purpose of it. The complaints seem to have died > > down now (though possibly because of the 'no arch' topic :) > > So apparently, there is quite a number of Arch users, and they do make > bug reports. Good :-) > I don't know that they do. I just know that people came to #python and complained, which is unfortunately something completely different. (We did ask every single one to take it up with the right forum, and I know at least one person did file a bug, but that's about it.) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Nov 4 22:24:09 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 4 Nov 2010 14:24:09 -0700 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: On Thu, Nov 4, 2010 at 1:25 PM, Glyph Lefkowitz wrote: > On Nov 4, 2010, at 12:49 PM, Guido van Rossum wrote: > > What's the attack you're thinking of on marshal? It never executes any > code while unmarshalling (although it can unmarshal code objects -- > but the receiving program has to do something additionally to execute > those). > > These issues may have been fixed now, but a long time ago I recall seeing > some nasty segfaults which looked exploitable when feeding marshal malformed > data. ?If they still exist, running a fuzzer on some pyc files should reveal > them pretty quickly. > > When I ran across them I didn't think much of them, and probably did not > even report the bug, since marshal is mostly used to load code anyway, which > is implicitly trusted. I'm not sure that all these were fixed but it would be a finite (and probably small) amount of work to get it fixed -- unlike fixing pickling, which is impossible (unless you implemented some kind of sandboxing solution :-). A good use for pickling is when it's optional. Example: putting pickles in memcache. The source of the pickles is (presumably) trusted, so the only remaining problem is occasional version skew. If the unpickling fails it can just be treated as a cache miss. (Tricky: when unpickling succeeds but returns a broken object. "Nobody's perfect." :-) -- --Guido van Rossum (python.org/~guido) From lvh at laurensvh.be Thu Nov 4 23:40:46 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Thu, 4 Nov 2010 23:40:46 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD23A19.6080002@archlinux.org> References: <4CD23A19.6080002@archlinux.org> Message-ID: On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae wrote: > According to #python, we are all idiots.... I realize this is not really what your message was about and for sake of brevity you used a bit of a hyperbole, but like Thomas I would still like to nip in right there. #python is a pretty big channel. I think everyone understands that reducing it in its entirety to a single opinion as inflammatory as "you're all idiots" is at best oversimplifying and at worst offensive. (FWIW, Thomas has already said a bunch of stuff I completely agree with, so +1 everything he said.) What is true is that there's a new and temporary "NO ARCH" rule in the topic, and it's the for the same reason there's a "NO LOL" in the topic: to keep the signal to noise ratio high. Apparently there is a large number of packages (or perhaps just commonly used ones) either in Arch itself or AUR that didn't work anymore. This caused a lot of people to complain about problems that are actually Arch-specific problems: not really something #python is there for nor something it is good at helping with. That wouldn't be helping people with Python, that would be helping people with Arch. It is not intended as, and should not be interpreted as, some kind of public "declaration of war" against Arch. It simply means that #python isn't going to do Arch-specific support for packages that no longer work after an update, since that's not our job nor expertise. I don't think grudges or misunderstandings help anyone, and Python in particular, further along. I think I've demonstrated that I'm eager to get rid of them before. If you (or anyone else for that matter) are worried about behavior or policy in #python in the future (I assure you there's really not as much as people generally seem to think there is) and would like clarification, there's an easy way to access a list of the ops: /msg chanserv access #python list Or just shout "are there any ops on" in #python whenever you like. These people should be able to tell you what you want to know or at least point you to the right person to ask. But basically, to reiterate a point I've made a bunch of times and have already made (not to you in particular, just in general): #python is a bunch of people, please don't extrapolate the opinions of a few to the opinions of many. It's easy and tempting, but it often leads to demonizing a bunch of people and putting words in people's mouths which they didn't say or even agree with. cheers and good luck lvh From allan at archlinux.org Fri Nov 5 01:19:59 2010 From: allan at archlinux.org (Allan McRae) Date: Fri, 05 Nov 2010 10:19:59 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: <4CD34DAF.2040601@archlinux.org> On 05/11/10 08:40, Laurens Van Houtven wrote: > On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae wrote: >> According to #python, we are all idiots.... > > I realize this is not really what your message was about and for sake > of brevity you used a bit of a hyperbole, but like Thomas I would > still like to nip in right there. #python is a pretty big channel. I > think everyone understands that reducing it in its entirety to a > single opinion as inflammatory as "you're all idiots" is at best > oversimplifying and at worst offensive. Of course, and I was not intending to offend here. It was more of a running commentary on the unintended influx of Arch Linux users to the channel and some of the responses posted to them (some of which I found rather amusing when forwarded to me - especially the early response as people were figuring out what was going on). I also agree with the "NO ARCH" topic at the moment. I was fairly surprised so many people went to #python for help given we had made news posts and had a topic in our IRC channel pointing to how to start fixing issues. Allan From jcea at jcea.es Fri Nov 5 01:26:31 2010 From: jcea at jcea.es (Jesus Cea) Date: Fri, 05 Nov 2010 01:26:31 +0100 Subject: [Python-Dev] Help with warnings not being raised Message-ID: <4CD34F37.2020504@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all. I just committed r86180, but there is something I don't like. If you read the tests I did (by hand)at http://bugs.python.org/issue9675#msg120462 , python should show the unraisable and THEN the "C API unavailable" warning, but it is not showing the warning. I don't know why. I have committed the patch because it solves the original bug, but I am pretty uncomfy not knowing what Python is not doing exactly what I want... Any idea?. Sorry for wasting your time with probably trivial stuff, but I need to know... :-? PS: I am using "PyErr_Warn()", that is deprecated, because this code should work in Python 2.3 too. I tried "PyErr_WarnEx()" too, it didn't work either. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNNPN5lgi5GaxT1NAQKt9gP/VbS1Fycg7SBBy9nTFhvw7NOAOSnxM8Dt B0Wq/I9Rnr+YOYGpCIvgVut8CuqT3oVRtRPeBnajjinEo7rcZSi79rQlcMcNq1VS JwQELp9bd3Az5Xmbpf+FeKNBE8K+1bpezAcGHv/QTsPXSIsU+fTH1sKwXoj9S4Vg CMUXTP9InkE= =6LVa -----END PGP SIGNATURE----- From benjamin at python.org Fri Nov 5 01:36:25 2010 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 4 Nov 2010 19:36:25 -0500 Subject: [Python-Dev] Help with warnings not being raised In-Reply-To: <4CD34F37.2020504@jcea.es> References: <4CD34F37.2020504@jcea.es> Message-ID: 2010/11/4 Jesus Cea : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi all. I just committed r86180, but there is something I don't like. > > If you read the tests I did (by hand)at > http://bugs.python.org/issue9675#msg120462 , python should show the > unraisable and THEN the "C API unavailable" warning, but it is not > showing the warning. > > I don't know why. Are you passing -3 -Wall? -- Regards, Benjamin From marc at gsites.de Fri Nov 5 01:21:41 2010 From: marc at gsites.de (Marcel Hellkamp) Date: Fri, 05 Nov 2010 01:21:41 +0100 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: <4CD34E15.80408@gsites.de> Am 04.11.2010 17:15, schrieb anatoly techtonik: > pickle is insecure, marshal too. If the transport or storage layer is not save, you should cryptographically sign the data anyway:: def pickle_encode(data, key): msg = base64.b64encode(pickle.dumps(data, -1)) sig = base64.b64encode(hmac.new(key, msg).digest()) return sig + ':' + msg def pickle_decode(data, key): if data and ':' in data: sig, msg = data.split(':', 1) if sig == base64.b64encode(hmac.new(key, msg).digest()): return pickle.loads(base64.b64decode(msg)) raise pickle.UnpicklingError("Wrong or missing signature.") Bottle (a web framework) uses a similar approach to store non-string data in client-side cookies. I don't see a (security) problem here. -- Mit freundlichen Gr??en Marcel Hellkamp From stephen at xemacs.org Fri Nov 5 01:43:30 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 05 Nov 2010 09:43:30 +0900 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Thomas Wouters writes: > To clarify (but I dont speak for the rest of #python, just myself), I think > the move was premature, but I don't use Arch and I don't know what typical > Arch users expect. All of the Arch users I know expect Arch to occasionally do radical things because they're the right things to do in the long run. But every avant garde distribution picks up its share of wannabes who don't understand how the process works. > The reason I think it's premature is that 'python2' just doesn't > work everywhere, and I would have gone for a transitionary period > where '/usr/bin/python' is something that screams loudly that it > shouldn't be used before it executes 'python2'. This is unrealistic. It would seriously annoy Arch's intended audience. (Eg, recently I've become a lot more favorable to using Word instead of OOo because Word doesn't pop up a useless warning every time I save a .doc file.) Practically speaking, it would have to be off by default, like Python pending deprecation warnings. Anyway, I bet that anybody capable of upgrading their *Arch* packages and complaining to *#python* about resulting breakage would be capable of complaining to #python about the weird warning about python2. And you can't have a NO /USR/BIN/PYTHON topic, can you? > As for #python, well, we got this storm of people utterly confused > about how their stuff doesn't work anymore, and putting the blame > in the wrong place. How so? Ultimately, Guido is responsible for this. Sure, the immediate symptom was caused by Arch's action, but Python 3 *is* rather incompatible with Python 2. You're going to get a storm every time a distro changes, and in a year or two, it's no longer going to be something you can dispose of by setting a hotkey to "Google for 'BOGUS Linux python'" -- it's going to be stuff that requires a real understanding of how Python 3 differs from Python 2, and often will be pretty subtle. > I don't think a distribution should ever cause that (even though > many do in lesser ways) Sure, and Guido should have exercised the Time Machine a little harder so that Python 3 never needed to happen. IOW, this is the price of success and wide distribution. BTW, I hope the next distribution make the jump does try your suggestion to make /usr/bin/python scream. It might work, even work well. From devin.c.cook at gmail.com Fri Nov 5 01:52:05 2010 From: devin.c.cook at gmail.com (Devin Cook) Date: Thu, 4 Nov 2010 19:52:05 -0500 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD34DAF.2040601@archlinux.org> References: <4CD23A19.6080002@archlinux.org> <4CD34DAF.2040601@archlinux.org> Message-ID: On Thu, Nov 4, 2010 at 7:19 PM, Allan McRae wrote: > I also agree with the "NO ARCH" topic at the moment. I was fairly surprised > so many people went to #python for help given we had made news posts and had > a topic in our IRC channel pointing to how to start fixing issues. > > Allan I don't remember seeing any warning about it during the upgrade. That may have helped people (ones that read the warnings, at least) figure out what was going on. I think a warning from /usr/bin/python may have helped as well, but I do suppose might be a bit extreme. FWIW, I found those news posts and the Python wiki page pretty quickly after I realized my scripts weren't working anymore. -Devin From steve at pearwood.info Fri Nov 5 01:56:05 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 05 Nov 2010 11:56:05 +1100 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: <4CD35625.8050206@pearwood.info> Nick Coghlan wrote: > As a tool for communicating between different instances of the *same* > version of Python though, pickle is fine. I'm using pickle to pass a list and dict of floats and strings from Python 2.6 to 3.1. I've never had any problems with it. Am I living in a state of sin or is that okay? -- Steven From foom at fuhm.net Fri Nov 5 02:04:53 2010 From: foom at fuhm.net (James Y Knight) Date: Thu, 4 Nov 2010 21:04:53 -0400 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 4, 2010, at 8:43 PM, Stephen J. Turnbull wrote: > All of the Arch users I know expect Arch to occasionally do radical > things because they're the right things to do in the long run. But the previous consensus (at least, as I, and presumably many other people understood it) was that python2 would remain the owner of the name "/usr/bin/python" for the indefinite future, and python3 would be invoked with /usr/bin/python3. Given that, it's not at all clear that Arch's actions are the right thing to do. IMO, moving away from that consensus should've been brought up on python-dev rather than just one distro just doing it all alone, causing incompatibilities and annoyance. If python-dev wants python3 to inherit the name /usr/bin/python, then python2 should've been installing a binary called /usr/bin/python2 for a couple years ahead of time, and recommending that everyone use that in their #! lines, so that the switch could've been done without breaking everything... > Sure, and Guido should have exercised the Time Machine a little harder > so that Python 3 never needed to happen. IOW, this is the price of > success and wide distribution. Well, other programming languages seem to have avoided making sweeping bidirectionally-incompatible changes, despite being successful and widely distributed. But that's a whole other discussion. James From jcea at jcea.es Fri Nov 5 02:12:55 2010 From: jcea at jcea.es (Jesus Cea) Date: Fri, 05 Nov 2010 02:12:55 +0100 Subject: [Python-Dev] Help with warnings not being raised In-Reply-To: References: <4CD34F37.2020504@jcea.es> Message-ID: <4CD35A17.6080704@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/11/10 01:36, Benjamin Peterson wrote: >> I don't know why. > > Are you passing -3 -Wall? I am passing "-3 -Werror", to induce the error control I have committed. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNNaF5lgi5GaxT1NAQJt0wQAk29q0qXXDzBINyamo7I3ehD75U185165 UXYvCw987N74ye+CyUfpkTENjFkxY+cUMEmTmB0N/mhGblveHFLbaC0Kz831SWGM OdNDi6tBQB1CpyxeOyhQN4e5NzabljoFc7XuLh32rbYY15dqYnZXYgUaXZ+8W84t DXiK08P0TrU= =g732 -----END PGP SIGNATURE----- From jcea at jcea.es Fri Nov 5 02:20:56 2010 From: jcea at jcea.es (Jesus Cea) Date: Fri, 05 Nov 2010 02:20:56 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD23A19.6080002@archlinux.org> References: <4CD23A19.6080002@archlinux.org> Message-ID: <4CD35BF8.7030501@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 04/11/10 05:44, Allan McRae wrote: > The second case was particularly interesting. These software would > change some of their #! to point at the python2 symlink and leave the > rest pointing at python. Note that python-2.7 itself falls into this > category as many files in /usr/lib/python2.7 still have "#!/usr/bin/env > python" even when installed with "make altinstall". I can not remember > the exact details, but I recall that some of these files were installed > with executable permissions which would be bad, but I need to look into > this again now things have calmed down... PLEASE, open a bug with this. It is a serious bug. "make altinstall" *SHOULD* be respected. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNNb+Jlgi5GaxT1NAQIScgP+NFE98E98rFOMh85RJBdFT3U3nwWNktz8 2uYBPQq5p28eQ6LY6TNkv4/iSIoF++o40xuveuuL+1ys7I/QRne0P/Wipspr2eLZ oOMkwNVfrYnaX0MV/pu750uqsh62dQZIqxe9oWtD4FS00gHgqtfIvlI/EZYkVy0m WwzT9zBo5Lw= =fQx7 -----END PGP SIGNATURE----- From jcea at jcea.es Fri Nov 5 02:31:05 2010 From: jcea at jcea.es (Jesus Cea) Date: Fri, 05 Nov 2010 02:31:05 +0100 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: References: Message-ID: <4CD35E59.9050008@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 04/11/10 15:57, Alexander Belopolsky wrote: > On Thu, Nov 4, 2010 at 10:51 AM, Guido van Rossum wrote: > .. >>>> Twisted actually tried to preserve pickle compatibility in the bad old days, >>>> but it was impossible. Pickles should never really be saved to disk unless >>>> they contain nothing but lists, ints, strings, and dicts. >> >> But *that* set of types can safely be marshalled using the marshal module... > > Not if the instances contain reference cycles. Moreover, in the docs the marshall module EXPLICITLY says that the format is undocumented on purpose, and subject to change. Seems a pretty bad option for persistence, if you expect to read your data back in the future. http://docs.python.org/library/marshal.html """ This module contains functions that can read and write Python values in a binary format. The format is specific to Python, but independent of machine architecture issues (e.g., you can write a Python value to a file on a PC, transport the file to a Sun, and read it back there). Details of the format are undocumented on purpose; it may change between Python versions (although it rarely does). """ - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNNeWZlgi5GaxT1NAQLU/wP/aZOJkgYWHyYMUT0diDsh+pFg6TxDPWMu /cNx5l6wNaW8DQ5cuMSHkelfVYpx6EQwTCZPu9jiAAOJmFfNURt1Q+P4ikf5eobA 7mRlFrr+C3Lmi9CA3thuwBh4IkLHUl3mk6eQ0mqJPzpdbJLWhPmkOEN7L31nk9// YQHdU4e795U= =+cB0 -----END PGP SIGNATURE----- From steve at pearwood.info Fri Nov 5 02:40:10 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 05 Nov 2010 12:40:10 +1100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CD3607A.3090701@pearwood.info> James Y Knight wrote: > But the previous consensus (at least, as I, and presumably many other > people understood it) was that python2 would remain the owner of the > name "/usr/bin/python" for the indefinite future, and python3 would > be invoked with /usr/bin/python3. > > Given that, it's not at all clear that Arch's actions are the right > thing to do. The time will come when even Python 2.7 is long obsolete. I think it is silly to insist that people invoke python3 to run their Python 3.7 scripts. Arch might be jumping the gun a little, or even a lot, but sooner or later it should be done. Besides, this is another sign that the Python 3 haters are wrong. We now have a distro that has made Python 3 the standard system python. It might be a bleeding-edge distro not recommended for non-experts, but it's still pretty cool that *somebody* has done it. > IMO, moving away from that consensus should've been brought up on > python-dev rather than just one distro just doing it all alone, > causing incompatibilities and annoyance. We're all adults here. If Arch wants to live on the bleeding edge, more power to them. That's why my server runs Centos :) -- Steven From exarkun at twistedmatrix.com Fri Nov 5 05:09:35 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Fri, 05 Nov 2010 04:09:35 -0000 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: <4CD34E15.80408@gsites.de> References: <4CD34E15.80408@gsites.de> Message-ID: <20101105040935.2040.490777150.divmod.xquotient.622@localhost.localdomain> On 12:21 am, marc at gsites.de wrote: >Am 04.11.2010 17:15, schrieb anatoly techtonik: > > pickle is insecure, marshal too. > >If the transport or storage layer is not save, you should >cryptographically sign the data anyway:: > > def pickle_encode(data, key): > msg = base64.b64encode(pickle.dumps(data, -1)) > sig = base64.b64encode(hmac.new(key, msg).digest()) > return sig + ':' + msg > > def pickle_decode(data, key): > if data and ':' in data: > sig, msg = data.split(':', 1) > if sig == base64.b64encode(hmac.new(key, msg).digest()): > return pickle.loads(base64.b64decode(msg)) > raise pickle.UnpicklingError("Wrong or missing signature.") > >Bottle (a web framework) uses a similar approach to store non-string >data in client-side cookies. I don't see a (security) problem here. Your pickle_decode leaks information about the key. An attacker will eventually (a few seconds to a few minutes, depending on how they have access to this system) be able to determine your key and send you arbitrary pickles (ie, execute arbitrary code on your system). Oops. This stuff is hard. If you're going to mess around with it, make sure you're *serious* (better approach: don't mess around with it). Jean-Paul From bob at redivi.com Fri Nov 5 05:21:57 2010 From: bob at redivi.com (Bob Ippolito) Date: Fri, 5 Nov 2010 12:21:57 +0800 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: <20101105040935.2040.490777150.divmod.xquotient.622@localhost.localdomain> References: <4CD34E15.80408@gsites.de> <20101105040935.2040.490777150.divmod.xquotient.622@localhost.localdomain> Message-ID: On Friday, November 5, 2010, wrote: > On 12:21 am, marc at gsites.de wrote: > > Am 04.11.2010 17:15, schrieb anatoly techtonik: >> pickle is insecure, marshal too. > > If the transport or storage layer is not save, you should cryptographically sign the data anyway:: > > ? ?def pickle_encode(data, key): > ? ? ? ?msg = base64.b64encode(pickle.dumps(data, -1)) > ? ? ? ?sig = base64.b64encode(hmac.new(key, msg).digest()) > ? ? ? ?return sig + ':' + msg > > ? ?def pickle_decode(data, key): > ? ? ? ?if data and ':' in data: > ? ? ? ? ? ?sig, msg = data.split(':', 1) > ? ? ? ? ? ?if sig == base64.b64encode(hmac.new(key, msg).digest()): > ? ? ? ? ? ? ? ?return pickle.loads(base64.b64decode(msg)) > ? ? ? ?raise pickle.UnpicklingError("Wrong or missing signature.") > > Bottle (a web framework) uses a similar approach to store non-string data in client-side cookies. I don't see a (security) problem here. > > > Your pickle_decode leaks information about the key. ?An attacker will eventually (a few seconds to a few minutes, depending on how they have access to this system) be able to determine your key and send you arbitrary pickles (ie, execute arbitrary code on your system). > > Oops. > > This stuff is hard. ?If you're going to mess around with it, make sure you're *serious* (better approach: don't mess around with it). Specifically you need to use a constant time signature verification or else there are possible timing attacks. Sounds like something a hmac module should provide in the first place. But yeah, this stuff is hard, better to just not have a code execution hole in the first place. -bob From mcrae_allan at hotmail.com Fri Nov 5 04:14:02 2010 From: mcrae_allan at hotmail.com (Allan McRae) Date: Fri, 05 Nov 2010 13:14:02 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD35BF8.7030501@jcea.es> References: <4CD23A19.6080002@archlinux.org> <4CD35BF8.7030501@jcea.es> Message-ID: <4CD3767A.4000603@hotmail.com> On 05/11/10 11:20, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 04/11/10 05:44, Allan McRae wrote: >> The second case was particularly interesting. These software would >> change some of their #! to point at the python2 symlink and leave the >> rest pointing at python. Note that python-2.7 itself falls into this >> category as many files in /usr/lib/python2.7 still have "#!/usr/bin/env >> python" even when installed with "make altinstall". I can not remember >> the exact details, but I recall that some of these files were installed >> with executable permissions which would be bad, but I need to look into >> this again now things have calmed down... > > PLEASE, open a bug with this. It is a serious bug. "make altinstall" > *SHOULD* be respected. > Done: http://bugs.python.org/issue10318 Allan From thomas at python.org Fri Nov 5 09:47:18 2010 From: thomas at python.org (Thomas Wouters) Date: Fri, 5 Nov 2010 09:47:18 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Fri, Nov 5, 2010 at 01:43, Stephen J. Turnbull wrote: > Thomas Wouters writes: > > > To clarify (but I dont speak for the rest of #python, just myself), I > think > > the move was premature, but I don't use Arch and I don't know what > typical > > Arch users expect. > > All of the Arch users I know expect Arch to occasionally do radical > things because they're the right things to do in the long run. But > every avant garde distribution picks up its share of wannabes who > don't understand how the process works. > > > The reason I think it's premature is that 'python2' just doesn't > > work everywhere, and I would have gone for a transitionary period > > where '/usr/bin/python' is something that screams loudly that it > > shouldn't be used before it executes 'python2'. > > This is unrealistic. It would seriously annoy Arch's intended > audience. (Eg, recently I've become a lot more favorable to using > Word instead of OOo because Word doesn't pop up a useless warning > every time I save a .doc file.) Practically speaking, it would have > to be off by default, like Python pending deprecation warnings. > Wait, what? Warning about impending brokenness is *more annoying* than just plain breaking? How on earth would the warning be "useless"? Keep in mind that the warning would only show up *if stuff would otherwise not work*. Anyway, I bet that anybody capable of upgrading their *Arch* packages > and complaining to *#python* about resulting breakage would be capable > of complaining to #python about the weird warning about python2. And > you can't have a NO /USR/BIN/PYTHON topic, can you? > Any change is disruptive. My comment wasn't about the crowd of people visiting #python and complaining, it was about the decision to change /usr/bin/python, and how it was done. However, a warning with a clear description -- for example, a link to a webpage explaining the situation -- would most assuredly have prevented many people from coming to #python in desperation. They might still have *complained*, in #python or elsewhere, but it would have been a lot clearer. > > > As for #python, well, we got this storm of people utterly confused > > about how their stuff doesn't work anymore, and putting the blame > > in the wrong place. > > How so? Ultimately, Guido is responsible for this. Sure, the immediate symptom was caused by Arch's action, but Python 3 *is* > rather incompatible with Python 2. You're going to get a storm every > time a distro changes, and in a year or two, it's no longer going to > be something you can dispose of by setting a hotkey to "Google for > 'BOGUS Linux python'" -- it's going to be stuff that requires a real > understanding of how Python 3 differs from Python 2, and often will be > pretty subtle. > > I don't think a distribution should ever cause that (even though > > many do in lesser ways) > > Sure, and Guido should have exercised the Time Machine a little harder > so that Python 3 never needed to happen. IOW, this is the price of > success and wide distribution. > No, that's not my point at all. The problem isn't that Python 3 is incompatible with Python 2. The problem is that stuff broke without (apparently) fair warning. This isn't a Python thing, this is a distribution thing: for users of a distribution, having a clear, usable migration path for incompatible changes is *important*. For users, not packagers, this means you have to slap them in the face with upcoming incompatible changes, or they won't notice. It may not be important for Arch, or for the users Arch expects to have, but it sure as hell is important to me and every sysadmin I know :) -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From allan at archlinux.org Fri Nov 5 11:25:35 2010 From: allan at archlinux.org (Allan McRae) Date: Fri, 05 Nov 2010 20:25:35 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CD3DB9F.3070407@archlinux.org> On 05/11/10 18:47, Thomas Wouters wrote: > No, that's not my point at all. The problem isn't that Python 3 is > incompatible with Python 2. The problem is that stuff broke without > (apparently) fair warning. Just to clarify (and going way off topic for this list...), this was discussed on the Arch Linux mailing lists around six months in advance, then again about two months beforehand when the rebuild started. Then it sat in our testing repository for a month where issues were discussed on our mailing lists and forums. Also a news post was made on our website front page before moving it into our main repos. With a rolling release distro we do not have the luxury of making release notes every major release so we make it abundantly clear to our users that we expect them to at least always read the front page news before updating. There are even wrapper scripts for our package manager that print the news headlines before updating. So there was warning. As always, it was just ignored by a portion of our users. Allan From ncoghlan at gmail.com Fri Nov 5 13:55:58 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 5 Nov 2010 22:55:58 +1000 Subject: [Python-Dev] Help with warnings not being raised In-Reply-To: <4CD35A17.6080704@jcea.es> References: <4CD34F37.2020504@jcea.es> <4CD35A17.6080704@jcea.es> Message-ID: On Fri, Nov 5, 2010 at 11:12 AM, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/11/10 01:36, Benjamin Peterson wrote: >>> I don't know why. >> >> Are you passing -3 -Wall? > > I am passing "-3 -Werror", to induce the error control I have committed. Under -We, PyErr_Warn raises an exception rather than printing to stdout. That exception is clobbered by the immediately following call to PyErr_Clear. Since you *only* hit that branch under -We in the first place, a second call to PyErr_WriteUnraisable should get the error to actually print out. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Nov 5 14:18:45 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 5 Nov 2010 23:18:45 +1000 Subject: [Python-Dev] Pickle alternative in stdlib (Was: On breaking modules into packages) In-Reply-To: <4CD35625.8050206@pearwood.info> References: <4CD35625.8050206@pearwood.info> Message-ID: On Fri, Nov 5, 2010 at 10:56 AM, Steven D'Aprano wrote: > Nick Coghlan wrote: > >> As a tool for communicating between different instances of the *same* >> version of Python though, pickle is fine. > > I'm using pickle to pass a list and dict of floats and strings from Python > 2.6 to 3.1. I've never had any problems with it. Am I living in a state of > sin or is that okay? Builtins are generally fine, and we do try reasonably hard to keep the pickle formats properly compatible across versions. It's corner cases (like pickling unittest objects) that may sometimes break, since pickle implicitly depends on things that *should* be disregarded as implementation details. Specifically, without explicit directions to do otherwise, pickle encodes objects based on what they *are*, which may include implementation details, such as optional acceleration modules, platform specific variants of classes returned by a factory function, etc. Technically such things are bugs in an object's pickling support, but they're *really* non-obvious (and genuinely harmless in most cases). As I see it, there are at least 3 levels of pickling support: 1. Complete, version independent (implementation details are weeded out from the pickle, or deliberately kept the same across versions to preserve pickle compatibility) 2. Partial, potentially version dependent (pickles may be infected with implementation details that affect cross-version compatibility if they happen to change) 3. None (can't even pickle it in the first place) Builtins are in category 1, but there are plenty of things in the standard library (like unittest classes) that rely on default pickling behaviour and hence fit into category 2 (we just very, very rarely move anything around, so such classes may as well be in category 1 most of the time). Notably, this mostly causes problems when reading pickles generated with a *new* version of Python in an *old* version. When going the other way, we can adjust the unpickling process to cope with any discrepancies (for the "relying on implementation details case", usually by the simple expedient of keeping both sets of names around). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Nov 5 15:17:51 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 6 Nov 2010 00:17:51 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: On Thu, Nov 4, 2010 at 11:59 PM, Benjamin Peterson wrote: > 2010/11/4 Nick Coghlan : >> Tools also had a few discrepancies: >> ?scripts/2to3.py: /usr/bin/env python (necessary, I think - I believe >> 2to3 is a 2.x only program) > > No, I believe distutils is supposed to patch that up, though. Yeah, I did a more thorough grep and the ready-to-install version of 2to3.py has a correctly updated shebang line. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Fri Nov 5 17:09:38 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 06 Nov 2010 01:09:38 +0900 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87sjzfhhvh.fsf@uwakimon.sk.tsukuba.ac.jp> Thomas Wouters writes: > > This is unrealistic. It would seriously annoy Arch's intended > > audience. (Eg, recently I've become a lot more favorable to using > > Word instead of OOo because Word doesn't pop up a useless warning > > every time I save a .doc file.) Practically speaking, it would have > > to be off by default, like Python pending deprecation warnings. > > Wait, what? Warning about impending brokenness is *more annoying* than just > plain breaking? How on earth would the warning be "useless"? > Keep in mind that the warning would only show up *if stuff would otherwise > not work*. As I understood it, what you proposed was that in a *Python 2-based* distribution thinking about switching to Python 3 as the default /usr/bin/python, they should first substitute a bitch'n'run-python2 script for the python (Python 2) binary, and after that works the bugs out, switch to Python 3. In that scenario, the bitching is useful *exactly* once: the first time anybody reports the bug to someone who can do something about it. But for some time, *every time* you run your app, it bitches uselessly: it would work fine if you just install Python 2 as /usr/bin/python, without bitching. That's not very graceful. And "some time" will often stretch into weeks or months for any given user, since few distros will bless a new package with zero testing. > No, that's not my point at all. The problem isn't that Python 3 is > incompatible with Python 2. The problem is that stuff broke without > (apparently) fair warning. Warning was given; they weren't listening. From thomas at python.org Fri Nov 5 17:58:30 2010 From: thomas at python.org (Thomas Wouters) Date: Fri, 5 Nov 2010 17:58:30 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <87sjzfhhvh.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> <87sjzfhhvh.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Fri, Nov 5, 2010 at 17:09, Stephen J. Turnbull wrote: > Thomas Wouters writes: > > > > This is unrealistic. It would seriously annoy Arch's intended > > > audience. (Eg, recently I've become a lot more favorable to using > > > Word instead of OOo because Word doesn't pop up a useless warning > > > every time I save a .doc file.) Practically speaking, it would have > > > to be off by default, like Python pending deprecation warnings. > > > > Wait, what? Warning about impending brokenness is *more annoying* than > just > > plain breaking? How on earth would the warning be "useless"? > > Keep in mind that the warning would only show up *if stuff would > otherwise > > not work*. > > As I understood it, what you proposed was that in a *Python 2-based* > distribution thinking about switching to Python 3 as the default > /usr/bin/python, they should first substitute a bitch'n'run-python2 > script for the python (Python 2) binary, and after that works the bugs > out, switch to Python 3. > > In that scenario, the bitching is useful *exactly* once: the first > time anybody reports the bug to someone who can do something about it. > But for some time, *every time* you run your app, it bitches > uselessly: it would work fine if you just install Python 2 as > /usr/bin/python, without bitching. That's not very graceful. And > "some time" will often stretch into weeks or months for any given > user, since few distros will bless a new package with zero testing. > No, what I suggested was that *instead of changing /usr/bin/python to Python 3*, it would produce a warning. So, as before, change everything you know about to python2. Keep everything that is python3 using python3. Change /usr/bin/python, which *should* now be unused, to something that complains. Since all the distribution-installed packages were changed, the only warnings will come from invocations that would otherwise have spectacularly and possibly quite confusingly blown up. As I said, the warning can provide clear instructions on updating the software. Heck, the /usr/bin/python wrapper could be made to be quiet for a day at a time by having the user press a button. > > No, that's not my point at all. The problem isn't that Python 3 is > > incompatible with Python 2. The problem is that stuff broke without > > (apparently) fair warning. > > Warning was given; they weren't listening. > Yes, that's what users do. They don't look at the websites or read the mailinglists, they just care that their stuff keeps working and they don't want to pay the maintenance cost :) I'm not saying Arch should have done this, but most Linux distributions do *not* have attentive users. This is not news. I would rather we stay with an explicit 'python3' for another decade (as, after all, Perl did with perl5 as well) than that more people are confronted with the switch to python3 by having their own code break. -- Thomas Wouters Hi! I'm a .signature virus! copy me into your .signature file to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri Nov 5 18:08:19 2010 From: status at bugs.python.org (Python tracker) Date: Fri, 5 Nov 2010 18:08:19 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2514 (+17) closed 19597 (+78) total 22111 (+95) Open issues with patches: 1044 Issues opened (56) ================== #5251: contextlib.nested inconsistent with, well, nested with stateme http://bugs.python.org/issue5251 reopened by ncoghlan #10236: Sporadic failures of test_ssl http://bugs.python.org/issue10236 opened by ixokai #10237: failure in Barrier tests http://bugs.python.org/issue10237 opened by pitrou #10238: ctypes not building under OS X 10.6 with LLVM/Clang 2.8 http://bugs.python.org/issue10238 opened by brett.cannon #10239: multiprocessing signal defect http://bugs.python.org/issue10239 opened by Neal.Becker #10240: dict.update.__doc__ is misleading http://bugs.python.org/issue10240 opened by ivank #10241: gc fixes for module m_copy attribute http://bugs.python.org/issue10241 opened by nascheme #10242: unittest's assertItemsEqual() method makes too many assumption http://bugs.python.org/issue10242 opened by rhettinger #10245: Fix resource warnings in test_telnetlib http://bugs.python.org/issue10245 opened by bbrazil #10248: Fix resource warnings in test_xmlrpclib http://bugs.python.org/issue10248 opened by bbrazil #10252: Fix resource warnings in distutils http://bugs.python.org/issue10252 opened by bbrazil #10254: unicodedata.normalize('NFC', s) regression http://bugs.python.org/issue10254 opened by valhallasw #10255: refleak in initstdio http://bugs.python.org/issue10255 opened by nascheme #10259: Entry text not set if all of 'Font', 'Foreground' and 'Justify http://bugs.python.org/issue10259 opened by iarspider #10260: Add a threading.Condition.wait_for() method http://bugs.python.org/issue10260 opened by krisvale #10261: tarfile iterator without members caching http://bugs.python.org/issue10261 opened by karstenw #10262: Add --disable-abi-flags option to `configure` http://bugs.python.org/issue10262 opened by Arfrever #10267: test_ttk_guionly leaks many references http://bugs.python.org/issue10267 opened by pitrou #10270: Fix resource warnings in test_threading http://bugs.python.org/issue10270 opened by bbrazil #10271: warnings.showwarning should allow any callable object http://bugs.python.org/issue10271 opened by lekma #10272: SSL handshake timeouts not caught by transient_internet http://bugs.python.org/issue10272 opened by pitrou #10273: Clean-up Unittest API http://bugs.python.org/issue10273 opened by rhettinger #10274: imaplib should provide a means to validate a remote server ssl http://bugs.python.org/issue10274 opened by db #10276: zlib crc32/adler32 buffer length truncation (64-bit) http://bugs.python.org/issue10276 opened by nvawda #10278: add time.wallclock() method http://bugs.python.org/issue10278 opened by krisvale #10282: IMPLEMENTATION token differently delt with in NNTP capability http://bugs.python.org/issue10282 opened by jelie #10284: NNTP should accept bytestrings for username and password http://bugs.python.org/issue10284 opened by jelie #10287: NNTP authentication should check capabilities http://bugs.python.org/issue10287 opened by jelie #10288: Remove deprecated C "character" handling macros ISUPPER() etc http://bugs.python.org/issue10288 opened by dmalcolm #10289: Document magic methods called by built-in functions http://bugs.python.org/issue10289 opened by eric.araujo #10291: Clean-up turtledemo in-package documentation http://bugs.python.org/issue10291 opened by belopolsky #10292: tarinfo should use relative symlinks http://bugs.python.org/issue10292 opened by magcius #10296: ctypes catches BreakPoint error on windows 32 http://bugs.python.org/issue10296 opened by krisvale #10297: decimal module documentation is misguiding http://bugs.python.org/issue10297 opened by hafiza.jameel #10298: zipfile: incorrect comment size will prevent extracting http://bugs.python.org/issue10298 opened by rep #10299: Add index with links section for built-in functions http://bugs.python.org/issue10299 opened by nestor #10302: Add class-functions to hash many small objects with hashlib http://bugs.python.org/issue10302 opened by ebfe #10303: small inconsistency in tutorial http://bugs.python.org/issue10303 opened by maltehelmert #10304: error in tutorial triple-string example http://bugs.python.org/issue10304 opened by maltehelmert #10305: Cleanup up ResourceWarnings in multiprocessing http://bugs.python.org/issue10305 opened by brian.curtin #10308: Modules/getpath.c bugs http://bugs.python.org/issue10308 opened by hfuru #10309: dlmalloc.c needs _GNU_SOURCE for mremap() http://bugs.python.org/issue10309 opened by hfuru #10310: signed:1 bitfields rarely make sense http://bugs.python.org/issue10310 opened by hfuru #10311: Signal handlers must preserve errno http://bugs.python.org/issue10311 opened by hfuru #10312: intcatcher() can deadlock http://bugs.python.org/issue10312 opened by hfuru #10318: "make altinstall" installs many files with incorrect shebangs http://bugs.python.org/issue10318 opened by allan #10319: SocketServer.TCPServer truncates responses on close (in some s http://bugs.python.org/issue10319 opened by jrodman2 #10320: printf %qd is nonstandard http://bugs.python.org/issue10320 opened by hfuru #10321: Add support for Message objects and binary data to smtplib.sen http://bugs.python.org/issue10321 opened by r.david.murray #10323: Final state of underlying sequence in islice http://bugs.python.org/issue10323 opened by shashank #10324: Modules/binascii.c: simplify expressions http://bugs.python.org/issue10324 opened by nikai #10325: PY_LLONG_MAX & co - preprocessor constants or not? http://bugs.python.org/issue10325 opened by hfuru #10326: Can't pickle unittest.TestCase instances http://bugs.python.org/issue10326 opened by michael.foord #10327: Abnormal SSL timeouts when using socket timeouts - once again http://bugs.python.org/issue10327 opened by pakal #10328: re.sub[n] doesn't seem to handle /Z replacements correctly in http://bugs.python.org/issue10328 opened by Alexander.Schmolck #10329: trace.py and unicode in Python 3 http://bugs.python.org/issue10329 opened by doerwalter Most recent 15 issues with no replies (15) ========================================== #10329: trace.py and unicode in Python 3 http://bugs.python.org/issue10329 #10328: re.sub[n] doesn't seem to handle /Z replacements correctly in http://bugs.python.org/issue10328 #10326: Can't pickle unittest.TestCase instances http://bugs.python.org/issue10326 #10325: PY_LLONG_MAX & co - preprocessor constants or not? http://bugs.python.org/issue10325 #10324: Modules/binascii.c: simplify expressions http://bugs.python.org/issue10324 #10321: Add support for Message objects and binary data to smtplib.sen http://bugs.python.org/issue10321 #10320: printf %qd is nonstandard http://bugs.python.org/issue10320 #10319: SocketServer.TCPServer truncates responses on close (in some s http://bugs.python.org/issue10319 #10312: intcatcher() can deadlock http://bugs.python.org/issue10312 #10310: signed:1 bitfields rarely make sense http://bugs.python.org/issue10310 #10309: dlmalloc.c needs _GNU_SOURCE for mremap() http://bugs.python.org/issue10309 #10308: Modules/getpath.c bugs http://bugs.python.org/issue10308 #10303: small inconsistency in tutorial http://bugs.python.org/issue10303 #10298: zipfile: incorrect comment size will prevent extracting http://bugs.python.org/issue10298 #10297: decimal module documentation is misguiding http://bugs.python.org/issue10297 Most recent 15 issues waiting for review (15) ============================================= #10329: trace.py and unicode in Python 3 http://bugs.python.org/issue10329 #10324: Modules/binascii.c: simplify expressions http://bugs.python.org/issue10324 #10321: Add support for Message objects and binary data to smtplib.sen http://bugs.python.org/issue10321 #10312: intcatcher() can deadlock http://bugs.python.org/issue10312 #10311: Signal handlers must preserve errno http://bugs.python.org/issue10311 #10310: signed:1 bitfields rarely make sense http://bugs.python.org/issue10310 #10308: Modules/getpath.c bugs http://bugs.python.org/issue10308 #10299: Add index with links section for built-in functions http://bugs.python.org/issue10299 #10298: zipfile: incorrect comment size will prevent extracting http://bugs.python.org/issue10298 #10292: tarinfo should use relative symlinks http://bugs.python.org/issue10292 #10288: Remove deprecated C "character" handling macros ISUPPER() etc http://bugs.python.org/issue10288 #10278: add time.wallclock() method http://bugs.python.org/issue10278 #10276: zlib crc32/adler32 buffer length truncation (64-bit) http://bugs.python.org/issue10276 #10270: Fix resource warnings in test_threading http://bugs.python.org/issue10270 #10267: test_ttk_guionly leaks many references http://bugs.python.org/issue10267 Top 10 most discussed issues (10) ================================= #10273: Clean-up Unittest API http://bugs.python.org/issue10273 19 msgs #10284: NNTP should accept bytestrings for username and password http://bugs.python.org/issue10284 18 msgs #2636: Regexp 2.7 (modifications to current re 2.2.2) http://bugs.python.org/issue2636 16 msgs #1926: NNTPS support in nntplib http://bugs.python.org/issue1926 12 msgs #7061: Improve 24.5. turtle doc http://bugs.python.org/issue7061 12 msgs #9611: FileIO not 64-bit safe under Windows http://bugs.python.org/issue9611 11 msgs #10278: add time.wallclock() method http://bugs.python.org/issue10278 11 msgs #9377: socket, PEP 383: Mishandling of non-ASCII bytes in host/domain http://bugs.python.org/issue9377 10 msgs #10181: Problems with Py_buffer management in memoryobject.c (and else http://bugs.python.org/issue10181 10 msgs #10311: Signal handlers must preserve errno http://bugs.python.org/issue10311 10 msgs Issues closed (78) ================== #3699: test_bigaddrspace broken http://bugs.python.org/issue3699 closed by pitrou #4403: regression from 2.6: smtplib.py requiring ascii for sending me http://bugs.python.org/issue4403 closed by r.david.murray #4510: ValueError for list.remove() not very helpful http://bugs.python.org/issue4510 closed by benjamin.peterson #5573: multiprocessing Pipe poll() and recv() semantics. http://bugs.python.org/issue5573 closed by asksol #5729: Allows tabs for indenting JSON output http://bugs.python.org/issue5729 closed by rhettinger #6081: str.format_map() http://bugs.python.org/issue6081 closed by eric.smith #6706: asyncore's accept() is broken http://bugs.python.org/issue6706 closed by giampaolo.rodola #7059: 'checking getaddrinfo bug' doesn't output the result during ./ http://bugs.python.org/issue7059 closed by benjamin.peterson #7266: test_lib2to3 failure under Windows http://bugs.python.org/issue7266 closed by benjamin.peterson #7402: Improve reduce example in doanddont.rst http://bugs.python.org/issue7402 closed by rhettinger #7447: Sum() doc and behavior mismatch http://bugs.python.org/issue7447 closed by rhettinger #7547: test_timeout should skip, not fail, when the remote host is no http://bugs.python.org/issue7547 closed by pitrou #7826: support caching for 2to3 http://bugs.python.org/issue7826 closed by benjamin.peterson #9340: argparse parse_known_args does not work with subparsers http://bugs.python.org/issue9340 closed by bethard #9352: argparse eats characters when parsing multiple merged short op http://bugs.python.org/issue9352 closed by bethard #9353: argparse __all__ is incomplete http://bugs.python.org/issue9353 closed by bethard #9355: argparse add_mutually_exclusive_group more than once has incor http://bugs.python.org/issue9355 closed by bethard #9553: test_argparse.py: 80 failures if COLUMNS env var set to a valu http://bugs.python.org/issue9553 closed by bethard #9675: segfault: PyDict_SetItem: Assertion `value' failed. http://bugs.python.org/issue9675 closed by jcea #9733: Can't iterate over multiprocessing.managers.DictProxy http://bugs.python.org/issue9733 closed by asksol #9779: argparse.ArgumentParser not support unicode in print help http://bugs.python.org/issue9779 closed by bethard #9886: Make operator.itemgetter/attrgetter/methodcaller easier to dis http://bugs.python.org/issue9886 closed by rhettinger #9926: Wrapped TestSuite subclass does not get __call__ executed http://bugs.python.org/issue9926 closed by michael.foord #9981: let make_buildinfo use a temporary directory on windows http://bugs.python.org/issue9981 closed by krisvale #10025: random.seed not initialized as advertised http://bugs.python.org/issue10025 closed by rhettinger #10038: json.loads() on str should return unicode, not str http://bugs.python.org/issue10038 closed by barry #10110: Queue doesn't recognize it is full after shrinking maxsize http://bugs.python.org/issue10110 closed by rhettinger #10157: Refleaks in pythonrun.c http://bugs.python.org/issue10157 closed by ocean-city #10160: operator.attrgetter slower than lambda after adding dotted nam http://bugs.python.org/issue10160 closed by pitrou #10171: Ugly buttons in some Tkinter objects in Windows http://bugs.python.org/issue10171 closed by eric.araujo #10173: Don't pickle TestCase instances in test_multiprocessing http://bugs.python.org/issue10173 closed by pitrou #10177: PyUnicode_AsWideCharString and PyMem_Free http://bugs.python.org/issue10177 closed by terry.reedy #10184: tarfile touches directories twice http://bugs.python.org/issue10184 closed by loewis #10199: Move Demo/turtle under Lib/ http://bugs.python.org/issue10199 closed by belopolsky #10221: {}.pop('a') raises non-standard KeyError exception http://bugs.python.org/issue10221 closed by rhettinger #10230: test_tarfile failure (test_extractall) on AMD64 debian paralle http://bugs.python.org/issue10230 closed by georg.brandl #10233: fix test_tarfile ResourceWarnings http://bugs.python.org/issue10233 closed by pitrou #10235: test_argparse depends on the COLUMNS environment variable http://bugs.python.org/issue10235 closed by pitrou #10243: Packaged Pythons http://bugs.python.org/issue10243 closed by loewis #10244: PEP100 has broken links http://bugs.python.org/issue10244 closed by fijall #10246: uu.encode fd leak if arguments are filenames http://bugs.python.org/issue10246 closed by pitrou #10247: mold builder http://bugs.python.org/issue10247 closed by pitrou #10249: Fix resource warnings in test_unicodedata http://bugs.python.org/issue10249 closed by pitrou #10250: Fix resource warnings in test_urllib2_localnet http://bugs.python.org/issue10250 closed by pitrou #10251: Fix resource warnings in test_file http://bugs.python.org/issue10251 closed by pitrou #10253: Fix fd leak in fileio.c and test resource warnings http://bugs.python.org/issue10253 closed by pitrou #10257: Fix resource warnings in test_os http://bugs.python.org/issue10257 closed by brian.curtin #10258: Fix resource warnings in test_tokenize http://bugs.python.org/issue10258 closed by brian.curtin #10263: "python -m site" does not print path details http://bugs.python.org/issue10263 closed by ned.deily #10264: Fix resource warnings in test_smtplib http://bugs.python.org/issue10264 closed by benjamin.peterson #10265: Fix fd leak in sunau http://bugs.python.org/issue10265 closed by pitrou #10266: uu.decode fd leak if in_file is a filename http://bugs.python.org/issue10266 closed by pitrou #10268: Add --enable-loadable-sqlite-extensions option to `configure` http://bugs.python.org/issue10268 closed by benjamin.peterson #10269: Fix some resource warnings in test_sax http://bugs.python.org/issue10269 closed by benjamin.peterson #10275: how to know that a module is a module, a function is a functio http://bugs.python.org/issue10275 closed by brian.curtin #10277: sax leaks a fd if source is a filename http://bugs.python.org/issue10277 closed by benjamin.peterson #10279: test_gc failure on Windows x64 http://bugs.python.org/issue10279 closed by pitrou #10280: nntp_version set to the most recent advertised version http://bugs.python.org/issue10280 closed by pitrou #10281: Exception raised when an NNTP overview field is absent http://bugs.python.org/issue10281 closed by pitrou #10283: New parameter for an NNTP newsgroup pattern in LIST ACTIVE http://bugs.python.org/issue10283 closed by pitrou #10285: Other status field flags in documentation for NNTP LIST comman http://bugs.python.org/issue10285 closed by pitrou #10286: URLOpener => URLopener x2 in fix_urllib.py http://bugs.python.org/issue10286 closed by georg.brandl #10290: Fix resource warnings in distutils http://bugs.python.org/issue10290 closed by brian.curtin #10293: PyMemoryView object has obsolete members http://bugs.python.org/issue10293 closed by pitrou #10294: Lib/test/test_unicode_file.py contains dead code http://bugs.python.org/issue10294 closed by brett.cannon #10295: _socket.pyd uses winsock2, select.pyd uses winsock 1 http://bugs.python.org/issue10295 closed by krisvale #10300: Documentation of three PyDict_* functions http://bugs.python.org/issue10300 closed by benjamin.peterson #10301: Zipfile cannot be used in "with" Statement http://bugs.python.org/issue10301 closed by benjamin.peterson #10306: Weakref callback exceptions should be turned into warnings. http://bugs.python.org/issue10306 closed by oddthinking #10307: compile error in readline.c http://bugs.python.org/issue10307 closed by orsenthil #10313: Reassure user: test_os BytesWarning is OK http://bugs.python.org/issue10313 closed by r.david.murray #10314: Improve JSON encoding with sort_keys=True http://bugs.python.org/issue10314 closed by pitrou #10315: smtplib.SMTP_SSL new in 2.6 http://bugs.python.org/issue10315 closed by georg.brandl #10316: tkFileDialog.askopenfilenames scrambling multiple file selecti http://bugs.python.org/issue10316 closed by ned.deily #10317: Add TurtleShell to turtle http://bugs.python.org/issue10317 closed by rhettinger #10322: sys.argv and quoted arguments on command line http://bugs.python.org/issue10322 closed by fcr #960325: "--require " option for configure/make (fail if buil http://bugs.python.org/issue960325 closed by terry.reedy #10256: Fix resource warnings in test_pkgimport http://bugs.python.org/issue10256 closed by brian.curtin From debatem1 at gmail.com Fri Nov 5 18:10:34 2010 From: debatem1 at gmail.com (geremy condra) Date: Fri, 5 Nov 2010 10:10:34 -0700 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven wrote: > On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae wrote: > What is true is that there's a new and temporary "NO ARCH" rule in the > topic It's your channel and you can do with it what you want, but seriously- does this strike you as the best response to a widespread problem? You're basically telling people to get lost, and in all caps no less. Geremy Condra From fuzzyman at voidspace.org.uk Fri Nov 5 18:14:02 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Fri, 05 Nov 2010 17:14:02 +0000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: <4CD43B5A.3070003@voidspace.org.uk> On 05/11/2010 17:10, geremy condra wrote: > On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven wrote: >> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae wrote: > > >> What is true is that there's a new and temporary "NO ARCH" rule in the >> topic > It's your channel and you can do with it what you want, Actually it's a PSF run channel. > but seriously- > does this strike you as the best response to a widespread problem? > You're basically telling people to get lost, and in all caps no less. > They're saying that the channel isn't the correct place to get support on that particular issue. All the best, Michael > Geremy Condra > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From j.reid at mail.cryst.bbk.ac.uk Fri Nov 5 18:17:43 2010 From: j.reid at mail.cryst.bbk.ac.uk (John Reid) Date: Fri, 05 Nov 2010 17:17:43 +0000 Subject: [Python-Dev] *** glibc detected *** gdb: malloc(): smallbin double linked list Message-ID: Hi, I've compiled Python 2.7 (r27:82500, Nov 2 2010, 09:00:37) [GCC 4.4.3] on linux2 with the following configure options ./configure --prefix=/home/john/local/python-dbg --with-pydebug I've installed numpy and some other packages but when I try to run my extension code under gdb I get the errors below. Does anyone have any ideas of how to track down what's happening here? I imagine I've misconfigured something somewhere. Is valgrind the answer? Thanks, John. *** glibc detected *** gdb: malloc(): smallbin double linked list corrupted: 0x0000000004de7ad0 *** ======= Backtrace: ========= /lib/libc.so.6(+0x775b6)[0x7f0a252215b6] /lib/libc.so.6(+0x7b8e9)[0x7f0a252258e9] /lib/libc.so.6(__libc_malloc+0x6e)[0x7f0a2522658e] gdb(xmalloc+0x18)[0x45bc38] gdb[0x476df1] gdb[0x474c9b] gdb[0x474ee8] gdb(execute_command+0x2dd)[0x458d1d] gdb(catch_exception+0x50)[0x535510] gdb[0x4b5191] gdb(interp_exec+0x17)[0x535637] gdb(mi_cmd_interpreter_exec+0x6c)[0x4b9adc] gdb[0x4ba71a] gdb(catch_exception+0x50)[0x535510] gdb(mi_execute_command+0x97)[0x4ba137] gdb[0x53a0f8] gdb(gdb_do_one_event+0x29a)[0x53b38a] gdb(catch_errors+0x5b)[0x53531b] gdb(start_event_loop+0x1e)[0x53a90e] gdb[0x44f619] gdb(catch_errors+0x5b)[0x53531b] gdb[0x450166] gdb(catch_errors+0x5b)[0x53531b] gdb(gdb_main+0x24)[0x44f554] gdb(main+0x2e)[0x44f51e] /lib/libc.so.6(__libc_start_main+0xfd)[0x7f0a251c8c4d] gdb[0x44f429] ======= Memory map: ======== 00400000-00818000 r-xp 00000000 08:05 4832730 /usr/bin/gdb 00a17000-00a18000 r--p 00417000 08:05 4832730 /usr/bin/gdb 00a18000-00a25000 rw-p 00418000 08:05 4832730 /usr/bin/gdb 00a25000-00a43000 rw-p 00000000 00:00 0 0287f000-0b920000 rw-p 00000000 00:00 0 [heap] 7f0a1c000000-7f0a1c021000 rw-p 00000000 00:00 0 7f0a1c021000-7f0a20000000 ---p 00000000 00:00 0 7f0a20fc0000-7f0a20fd6000 r-xp 00000000 08:05 3498245 /lib/libgcc_s.so.1 7f0a20fd6000-7f0a211d5000 ---p 00016000 08:05 3498245 /lib/libgcc_s.so.1 7f0a211d5000-7f0a211d6000 r--p 00015000 08:05 3498245 /lib/libgcc_s.so.1 7f0a211d6000-7f0a211d7000 rw-p 00016000 08:05 3498245 /lib/libgcc_s.so.1 7f0a211fd000-7f0a21211000 r--p 000dc000 08:05 4825848 /usr/lib/libstdc++.so.6.0.13 7f0a21211000-7f0a21218000 r--p 00018000 08:05 4841756 /usr/lib/debug/lib/librt-2.11.1.so 7f0a21218000-7f0a21226000 r--p 00001000 08:05 4841756 /usr/lib/debug/lib/librt-2.11.1.so 7f0a21226000-7f0a2123e000 r--p 000bc000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a2123e000-7f0a21287000 r--p 003dd000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a21287000-7f0a21299000 r--p 00425000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a21299000-7f0a213e7000 r--p 0018c000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a213e7000-7f0a2152f000 r--p 0207c000 08:05 4653324 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/_stempy.so 7f0a2152f000-7f0a22027000 r--p 01585000 08:05 4653324 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/_stempy.so 7f0a22027000-7f0a22400000 rw-p 00000000 00:00 0 7f0a22408000-7f0a224d1000 r--p 00315000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a224d1000-7f0a224ff000 r--p 002e8000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a224ff000-7f0a22526000 r--p 00038000 08:05 4653310 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/myrrh_pylib-d 7f0a22526000-7f0a2259c000 r--p 01510000 08:05 4653324 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/_stempy.so 7f0a2259c000-7f0a2280c000 r--p 012a0000 08:05 4653324 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/_stempy.so 7f0a2280c000-7f0a2343f000 rw-p 00000000 00:00 0 7f0a23443000-7f0a2344c000 r--p 0001a000 08:05 6169643 /home/john/local/python-dbg/lib/python2.7/lib-dynload/datetime.so 7f0a2344c000-7f0a2345c000 r--p 002d9000 08:05 4653290 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_python.so.1.44.0 7f0a2345c000-7f0a23461000 r--p 0005e000 08:05 4653310 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/myrrh_pylib-d 7f0a23461000-7f0a23477000 r--p 0001f000 08:05 4653310 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/myrrh_pylib-d 7f0a23477000-7f0a2347d000 r--p 00004000 08:05 4653095 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/libboost_system.so.1.44.0 7f0a2347d000-7f0a2350c000 r--p 00757000 08:05 4653324 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/_stempy.so 7f0a2350c000-7f0a23555000 r--p 021c3000 08:05 4653324 /home/john/Dev/MyProjects/Bio/MotifSearch/python/stempy/_debug/_stempy.so 7f0a23555000-7f0a2355b000 r--p 00048000 08:05 6169627 /home/john/local/python-dbg/lib/python2.7/lib-dynload/_ctypes.so 7f0a2355b000-7f0a2356f000 r--p 0002d000 08:05 6169627 /home/john/local/python-dbg/lib/python2.7/lib-dynload/_ctypes.so 7f0a2356f000-7f0a23575000 r--p 000b1000 08:05 3489898 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/random/mtrand.so 7f0a23575000-7f0a2357c000 r--p 000ab000 08:05 3489898 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/random/mtrand.so 7f0a2357c000-7f0a2358d000 r--p 0009b000 08:05 3489898 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/random/mtrand.so 7f0a2358d000-7f0a2359b000 r--p 000dd000 08:05 4827887 /usr/lib/libgfortran.so.3.0.0 7f0a2359b000-7f0a235ac000 r--p 00416000 08:05 6709644 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/linalg/lapack_lite.so 7f0a235ac000-7f0a23668000 rw-p 00000000 00:00 0 7f0a23668000-7f0a2366d000 r--p 00033000 08:05 3180358 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/scalarmath.so 7f0a2366d000-7f0a23678000 r--p 00052000 08:05 3180358 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/scalarmath.so 7f0a23678000-7f0a2367d000 r--p 0004c000 08:05 3180358 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/scalarmath.so 7f0a2367d000-7f0a23690000 r--p 00039000 08:05 3180358 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/scalarmath.so 7f0a23690000-7f0a23698000 r--p 0001b000 08:05 6169649 /home/john/local/python-dbg/lib/python2.7/lib-dynload/cPickle.so 7f0a23698000-7f0a236a7000 r--p 004fd000 08:05 3180355 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/_dotblas.so 7f0a236a7000-7f0a2374f000 rw-p 00000000 00:00 0 7f0a2374f000-7f0a2375a000 r--p 0001b000 08:05 3180353 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/_sort.so 7f0a2375a000-7f0a23762000 r--p 00065000 08:05 3180320 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/umath.so 7f0a23762000-7f0a23774000 r--p 000ae000 08:05 3180320 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/umath.so 7f0a23774000-7f0a2377a000 r--p 000a9000 08:05 3180320 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/umath.so 7f0a2377a000-7f0a23780000 r--p 000a4000 08:05 3180320 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/umath.so 7f0a23780000-7f0a237b4000 r--p 00071000 08:05 3180320 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/umath.so 7f0a237b4000-7f0a23881000 rw-p 00000000 00:00 0 7f0a23883000-7f0a23888000 r--p 0000f000 08:05 3146117 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/fft/fftpack_lite.so 7f0a23888000-7f0a23897000 r--p 000b9000 08:05 3180362 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/multiarray.so 7f0a23897000-7f0a238a1000 r--p 00118000 08:05 3180362 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/multiarray.so 7f0a238a1000-7f0a238ae000 r--p 0010c000 08:05 3180362 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/multiarray.so 7f0a238ae000-7f0a238e8000 r--p 000d3000 08:05 3180362 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/multiarray.so 7f0a238e8000-7f0a23aa4000 r--p 004e2000 08:05 4841832 /usr/lib/debug/lib/libc-2.11.1.so 7f0a23aa4000-7f0a23b03000 r--p 0069d000 08:05 4841832 /usr/lib/debug/lib/libc-2.11.1.so 7f0a23b03000-7f0a23b27000 r--p 004bf000 08:05 4841832 /usr/lib/debug/lib/libc-2.11.1.so 7f0a23b27000-7f0a23bc3000 r--p 00424000 08:05 4841832 /usr/lib/debug/lib/libc-2.11.1.so 7f0a23bc3000-7f0a23c3e000 r--p 003aa000 08:05 4841832 /usr/lib/debug/lib/libc-2.11.1.so 7f0a23c3e000-7f0a23fca000 r--p 0001f000 08:05 4841832 /usr/lib/debug/lib/libc-2.11.1.so 7f0a23fca000-7f0a240f6000 rw-p 00000000 00:00 0 7f0a240f8000-7f0a24118000 r--p 00121000 08:05 3180362 /home/john/local/python-dbg/lib/python2.7/site-packages/numpy-1.5.1rc1-py2.7-linux-x86_64.egg/numpy/core/multiarray.so 7f0a24118000-7f0a24129000 r--p 0000e000 08:05 4950482 /usr/lib/debug/lib/libz.so.1.2.3.3 7f0a24129000-7f0a24133000 r--p 00000000 08:05 4950482 /usr/lib/debug/lib/libz.so.1.2.3.3 7f0a24133000-7f0a24154000 r--p 00155000 08:05 2900170 /lib/libc-2.11.1.so 7f0a24154000-7f0a241a2000 r--p 00061000 08:05 4841716 /usr/lib/debug/lib/libm-2.11.1.so 7f0a241a2000-7f0a241a8000 r--p 0005c000 08:05 4841716 /usr/lib/debug/lib/libm-2.11.1.so 7f0a241a8000-7f0a241bb000 r--p 0004a000 08:05 4841716 /usr/lib/debug/lib/libm-2.11.1.so 7f0a241bb000-7f0a241ed000 r--p 00007000 08:05 4841716 /usr/lib/debug/lib/libm-2.11.1.so 7f0a241ed000-7f0a241f4000 r-xp 00000000 08:05 2900165 /lib/libthread_db-1.0.so 7f0a241f4000-7f0a243f3000 ---p 00007000 08:05 2900165 /lib/libthread_db-1.0.so 7f0a243f3000-7f0a243f4000 r--p 00006000 08:05 2900165 /lib/libthread_db-1.0.so 7f0a243f4000-7f0a243f5000 rw-p 00007000 08:05 2900165 /lib/libthread_db-1.0.so 7f0a243f9000-7f0a2440c000 r--p 00038000 08:05 4841716 /usr/lib/debug/lib/libm-2.11.1.so 7f0a2440c000-7f0a2441b000 r--p 00000000 08:05 4841839 /usr/lib/debug/lib/libdl-2.11.1.so 7f0a2441b000-7f0a24431000 r--p 00078000 08:05 4841828 /usr/lib/debug/lib/libpthread-2.11.1.so 7f0a24431000-7f0a24439000 r--p 00071000 08:05 4841828 /usr/lib/debug/lib/libpthread-2.11.1.so 7f0a24439000-7f0a2444c000 r--p 0005f000 08:05 4841828 /usr/lib/debug/lib/libpthread-2.11.1.so 7f0a2444c000-7f0a2445c000 r--p 00050000 08:05 4841828 /usr/lib/debug/lib/libpthread-2.11.1.so 7f0a2445c000-7f0a244a9000 r--p 00004000 08:05 4841828 /usr/lib/debug/lib/libpthread-2.11.1.so 7f0a244a9000-7f0a244cc000 r--p 00063000 08:05 4841753 /usr/lib/debug/lib/ld-2.11.1.so 7f0a244cc000-7f0a244d6000 r--p 00085000 08:05 4841753 /usr/lib/debug/lib/ld-2.11.1.so 7f0a244d6000-7f0a244f3000 r--p 001be000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a244f3000-7f0a24537000 r--p 00370000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a24537000-7f0a2453e000 r--p 003b3000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a2453e000-7f0a2455c000 r--p 00353000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a2455c000-7f0a24583000 r--p 0032d000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a24583000-7f0a24591000 r--p 00320000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a24591000-7f0a2468b000 r--p 00227000 08:05 221210 /home/john/local/python-dbg/bin/python2.7 7f0a2468b000-7f0a247a8000 rw-p 00000000 00:00 0 7f0a247a8000-7f0a247aa000 r-xp 00000000 08:05 2900166 /lib/libutil-2.11.1.so 7f0a247aa000-7f0a249a9000 ---p 00002000 08:05 2900166 /lib/libutil-2.11.1.so 7f0a249a9000-7f0a249aa000 r--p 00001000 08:05 2900166 /lib/libutil-2.11.1.so 7f0a249aa000-7f0a249ab000 rw-p 00002000 08:05 2900166 /lib/libutil-2.11.1.so 7f0a249ab000-7f0a249c3000 r-xp 00000000 08:05 2900168 /lib/libpthread-2.11.1.so 7f0a249c3000-7f0a24bc2000 ---p 00018000 08:05 2900168 /lib/libpthread-2.11.1.so 7f0a24bc2000-7f0a24bc3000 r--p 00017000 08:05 2900168 /lib/libpthread-2.11.1.so 7f0a24bc3000-7f0a24bc4000 rw-p 00018000 08:05 2900168 /lib/libpthread-2.11.1.so 7f0a24bc4000-7f0a24bc8000 rw-p 00000000 00:00 0 7f0a24bc8000-7f0a24d30000 r-xp 00000000 08:05 2901949 /lib/libcrypto.so.0.9.8 7f0a24d30000-7f0a24f2f000 ---p 00168000 08:05 2901949 /lib/libcrypto.so.0.9.8 7f0a24f2f000-7f0a24f3c000 r--p 00167000 08:05 2901949 /lib/libcrypto.so.0.9.8 7f0a24f3c000-7f0a24f54000 rw-p 00174000 08:05 2901949 /lib/libcrypto.so.0.9.8 7f0a24f54000-7f0a24f58000 rw-p 00000000 00:00 0 7f0a24f58000-7f0a24fa3000 r-xp 00000000 08:05 2901950 /lib/libssl.so.0.9.8 7f0a24fa3000-7f0a251a2000 ---p 0004b000 08:05 2901950 /lib/libssl.so.0.9.8 7f0a251a2000-7f0a251a4000 r--p 0004a000 08:05 2901950 /lib/libssl.so.0.9.8 7f0a251a4000-7f0a251a9000 rw-p 0004c000 08:05 2901950 /lib/libssl.so.0.9.8 7f0a251a9000-7f0a251aa000 rw-p 00000000 00:00 0 7f0a251aa000-7f0a25324000 r-xp 00000000 08:05 2900170 /lib/libc-2.11.1.so 7f0a25324000-7f0a25523000 ---p 0017a000 08:05 2900170 /lib/libc-2.11.1.so 7f0a25523000-7f0a25527000 r--p 00179000 08:05 2900170 /lib/libc-2.11.1.so 7f0a25527000-7f0a25528000 rw-p 0017d000 08:05 2900170 /lib/libc-2.11.1.so 7f0a25528000-7f0a2552d000 rw-p 00000000 00:00 0 7f0a2552d000-7f0a2552f000 r-xp 00000000 08:05 2900174 /lib/libdl-2.11.1.so 7f0a2552f000-7f0a2572f000 ---p 00002000 08:05 2900174 /lib/libdl-2.11.1.so 7f0a2572f000-7f0a25730000 r--p 00002000 08:05 2900174 /lib/libdl-2.11.1.so 7f0a25730000-7f0a25731000 rw-p 00003000 08:05 2900174 /lib/libdl-2.11.1.so 7f0a25731000-7f0a25757000 r-xp 00000000 08:05 2900004 /lib/libexpat.so.1.5.2 7f0a25757000-7f0a25957000 ---p 00026000 08:05 2900004 /lib/libexpat.so.1.5.2 7f0a25957000-7f0a25959000 r--p 00026000 08:05 2900004 /lib/libexpat.so.1.5.2 7f0a25959000-7f0a2595a000 rw-p 00028000 08:05 2900004 /lib/libexpat.so.1.5.2 7f0a2595a000-7f0a25b98000 r-xp 00000000 08:05 4827971 /usr/lib/libpython2.6.so.1.0 7f0a25b98000-7f0a25d98000 ---p 0023e000 08:05 4827971 /usr/lib/libpython2.6.so.1.0 7f0a25d98000-7f0a25d9a000 r--p 0023e000 08:05 4827971 /usr/lib/libpython2.6.so.1.0 7f0a25d9a000-7f0a25dfc000 rw-p 00240000 08:05 4827971 /usr/lib/libpython2.6.so.1.0 7f0a25dfc000-7f0a25e0b000 rw-p 00000000 00:00 0 7f0a25e0b000-7f0a25e8d000 r-xp 00000000 08:05 2900011 /lib/libm-2.11.1.so 7f0a25e8d000-7f0a2608c000 ---p 00082000 08:05 2900011 /lib/libm-2.11.1.so 7f0a2608c000-7f0a2608d000 r--p 00081000 08:05 2900011 /lib/libm-2.11.1.so 7f0a2608d000-7f0a2608e000 rw-p 00082000 08:05 2900011 /lib/libm-2.11.1.so 7f0a2608e000-7f0a260a4000 r-xp 00000000 08:05 2900157 /lib/libz.so.1.2.3.3 7f0a260a4000-7f0a262a3000 ---p 00016000 08:05 2900157 /lib/libz.so.1.2.3.3 7f0a262a3000-7f0a262a4000 r--p 00015000 08:05 2900157 /lib/libz.so.1.2.3.3 7f0a262a4000-7f0a262a5000 rw-p 00016000 08:05 2900157 /lib/libz.so.1.2.3.3 7f0a262a5000-7f0a262e3000 r-xp 00000000 08:05 3498266 /lib/libncurses.so.5.7 7f0a262e3000-7f0a264e3000 ---p 0003e000 08:05 3498266 /lib/libncurses.so.5.7 7f0a264e3000-7f0a264e7000 r--p 0003e000 08:05 3498266 /lib/libncurses.so.5.7 7f0a264e7000-7f0a264e8000 rw-p 00042000 08:05 3498266 /lib/libncurses.so.5.7 7f0a264e8000-7f0a26521000 r-xp 00000000 08:05 3498308 /lib/libreadline.so.6.1 7f0a26521000-7f0a26720000 ---p 00039000 08:05 3498308 /lib/libreadline.so.6.1 7f0a26720000-7f0a26722000 r--p 00038000 08:05 3498308 /lib/libreadline.so.6.1 7f0a26722000-7f0a26728000 rw-p 0003a000 08:05 3498308 /lib/libreadline.so.6.1 7f0a26728000-7f0a26729000 rw-p 00000000 00:00 0 7f0a26729000-7f0a26749000 r-xp 00000000 08:05 2900131 /lib/ld-2.11.1.so 7f0a26749000-7f0a2674f000 r--p 00013000 08:05 6169622 /home/john/local/python-dbg/lib/python2.7/lib-dynload/itertools.so 7f0a2674f000-7f0a26758000 r--p 0004c000 08:05 4841753 /usr/lib/debug/lib/ld-2.11.1.so 7f0a26758000-7f0a267a4000 r--p 00001000 08:05 4841753 /usr/lib/debug/lib/ld-2.11.1.so 7f0a267a4000-7f0a26857000 rw-p 00000000 00:00 0 7f0a26857000-7f0a26858000 r--p 00000000 08:05 5792628 /usr/share/locale-langpack/en_GB/LC_MESSAGES/libc.mo 7f0a26858000-7f0a268da000 rw-p 00000000 00:00 0 7f0a268da000-7f0a26919000 r--p 00000000 08:05 4874536 /usr/lib/locale/en_GB.utf8/LC_CTYPE 7f0a26919000-7f0a26920000 rw-p 00000000 00:00 0 7f0a26922000-7f0a26928000 r--p 0005e000 08:05 4841753 /usr/lib/debug/lib/ld-2.11.1.so 7f0a26928000-7f0a26933000 r--p 00054000 08:05 4841753 /usr/lib/debug/lib/ld-2.11.1.so 7f0a26935000-7f0a26938000 rw-p 00000000 00:00 0 7f0a26938000-7f0a2693e000 r--p 00000000 08:05 5792627 /usr/share/locale-langpack/en_GB/LC_MESSAGES/gdb.mo 7f0a2693e000-7f0a26945000 r--s 00000000 08:05 5417899 /usr/lib/gconv/gconv-modules.cache 7f0a26945000-7f0a26946000 r--p 00000000 08:05 4875999 /usr/lib/locale/en_GB.utf8/LC_MESSAGES/SYS_LC_MESSAGES 7f0a26946000-7f0a26948000 rw-p 00000000 00:00 0 7f0a26948000-7f0a26949000 r--p 0001f000 08:05 2900131 /lib/ld-2.11.1.so 7f0a26949000-7f0a2694a000 rw-p 00020000 08:05 2900131 /lib/ld-2.11.1.so 7f0a2694a000-7f0a2694b000 rw-p 00000000 00:00 0 7ffff92d6000-7ffff92f8000 rw-p 00000000 00:00 0 [stack] 7ffff93ff000-7ffff9400000 r-xp 00000000 00:00 0 [vdso] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] From lvh at laurensvh.be Fri Nov 5 19:00:19 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Fri, 5 Nov 2010 19:00:19 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: On Fri, Nov 5, 2010 at 6:10 PM, geremy condra wrote: > On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven wrote: >> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae wrote: > > > >> What is true is that there's a new and temporary "NO ARCH" rule in the >> topic > > It's your channel and you can do with it what you want, but seriously- > does this strike you as the best response to a widespread problem? > You're basically telling people to get lost, and in all caps no less. > > Geremy Condra It is not by any means "my channel" -- I apologize if I gave anyone the impression that I alone decided that was going up, because that's not true. Unfortunately Freenode does not give us the ability to be more verbose in IRC topics. In fact, to put that up, we had to remove something less important. As a result, NO ARCH is roughly the best we can do. (Similarly NO LOL is really NO CHATSPEAK, but topics are length limited.) cheers lvh From lvh at laurensvh.be Fri Nov 5 19:08:35 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Fri, 5 Nov 2010 19:08:35 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> Message-ID: Whoops, pressed send too soon. This should've followed my previous email: Unscientifically judging by the rate of people who used to have vague problems that turned out to be Arch-related, I don't really think anyone feels they're being told to "get lost". People ask a question about it, which is great: answering that issue in the detail it deserves (as you've mentioned), which is something we can't do in the /topic but *can* do in the channel itself, takes a lot less time for everyone and leads to the correct answer (such as "tell the package maintainer") faster. As soon as this dies down and it stops being an issue, we're obviously taking it down. cheers lvh From fetchinson at googlemail.com Fri Nov 5 19:57:08 2010 From: fetchinson at googlemail.com (Daniel Fetchinson) Date: Fri, 5 Nov 2010 19:57:08 +0100 Subject: [Python-Dev] *** glibc detected *** gdb: malloc(): smallbin double linked list In-Reply-To: References: Message-ID: > Hi, > > I've compiled > Python 2.7 (r27:82500, Nov 2 2010, 09:00:37) > [GCC 4.4.3] on linux2 > > with the following configure options > ./configure --prefix=/home/john/local/python-dbg --with-pydebug > > I've installed numpy and some other packages but when I try to run my > extension code under gdb I get the errors below. Does anyone have any > ideas of how to track down what's happening here? I imagine I've > misconfigured something somewhere. Is valgrind the answer? > > Thanks, > John. Hi John, the right place for asking such questions is the python mailing list python-list at python.org, please see http://mail.python.org/mailman/listinfo/python-list This python-dev list is for the development *of* python and not development *with* python. For the latter python-list is the appropriate forum. Cheers, Daniel -- Psss, psss, put it down! - http://www.cafepress.com/putitdown From merwok at netwok.org Sat Nov 6 01:59:24 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sat, 06 Nov 2010 01:59:24 +0100 Subject: [Python-Dev] [Python-checkins] r86170 - in python/branches/py3k: Doc/library/stdtypes.rst Lib/test/test_unicode.py Misc/NEWS Objects/stringlib/string_format.h Objects/unicodeobject.c In-Reply-To: <20101104170658.E1303EE9D4@mail.python.org> References: <20101104170658.E1303EE9D4@mail.python.org> Message-ID: <4CD4A86C.4070704@netwok.org> > Author: eric.smith > Date: Thu Nov 4 18:06:58 2010 > New Revision: 86170 > > Log: > Issue #6081: Add str.format_map. str.format_map(mapping) is similar to str.format(**mapping), except mapping does not get converted to a dict. > Modified: python/branches/py3k/Doc/library/stdtypes.rst > ============================================================================== > --- python/branches/py3k/Doc/library/stdtypes.rst (original) > +++ python/branches/py3k/Doc/library/stdtypes.rst Thu Nov 4 18:06:58 2010 > @@ -1038,6 +1038,14 @@ > that can be specified in format strings. > > > +.. method:: str.format_map(mapping) > + > + Similar to ``str.forrmat(**mapping)``, except that ``mapping`` is Yarrr me hearrties, it be forrrmat! From debatem1 at gmail.com Sat Nov 6 03:38:21 2010 From: debatem1 at gmail.com (geremy condra) Date: Fri, 5 Nov 2010 19:38:21 -0700 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD43B5A.3070003@voidspace.org.uk> References: <4CD23A19.6080002@archlinux.org> <4CD43B5A.3070003@voidspace.org.uk> Message-ID: On Fri, Nov 5, 2010 at 10:14 AM, Michael Foord wrote: > On 05/11/2010 17:10, geremy condra wrote: >> >> On Thu, Nov 4, 2010 at 3:40 PM, Laurens Van Houtven >> ?wrote: >>> >>> On Thu, Nov 4, 2010 at 5:44 AM, Allan McRae ?wrote: >> >> >> >>> What is true is that there's a new and temporary "NO ARCH" rule in the >>> topic >> >> It's your channel and you can do with it what you want, > > Actually it's a PSF run channel. > >> but seriously- >> does this strike you as the best response to a widespread problem? >> You're basically telling people to get lost, and in all caps no less. >> > They're saying that the channel isn't the correct place to get support on > that particular issue. In the same way that telling someone to RTFM n00b is the same thing as telling them to kindly refer to the documents produced by man, yes. As you said during the "python 2 or 3" discussion some months back "given the topic is far more nuanced than an IRC topic can express maybe that just isn't the right place for it". Geremy Condra From ezio.melotti at gmail.com Sat Nov 6 05:44:42 2010 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sat, 06 Nov 2010 06:44:42 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> Message-ID: <4CD4DD3A.9040808@gmail.com> Hi, On 05/11/2010 19.08, Python tracker wrote: > ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) > Python tracker at http://bugs.python.org/ > > To view or respond to any of the issues listed below, click on the issue. > Do NOT respond to this message. > > Issues counts and deltas: > open 2514 (+17) > closed 19597 (+78) > total 22111 (+95) as suggested in recent mails[0][1] I changed these values to represent the deltas with the previous week. Now let's try to keep the "open" delta negative ;) Best Regards, Ezio Melotti [0]: http://mail.python.org/pipermail/python-dev/2010-October/104840.html [1]: http://mail.python.org/pipermail/python-dev/2010-September/104054.html From merwok at netwok.org Sat Nov 6 06:00:16 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sat, 06 Nov 2010 06:00:16 +0100 Subject: [Python-Dev] [Python-checkins] r86170 - in python/branches/py3k: Doc/library/stdtypes.rst Lib/test/test_unicode.py Misc/NEWS Objects/stringlib/string_format.h Objects/unicodeobject.c In-Reply-To: <4CD4A86C.4070704@netwok.org> References: <20101104170658.E1303EE9D4@mail.python.org> <4CD4A86C.4070704@netwok.org> Message-ID: <4CD4E0E0.8000008@netwok.org> According to #python-dev, there?s no need to go through python-checkins/-dev for typos, so I fixed this one in r86247. Piratical regards From eric at trueblade.com Sat Nov 6 11:43:45 2010 From: eric at trueblade.com (Eric Smith) Date: Sat, 06 Nov 2010 06:43:45 -0400 Subject: [Python-Dev] [Python-checkins] r86170 - in python/branches/py3k: Doc/library/stdtypes.rst Lib/test/test_unicode.py Misc/NEWS Objects/stringlib/string_format.h Objects/unicodeobject.c In-Reply-To: <4CD4E4C0.2060706@gmail.com> References: <20101104170658.E1303EE9D4@mail.python.org> <4CD4E4C0.2060706@gmail.com> Message-ID: <4CD53161.3030609@trueblade.com> On 11/6/10 1:16 AM, Ezio Melotti wrote: >> +.. method:: str.format_map(mapping) >> + >> + Similar to ``str.forrmat(**mapping)``, except that ``mapping`` is >> + used directly and not copied to a :class:`dict` . This is useful >> + if for example ``mapping`` is a dict subclass. > > > Including the "__missing__" example might be better. From the > description, it's not clear why str.format(**dict_subclass) wouldn't > work and that the previous line refers to the fact that ** converts the > mapping in a plain dict, thus making __missing__ and other things unusable. I agree, but I was hesitant to add a long example. But thinking about it some more I think I'll add it. >> + >> + self.assertEqual('{foo._x}'.format_map({'foo': C(20)}), '20') >> + > > > The classes D-H seem unused, did you forget to add some tests or am I > missing something? It was a big copy job from the other tests. I'll review them all. >> +PyDoc_STRVAR(format_map__doc__, >> + "S.format_map(mapping) -> str\n\ >> +\n\ >> +"); >> + > > > Wouldn't a more verbose docstring be better? (str.format seems to lack > one too) Undoubtedly true. Any suggestions? How about (for .format): "Returns S formatted with substitutions from args and kwargs." I also see that __format__'s docstring is similarly terse. Thanks for reviewing! -- Eric. From victor.stinner at haypocalc.com Sat Nov 6 12:19:55 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 6 Nov 2010 12:19:55 +0100 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot Message-ID: <201011061219.55473.victor.stinner@haypocalc.com> Hi, I noticed "OSError: [Errno 23] Too many open files in system" errors on your FreeBSD buildbot. I would like to know if you configured a limit on the open files or maybe of child processes on this buildbot or not, or if it is a failure in Python? The first error always occurs in the first test of test_concurrent_futures. It's maybe because this test uses a lot of open files or processes? Victor From martin at v.loewis.de Sat Nov 6 12:31:39 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 06 Nov 2010 12:31:39 +0100 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: <201011061219.55473.victor.stinner@haypocalc.com> References: <201011061219.55473.victor.stinner@haypocalc.com> Message-ID: <4CD53C9B.7060804@v.loewis.de> Am 06.11.2010 12:19, schrieb Victor Stinner: > Hi, > > I noticed "OSError: [Errno 23] Too many open files in system" errors on your > FreeBSD buildbot. I would like to know if you configured a limit on the open > files or maybe of child processes on this buildbot or not, or if it is a > failure in Python? Before David responds: feel free to put temporarily a "limits -a" command into the build process, or some such. Regards, Martin From lvh at laurensvh.be Sat Nov 6 13:53:12 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Sat, 6 Nov 2010 13:53:12 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> <4CD43B5A.3070003@voidspace.org.uk> Message-ID: I'm sorry you feel that way. Experience teaches us that people do speak up more than they tend to keep schtum. We do get feedback on most things, including the "NO ARCH" rule. At least so far, responses have not been anywhere near what you'd expect if you'd tell people to "RTFM n00b" (in terms of defensiveness and verbal hostility, at least). From the things I've seen (and I've asked other regulars, they seem to agree), the related interactions have been short, clear, and cordial. The first is important to #python because it keeps the signal to noise ratio high. The second is important to the person with the broken package, so they know what to do to fix it and how to get it fixed for other people as well. The last part is important to everyone. As usual, any and all policy is up for debate, but I really see too much result (not just for #python, but for the people with the broken package as well) and too little badness to consider taking it down right now. I believe I speak for all of the ops and regulars in #python when I say that. Even Allan himself has said that he agrees with the rule, and yes: I do honestly believe that right now, it is the best thing we can actually *do*. That doesn't mean it has to be the best thing bar none: like with software projects, "patches welcome", if you have any suggestions for improving this, we're all ears. However, I've already said: this is temporary, it's going down as soon as we stop getting feedback on it. (Checking if that has occurred or not is in my tickler file for next Friday.) It has already been pointed out in this thread that Arch is a distro with a target audience of above average knowledge. Yes, the rule does expect people to understand the difference between an Arch-specific problem and something that's likely to be unrelated.to whatever distro it is you're running. Even the people who do feel instantly offended and just leave without asking questions, hey, at least they're likely to go to Arch-specific spots next for support, and that's the right place (FWIW I do not believe this to be a significant amount of people). Also, sometimes pointing people to the FM is just the only reasonable thing left to do. If you've got recent-ish logs (24h) I can give you a recent prime example of that. I do doubt that anyone used terminology like 'RTFM n00b'. If you think 'NO ARCH' is the same kind of language, well, we'll just have to agree to disagree there. I could see how someone would think that, but IRC typically forces people to be more brief, and a lot of people understandably mistake that for being blunt or even downright rude. That's an unfortunate side effect of the medium that pretty much every large channel I know of has had to deal with in some way. cheers lvh From martin at v.loewis.de Sat Nov 6 14:41:08 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 06 Nov 2010 14:41:08 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CD55AF4.40308@v.loewis.de> > But the previous consensus (at least, as I, and presumably many other > people understood it) was that python2 would remain the owner of the > name "/usr/bin/python" for the indefinite future, and python3 would > be invoked with /usr/bin/python3. Can you cite references for that (not that other people agree, but that this was consensus)? I couldn't find any summary report of the 2009 language summit, and, despite having been present there, I don't recall that aspect even being discussed. Instead, I recall that a decision was made (and I'm not sure whether with consensus or not) that "make install" would install /usr/bin/python3, for the time being. Period. So I don't recall a decision that there shouldn't be a python2 binary, nor a decision that anything is done indefinitely (it may be that the decision was actually just about 3.1 - changing it again for 3.2 would require another decision, but certainly can't be ruled out categorically). Regards, Martin From g.brandl at gmx.net Sat Nov 6 15:38:22 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 06 Nov 2010 15:38:22 +0100 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <4CD4DD3A.9040808@gmail.com> References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> Message-ID: Am 06.11.2010 05:44, schrieb Ezio Melotti: > Hi, > > On 05/11/2010 19.08, Python tracker wrote: >> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) >> Python tracker at http://bugs.python.org/ >> >> To view or respond to any of the issues listed below, click on the issue. >> Do NOT respond to this message. >> >> Issues counts and deltas: >> open 2514 (+17) >> closed 19597 (+78) >> total 22111 (+95) > > as suggested in recent mails[0][1] I changed these values to represent > the deltas with the previous week. > Now let's try to keep the "open" delta negative ;) That is a worthy goal, however the difference between the "open" and "closed" deltas is already quite an achievement and shows that our triage works. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From rdmurray at bitdance.com Sat Nov 6 15:46:57 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 06 Nov 2010 10:46:57 -0400 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: <4CD53C9B.7060804@v.loewis.de> References: <201011061219.55473.victor.stinner@haypocalc.com> <4CD53C9B.7060804@v.loewis.de> Message-ID: <20101106144657.541191FCC6D@kimball.webabinitio.net> On Sat, 06 Nov 2010 12:31:39 +0100, Martin wrote: > Am 06.11.2010 12:19, schrieb Victor Stinner: > > Hi, > > > > I noticed "OSError: [Errno 23] Too many open files in system" errors on your > > FreeBSD buildbot. I would like to know if you configured a limit on the open > > files or maybe of child processes on this buildbot or not, or if it is a > > failure in Python? > > Before David responds: feel free to put temporarily a "limits -a" > command into the build process, or some such. You might also want to check the value of sysctl kern.maxfiles. On the FreeBSD (6.3) systems to which I have access the default value for kern.maxfiles appears to be 12328, but that information is of limited utility since its value is set based on kern.maxusers, which in turn is set at boot time based primarily on the available system memory (see: http://www.freebsd.org/doc/handbook/configtuning-kernel-limits.html) The systems I got the above number from have 1GB of memory. -- R. David Murray www.bitdance.com From rdmurray at bitdance.com Sat Nov 6 16:42:03 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 06 Nov 2010 11:42:03 -0400 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> Message-ID: <20101106154204.214DA21E583@kimball.webabinitio.net> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl wrote: > Am 06.11.2010 05:44, schrieb Ezio Melotti: > > Hi, > > > > On 05/11/2010 19.08, Python tracker wrote: > >> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) > >> Python tracker at http://bugs.python.org/ > >> > >> To view or respond to any of the issues listed below, click on the issue. > >> Do NOT respond to this message. > >> > >> Issues counts and deltas: > >> open 2514 (+17) > >> closed 19597 (+78) > >> total 22111 (+95) > > > > as suggested in recent mails[0][1] I changed these values to represent > > the deltas with the previous week. > > Now let's try to keep the "open" delta negative ;) > > That is a worthy goal, however the difference between the "open" and "closed" > deltas is already quite an achievement and shows that our triage works. Agreed. We did have negative open deltas for several weeks running in October. Kudos to everyone involved, and lets do it some more :) I'm looking forward to making a non-trivial dent in the open count during the bug weekend on the 20th/21st. -- R. David Murray www.bitdance.com From martin at v.loewis.de Sat Nov 6 17:00:42 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 06 Nov 2010 17:00:42 +0100 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py In-Reply-To: <20101106141631.59358EEADD@mail.python.org> References: <20101106141631.59358EEADD@mail.python.org> Message-ID: <4CD57BAA.1030301@v.loewis.de> > Remove one trace of Mac OS 9 support (#7908 follow-up) > > This was overlooked in r80804. This change is not really a bug fix, I'm skeptical that this change should be carried out, then. It's easy to argue that this can't possibly hurt (but I can certainly come up with code that will break under that change); however, I fail to see what good it does. Regards, Martin From merwok at netwok.org Sat Nov 6 17:26:50 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sat, 06 Nov 2010 17:26:50 +0100 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py In-Reply-To: <4CD57BAA.1030301@v.loewis.de> References: <20101106141631.59358EEADD@mail.python.org> <4CD57BAA.1030301@v.loewis.de> Message-ID: <4CD581CA.1020905@netwok.org> > I'm skeptical that this change should be carried out, then. Yes, I asked myself the same question. I had done the svnmerge from py3k, and when I saw that only one change was left, I wondered whether I should commit or revert. > It's easy to argue that this can't possibly hurt (but I can certainly > come up with code that will break under that change); however, I fail > to see what good it does. This was a private function used on an unsupported platform, this should do no harm. We?ve been bitten by ?should do no harm? before though, so I am ready to revert this change (and learn from this :) Regards From martin at v.loewis.de Sat Nov 6 17:33:06 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sat, 06 Nov 2010 17:33:06 +0100 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py In-Reply-To: <4CD581CA.1020905@netwok.org> References: <20101106141631.59358EEADD@mail.python.org> <4CD57BAA.1030301@v.loewis.de> <4CD581CA.1020905@netwok.org> Message-ID: <4CD58342.6040102@v.loewis.de> > This was a private function used on an unsupported platform, this should > do no harm. We?ve been bitten by ?should do no harm? before though, so > I am ready to revert this change (and learn from this :) Do as you like. I won't insist on it being reverted. It's rather a matter of agreeing when moving forward: IMO, mere style changes, code cleanup etc shouldn't be applied to the bug fix branches, as their only purpose is to provide bug fixes for existing users. Regards, Martin From tjreedy at udel.edu Sat Nov 6 17:55:45 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 06 Nov 2010 12:55:45 -0400 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: References: <4CD23A19.6080002@archlinux.org> <4CD43B5A.3070003@voidspace.org.uk> Message-ID: On 11/6/2010 8:53 AM, Laurens Van Houtven wrote: > Experience teaches us that people do speak up more than they tend to > keep schtum. We do get feedback on most things, including the "NO > ARCH" rule. It strikes me as reasonable to warn people that they would be wasting their time typing out a multiline question about problems with the new Arch distro. They can always ask briefly 'Why NO ARCH' and get back 'Beyond our knowledge' (or a longer pasted response). -- Terry Jan Reedy From tjreedy at udel.edu Sat Nov 6 18:01:56 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 06 Nov 2010 13:01:56 -0400 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20101106154204.214DA21E583@kimball.webabinitio.net> References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> Message-ID: On 11/6/2010 11:42 AM, R. David Murray wrote: > On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl wrote: >> Am 06.11.2010 05:44, schrieb Ezio Melotti: >>> Hi, >>> >>> On 05/11/2010 19.08, Python tracker wrote: >>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) >>>> Python tracker at http://bugs.python.org/ >>>> >>>> To view or respond to any of the issues listed below, click on the issue. >>>> Do NOT respond to this message. >>>> >>>> Issues counts and deltas: >>>> open 2514 (+17) This seems wrong. A default search for open issues returns 2452 and it was about the same yesterday just a few hours after the report. >>>> closed 19597 (+78) >>>> total 22111 (+95) >>> >>> as suggested in recent mails[0][1] I changed these values to represent >>> the deltas with the previous week. >>> Now let's try to keep the "open" delta negative ;) Since there were more issues closed than opened I think it really was. Anyway, we are down 300 from the 2750 peak. >> That is a worthy goal, however the difference between the "open" and "closed" >> deltas is already quite an achievement and shows that our triage works. > > Agreed. > > We did have negative open deltas for several weeks running in October. > Kudos to everyone involved, and lets do it some more :) I'm looking > forward to making a non-trivial dent in the open count during the bug > weekend on the 20th/21st. -- Terry Jan Reedy From tjreedy at udel.edu Sat Nov 6 18:22:35 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 06 Nov 2010 13:22:35 -0400 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py In-Reply-To: <4CD58342.6040102@v.loewis.de> References: <20101106141631.59358EEADD@mail.python.org> <4CD57BAA.1030301@v.loewis.de> <4CD581CA.1020905@netwok.org> <4CD58342.6040102@v.loewis.de> Message-ID: On 11/6/2010 12:33 PM, "Martin v. L?wis" wrote: >> This was a private function used on an unsupported platform, this should >> do no harm. We?ve been bitten by ?should do no harm? before though, so >> I am ready to revert this change (and learn from this :) > > Do as you like. I won't insist on it being reverted. > > It's rather a matter of agreeing when moving forward: IMO, mere style > changes, code cleanup etc shouldn't be applied to the bug fix branches, > as their only purpose is to provide bug fixes for existing users. The omission of the deletion from the 5/5 revision was a bug in that revision. If the removal of OS9 support was documented (announced), which I presume it was, then one could consider any visible trace remaining to be a bug. Perhaps the policy on code cleanup should be a bit more liberal for 2.7 *because* it will be maintained for several years and *because* there is no newer 2.x branch to apply changes to. If I were going to maintain 2.7 for several years, I would want to have the benefit of gradual improvements that make maintainance easier. Applying such a cleanup to 3.1, say, is less necessary because a) the code will soon be end-of-lifed and not maintained much and b) it can be applied to the newer (3.2) branch and benefit that and all future releases thereafter. -- Terry Jan Reedy From ezio.melotti at gmail.com Sat Nov 6 18:25:37 2010 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sat, 06 Nov 2010 19:25:37 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> Message-ID: <4CD58F91.8040806@gmail.com> On 06/11/2010 19.01, Terry Reedy wrote: > On 11/6/2010 11:42 AM, R. David Murray wrote: >> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl >> wrote: >>> Am 06.11.2010 05:44, schrieb Ezio Melotti: >>>> Hi, >>>> >>>> On 05/11/2010 19.08, Python tracker wrote: >>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) >>>>> Python tracker at http://bugs.python.org/ >>>>> >>>>> To view or respond to any of the issues listed below, click on the >>>>> issue. >>>>> Do NOT respond to this message. >>>>> >>>>> Issues counts and deltas: >>>>> open 2514 (+17) > > This seems wrong. A default search for open issues returns 2452 and it > was about the same yesterday just a few hours after the report. > That's because the "open" count includes about 25 languishing and 39 pending issues (technically they are still open). >>>>> closed 19597 (+78) >>>>> total 22111 (+95) >>>> >>>> as suggested in recent mails[0][1] I changed these values to represent >>>> the deltas with the previous week. >>>> Now let's try to keep the "open" delta negative ;) > > Since there were more issues closed than opened I think it really was. > Anyway, we are down 300 from the 2750 peak. > >>> That is a worthy goal, however the difference between the "open" and >>> "closed" >>> deltas is already quite an achievement and shows that our triage works. Yes, even if having a negative delta would be best, having the "closed" delta higher than then "open" one is still very good. So congrats to everyone who worked and works to make this possible. >> >> Agreed. >> >> We did have negative open deltas for several weeks running in October. >> Kudos to everyone involved, and lets do it some more :) I'm looking >> forward to making a non-trivial dent in the open count during the bug >> weekend on the 20th/21st. > Best Regards, Ezio Melotti From g.rodola at gmail.com Sat Nov 6 18:53:30 2010 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 6 Nov 2010 18:53:30 +0100 Subject: [Python-Dev] SSH access against buildbot boxes Message-ID: Hi, sorry in advance if this sounds a little indiscreet, but I think it would be great if we'd have SSH access against some of the computers used to host buildbots. Personally, I would find this particularly useful for OSX since it's one of the few OSes I can't manage to virtualize and which often causes me problems. Some examples: http://bugs.python.org/issue10340 http://bugs.python.org/issue8490 (this one also involves Solaris) In such cases I would find more easy to be able to connect to the machine and test myself rather than create a separate branch, commit, schedule a buildbot run, wait for it to complete and see whether everything is "green". On the other side I perfectly understand how opening up blanket ssh access is not something everyone is comfortable with doing. AFAICR there was someone who was setting up an evironment to solve exactly this problem but I'm not sure whether this is already usable. Best regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ From rrr at ronadam.com Sat Nov 6 20:18:43 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 06 Nov 2010 14:18:43 -0500 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> Message-ID: <4CD5AA13.8040704@ronadam.com> On 11/06/2010 12:01 PM, Terry Reedy wrote: > On 11/6/2010 11:42 AM, R. David Murray wrote: >> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl wrote: >>> Am 06.11.2010 05:44, schrieb Ezio Melotti: >>>> Hi, >>>> >>>> On 05/11/2010 19.08, Python tracker wrote: >>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) >>>>> Python tracker at http://bugs.python.org/ >>>>> >>>>> To view or respond to any of the issues listed below, click on the >>>>> issue. >>>>> Do NOT respond to this message. >>>>> >>>>> Issues counts and deltas: >>>>> open 2514 (+17) > > This seems wrong. A default search for open issues returns 2452 and it > was about the same yesterday just a few hours after the report. > >>>>> closed 19597 (+78) >>>>> total 22111 (+95) >>>> >>>> as suggested in recent mails[0][1] I changed these values to represent >>>> the deltas with the previous week. >>>> Now let's try to keep the "open" delta negative ;) > > Since there were more issues closed than opened I think it really was. > Anyway, we are down 300 from the 2750 peak. Current status from the tracker... don't care: 22134 not closed: 2491 not selected: 1 open: 2451 languishing: 25 pending: 39 closed: 19604 That gives us... 2451 open 1 not selected 39 pending 25 languishing ---- 2516 Total open 2451 open 39 languishing 1 not selected ---- 2491 total "not closed" 19604 closed 2491 not closed 39 pending ----- 22134 Total issues My guess as to how this got this way, is that different fields were merged at some time where the meanings didn't quite match up. It would be nicer if... closed + not_closed = total issues closed + open + not_selected = total issues Pending and languishing should be keywords or sub categories of open. Cheers, Ron From rrr at ronadam.com Sat Nov 6 20:18:43 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 06 Nov 2010 14:18:43 -0500 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> Message-ID: <4CD5AA13.8040704@ronadam.com> On 11/06/2010 12:01 PM, Terry Reedy wrote: > On 11/6/2010 11:42 AM, R. David Murray wrote: >> On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl wrote: >>> Am 06.11.2010 05:44, schrieb Ezio Melotti: >>>> Hi, >>>> >>>> On 05/11/2010 19.08, Python tracker wrote: >>>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) >>>>> Python tracker at http://bugs.python.org/ >>>>> >>>>> To view or respond to any of the issues listed below, click on the >>>>> issue. >>>>> Do NOT respond to this message. >>>>> >>>>> Issues counts and deltas: >>>>> open 2514 (+17) > > This seems wrong. A default search for open issues returns 2452 and it > was about the same yesterday just a few hours after the report. > >>>>> closed 19597 (+78) >>>>> total 22111 (+95) >>>> >>>> as suggested in recent mails[0][1] I changed these values to represent >>>> the deltas with the previous week. >>>> Now let's try to keep the "open" delta negative ;) > > Since there were more issues closed than opened I think it really was. > Anyway, we are down 300 from the 2750 peak. Current status from the tracker... don't care: 22134 not closed: 2491 not selected: 1 open: 2451 languishing: 25 pending: 39 closed: 19604 That gives us... 2451 open 1 not selected 39 pending 25 languishing ---- 2516 Total open 2451 open 39 languishing 1 not selected ---- 2491 total "not closed" 19604 closed 2491 not closed 39 pending ----- 22134 Total issues My guess as to how this got this way, is that different fields were merged at some time where the meanings didn't quite match up. It would be nicer if... closed + not_closed = total issues closed + open + not_selected = total issues Pending and languishing should be keywords or sub categories of open. Cheers, Ron From martin at v.loewis.de Sat Nov 6 20:15:20 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 06 Nov 2010 20:15:20 +0100 Subject: [Python-Dev] SSH access against buildbot boxes In-Reply-To: References: Message-ID: <4CD5A948.6070206@v.loewis.de> > sorry in advance if this sounds a little indiscreet, but I think it > would be great if we'd have SSH access against some of the computers > used to host buildbots. The only way this can work is on a bilateral basis. If you need shell access to one of the build slaves, contact the slave operator. Regards, Martin From rrr at ronadam.com Sat Nov 6 20:30:47 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 06 Nov 2010 14:30:47 -0500 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <4CD5AA13.8040704@ronadam.com> References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> <4CD5AA13.8040704@ronadam.com> Message-ID: <4CD5ACE7.1030005@ronadam.com> > Current status from the tracker... > > don't care: 22134 > not closed: 2491 > not selected: 1 > > open: 2451 > languishing: 25 > pending: 39 > closed: 19604 > > > That gives us... > > 2451 open > 1 not selected > 39 pending > 25 languishing > ---- > 2516 Total open > > > 2451 open > 39 languishing Should be 39 pending here, not languishing. --Ron > 1 not selected > ---- > 2491 total "not closed" > > > 19604 closed > 2491 not closed > 39 pending > ----- > 22134 Total issues From rrr at ronadam.com Sat Nov 6 20:30:47 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 06 Nov 2010 14:30:47 -0500 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <4CD5AA13.8040704@ronadam.com> References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> <4CD5AA13.8040704@ronadam.com> Message-ID: <4CD5ACE7.1030005@ronadam.com> > Current status from the tracker... > > don't care: 22134 > not closed: 2491 > not selected: 1 > > open: 2451 > languishing: 25 > pending: 39 > closed: 19604 > > > That gives us... > > 2451 open > 1 not selected > 39 pending > 25 languishing > ---- > 2516 Total open > > > 2451 open > 39 languishing Should be 39 pending here, not languishing. --Ron > 1 not selected > ---- > 2491 total "not closed" > > > 19604 closed > 2491 not closed > 39 pending > ----- > 22134 Total issues From eric at trueblade.com Sat Nov 6 20:38:14 2010 From: eric at trueblade.com (Eric Smith) Date: Sat, 06 Nov 2010 15:38:14 -0400 Subject: [Python-Dev] [Python-checkins] r86170 - in python/branches/py3k: Doc/library/stdtypes.rst Lib/test/test_unicode.py Misc/NEWS Objects/stringlib/string_format.h Objects/unicodeobject.c In-Reply-To: <4CD53161.3030609@trueblade.com> References: <20101104170658.E1303EE9D4@mail.python.org> <4CD4E4C0.2060706@gmail.com> <4CD53161.3030609@trueblade.com> Message-ID: <4CD5AEA6.8060609@trueblade.com> On 11/6/10 6:43 AM, Eric Smith wrote: > On 11/6/10 1:16 AM, Ezio Melotti wrote: I've addressed all of these issues, although if anyone has suggestions for the docstrings or documentation they'd be appreciated. Thanks again. -- Eric. From me+python at ixokai.io Sat Nov 6 21:19:32 2010 From: me+python at ixokai.io (Stephen Hansen) Date: Sat, 06 Nov 2010 13:19:32 -0700 Subject: [Python-Dev] SSH access against buildbot boxes In-Reply-To: References: Message-ID: <4CD5B854.80607@ixokai.io> On 11/6/10 10:53 AM, Giampaolo Rodol? wrote: > Personally, I would find this particularly useful for OSX since it's > one of the few OSes I can't manage to virtualize and which often > causes me problems. Although I said this on IRC, I'll repeat the offer to the list for those not present -- I'm operating the Leopard and Snow Leopard buildslaves, and although I try to be proactive watching for failures, if someone wants to test something out before committing they can poke me and I'd be happy to help. I can either run a test or two and report back to you, or if you need it I can open up SSH or even VNC access on a temporary/as-needed basis. Heck, if you're doing some longer-term work that is more then just debugging a certain issue and would need access over a longer period of time, I can probably work something out for you. I'm just not comfortable opening up such access except on a person-by-person/case-by-case basis. I idle on #python-dev as "ixokai" -- you can ping me there and I generally wake up rather promptly. That, or email works too. -- Stephen Hansen ... Also: Ixokai ... Mail: me+python (AT) ixokai (DOT) io ... Blog: http://meh.ixokai.io/ -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 487 bytes Desc: OpenPGP digital signature URL: From db3l.net at gmail.com Sat Nov 6 22:36:44 2010 From: db3l.net at gmail.com (David Bolen) Date: Sat, 6 Nov 2010 17:36:44 -0400 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: References: <201011061219.55473.victor.stinner@haypocalc.com> Message-ID: On Sat, Nov 6, 2010 at 7:19 AM, Victor Stinner > > I noticed "OSError: [Errno 23] Too many open files in system" errors on your > FreeBSD buildbot. I would like to know if you configured a limit on the open > files or maybe of child processes on this buildbot or not, or if it is a > failure in Python? > > The first error always occurs in the first test of test_concurrent_futures. It's > maybe because this test uses a lot of open files or processes? I couldn't find the matching failures that you're talking about, but then I figured out you mean the FreeBSD7 (7.2) buildbot, not the FreeBSD (6.4) buildbot .... I haven't configured any specific limits with respect to open files. On both FreeBSD buildbots, kern.maxfiles is 3600 and kern.maxfilesperproc is 3060. Both have limits of 1530 processes. The latter also agrees with the maximum descriptors as shown by limit. In regards to R. David Murray's response, the buildbots are VMs with limited memory, so the dynamic calculation he references for descriptors is much lower than his system. Looks like the reason FreeBSD is ok, and FreeBSD7 is because the relevant tests don't run due to lack of POSIX semaphore support. I manually enabled their use on FreeBSD7 a while back (11/2009, issue7272) since they aren't on by default. I'd be surprised if at least test_multiprocessing didn't pass at that point (since that's what the issue was for) but even it seems to be generating the open files error now. The buildbots haven't changed, but I suppose the tests might just have grown in the number of files they need over time. I noticed that the failures seem to always be on a semaphore call. Some quick googling found a few references that seems to imply that the number of posix semaphores are very limited (like 30), and can't be changed without recompiling the kernel from source. So that's not so big a threshold for the tests to have perhaps started crossing since issue7272 was fixed. Certainly seems more likely than 3000+ files or 1500+ processes. I wonder if it's possible to deduce if this started recently or not? The web buildbot interface doesn't go back that far, and an additional complexity is that the FreeBSD builds tend to have various errors somewhat consistently over time, but perhaps there are server logs we can grep for this particular error? Not sure if the best approach at this point is to see if the tests can use fewer semaphores, skip these tests under FreeBSD 7 like 6, or if it's important enough to compile a new kernel with a higher semaphore limit. -- David From db3l.net at gmail.com Sat Nov 6 23:35:44 2010 From: db3l.net at gmail.com (David Bolen) Date: Sat, 06 Nov 2010 18:35:44 -0400 Subject: [Python-Dev] SSH access against buildbot boxes References: Message-ID: Giampaolo Rodol? writes: > In such cases I would find more easy to be able to connect to the > machine and test myself rather than create a separate branch, commit, > schedule a buildbot run, wait for it to complete and see whether > everything is "green". I agree with both Stephen and Martin's prior responses. For me, I'm happy to arrange for individual access on a case by case basis, but am less comfortable leaving access enabled permanently. I've arranged access to both my Windows and FreeBSD buildbots in the past, and while I suspect my OSX Tiger buildbot may be a little less interesting than the other OSX boxes, the offer remains open for any of my buildbots. -- David From victor.stinner at haypocalc.com Sun Nov 7 04:30:54 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 7 Nov 2010 04:30:54 +0100 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: References: <201011061219.55473.victor.stinner@haypocalc.com> Message-ID: <201011070430.54085.victor.stinner@haypocalc.com> On Saturday 06 November 2010 22:36:44 you wrote: > I couldn't find the matching failures that you're talking about, but > then I figured out you mean the FreeBSD7 (7.2) buildbot, not the > FreeBSD (6.4) buildbot .... Search "test_concurrent_futures" in: http://www.python.org/dev/buildbot/builders/x86%20FreeBSD%207.2%203.x/builds/1154/steps/test/logs/stdio I specified "x86 FreeBSD 7.2 3.x" in the email title. > (...) > I noticed that the failures seem to always be on a semaphore call. > Some quick googling found a few references that seems to imply that > the number of posix semaphores are very limited (like 30), and can't > be changed without recompiling the kernel from source. So that's not > so big a threshold for the tests to have perhaps started crossing > since issue7272 was fixed. Certainly seems more likely than 3000+ > files or 1500+ processes. Nice catch. The problem is the total number of semaphores: I reproduced the bug in my FreeBSD 8 VM. The first test fails at the creation of the 31th semaphore. The first failing test if test_all_completed. And it looks like this test doesn't destroy the semaphore at exit: my counter (based on __init__/__del__) is still at 15 when exiting the test! > I wonder if it's possible to deduce if this started recently or not? > The web buildbot interface doesn't go back that far, and an additional > complexity is that the FreeBSD builds tend to have various errors > somewhat consistently over time, but perhaps there are server logs we > can grep for this particular error? No idea. > Not sure if the best approach at this point is to see if the tests can > use fewer semaphores, skip these tests under FreeBSD 7 like 6, or if > it's important enough to compile a new kernel with a higher semaphore > limit. You wrote that the "POSIX" semaphore are very limited. Do it mean that there is another kind of semaphore will an higher limit? I don't think that skipping the test is a good idea: it looks like a real bug in (a limitation of) the ProcessPoolExecutor implementation on FreeBSD. Eg. test_map fails on FreeBSD 7.2 with ProcessPoolExecutorTest which uses self.executor = futures.ProcessPoolExecutor(max_workers=1): only one worker process! It looks like it is possible to tune semaphore limits on FreeBSD, without recompiling the kernel, by using boot loader option (kern.ipc.sem* options). But ask the FreeBSD user to tune its boot loader options to use the concurrent.futures module is not pratical :-) Victor From ncoghlan at gmail.com Sun Nov 7 06:44:07 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Nov 2010 15:44:07 +1000 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD55AF4.40308@v.loewis.de> References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> <4CD55AF4.40308@v.loewis.de> Message-ID: On Sat, Nov 6, 2010 at 11:41 PM, "Martin v. L?wis" wrote: > Instead, I recall that a decision was made (and I'm not sure whether > with consensus or not) that "make install" would install > /usr/bin/python3, for the time being. Period. Indeed, that's my recollection as well. Whether python3 ever inherits the python symlink at some point in the future is a different question that has never really been discussed (and probably makes more sense at the distro level at this point in time - "python = Python 2.x, python3 = Python 3.x" will likely stand as python-dev's consensus recommendation for quite some time to come). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 7 06:55:37 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Nov 2010 15:55:37 +1000 Subject: [Python-Dev] SSH access against buildbot boxes In-Reply-To: References: Message-ID: On Sun, Nov 7, 2010 at 3:53 AM, Giampaolo Rodol? wrote: > In such cases I would find more easy to be able to connect to the > machine and test myself rather than create a separate branch, commit, > schedule a buildbot run, wait for it to complete and see whether > everything is "green". > > On the other side I perfectly understand how opening up blanket ssh > access is not something everyone is comfortable with doing. > AFAICR there was someone who was setting up an evironment to solve > exactly this problem but I'm not sure whether this is already usable. Dealing with exactly this problem is one of the goals of the Snakebite project. As far as I know, the folks behind that project are still working on it - I've cc'ed Trent Nelson to see if he can provide any additional info on the topic. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ezio.melotti at gmail.com Sat Nov 6 21:00:46 2010 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sat, 06 Nov 2010 22:00:46 +0200 Subject: [Python-Dev] Summary of Python tracker Issues In-Reply-To: <20101106154204.214DA21E583@kimball.webabinitio.net> References: <20101105170819.89E4B7820C@psf.upfronthosting.co.za> <4CD4DD3A.9040808@gmail.com> <20101106154204.214DA21E583@kimball.webabinitio.net> Message-ID: <4CD5B3EE.8020709@gmail.com> On 06/11/2010 17.42, R. David Murray wrote: > On Sat, 06 Nov 2010 15:38:22 +0100, Georg Brandl wrote: >> Am 06.11.2010 05:44, schrieb Ezio Melotti: >>> Hi, >>> >>> On 05/11/2010 19.08, Python tracker wrote: >>>> ACTIVITY SUMMARY (2010-10-29 - 2010-11-05) >>>> Python tracker at http://bugs.python.org/ >>>> >>>> To view or respond to any of the issues listed below, click on the issue. >>>> Do NOT respond to this message. >>>> >>>> Issues counts and deltas: >>>> open 2514 (+17) >>>> closed 19597 (+78) >>>> total 22111 (+95) >>> as suggested in recent mails[0][1] I changed these values to represent >>> the deltas with the previous week. >>> Now let's try to keep the "open" delta negative ;) >> That is a worthy goal, however the difference between the "open" and "closed" >> deltas is already quite an achievement and shows that our triage works. > Agreed. > > We did have negative open deltas for several weeks running in October. > Kudos to everyone involved, and lets do it some more :) I'm looking > forward to making a non-trivial dent in the open count during the bug > weekend on the 20th/21st. Just to get a better idea I tried to plot a graph with the values of the last 13 weeks. The resulting image is attached to the mail. Best Regards, Ezio Melotti -------------- next part -------------- A non-text attachment was scrubbed... Name: issues.png Type: image/png Size: 57338 bytes Desc: not available URL: From martin at v.loewis.de Sun Nov 7 09:01:22 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 07 Nov 2010 09:01:22 +0100 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py In-Reply-To: References: <20101106141631.59358EEADD@mail.python.org> <4CD57BAA.1030301@v.loewis.de> <4CD581CA.1020905@netwok.org> <4CD58342.6040102@v.loewis.de> Message-ID: <4CD65CD2.3020402@v.loewis.de> >> It's rather a matter of agreeing when moving forward: IMO, mere style >> changes, code cleanup etc shouldn't be applied to the bug fix branches, >> as their only purpose is to provide bug fixes for existing users. > > The omission of the deletion from the 5/5 revision was a bug in that > revision. If the removal of OS9 support was documented (announced), > which I presume it was, then one could consider any visible trace > remaining to be a bug. Well, the question is: can anything break due to the code removal. In principle, stuff *could* break even by a function that is supposedly unused, and had supposedly been removed. The problem is that a supposedly-unused function actually might be used somewhere, in some context unrelated to its intended usage. > Perhaps the policy on code cleanup should be a bit more liberal for 2.7 > *because* it will be maintained for several years and *because* there is > no newer 2.x branch to apply changes to. You mean, it's ok to break stuff with no gain in 2.7 bug fix releases? > If I were going to maintain 2.7 > for several years, I would want to have the benefit of gradual > improvements that make maintainance easier. I question whether cleanup on a maintenance branch makes maintenance easier. For example, one may (and I often do) compare the code base of the previous bug fix release with the upcoming one, to see whether any suspicious change accidentally was backported. Code cleanup is in the way of such analysis, making maintenance more difficult. Regards, Martin From trent at snakebite.org Sun Nov 7 12:24:59 2010 From: trent at snakebite.org (Trent Nelson) Date: Sun, 7 Nov 2010 06:24:59 -0500 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: References: Message-ID: <4CD68C8B.4090004@snakebite.org> On 07-Nov-10 1:55 AM, Nick Coghlan wrote: > On Sun, Nov 7, 2010 at 3:53 AM, Giampaolo Rodol? wrote: >> In such cases I would find more easy to be able to connect to the >> machine and test myself rather than create a separate branch, commit, >> schedule a buildbot run, wait for it to complete and see whether >> everything is "green". >> >> On the other side I perfectly understand how opening up blanket ssh >> access is not something everyone is comfortable with doing. >> AFAICR there was someone who was setting up an evironment to solve >> exactly this problem but I'm not sure whether this is already usable. > > Dealing with exactly this problem is one of the goals of the Snakebite project. > > As far as I know, the folks behind that project are still working on > it - I've cc'ed Trent Nelson to see if he can provide any additional > info on the topic. Thanks for the ping Nick, I might have missed this otherwise. Good timing, too, as Titus and I were just discussing which low hanging fruit/pain points Snakebite should tackle first (now that all the server room stuff has finally been taken care of). Luckily, the problems that we faced 2.5 years ago when I came up with the idea of Snakebite are still just as ever present today ;-) 1. Not having access to buildbots is a pain when something doesn't work right. Doing dummy debug commits against trunk to try and coerce some more information out of a failing platform is painful. Losing a build slave entirely due to a particularly hard crash and requiring the assistance of the owner is also super frustrating. 2. The buildbot web interface for building non-(trunk|2.x|py3k) branches is also crazy unfriendly. Per-activity branches are a great way to isolate development, even with Subversion, but it kinda' blows that you don't *really* get any feedback about how your code behaves on other platforms until you re-integrate your changes back into a mainline branch. (I'm sure none of us have been masochistic enough to manually kick off individual builds for every platform via the buildbot web page after every commit to a non-standard branch.) So, enter Snakebite. We've got three racks filled with way more hardware than I should have ever purchased. Ignoring the overhead of having to set machines up and whatnot, let's just assume that over the next couple of months, if there's a platform we need a stable buildbot for, Snakebite can provide it. (And if we feel like bringing IRIX/MIPS and Tru64/Alphas back as primary platforms, we've got the hardware to do that, too ;-).) Now, the fact that they're all in the one place and under my complete control is a big advantage, as I can start addressing some of the pain points that lead me down this twisted path 2.5 years ago. I'd like to get some feedback from the development community on what they'd prefer. In my mind, I could take one of the following two steps: 1. Set up standard build slaves on all the platforms, but put something in place that allowed committers to ssh/mstsc in to said slaves when things go wrong in order to aid with debugging and/or maintaining general buildbot health (OK'ing modal crash dialogues on Windows, for example). 2. Address the second problem of the buildbot web interface sucking for non-standard branches. I'm thinking along the lines of a hack to buildbot, such that upon creation of new per-activity branches off a mainline, something magically runs in the background and sets up a complete buildbot view at python.snakebite.org/dev/buildbot/, just as if you were looking at a trunk buildbot page. I'm not sure how easy the second point will be when we switch to hg; and I'll admit if there have been any python-dev discussions about buildbot once we're on hg, I've missed them. Of course there's a third option, which is using the infrastructure I've mentioned to address a similarly annoying pain point I haven't thought of -- so feel free to mention anything else you'd like to see first instead of the above two things. Titus, for example, alluded to some nifty way for a committer to push his local hg branch/changes somewhere, such that it would kick off builds on multiple platforms in the same sorta' vein as point 2, but able to leverage cloud resources like Amazon's EC2, not just Snakebite hardware. Look forward to hearing some feedback! Regards, Trent. From ncoghlan at gmail.com Sun Nov 7 12:50:31 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Nov 2010 21:50:31 +1000 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD68C8B.4090004@snakebite.org> References: <4CD68C8B.4090004@snakebite.org> Message-ID: On Sun, Nov 7, 2010 at 9:24 PM, Trent Nelson wrote: > 1. ?Set up standard build slaves on all the platforms, but put something in > place that allowed committers to ssh/mstsc in to said slaves when things go > wrong in order to aid with debugging and/or maintaining general buildbot > health (OK'ing modal crash dialogues on Windows, for example). This sounds like a great place to start. Perhaps focus on one or two of the less common platforms first (e.g. FreeBSD 7 has been hitting a few semaphore issues lately). The big 3 (Windows/Mac/Linux) are usually reasonably well covered for debugging purposes by people that use them for development. > 2. ?Address the second problem of the buildbot web interface sucking for > non-standard branches. ?I'm thinking along the lines of a hack to buildbot, > such that upon creation of new per-activity branches off a mainline, > something magically runs in the background and sets up a complete buildbot > view at python.snakebite.org/dev/buildbot/, just as if you > were looking at a trunk buildbot page. > > I'm not sure how easy the second point will be when we switch to hg; and > I'll admit if there have been any python-dev discussions about buildbot once > we're on hg, I've missed them. With the switch to hg.python.org imminent, it may be better to focus on Hg for that part (unless you have other projects in mind that also use SVN). I believe Martin and/or Dirkjan have worked out the equivalent triggers and build commands needed to switch the buildbot fleet from svn to hg, but I'm not entirely certain about that one. Good to know things are still progressing though - traffic on the website news feed and the mailing list has been a little sparse this year ;) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dirkjan at ochtman.nl Sun Nov 7 13:15:15 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Sun, 7 Nov 2010 13:15:15 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD68C8B.4090004@snakebite.org> References: <4CD68C8B.4090004@snakebite.org> Message-ID: On Sun, Nov 7, 2010 at 12:24, Trent Nelson wrote: > Titus, for example, alluded to some nifty way for a committer to push his > local hg branch/changes somewhere, such that it would kick off builds on > multiple platforms in the same sorta' vein as point 2, but able to leverage > cloud resources like Amazon's EC2, not just Snakebite hardware. Mozilla has something called the "try server", where people push changes like to any normal repositories, but the result is that it runs all the test suites they have. This lets people painlessly test stuff on all platforms before actually pushing it to one of the main repositories. On Sun, Nov 7, 2010 at 12:50, Nick Coghlan wrote: > With the switch to hg.python.org imminent, it may be better to focus > on Hg for that part (unless you have other projects in mind that also > use SVN). I believe Martin and/or Dirkjan have worked out the > equivalent triggers and build commands needed to switch the buildbot > fleet from svn to hg, but I'm not entirely certain about that one. Yeah, Martin has things for buildbot worked out. Notes about this are in the hg.python.org/pymigr repository. Cheers, Dirkjan From solipsis at pitrou.net Sun Nov 7 14:42:20 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 7 Nov 2010 14:42:20 +0100 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py References: <20101106141631.59358EEADD@mail.python.org> <4CD57BAA.1030301@v.loewis.de> <4CD581CA.1020905@netwok.org> <4CD58342.6040102@v.loewis.de> <4CD65CD2.3020402@v.loewis.de> Message-ID: <20101107144220.78a3f3aa@pitrou.net> On Sun, 07 Nov 2010 09:01:22 +0100 "Martin v. L?wis" wrote: > > > If I were going to maintain 2.7 > > for several years, I would want to have the benefit of gradual > > improvements that make maintainance easier. > > I question whether cleanup on a maintenance branch makes maintenance > easier. It certainly does when using svnmerge. You can have many merge conflicts if cleanups on the dev branch aren't backported to the bugfix branches. Regards Antoine. From khamenya at gmail.com Sun Nov 7 11:19:11 2010 From: khamenya at gmail.com (Valery Khamenya) Date: Sun, 7 Nov 2010 11:19:11 +0100 Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests) Message-ID: Hi, A) I missed the auto-complete feature for dictionary keys a lot in python console. This patch seems to do the job. B) There is no rlcompleter tests in trunk for some reason. So, I've taken the 2.7.x test_rlcompleter.py and extended it. C) patched rlcompleter as such works OK for unicode dictionary keys as well. All tests pass OK. HOWEVER, readline's completion mechanism seem to be confused with unicode strings -- see comments to Completer.dict_key_matches(). So, perhaps, some changes should be applied to readline code too. Attached: 1. rlcompleter.py (as for trunk) 2. test_rlcompleter (as for trunk) 3. rlcompleter_trunk_to_new.diff (created as: diff rlcompleter_trunk.py rlcompleter.py >rlcompleter_trunk_to_new.diff) P.S. thanks to kerio & ssbr on icq for advices. best regards -- Valery A.Khamenya -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rlcompleter.py Type: text/x-python Size: 9001 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_rlcompleter.py Type: text/x-python Size: 6134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: rlcompleter_trunk_to_new.diff Type: text/x-patch Size: 3674 bytes Desc: not available URL: From g.brandl at gmx.net Sun Nov 7 14:51:09 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 07 Nov 2010 14:51:09 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: References: <4CD68C8B.4090004@snakebite.org> Message-ID: Am 07.11.2010 12:50, schrieb Nick Coghlan: > On Sun, Nov 7, 2010 at 9:24 PM, Trent Nelson wrote: >> 1. Set up standard build slaves on all the platforms, but put something in >> place that allowed committers to ssh/mstsc in to said slaves when things go >> wrong in order to aid with debugging and/or maintaining general buildbot >> health (OK'ing modal crash dialogues on Windows, for example). > > This sounds like a great place to start. Perhaps focus on one or two > of the less common platforms first (e.g. FreeBSD 7 has been hitting a > few semaphore issues lately). The big 3 (Windows/Mac/Linux) are > usually reasonably well covered for debugging purposes by people that > use them for development. > >> 2. Address the second problem of the buildbot web interface sucking for >> non-standard branches. I'm thinking along the lines of a hack to buildbot, >> such that upon creation of new per-activity branches off a mainline, >> something magically runs in the background and sets up a complete buildbot >> view at python.snakebite.org/dev/buildbot/, just as if you >> were looking at a trunk buildbot page. >> >> I'm not sure how easy the second point will be when we switch to hg; and >> I'll admit if there have been any python-dev discussions about buildbot once >> we're on hg, I've missed them. > > With the switch to hg.python.org imminent, it may be better to focus > on Hg for that part (unless you have other projects in mind that also > use SVN). I believe Martin and/or Dirkjan have worked out the > equivalent triggers and build commands needed to switch the buildbot > fleet from svn to hg, but I'm not entirely certain about that one. I've spent a good bit of time on that, and left all the instructions in the buildbot master config. I also adapted buildbot's hg hook to our situation (e.g. to send a change to multiple masters, as required for the community buildbots), so it should be quite easy to actually switch the buildbots over on migration day. Georg From foom at fuhm.net Sun Nov 7 15:57:06 2010 From: foom at fuhm.net (James Y Knight) Date: Sun, 7 Nov 2010 09:57:06 -0500 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <4CD55AF4.40308@v.loewis.de> References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> <4CD55AF4.40308@v.loewis.de> Message-ID: <027552CB-5AB8-47F4-A8CE-422D8862AE6D@fuhm.net> On Nov 6, 2010, at 9:41 AM, Martin v. L?wis wrote: > So I don't recall a decision that there shouldn't be a python2 > binary, The decision to make one would have to be an active decision, since Python has never installed one before. If there should be one, then the Python Makefile should make one by default. > nor a decision that anything is done indefinitely > (it may be that the decision was actually just about 3.1 - changing > it again for 3.2 would require another decision, but certainly can't > be ruled out categorically). When I said "indefinite", I meant "until some point in the future not yet determined", with an implied undertone of "not anytime soon". James From exarkun at twistedmatrix.com Sun Nov 7 17:25:18 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Sun, 07 Nov 2010 16:25:18 -0000 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD68C8B.4090004@snakebite.org> References: <4CD68C8B.4090004@snakebite.org> Message-ID: <20101107162518.2040.178068202.divmod.xquotient.717@localhost.localdomain> On 11:24 am, trent at snakebite.org wrote: > >2. Address the second problem of the buildbot web interface sucking >for non-standard branches. I'm thinking along the lines of a hack to >buildbot, such that upon creation of new per-activity branches off a >mainline, something magically runs in the background and sets up a >complete buildbot view at python.snakebite.org/dev/buildbot/branch-name>, just as if you were looking at a trunk buildbot page. This is basically trivial. I gave #python-dev a tool for forcing builds, dunno if anyone still has a copy, but it's easy to reconstruct from (which is what the Twisted project uses). Plus, you can add ?branch= to most BuildBot views to limit display of results to just builds for the named branch. >Titus, for example, alluded to some nifty way for a committer to push >his local hg branch/changes somewhere, such that it would kick off >builds on multiple platforms in the same sorta' vein as point 2, but >able to leverage cloud resources like Amazon's EC2, not just Snakebite >hardware. BuildBot supports managing EC2 instance lifetimes to run builds. Jean-Paul From brian.curtin at gmail.com Sun Nov 7 17:41:09 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Sun, 7 Nov 2010 10:41:09 -0600 Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests) In-Reply-To: References: Message-ID: On Sun, Nov 7, 2010 at 04:19, Valery Khamenya wrote: > Hi, > > A) I missed the auto-complete feature for dictionary keys a lot in python > console. This patch seems to do the job. > > B) There is no rlcompleter tests in trunk for some reason. So, I've taken > the 2.7.x test_rlcompleter.py and extended it. > > C) patched rlcompleter as such works OK for unicode dictionary keys as > well. All tests pass OK. HOWEVER, readline's completion mechanism seem to be > confused with unicode strings -- see comments to > Completer.dict_key_matches(). So, perhaps, some changes should be applied to > readline code too. > > Attached: > > 1. rlcompleter.py (as for trunk) > > 2. test_rlcompleter (as for trunk) > > 3. rlcompleter_trunk_to_new.diff (created as: diff rlcompleter_trunk.py > rlcompleter.py >rlcompleter_trunk_to_new.diff) > > P.S. thanks to kerio & ssbr on icq for advices. > > best regards > -- > Valery A.Khamenya > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brian.curtin%40gmail.com > > Can you post your patch on bugs.python.org? -------------- next part -------------- An HTML attachment was scrubbed... URL: From khamenya at gmail.com Sun Nov 7 18:07:31 2010 From: khamenya at gmail.com (Valery Khamenya) Date: Sun, 7 Nov 2010 18:07:31 +0100 Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests) In-Reply-To: References: Message-ID: > > Can you post your patch on bugs.python.org? > the site is not working currently. Also, I have forgotten to mention, that the usual lines in .pythonstartup should look now like that: # the usual lines: import readline import rlcompleter readline.parse_and_bind('tab: complete') readline.parse_and_bind('Control-Space: complete') # and now the additional line to allow the '[' char and both quote characters: readline.set_completer_delims(re.compile(r'[\'"\\[]').sub('', readline.get_completer_delims())) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bobbyi at gmail.com Sun Nov 7 18:30:17 2010 From: bobbyi at gmail.com (Bobby Impollonia) Date: Sun, 7 Nov 2010 09:30:17 -0800 Subject: [Python-Dev] bugs.python.org not responding (Was: rlcompleter -- auto-complete dictionary keys (+ tests)) Message-ID: On Sun, Nov 7, 2010 at 9:07 AM, Valery Khamenya wrote: >> Can you post your patch on bugs.python.org? > > ?the site is not working currently. Yes, it is down for me too, trying from multiple hosts. It was up approximately an hour ago, but has now been unresponsive for the past twenty or thirty minutes. I cannot even ping bugs.python.org. The main python.org site seems to be fine. From rdmurray at bitdance.com Sun Nov 7 20:19:44 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 07 Nov 2010 14:19:44 -0500 Subject: [Python-Dev] bugs.python.org not responding (Was: rlcompleter -- auto-complete dictionary keys (+ tests)) In-Reply-To: References: Message-ID: <20101107191944.5B492164C16@kimball.webabinitio.net> On Sun, 07 Nov 2010 09:30:17 -0800, Bobby Impollonia wrote: > On Sun, Nov 7, 2010 at 9:07 AM, Valery Khamenya wrote: > >> Can you post your patch on bugs.python.org? > > > > the site is not working currently. > > Yes, it is down for me too, trying from multiple hosts. It was up > approximately an hour ago, but has now been unresponsive for the past > twenty or thirty minutes. I cannot even ping bugs.python.org. The main > python.org site seems to be fine. The hosting company working on the problem, which seems to be a hardware issue. Hopefully be resolved soon. FYI bugs.python.org and www.python.org are different machines, and in fact the two machines are not even hosted at the same location. Valery, I would advise you to submit the patch to bugs.python.org when it comes back up. Patches posted to this mailing list will in general just get forgotten. -- R. David Murray www.bitdance.com From python at mrabarnett.plus.com Sun Nov 7 22:05:32 2010 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 07 Nov 2010 21:05:32 +0000 Subject: [Python-Dev] Bug track down? Message-ID: <4CD7149C.2020003@mrabarnett.plus.com> It looks like the bug tracker is down. From martin at v.loewis.de Sun Nov 7 22:06:49 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Nov 2010 22:06:49 +0100 Subject: [Python-Dev] Python-3 transition in Arch Linux In-Reply-To: <027552CB-5AB8-47F4-A8CE-422D8862AE6D@fuhm.net> References: <4CD23A19.6080002@archlinux.org> <877hgsior1.fsf@uwakimon.sk.tsukuba.ac.jp> <4CD55AF4.40308@v.loewis.de> <027552CB-5AB8-47F4-A8CE-422D8862AE6D@fuhm.net> Message-ID: <4CD714E9.9000507@v.loewis.de> Am 07.11.2010 15:57, schrieb James Y Knight: > On Nov 6, 2010, at 9:41 AM, Martin v. L?wis wrote: >> So I don't recall a decision that there shouldn't be a python2 >> binary, > > The decision to make one would have to be an active decision, since > Python has never installed one before. If there should be one, then > the Python Makefile should make one by default. No. Creation of additional symlinks is certainly in the realm of what Python packagers can decide on their own. Regards, Martin From martin at v.loewis.de Sun Nov 7 22:26:46 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 07 Nov 2010 22:26:46 +0100 Subject: [Python-Dev] Bug track down? In-Reply-To: <4CD7149C.2020003@mrabarnett.plus.com> References: <4CD7149C.2020003@mrabarnett.plus.com> Message-ID: <4CD71996.5000006@v.loewis.de> Am 07.11.2010 22:05, schrieb MRAB: > It looks like the bug tracker is down. Thanks - we have already contacted the hosting company, who have already contacted the datacenter. It appears that the bug tracker actually wasn't down (at least, it believes it was up all time), which suggests that there was some kind of networking problem. It came back, then went away again. Regards, Martin From solipsis at pitrou.net Sun Nov 7 23:11:59 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 7 Nov 2010 23:11:59 +0100 Subject: [Python-Dev] r86276 - python/branches/py3k/Lib/distutils/cygwinccompiler.py References: <20101106180353.33FAEEE98A@mail.python.org> Message-ID: <20101107231159.48775697@pitrou.net> On Sat, 6 Nov 2010 19:03:53 +0100 (CET) eric.araujo wrote: > Author: eric.araujo > Date: Sat Nov 6 19:03:52 2010 > New Revision: 86276 > > Log: > Fix #10252 again (hopefully definitely). Patch by Brian Curtin. It seems this and previous fixes should be backported to 2.7. Regards Antoine. From db3l.net at gmail.com Mon Nov 8 00:34:36 2010 From: db3l.net at gmail.com (David Bolen) Date: Sun, 07 Nov 2010 18:34:36 -0500 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot References: <201011061219.55473.victor.stinner@haypocalc.com> <201011070430.54085.victor.stinner@haypocalc.com> Message-ID: Victor Stinner writes: > You wrote that the "POSIX" semaphore are very limited. Do it mean that there > is another kind of semaphore will an higher limit? Well, I think the SYSV semaphores are either less limited or at least more adjustable. They've certainly been around longer in FreeBSD. The POSIX semaphore support is not enabled by default in FreeBSD 7, so I added loader.conf stuff to load them (as part of issue7272). I don't think the Python internals are using the SYSV semaphores though. SYSV functions have no underscore (e.g., semget) whereas POSIX do (sem_get). Also, I believe only POSIX has named semaphores. > I don't think that skipping the test is a good idea: it looks like a real bug > in (a limitation of) the ProcessPoolExecutor implementation on FreeBSD. Eg. > test_map fails on FreeBSD 7.2 with ProcessPoolExecutorTest which uses > self.executor = futures.ProcessPoolExecutor(max_workers=1): only one worker > process! > > It looks like it is possible to tune semaphore limits on FreeBSD, without > recompiling the kernel, by using boot loader option (kern.ipc.sem* options). > But ask the FreeBSD user to tune its boot loader options to use the > concurrent.futures module is not pratical :-) Yeah, I guess the key question is if changing the limit is just needed to get around an artifact of the test process (which I'm willing to do for the buildbot), or if it would be needed to be able to use the regular modules in practice. If the latter, I doubt too many users are going to jump through such hoops, particularly if it needs a kernel rebuild, so we may need to make other choices in terms of support under FreeBSD. I'm also not entirely sure just what is the limiting factor. I think the kern.ipc.sem* options are for the SYSV semaphores, not POSIX, though some of them do have a similar limit. Some are adjustable by sysctl, others by loader.conf. The references I found were talking about a limit set explicitly (#define SEM_MAX) in the kernel source (uipc_sem.c) which exports its value (at least in 7.2) via the sysctl p1003_1b.sem_nsems_max, which is read-only. I got the impression they weren't adjustable even in loader.conf, but haven't actually tried it yet myself. It may be different in 8.x, but one email thread I found indicated that the changes proposed to make the POSIX limits adjustable didn't make the 8.1 cut (current release), though might make it in the next 8.x release. -- David From martin at v.loewis.de Mon Nov 8 01:09:27 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Nov 2010 01:09:27 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD68C8B.4090004@snakebite.org> References: <4CD68C8B.4090004@snakebite.org> Message-ID: <4CD73FB7.5010402@v.loewis.de> > Luckily, the problems that we faced 2.5 years ago when I came up with > the idea of Snakebite are still just as ever present today ;-) Is this bashing of existing infrastructure really necessary? People (like me) might start bashing about vaporware and how a bird in the hand is worth two in the bush. Cooperate, don't confront. Regards, Martin From martin at v.loewis.de Mon Nov 8 01:13:19 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Nov 2010 01:13:19 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: References: <4CD68C8B.4090004@snakebite.org> Message-ID: <4CD7409F.60308@v.loewis.de> > I've spent a good bit of time on that, and left all the instructions in > the buildbot master config. I also adapted buildbot's hg hook to our > situation (e.g. to send a change to multiple masters, as required for > the community buildbots), so it should be quite easy to actually > switch the buildbots over on migration day. I'm not sure this is the right way of doing it. AFAICT, hg can have multiple handlers for the same hook, e.g. incoming.buildbot and incoming.community. Furthermore, I believe the community buildbot farm is currently dead, and unlikely to come back. Regards, Martin From solipsis at pitrou.net Mon Nov 8 01:58:05 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 8 Nov 2010 01:58:05 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) References: <4CD68C8B.4090004@snakebite.org> Message-ID: <20101108015805.2254b586@pitrou.net> On Sun, 7 Nov 2010 06:24:59 -0500 Trent Nelson wrote: > (And if we feel like bringing IRIX/MIPS > and Tru64/Alphas back as primary platforms, we've got the hardware to do > that, too ;-).) Unless you want to rename your project zombiebite, it would probably be better not to resurrect those old corpses. (I'm talking about the OSes, not the chips) > Of course there's a third option, which is using the infrastructure I've > mentioned to address a similarly annoying pain point I haven't thought > of -- so feel free to mention anything else you'd like to see first > instead of the above two things. I'm sure there are various special builds that could be useful. One is a build with heavy resource consumption (lots of RAM, lots of disk) if there's a machine which can handle that. Another is testing memory leaks on all 3 branches (I have a daily script which does that for 3.x on my personal server). Perhaps there could even be some automated fuzzing if Victor is looking for something to do on his free time :) Regards Antoine. From scott+python-dev at scottdial.com Mon Nov 8 03:32:33 2010 From: scott+python-dev at scottdial.com (Scott Dial) Date: Sun, 07 Nov 2010 21:32:33 -0500 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD73FB7.5010402@v.loewis.de> References: <4CD68C8B.4090004@snakebite.org> <4CD73FB7.5010402@v.loewis.de> Message-ID: <4CD76141.6060304@scottdial.com> On 11/7/2010 7:09 PM, Martin v. L?wis wrote: >> Luckily, the problems that we faced 2.5 years ago when I came up with >> the idea of Snakebite are still just as ever present today ;-) > > Is this bashing of existing infrastructure really necessary? > People (like me) might start bashing about vaporware and how > a bird in the hand is worth two in the bush. Cooperate, don't > confront. +1 Respect your (software) elders. The Snaketbite rhetoric has always been less than generous with regard to Buildbot, but Buildbot has been providing an infinitely more useful service to the community for much longer than Snakebite has for those 2.5 years. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From victor.stinner at haypocalc.com Mon Nov 8 03:36:34 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 8 Nov 2010 03:36:34 +0100 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: References: <201011061219.55473.victor.stinner@haypocalc.com> <201011070430.54085.victor.stinner@haypocalc.com> Message-ID: <201011080336.34367.victor.stinner@haypocalc.com> On Monday 08 November 2010 00:34:36 David Bolen wrote: > Victor Stinner writes: > > You wrote that the "POSIX" semaphore are very limited. Do it mean that > > there is another kind of semaphore will an higher limit? > > Well, I think the SYSV semaphores are either less limited or at least > more adjustable. They've certainly been around longer in FreeBSD. > The POSIX semaphore support is not enabled by default in FreeBSD 7, so > I added loader.conf stuff to load them (as part of issue7272). I > don't think the Python internals are using the SYSV semaphores though. > SYSV functions have no underscore (e.g., semget) whereas POSIX do > (sem_get). Also, I believe only POSIX has named semaphores. I created the issue http://bugs.python.org/issue10348 to suggest this. Victor From ctb at msu.edu Mon Nov 8 03:58:56 2010 From: ctb at msu.edu (C. Titus Brown) Date: Sun, 7 Nov 2010 18:58:56 -0800 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD76141.6060304@scottdial.com> References: <4CD68C8B.4090004@snakebite.org> <4CD73FB7.5010402@v.loewis.de> <4CD76141.6060304@scottdial.com> Message-ID: <20101108025856.GA9304@idyll.org> On Sun, Nov 07, 2010 at 09:32:33PM -0500, Scott Dial wrote: > On 11/7/2010 7:09 PM, Martin v. L?wis wrote: > >> Luckily, the problems that we faced 2.5 years ago when I came up with > >> the idea of Snakebite are still just as ever present today ;-) > > > > Is this bashing of existing infrastructure really necessary? > > People (like me) might start bashing about vaporware and how > > a bird in the hand is worth two in the bush. Cooperate, don't > > confront. > > +1 Respect your (software) elders. > > The Snaketbite rhetoric has always been less than generous with regard > to Buildbot, but Buildbot has been providing an infinitely more useful > service to the community for much longer than Snakebite has for those > 2.5 years. Yes, yes, I agree that some graciousness is a good idea. Oh, wait... you're not helping. Anyway, I think buildbot is a good local optimum for python-dev, largely because it's maintained by someone who cares enough to do it well. And, if Trent had been talking about buildbot only, MvL's comment would be more than fair. But Trent, and I, and others, have talked about quite a bit more than buildbot being "the" problem. Things like enabling *and maintaining* easy EC2 spin-up with buildbot, or providing SSH key access, or making a 'try' server available and maintaining it, would be clearly beneficial. And that's some of what Trent has been talking about providing. It turns out it's hard to do without lots and lots of time and money. If you truly think it's not useful, I'd be interested in hearing your opinions, because we've spent an ungodly amount of the above on it. In the larger context, I worry very much that we're settling for a rather suboptimal support setup (on svn, and on cont integration, and on some other aspects of Python infrastructure) because the current maintainers are so overloaded and few others are stepping up to bear burdens. This is a big concern of at least some people in the PSF. But it's not an easy problem to solve - quelle surprise. And I'm not in a personal position to help, so I've basically tried to shut up about it :). As for buildbot, I've been pretty hard on buildbot myself, and I'm happy to justify it to others -- I've done so in public fora so I'm sure you can find the records, if you care to look. But it's not really very relevant to this conversation, especially since Trent has always been interested in building off the buildbot setup rather than replacing it. --titus -- C. Titus Brown, ctb at msu.edu From qiyong at sosdg.org Mon Nov 8 02:43:33 2010 From: qiyong at sosdg.org (Qi Yong) Date: Sun, 7 Nov 2010 18:43:33 -0700 Subject: [Python-Dev] KeyboardInterrupt not catch Message-ID: <20101108014333.GC32719@meridian.sosdg.org> Hello, With this script, after ctrl-d, ctrl-c exception not catch. Is it a python bug or a wrong exception usage? Thanks. If with import readline, this problem disappears. -- qiyong def parse(): try: answer = raw_input("Eo: ") print answer except EOFError: print("EOF") except KeyboardInterrupt: print("") def main(): while True: parse() if __name__ == "__main__": main() -- Qi Yong From tjreedy at udel.edu Mon Nov 8 04:45:43 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 07 Nov 2010 22:45:43 -0500 Subject: [Python-Dev] r86276 - python/branches/py3k/Lib/distutils/cygwinccompiler.py In-Reply-To: <20101107231159.48775697@pitrou.net> References: <20101106180353.33FAEEE98A@mail.python.org> <20101107231159.48775697@pitrou.net> Message-ID: On 11/7/2010 5:11 PM, Antoine Pitrou wrote: > On Sat, 6 Nov 2010 19:03:53 +0100 (CET) > eric.araujo wrote: >> Author: eric.araujo >> Date: Sat Nov 6 19:03:52 2010 >> New Revision: 86276 >> >> Log: >> Fix #10252 again (hopefully definitely). Patch by Brian Curtin. > > It seems this and previous fixes should be backported to 2.7. Perhaps there should be a 'backport 2.7' keyword to check on issues that might be but have not been. -- Terry Jan Reedy From scott+python-dev at scottdial.com Mon Nov 8 04:51:44 2010 From: scott+python-dev at scottdial.com (Scott Dial) Date: Sun, 07 Nov 2010 22:51:44 -0500 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <20101108025856.GA9304@idyll.org> References: <4CD68C8B.4090004@snakebite.org> <4CD73FB7.5010402@v.loewis.de> <4CD76141.6060304@scottdial.com> <20101108025856.GA9304@idyll.org> Message-ID: <4CD773D0.2000306@scottdial.com> On 11/7/2010 9:58 PM, C. Titus Brown wrote: > Yes, yes, I agree that some graciousness is a good idea. > > Oh, wait... you're not helping. Classy. I don't remember being invited to help. snakebite.org is a dead end. snakebite-list hasn't had a post for over a year. Where is the list of things that you need done so that I can get started on that? Oh wait.. Seriously, all I asked was for you to tone down your insults to a technology that is already solving problems today. Why you feel the need to attack me personally is beyond my understanding. Furthermore, I don't see why I need to be "helping" -- somebody who doesn't want help -- to be able to deduce that your message is being insulting to the authors of Buildbot. On 11/7/2010 9:58 PM, C. Titus Brown wrote: > And I'm not in a personal position to help, so I've basically tried > to shut up about it :). At least I am in good company. ;) -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From ben+python at benfinney.id.au Mon Nov 8 05:05:05 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 08 Nov 2010 15:05:05 +1100 Subject: [Python-Dev] KeyboardInterrupt not catch References: <20101108014333.GC32719@meridian.sosdg.org> Message-ID: <87d3qge9zi.fsf@benfinney.id.au> Qi Yong writes: > With this script, after ctrl-d, ctrl-c exception not catch. When I run it, the Ctrl-D doesn't affect the behaviour of Ctrl-C. Can you confirm that the behaviour is dependent on whether Ctrl-D is used? > If with import readline, this problem disappears. Again, if I ?import readline? or not, the behaviour is unchanged. Can you show a specific series of steps that changes the behaviour? > Is it a python bug or a wrong exception usage? Thanks. The ?raw_input? function uses the ?readline? library. That library uses signal handlers for many of the terminal signals, as documented at . -- \ ?You've got the brain of a four-year-old boy, and I'll bet he | `\ was glad to get rid of it.? ?Groucho Marx | _o__) | Ben Finney From ralf at brainbot.com Mon Nov 8 06:35:27 2010 From: ralf at brainbot.com (Ralf Schmitt) Date: Mon, 08 Nov 2010 06:35:27 +0100 Subject: [Python-Dev] KeyboardInterrupt not catch In-Reply-To: <20101108014333.GC32719@meridian.sosdg.org> (Qi Yong's message of "Sun, 7 Nov 2010 18:43:33 -0700") References: <20101108014333.GC32719@meridian.sosdg.org> Message-ID: <87mxpkidi8.fsf@muni.brainbot.com> Qi Yong writes: > Hello, > > With this script, after ctrl-d, ctrl-c exception not catch. > Is it a python bug or a wrong exception usage? Thanks. > If with import readline, this problem disappears. there's already a bug in the issue tracker: http://bugs.python.org/issue1195 Cheers, - Ralf From g.brandl at gmx.net Mon Nov 8 07:39:31 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 08 Nov 2010 07:39:31 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD7409F.60308@v.loewis.de> References: <4CD68C8B.4090004@snakebite.org> <4CD7409F.60308@v.loewis.de> Message-ID: Am 08.11.2010 01:13, schrieb "Martin v. L?wis": >> I've spent a good bit of time on that, and left all the instructions in >> the buildbot master config. I also adapted buildbot's hg hook to our >> situation (e.g. to send a change to multiple masters, as required for >> the community buildbots), so it should be quite easy to actually >> switch the buildbots over on migration day. > > I'm not sure this is the right way of doing it. AFAICT, hg can have > multiple handlers for the same hook, e.g. incoming.buildbot and > incoming.community. That is true, however it doesn't help you: the hook takes its configuration from the hgrc file, so you can configure exactly one host:port to send changes to. > Furthermore, I believe the community buildbot farm is currently dead, > and unlikely to come back. Then it's easy not to use that feature :) Georg From martin at v.loewis.de Mon Nov 8 09:05:23 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 08 Nov 2010 09:05:23 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: References: <4CD68C8B.4090004@snakebite.org> <4CD7409F.60308@v.loewis.de> Message-ID: <4CD7AF43.5020703@v.loewis.de> > That is true, however it doesn't help you: the hook takes its configuration > from the hgrc file, so you can configure exactly one host:port to send > changes to. Ah, ok. Regards, Martin From asmodai at in-nomine.org Mon Nov 8 10:05:24 2010 From: asmodai at in-nomine.org (Jeroen Ruigrok van der Werven) Date: Mon, 8 Nov 2010 10:05:24 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: References: <4CD68C8B.4090004@snakebite.org> Message-ID: <20101108090524.GI27974@nexus.in-nomine.org> -On [20101107 12:52], Nick Coghlan (ncoghlan at gmail.com) wrote: >This sounds like a great place to start. Perhaps focus on one or two >of the less common platforms first (e.g. FreeBSD 7 has been hitting a >few semaphore issues lately). Nick, do you have some pointers for this? I am one of those BSD Python users/coders and would like to resolve any issues. By default FreeBSD 7, at least, has limits on semaphores and the likes, but those can be expanded. -- Jeroen Ruigrok van der Werven / asmodai ????? ?????? ??? ?? ?????? http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Hypocrisy is the homage which vice pays to virtue... From regebro at gmail.com Mon Nov 8 10:42:26 2010 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 8 Nov 2010 10:42:26 +0100 Subject: [Python-Dev] Continuing 2.x In-Reply-To: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> Message-ID: 2010/10/28 Kristj?n Valur J?nsson : > Hello all. > > > > So, python 2.7 is in bugfix only mode.? ?trunk? is off limit.? So, where > does one make improvements to the distinguished, and still very much alive, > 2.x series of Python? > > The answer would seem to be ?one doesn?t?. ?But must it be that way? Except for making releases that start backporting Python 3 features and breaking backwards compatibility gradually (which may or may not be a good idea) I don't see the point. There isn't much to do when it comes to improving the language, and there is a moratorium anyway. Improvements in the standard library can be more easily done in external libraries anyway, and then you can release the improved libraries for everything from Python 2.4 and forwards if you like. So it can be done, but the question is "Why?" -- Lennart Regebro, Colliberty: http://www.colliberty.com/ Telephone: +48 691 268 328 From khamenya at gmail.com Mon Nov 8 11:43:36 2010 From: khamenya at gmail.com (Valery Khamenya) Date: Mon, 8 Nov 2010 11:43:36 +0100 Subject: [Python-Dev] bugs.python.org not responding (Was: rlcompleter -- auto-complete dictionary keys (+ tests)) In-Reply-To: <20101107191944.5B492164C16@kimball.webabinitio.net> References: <20101107191944.5B492164C16@kimball.webabinitio.net> Message-ID: Hi David, > Valery, I would advise you to submit the patch to bugs.python.org when > it comes back up. Patches posted to this mailing list will in general > just get forgotten. > > done: http://bugs.python.org/issue10351 http://bugs.python.org/issue10352 Albeit, as I can already see the situation with changes in 2.x trunk isn't much simple. I hope the patch won't go forgotten. (After all we here still rely very much on 2.x) regards, Valery. -------------- next part -------------- An HTML attachment was scrubbed... URL: From regebro at gmail.com Mon Nov 8 12:08:24 2010 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 8 Nov 2010 12:08:24 +0100 Subject: [Python-Dev] new buffer in python2.7 In-Reply-To: <20101027123622.0ab194f9@pitrou.net> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB17@exchcn.ccp.ad.local> <20101027123622.0ab194f9@pitrou.net> Message-ID: On Wed, Oct 27, 2010 at 12:36, Antoine Pitrou wrote: > On Wed, 27 Oct 2010 10:13:12 +0800 > Kristj?n Valur J?nsson wrote: >> Although 2.7 has the new buffer interface and memoryview >> objects, these are widely not accepted in the built in modules. > > That's true, and slightly unfortunate. It could be a reason for > switching to 3.1/3.2 :-) It's rather a reason against it, as it makes supporting both Python 2 and Python 3 harder. However, fixing this in 2.7 just means that you need to support 2.7x or later only, so it's not a good solution. I think using compatibility types is a better solution. I suggested something like that for inclusion in "six", but it was softly rejected. :-) Something like this, for example. It's a str in Python2 and a Bytes in Python3, but it extends both classes with a consistent interface. Improvements, comments and ideas are welcome. bites.py: -------------------- import sys if sys.version < '3': class Bites(str): def __new__(cls, value): if isinstance(value[0], int): # It's a list of integers value = ''.join([chr(x) for x in value]) return super(Bites, cls).__new__(cls, value) def itemint(self, index): return ord(self[index]) def iterint(self): for x in self: yield ord(x) else: class Bites(bytes): def __new__(cls, value): if isinstance(value, str): # It's a unicode string: value = value.encode('ISO-8859-1') return super(Bites, cls).__new__(cls, value) def itemint(self, x): return self[x] def iterint(self): for x in self: yield x -------------------- -- Lennart Regebro: http://regebro.wordpress.com/ Python 3 Porting: http://python3porting.com/ +33 661 58 14 64 From ncoghlan at gmail.com Mon Nov 8 12:44:56 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Nov 2010 21:44:56 +1000 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module Message-ID: All, I was about to commit the patch for issue 2001 (the improvements to the pydoc web server and the removal of the Tk GUI) when I realised that pydoc.serve() and pydoc.gui() are technically public standard library APIs (albeit undocumented ones). Currently the patch switches serve() to start the new server implementation and gui() to start the server and open a browser window for it. It occurred to me that, despite the "it's an application" feel to the pydoc web server APIs, it may be a better idea to leave the two existing functions alone (aside from adding DeprecationWarning), and using new private function names to start the new server and the web browser. Is following the standard deprecation procedure the better course here, or am I being overly paranoid? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Nov 8 12:51:44 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Nov 2010 21:51:44 +1000 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <20101108090524.GI27974@nexus.in-nomine.org> References: <4CD68C8B.4090004@snakebite.org> <20101108090524.GI27974@nexus.in-nomine.org> Message-ID: On Mon, Nov 8, 2010 at 7:05 PM, Jeroen Ruigrok van der Werven wrote: > -On [20101107 12:52], Nick Coghlan (ncoghlan at gmail.com) wrote: >>This sounds like a great place to start. Perhaps focus on one or two >>of the less common platforms first (e.g. FreeBSD 7 has been hitting a >>few semaphore issues lately). > > Nick, do you have some pointers for this? I am one of those BSD Python > users/coders and would like to resolve any issues. > > By default FreeBSD 7, at least, has limits on semaphores and the likes, but > those can be expanded. Possibly a bad example on my part, since David and Victor actually seem to be making reasonable progress in tracking down the problem: http://mail.python.org/pipermail/python-dev/2010-November/105334.html Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dirkjan at ochtman.nl Mon Nov 8 12:55:24 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 8 Nov 2010 12:55:24 +0100 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: References: <4CD68C8B.4090004@snakebite.org> Message-ID: On Sun, Nov 7, 2010 at 13:15, Dirkjan Ochtman wrote: > Yeah, Martin has things for buildbot worked out. Notes about this are > in the hg.python.org/pymigr repository. I meant Georg here, of course. Sorry, Georg! Cheers, Dirkjan From ncoghlan at gmail.com Mon Nov 8 12:57:45 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Nov 2010 21:57:45 +1000 Subject: [Python-Dev] Snakebite, buildbot and low hanging fruit -- feedback wanted! (Was Re: SSH access against buildbot boxes) In-Reply-To: <4CD73FB7.5010402@v.loewis.de> References: <4CD68C8B.4090004@snakebite.org> <4CD73FB7.5010402@v.loewis.de> Message-ID: On Mon, Nov 8, 2010 at 10:09 AM, "Martin v. L?wis" wrote: >> Luckily, the problems that we faced 2.5 years ago when I came up with >> the idea of Snakebite are still just as ever present today ;-) > > Is this bashing of existing infrastructure really necessary? > People (like me) might start bashing about vaporware and how > a bird in the hand is worth two in the bush. Cooperate, don't > confront. I don't believe the comment was meant to be a slight on the efforts of the current infrastructure maintainers. I took Trent's message as referring to the problems Giampaolo mentioned in the original post (i.e. the ability to grant buildbot access in an easy-to-use way to existing core developers without burdening every buildbot operator with decisions as to who they can trust with access to their buildbot). Buildbot (and similar tools) are fine for what they do, but there are some problems like this that they don't even *try* to solve (because they aren't software problems - they're dependent on physical infrastructure). Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From michael at voidspace.org.uk Mon Nov 8 13:09:42 2010 From: michael at voidspace.org.uk (Michael Foord) Date: Mon, 08 Nov 2010 12:09:42 +0000 Subject: [Python-Dev] GUI test runner tool Message-ID: <4CD7E886.4060702@voidspace.org.uk> Hello all, Now that unittest has test discovery, Mark Roddy has been working on resurrecting the old GUI test runner (using Tkinter): https://bitbucket.org/markroddy/unittestgui This was part of the original pyunit project but I believe it was never part of the standard library: http://sourceforge.net/projects/pyunit/ Here's a screenshot of what it looks like: http://skitch.com/fuzzyman/dhu9r/pyunit I'd like to propose adding it to Python in Tools/ and am volunteering to maintain it. If the answer is "not yet" that is fine as it can go into unittest2 first. Mark has updated it to work with test discovery and added support for configuring test discovery in the same way as you can from the command line. It is a nice tool for those new to writing tests who aren't yet familiar with the command line or prefer a GUI. In its basic form you simply pick a directory and unittestgui will discover and run all the tests it finds. It would be nice if it provided more diagnostic information on tests it ran (clicking through test results) but these can be added later. All the best, Michael Foord -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From asmodai at in-nomine.org Mon Nov 8 13:23:33 2010 From: asmodai at in-nomine.org (Jeroen Ruigrok van der Werven) Date: Mon, 8 Nov 2010 13:23:33 +0100 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: References: <201011061219.55473.victor.stinner@haypocalc.com> <201011070430.54085.victor.stinner@haypocalc.com> Message-ID: <20101108122333.GJ27974@nexus.in-nomine.org> -On [20101108 00:36], David Bolen (db3l.net at gmail.com) wrote: >Victor Stinner writes: >Well, I think the SYSV semaphores are either less limited or at least >more adjustable. They've certainly been around longer in FreeBSD. >The POSIX semaphore support is not enabled by default in FreeBSD 7, so >I added loader.conf stuff to load them (as part of issue7272). It is enabled by default on FreeBSD 8 at least. Looking through the repository it seems 7-STABLE has it enabled by default as well in the GENERIC kernel (the standard one it boots with after its first install). It seems this was added for 7.3 and onward. So 7.2 and before need an "options P1003_1B_SEMAPHORES" added to their kernel at least. The SYSV options are already present in the entire 7.x line. >> It looks like it is possible to tune semaphore limits on FreeBSD, without >> recompiling the kernel, by using boot loader option (kern.ipc.sem* options). >> But ask the FreeBSD user to tune its boot loader options to use the >> concurrent.futures module is not pratical :-) PostgreSQL installations via ports as well as its documentation instruct the FreeBSD user to tweak kern.ipc settings. >Yeah, I guess the key question is if changing the limit is just needed >to get around an artifact of the test process (which I'm willing to do >for the buildbot), or if it would be needed to be able to use the >regular modules in practice. If the latter, I doubt too many users >are going to jump through such hoops, particularly if it needs a >kernel rebuild, so we may need to make other choices in terms of >support under FreeBSD. Almost every FreeBSD user I know of compiles a new kernel. It's just one of those BSD things that every user goes through. >I'm also not entirely sure just what is the limiting factor. I think >the kern.ipc.sem* options are for the SYSV semaphores, not POSIX, though >some of them do have a similar limit. Some are adjustable by sysctl, >others by loader.conf. kern.ipc is about System V IPC. As you indicate later on, p1003_1b is the POSIX related IPC sysctl tree. The three semaphore settings semmni, semmns, and semmnu are only tweakable via loader.conf. >The references I found were talking about a limit set explicitly >(#define SEM_MAX) in the kernel source (uipc_sem.c) which exports its >value (at least in 7.2) via the sysctl p1003_1b.sem_nsems_max, which >is read-only. I got the impression they weren't adjustable even in >loader.conf, but haven't actually tried it yet myself. > >It may be different in 8.x, but one email thread I found indicated >that the changes proposed to make the POSIX limits adjustable didn't >make the 8.1 cut (current release), though might make it in the next >8.x release. After checking the repository I saw that there were MFCs (Merge From Current, backport) to 8-STABLE prior to the 8.1 release for dynamic tweaking. On my 8.1 machine: nexus% sudo sysctl -w p1003_1b.sem_nsems_max=31 p1003_1b.sem_nsems_max: 30 -> 32 7.x is hardlocked at the moment unless someone manually edits the file to up the SEM_MAX define. The same goes for FreeBSD 8.0. -- Jeroen Ruigrok van der Werven / asmodai ????? ?????? ??? ?? ?????? http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Nothing yet from nothing ever came... From jcea at jcea.es Mon Nov 8 13:59:09 2010 From: jcea at jcea.es (Jesus Cea) Date: Mon, 08 Nov 2010 13:59:09 +0100 Subject: [Python-Dev] Help with warnings not being raised In-Reply-To: References: <4CD34F37.2020504@jcea.es> <4CD35A17.6080704@jcea.es> Message-ID: <4CD7F41D.1060809@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/11/10 13:55, Nick Coghlan wrote: > Under -We, PyErr_Warn raises an exception rather than printing to > stdout. That exception is clobbered by the immediately following call > to PyErr_Clear. > Since you *only* hit that branch under -We in the first place, a > second call to PyErr_WriteUnraisable should get the error to actually > print out. Excellent explanation, Nick. Thanks. Patched in r86317. Up-ported to upcoming pybsddb 5.1.1. PS: Bugs.python.org is still down. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNf0HZlgi5GaxT1NAQKSEwQAov31uECPVCEjxP7bns9VH4bz0HzXaAV7 VUeMxt8snK6/o6d8IowdoAGNR4MiSAkY0ww8IG9QQ/9919FMBD3kZs1+JNRcTWu3 RLN2kLvE6g+reV+M6tRqnMYwuxXiq4MhBgZZrnB6DA//buIbjOaLTVcL6ABDEMVc ULj2g0TN0mA= =h6DP -----END PGP SIGNATURE----- From merwok at netwok.org Mon Nov 8 15:30:18 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 08 Nov 2010 15:30:18 +0100 Subject: [Python-Dev] r86276 - python/branches/py3k/Lib/distutils/cygwinccompiler.py In-Reply-To: <20101107231159.48775697@pitrou.net> References: <20101106180353.33FAEEE98A@mail.python.org> <20101107231159.48775697@pitrou.net> Message-ID: <4CD8097A.9020301@netwok.org> >> New Revision: 86276 >> Log: >> Fix #10252 again (hopefully definitely). Patch by Brian Curtin. > > It seems this and previous fixes should be backported to 2.7. Certainly. I was waiting on buildbot feedback before doing it. Regards From merwok at netwok.org Mon Nov 8 15:34:36 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 08 Nov 2010 15:34:36 +0100 Subject: [Python-Dev] r86276 - python/branches/py3k/Lib/distutils/cygwinccompiler.py In-Reply-To: References: <20101106180353.33FAEEE98A@mail.python.org> <20101107231159.48775697@pitrou.net> Message-ID: <4CD80A7C.9080704@netwok.org> >> It seems this and previous fixes should be backported to 2.7. > > Perhaps there should be a 'backport 2.7' keyword to check on issues that > might be but have not been. The ?Your issues? list is very helpful and works well for me. This bug is still open and assigned to me (and opened in my web browser, incidentally), so I don?t fear I?ll forget it. This new keyword would IMO be redundant with existing fields (status:open + version:2.7). (The once-existing 26backport was entirely different if I recall correctly, it was used to tag 3.x features to be added to 2.x.) Regards From merwok at netwok.org Mon Nov 8 15:38:56 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 08 Nov 2010 15:38:56 +0100 Subject: [Python-Dev] [Python-checkins] r86300 - in python/branches/py3k: Misc/NEWS PC/winsound.c In-Reply-To: <20101107142927.17812EE9A1@mail.python.org> References: <20101107142927.17812EE9A1@mail.python.org> Message-ID: <4CD80B80.6010607@netwok.org> > Author: hirokazu.yamamoto > New Revision: 86300 > Log: > Issue #6317: Now winsound.PlaySound only accepts unicode with MvL's approval. > > > Modified: python/branches/py3k/Misc/NEWS > ============================================================================== > --- python/branches/py3k/Misc/NEWS (original) > +++ python/branches/py3k/Misc/NEWS Sun Nov 7 15:29:26 2010 > @@ -251,6 +251,8 @@ > Extension Modules > ----------------- > > +- Issue #6317: Now winsound.PlaySound only accepts unicode. > + > - Issue #6317: Now winsound.PlaySound can accept non ascii filename. Shouldn?t that be only one entry? Regards From exarkun at twistedmatrix.com Mon Nov 8 16:12:55 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 08 Nov 2010 15:12:55 -0000 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: References: Message-ID: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain> On 11:44 am, ncoghlan at gmail.com wrote: >All, > >I was about to commit the patch for issue 2001 (the improvements to >the pydoc web server and the removal of the Tk GUI) when I realised >that pydoc.serve() and pydoc.gui() are technically public standard >library APIs (albeit undocumented ones). > >Currently the patch switches serve() to start the new server >implementation and gui() to start the server and open a browser window >for it. > >It occurred to me that, despite the "it's an application" feel to the >pydoc web server APIs, it may be a better idea to leave the two >existing functions alone (aside from adding DeprecationWarning), and >using new private function names to start the new server and the web >browser. > >Is following the standard deprecation procedure the better course >here, or am I being overly paranoid? Following the deprecation procedure here sounds awesome to me. Thanks for considering it, I hope you'll choose to go that way. Jean-Paul From foom at fuhm.net Mon Nov 8 16:34:50 2010 From: foom at fuhm.net (James Y Knight) Date: Mon, 8 Nov 2010 10:34:50 -0500 Subject: [Python-Dev] Continuing 2.x In-Reply-To: References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> Message-ID: <0A652EB4-2C5D-4070-83EE-CF75603EE721@fuhm.net> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote: > Except for making releases that start backporting Python 3 features > and breaking backwards compatibility gradually (which may or may not > be a good idea) I don't see the point. There isn't much to do when it > comes to improving the language, and there is a moratorium anyway. > Improvements in the standard library can be more easily done in > external libraries anyway, and then you can release the improved > libraries for everything from Python 2.4 and forwards if you like. > > So it can be done, but the question is "Why?" To keep the batteries included? James From merwok at netwok.org Mon Nov 8 17:00:27 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 08 Nov 2010 17:00:27 +0100 Subject: [Python-Dev] [Python-checkins] r86264 - python/branches/release27-maint/Lib/distutils/sysconfig.py In-Reply-To: <4CD65CD2.3020402@v.loewis.de> References: <20101106141631.59358EEADD@mail.python.org> <4CD57BAA.1030301@v.loewis.de> <4CD581CA.1020905@netwok.org> <4CD58342.6040102@v.loewis.de> <4CD65CD2.3020402@v.loewis.de> Message-ID: <4CD81E9B.9020709@netwok.org> [Martin] >>> It's rather a matter of agreeing when moving forward: IMO, mere style >>> changes, code cleanup etc shouldn't be applied to the bug fix branches, >>> as their only purpose is to provide bug fixes for existing users. >> >> [Terry] >> The omission of the deletion from the 5/5 revision was a bug in that >> revision. If the removal of OS9 support was documented (announced), >> which I presume it was, then one could consider any visible trace >> remaining to be a bug. FTR, it was documented in PEP 11 as removed in 2.4 (but not in 2.4?s whatsnew). > Well, the question is: can anything break due to the code removal. > In principle, stuff *could* break even by a function that is supposedly > unused, and had supposedly been removed. The problem is that a > supposedly-unused function actually might be used somewhere, in some > context unrelated to its intended usage. It?s known that people do modify distutils.sysconfig._config_vars, a private dictionary; I can imagine some really contrived example of code using _init_mac, the function I removed, to set sysconfig values for Mac OS 9 in 2.7 code. 1% chance, I guess. >> Perhaps the policy on code cleanup should be a bit more liberal for 2.7 >> *because* it will be maintained for several years and *because* there is >> no newer 2.x branch to apply changes to. > > You mean, it's ok to break stuff with no gain in 2.7 bug fix releases? I don?t think Terry was suggesting breakages, just other kinds of cleanup. In this particular case, I think now that I should have followed distutils policy (which is less liberal that the rest of the stdlib). If there are no arguments against it this week, I will revert the commit. Regards From merwok at netwok.org Mon Nov 8 17:02:24 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 08 Nov 2010 17:02:24 +0100 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: References: Message-ID: <4CD81F10.2050103@netwok.org> Hi Nick, If there is no enormous difficulty in maintaining compatibility, I think the usual deprecation process should be followed. We don?t know who is using pydoc as a library, so let?s play safe and not risk breaking their code (especially considering that it must not have been easy to write code extending pydoc :). BTW, doesn?t the process start with PendingDeprecationWarnings, then DeprecationWarnings? Regards From tseaver at palladion.com Mon Nov 8 17:15:01 2010 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 08 Nov 2010 11:15:01 -0500 Subject: [Python-Dev] Continuing 2.x In-Reply-To: References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/08/2010 04:42 AM, Lennart Regebro wrote: > 2010/10/28 Kristj?n Valur J?nsson : >> Hello all. >> >> >> >> So, python 2.7 is in bugfix only mode. ?trunk? is off limit. So, where >> does one make improvements to the distinguished, and still very much alive, >> 2.x series of Python? >> >> The answer would seem to be ?one doesn?t?. But must it be that way? > > Except for making releases that start backporting Python 3 features > and breaking backwards compatibility gradually (which may or may not > be a good idea) I don't see the point. There isn't much to do when it > comes to improving the language, and there is a moratorium anyway. > Improvements in the standard library can be more easily done in > external libraries anyway, and then you can release the improved > libraries for everything from Python 2.4 and forwards if you like. > > So it can be done, but the question is "Why?" The OP has existing patches to contribute which the core python-dev team consider "not-a-bugfix", and hence not acceptable for the 2.7 branch. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzYIgUACgkQ+gerLs4ltQ4i1QCfVwDdlkd9tGj6ayKBq3xpiHAW fIYAoNwDx35RfC5VYEyVjhJBbCxrqfXk =bnTg -----END PGP SIGNATURE----- From g.brandl at gmx.net Mon Nov 8 17:13:45 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 08 Nov 2010 17:13:45 +0100 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: <4CD81F10.2050103@netwok.org> References: <4CD81F10.2050103@netwok.org> Message-ID: Am 08.11.2010 17:02, schrieb ?ric Araujo: > Hi Nick, > > If there is no enormous difficulty in maintaining compatibility, I think > the usual deprecation process should be followed. We don?t know who is > using pydoc as a library, so let?s play safe and not risk breaking their > code (especially considering that it must not have been easy to write > code extending pydoc :). > > BTW, doesn?t the process start with PendingDeprecationWarnings, then > DeprecationWarnings? PendingDeprecationWarnings only make sense for larger changes, especially now that bot Pending and normal DeprecationWarnings are silent by default. See PEP 387 (which is only a draft though). Georg From rdmurray at bitdance.com Mon Nov 8 17:18:59 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 08 Nov 2010 11:18:59 -0500 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: <4CD81F10.2050103@netwok.org> References: <4CD81F10.2050103@netwok.org> Message-ID: <20101108161859.87C6A234DE2@kimball.webabinitio.net> On Mon, 08 Nov 2010 17:02:24 +0100, wrote: > If there is no enormous difficulty in maintaining compatibility, I think > the usual deprecation process should be followed. We don???t know who is > using pydoc as a library, so let???s play safe and not risk breaking their > code (especially considering that it must not have been easy to write > code extending pydoc :). > > BTW, doesn???t the process start with PendingDeprecationWarnings, then > DeprecationWarnings? No, PendingDeprecationWarning was something used when we wanted a default-silent deprecation warning for a while before doing an actual deprecation. Now that deprecation warnings are silent by default we'll probably never need PendingDeprecationWarning ever again :) --David From belopolsky at users.sourceforge.net Mon Nov 8 18:20:23 2010 From: belopolsky at users.sourceforge.net (Alexander Belopolsky) Date: Mon, 8 Nov 2010 12:20:23 -0500 Subject: [Python-Dev] Breaking undocumented API Message-ID: Was: [issue2001] Pydoc interactive browsing enhancement On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan wrote: .. > > I'd actually started typing out the command to commit this before it finally clicked that the patch changes public > APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected > by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality > without going through a deprecation period first. > I have a similar issue with the trace module and would appreciate some guidance on this as well. The trace module documented API includes just the Trace class, but the module defines several helper functions and classes that do not start with a leading underscore and are not excluded from * imports by __all__. (There is no trace.__all__.) I don't think a strict don't remove without deprecation policy is workable. For example, is trace.rx_blank constant part of the trace module API that needs to be preserved indefinitely? I don't even know if it is possible to add a deprecation warning to it, but CoverageResults._blank_re would certainly be a better place for it. The functions I have specific need to modify (See http://bugs.python.org/issue10342) are trace.find_strings(), and find_executable_linenos(). The functions take module's file name, but I need to make them to take the module object in order to be able to deal with modules that have custom loaders. The trace.find_strings() function is clearly internal. It's name does not even reflect what it does (finding docstring locations), so it was never intended for use outside of the trace module. However, google code search reveals that people do use it and other functions in their code. This suggests that trace.find_strings() should probably be preserved or properly deprecated. If this is the case, should we fix bugs in it? Note that it currently has a bug because it ignores the coding cookie when opening python source file. Should this be fixed? I freely admit that I have more questions than answers, so I would like to hear from a wider audience. From fuzzyman at voidspace.org.uk Mon Nov 8 18:35:56 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 08 Nov 2010 17:35:56 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: <4CD834FC.5020901@voidspace.org.uk> > Was: [issue2001] Pydoc interactive browsing enhancement > > [snip...] > This suggests that trace.find_strings() should probably be preserved > or properly deprecated. If this is the case, should we fix bugs in > it? Note that it currently has a bug because it ignores the coding > cookie when opening python source file. Should this be fixed? > > I freely admit that I have more questions than answers, so I would > like to hear from a wider audience. If you deprecate it then you don't *have* to fix bugs in it. If we know it is used then we can't remove it without deprecation. If the function is no longer needed but we want to exclude it from the public API, you could create a new function in the module, with a leading underscore name, fix the bugs in that and deprecate the old name. Alternatively you could make the old name an alias for the new one with a deprecation warning applied. That way the old name does get the bugfixes but is still deprecated. Michael > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Mon Nov 8 18:40:47 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 08 Nov 2010 17:40:47 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD834FC.5020901@voidspace.org.uk> References: <4CD834FC.5020901@voidspace.org.uk> Message-ID: <4CD8361F.3070308@voidspace.org.uk> >> Was: [issue2001] Pydoc interactive browsing enhancement >> >> [snip...] >> This suggests that trace.find_strings() should probably be preserved >> or properly deprecated. If this is the case, should we fix bugs in >> it? Note that it currently has a bug because it ignores the coding >> cookie when opening python source file. Should this be fixed? >> >> I freely admit that I have more questions than answers, so I would >> like to hear from a wider audience. > > If you deprecate it then you don't *have* to fix bugs in it. If we > know it is used then we can't remove it without deprecation. > > If the function is no longer needed but we want to exclude it from the > public API, you could create a new function in the module, with a > leading underscore name, fix the bugs in that and deprecate the old name. > Sorry, this meant to say "if the function is *still needed* (internally to the module) but we want to exclude it from the API"... This would be a good approach to clarifying the public API of standard library modules. At least that way we could work towards a consistent policy. All the best, Michael > Alternatively you could make the old name an alias for the new one > with a deprecation warning applied. That way the old name does get the > bugfixes but is still deprecated. > > Michael > >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Mon Nov 8 18:50:32 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 08 Nov 2010 18:50:32 +0100 Subject: [Python-Dev] Buildbot for AIX In-Reply-To: <4CD83772.8040701@users.sourceforge.net> References: <1284586127.79.0.767793857364.issue1633863@psf.upfronthosting.co.za> <4C93377C.4040300@users.sourceforge.net> <4C936225.1000500@v.loewis.de> <4C9770FB.1010103@users.sourceforge.net> <4C97B422.4040606@v.loewis.de> <4CB87587.9020100@users.sourceforge.net> <20101028164515.33ac8412@pitrou.net> <4CD83772.8040701@users.sourceforge.net> Message-ID: <1289238632.3572.14.camel@localhost.localdomain> Le lundi 08 novembre 2010 ? 18:46 +0100, S?bastien Sabl? a ?crit : > xlc: 1501-216 (W) command option - -qmaxmem=18000 is not recognized - > passed to ld Is -qmaxmem really necessary to build Python? If so, you could try passing it in CFLAGS. > However running 2 different slaves per host in order to distinguish xlc > and gcc would be OK; though I would appreciate if they could run > sequentially rather than in parallel as that would limit the host load. If there are two separate slaves, I can't think of any simple way to run builds sequentially. Perhaps you can assign both of them to a single CPU (assuming AIX allows that). Regards Antoine. From ctb at msu.edu Mon Nov 8 18:57:28 2010 From: ctb at msu.edu (C. Titus Brown) Date: Mon, 8 Nov 2010 09:57:28 -0800 Subject: [Python-Dev] Buildbot for AIX In-Reply-To: <1289238632.3572.14.camel@localhost.localdomain> References: <1284586127.79.0.767793857364.issue1633863@psf.upfronthosting.co.za> <4C93377C.4040300@users.sourceforge.net> <4C936225.1000500@v.loewis.de> <4C9770FB.1010103@users.sourceforge.net> <4C97B422.4040606@v.loewis.de> <4CB87587.9020100@users.sourceforge.net> <20101028164515.33ac8412@pitrou.net> <4CD83772.8040701@users.sourceforge.net> <1289238632.3572.14.camel@localhost.localdomain> Message-ID: <20101108175728.GA16330@idyll.org> On Mon, Nov 08, 2010 at 06:50:32PM +0100, Antoine Pitrou wrote: > > However running 2 different slaves per host in order to distinguish xlc > > and gcc would be OK; though I would appreciate if they could run > > sequentially rather than in parallel as that would limit the host load. > > If there are two separate slaves, I can't think of any simple way to run > builds sequentially. Perhaps you can assign both of them to a single CPU > (assuming AIX allows that). You can specify a slave lock to do this, in buildbot: http://buildbot.net/buildbot/docs/0.8.1/Interlocks.html One the neat things that a master/slave system like buildbot provides... cheers, --titus -- C. Titus Brown, ctb at msu.edu From sable at users.sourceforge.net Mon Nov 8 18:46:26 2010 From: sable at users.sourceforge.net (=?ISO-8859-1?Q?S=E9bastien_Sabl=E9?=) Date: Mon, 08 Nov 2010 18:46:26 +0100 Subject: [Python-Dev] Buildbot for AIX In-Reply-To: <20101028164515.33ac8412@pitrou.net> References: <1284586127.79.0.767793857364.issue1633863@psf.upfronthosting.co.za> <4C93377C.4040300@users.sourceforge.net> <4C936225.1000500@v.loewis.de> <4C9770FB.1010103@users.sourceforge.net> <4C97B422.4040606@v.loewis.de> <4CB87587.9020100@users.sourceforge.net> <20101028164515.33ac8412@pitrou.net> Message-ID: <4CD83772.8040701@users.sourceforge.net> Hi Antoine, I tried to provide command lines arguments to configure instead of environment variables with: configureFlags = ["--with-pydebug", "--without-computed-gotos", "CC=xlc", 'OPT="-O2 -qmaxmem=18000"'] But that would fail: on the slave, configure would run like that: ./configure --with-pydebug --without-computed-gotos CC=xlc OPT="-O2 -qmaxmem=18000" And the compilation would give some error like that: xlc -c "-O2 -qmaxmem=18000" -O2 -O2 -I. -IInclude -I./Include -I/home/cis/.buildbot/support-buildbot/include -I/home/cis/.buildbot/support-buildbot/include/ncurses -I/home/cis/.buildbot/support-buildbot/include -I/home/cis/.buildbot/support-buildbot/include/ncurses -DPy_BUILD_CORE -o Modules/python.o ./Modules/python.c xlc: 1501-216 (W) command option - -qmaxmem=18000 is not recognized - passed to ld However running 2 different slaves per host in order to distinguish xlc and gcc would be OK; though I would appreciate if they could run sequentially rather than in parallel as that would limit the host load. regards -- S?bastien Sabl? Le 28/10/2010 16:45, Antoine Pitrou a ?crit : > On Fri, 15 Oct 2010 17:38:47 +0200 > S?bastien Sabl? wrote: >> >> Could you please take a look at those modifications in master.cfg, >> provide me some password for the bot slaves and apply the corrections in >> those issues? > > About the master.cfg modifications: there should be no need for > separate environment variables. Instead, you should be able to specify > them as command-line arguments to ./configure, e.g.: > > ["--with-pydebug", "--without-computed-gotos", "CC=xlc", > 'OPT="-O2 -qmaxmem=18000"'] > > Can you check this works for you? > > Also, there's no need to complicate the buildbot naming procedure. > You should be able to run several buildslaves on a single machine, > provided we give you separate credentials: one per compiler type. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/sable%40users.sourceforge.net From exarkun at twistedmatrix.com Mon Nov 8 19:07:24 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 08 Nov 2010 18:07:24 -0000 Subject: [Python-Dev] Buildbot for AIX In-Reply-To: <1289238632.3572.14.camel@localhost.localdomain> References: <1284586127.79.0.767793857364.issue1633863@psf.upfronthosting.co.za> <4C93377C.4040300@users.sourceforge.net> <4C936225.1000500@v.loewis.de> <4C9770FB.1010103@users.sourceforge.net> <4C97B422.4040606@v.loewis.de> <4CB87587.9020100@users.sourceforge.net> <20101028164515.33ac8412@pitrou.net> <4CD83772.8040701@users.sourceforge.net> <1289238632.3572.14.camel@localhost.localdomain> Message-ID: <20101108180724.2040.587669116.divmod.xquotient.740@localhost.localdomain> On 05:50 pm, solipsis at pitrou.net wrote: >Le lundi 08 novembre 2010 ? 18:46 +0100, S?bastien Sabl? a ?crit : >>xlc: 1501-216 (W) command option - -qmaxmem=18000 is not recognized - >>passed to ld > >Is -qmaxmem really necessary to build Python? >If so, you could try passing it in CFLAGS. >>However running 2 different slaves per host in order to distinguish >>xlc >>and gcc would be OK; though I would appreciate if they could run >>sequentially rather than in parallel as that would limit the host >>load. > >If there are two separate slaves, I can't think of any simple way to >run >builds sequentially. Perhaps you can assign both of them to a single >CPU >(assuming AIX allows that). A master lock will allow this. Although just having a single slave and using a slave lock would be simpler. Jean-Paul From rrr at ronadam.com Mon Nov 8 19:07:47 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 12:07:47 -0600 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain> References: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain> Message-ID: <4CD83C73.5000906@ronadam.com> On 11/08/2010 09:12 AM, exarkun at twistedmatrix.com wrote: > On 11:44 am, ncoghlan at gmail.com wrote: >> All, >> >> I was about to commit the patch for issue 2001 (the improvements to >> the pydoc web server and the removal of the Tk GUI) when I realised >> that pydoc.serve() and pydoc.gui() are technically public standard >> library APIs (albeit undocumented ones). >> >> Currently the patch switches serve() to start the new server >> implementation and gui() to start the server and open a browser window >> for it. >> >> It occurred to me that, despite the "it's an application" feel to the >> pydoc web server APIs, it may be a better idea to leave the two >> existing functions alone (aside from adding DeprecationWarning), and >> using new private function names to start the new server and the web >> browser. >> >> Is following the standard deprecation procedure the better course >> here, or am I being overly paranoid? > > Following the deprecation procedure here sounds awesome to me. Thanks > for considering it, I hope you'll choose to go that way. I want to be clear on what isn't changing. All of the help() function features that python depends on and any of the code that is required for that is staying the same. All of the static html document generating features and code that depend on that, is staying the same. These static pages do not depend on any parts of pydoc after they are generated. Those are the parts that are most likely to be used in other applications as well. The new changes only effect the interactive browsing mode. The tk search box was removed. By doing that, it enabled the browser interface, to be used on systems that don't have tk installed. The html web server was rewritten and a search feature was added so that you can do the same searches in the web browser that you did in the tk search box. Do you (or anyone) know of any programs that access pydocs tk search window, or it's server parts directly? The server was so specific and included very specific pydoc html code, so it would have been very difficult for it to be used for anything else. Any thoughts? I think the main issues Nick is concerned with is the functions and options used to start pydoc in the interactive mode. Cheers, Ron From rrr at ronadam.com Mon Nov 8 19:07:47 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 12:07:47 -0600 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain> References: <20101108151255.2040.168229257.divmod.xquotient.738@localhost.localdomain> Message-ID: <4CD83C73.5000906@ronadam.com> On 11/08/2010 09:12 AM, exarkun at twistedmatrix.com wrote: > On 11:44 am, ncoghlan at gmail.com wrote: >> All, >> >> I was about to commit the patch for issue 2001 (the improvements to >> the pydoc web server and the removal of the Tk GUI) when I realised >> that pydoc.serve() and pydoc.gui() are technically public standard >> library APIs (albeit undocumented ones). >> >> Currently the patch switches serve() to start the new server >> implementation and gui() to start the server and open a browser window >> for it. >> >> It occurred to me that, despite the "it's an application" feel to the >> pydoc web server APIs, it may be a better idea to leave the two >> existing functions alone (aside from adding DeprecationWarning), and >> using new private function names to start the new server and the web >> browser. >> >> Is following the standard deprecation procedure the better course >> here, or am I being overly paranoid? > > Following the deprecation procedure here sounds awesome to me. Thanks > for considering it, I hope you'll choose to go that way. I want to be clear on what isn't changing. All of the help() function features that python depends on and any of the code that is required for that is staying the same. All of the static html document generating features and code that depend on that, is staying the same. These static pages do not depend on any parts of pydoc after they are generated. Those are the parts that are most likely to be used in other applications as well. The new changes only effect the interactive browsing mode. The tk search box was removed. By doing that, it enabled the browser interface, to be used on systems that don't have tk installed. The html web server was rewritten and a search feature was added so that you can do the same searches in the web browser that you did in the tk search box. Do you (or anyone) know of any programs that access pydocs tk search window, or it's server parts directly? The server was so specific and included very specific pydoc html code, so it would have been very difficult for it to be used for anything else. Any thoughts? I think the main issues Nick is concerned with is the functions and options used to start pydoc in the interactive mode. Cheers, Ron From tjreedy at udel.edu Mon Nov 8 19:12:18 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 08 Nov 2010 13:12:18 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: On 11/8/2010 12:20 PM, Alexander Belopolsky wrote: > Was: [issue2001] Pydoc interactive browsing enhancement > > On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan wrote: > .. >> >> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public >> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected >> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality >> without going through a deprecation period first. >> > > I have a similar issue with the trace module and would appreciate some > guidance on this as well. The trace module documented API includes > just the Trace class, but the module defines several helper functions > and classes that do not start with a leading underscore and are not > excluded from * imports by __all__. (There is no trace.__all__.) The trace module *appears* to be an ancient module written at a time (fictional or actual) when there was no '_' and '__all__' convention and only a loose 'public' == documented convention. The undocumented public-looking private stuff is a huge mess that Eli and I intentionally passed over in our July/August patch documenting (and fixing) the public stuff. I hope we included everything that should be public. In order to warn about constants getting renamed or moved, is it possible to issue an off-by-default warning on module import, something like "Trace is an ancient module with public names for many undocumented private constants and functions. Use of these is deprecated. See lib doc for more." -- Terry Jan Reedy From belopolsky at users.sourceforge.net Mon Nov 8 19:35:39 2010 From: belopolsky at users.sourceforge.net (Alexander Belopolsky) Date: Mon, 8 Nov 2010 13:35:39 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD834FC.5020901@voidspace.org.uk> References: <4CD834FC.5020901@voidspace.org.uk> Message-ID: On Mon, Nov 8, 2010 at 12:35 PM, Michael Foord wrote: .. > If you deprecate it then you don't *have* to fix bugs in it. If we know it > is used then we can't remove it without deprecation. > What about the maintenance branch? From fuzzyman at voidspace.org.uk Mon Nov 8 19:39:26 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 08 Nov 2010 18:39:26 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CD834FC.5020901@voidspace.org.uk> Message-ID: <4CD843DE.1040703@voidspace.org.uk> > On Mon, Nov 8, 2010 at 12:35 PM, Michael Foord > wrote: > .. >> If you deprecate it then you don't *have* to fix bugs in it. If we know it >> is used then we can't remove it without deprecation. >> > What about the maintenance branch? So you have a bug in the module that can only be fixed in a function you want to deprecate? It depends what approach you are taking in 3.2. If you are creating a new private function, in which you will fix the bug, but keeping an alias around to the old name so that you can deprecate it - then merely fixing the bug in the maintenance branch should be fine. (If you're deprecating the function because it is unneeded then you don't need to fix bugs in the maintenance branch either - I guess no-one would complain if you did though.) Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From belopolsky at users.sourceforge.net Mon Nov 8 19:44:22 2010 From: belopolsky at users.sourceforge.net (Alexander Belopolsky) Date: Mon, 8 Nov 2010 13:44:22 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD843DE.1040703@voidspace.org.uk> References: <4CD834FC.5020901@voidspace.org.uk> <4CD843DE.1040703@voidspace.org.uk> Message-ID: On Mon, Nov 8, 2010 at 1:39 PM, Michael Foord wrote: .. > So you have a bug in the module that can only be fixed in a function you > want to deprecate? > No, I have a bug in a function that I want to deprecate. You said I don't need to fix it if I add a deprecation warning. However, as far as I know, deprecation warnings are not backported to maintenance branches while bug fixes are. So the specific question is: there is a bug in trace.find_strings() - should it be fixed in 3.1-maint? From fuzzyman at voidspace.org.uk Mon Nov 8 19:46:39 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 08 Nov 2010 18:46:39 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CD834FC.5020901@voidspace.org.uk> <4CD843DE.1040703@voidspace.org.uk> Message-ID: <4CD8458F.3080204@voidspace.org.uk> > On Mon, Nov 8, 2010 at 1:39 PM, Michael Foord wrote: > .. >> So you have a bug in the module that can only be fixed in a function you >> want to deprecate? >> > No, I have a bug in a function that I want to deprecate. You said I > don't need to fix it if I add a deprecation warning. However, as far > as I know, deprecation warnings are not backported to maintenance > branches while bug fixes are. So the specific question is: there is a > bug in trace.find_strings() - should it be fixed in 3.1-maint? My opinion would be: * No we don't backport the deprecation warning * No we don't need to fix the bug Others may disagree. (Logic being that we won't fix the bug in 3.2, if we fixed it in 2.7 then we would have to fix it in 3.2. Therefore we shouldn't fix in 2.7.) Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From brett at python.org Mon Nov 8 20:00:45 2010 From: brett at python.org (Brett Cannon) Date: Mon, 8 Nov 2010 11:00:45 -0800 Subject: [Python-Dev] GUI test runner tool In-Reply-To: <4CD7E886.4060702@voidspace.org.uk> References: <4CD7E886.4060702@voidspace.org.uk> Message-ID: On Mon, Nov 8, 2010 at 04:09, Michael Foord wrote: > Hello all, > > Now that unittest has test discovery, Mark Roddy has been working on > resurrecting the old GUI test runner (using Tkinter): > > https://bitbucket.org/markroddy/unittestgui > > This was part of the original pyunit project but I believe it was never part > of the standard library: > > http://sourceforge.net/projects/pyunit/ > > Here's a screenshot of what it looks like: > > http://skitch.com/fuzzyman/dhu9r/pyunit > > I'd like to propose adding it to Python in Tools/ and am volunteering to > maintain it. Does that mean upgrading it as well? =) For instance it would be great to get it to use ttk so it looks a bit sharper, supports skipped tests and expected failures, and dream-of-dreams ties into regrtest so you can just check boxes instead of passing a ton of CLI flags. > If the answer is "not yet" that is fine as it can go into > unittest2 first. Mark has updated it to work with test discovery and added > support for configuring test discovery in the same way as you can from the > command line. It is a nice tool for those new to writing tests who aren't > yet familiar with the command line or prefer a GUI. I personally have no problem with it going into tools as long as it can also be used to run the tests in the stdlib. Just don't put it in Demos/ . =) -Brett > > In its basic form you simply pick a directory and unittestgui will discover > and run all the tests it finds. It would be nice if it provided more > diagnostic information on tests it ran (clicking through test results) but > these can be added later. > > All the best, > > Michael Foord > > -- > > http://www.voidspace.org.uk/ > > READ CAREFULLY. By accepting and reading this email you agree, > on behalf of your employer, to release me from all obligations > and waivers arising from any and all NON-NEGOTIATED agreements, > licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, > confidentiality, non-disclosure, non-compete and acceptable use > policies (?BOGUS AGREEMENTS?) that I have entered into with your > employer, its partners, licensors, agents and assigns, in > perpetuity, without prejudice to my ongoing rights and privileges. > You further represent that you have the authority to release me > from any BOGUS AGREEMENTS on behalf of your employer. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > From alexander.belopolsky at gmail.com Mon Nov 8 20:28:25 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 8 Nov 2010 14:28:25 -0500 Subject: [Python-Dev] GUI test runner tool In-Reply-To: <4CD7E886.4060702@voidspace.org.uk> References: <4CD7E886.4060702@voidspace.org.uk> Message-ID: On Mon, Nov 8, 2010 at 7:09 AM, Michael Foord wrote: .. > I'd like to propose adding [unittestgui] to Python in Tools/ and am volunteering to > maintain it. Why not adding it under Lib/unittest/? I think Tools/ is a less attractive location for most users than say PyPI or some other package repository. Tools/ is for stuff that is primarily of interest to python developers, not python users. OS vendors are less likely to install packages in Tools/ in a user-visible place than they are a popular 3rd-party package. From brett at python.org Mon Nov 8 20:58:25 2010 From: brett at python.org (Brett Cannon) Date: Mon, 8 Nov 2010 11:58:25 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky wrote: > Was: [issue2001] Pydoc interactive browsing enhancement > > On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan wrote: > .. >> >> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public >> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected >> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality >> without going through a deprecation period first. >> > > I have a similar issue with the trace module and would appreciate some > guidance on this as well. ?The trace module documented API includes > just the Trace class, but the module defines several helper functions > and classes ?that do not start with a leading underscore and are not > excluded from * imports by __all__. ?(There is no trace.__all__.) I think we need to, as a group, decide how to handle undocumented APIs that don't have a leading underscore: they get treated just the same as the documented APIs, or are they private regardless and thus we can change them at our whim? > > I don't think a strict don't remove without deprecation policy is > workable. ?For example, is trace.rx_blank constant part of the trace > module API that needs to be preserved indefinitely? ?I don't even know > if it is possible to add a deprecation warning to it, but > CoverageResults._blank_re would certainly be a better place for it. The deprecation policy obviously cannot apply to module-level attributes. > > The functions I have specific need to modify (See > http://bugs.python.org/issue10342) are trace.find_strings(), and > find_executable_linenos(). ?The functions take module's file name, but > I need to make them to take the module object in order to be able to > deal with modules that have custom loaders. > > The trace.find_strings() function is clearly internal. ?It's name does > not even reflect what it does (finding docstring locations), so it was > never intended for use outside of the trace module. ?However, google > code search reveals that people do use it and other functions in their > code. > > This suggests that trace.find_strings() should probably be preserved > or properly deprecated. ?If this is the case, should we fix bugs in > it? ?Note that it currently has a bug because it ignores the coding > cookie when opening python source file. ?Should this be fixed? > > I freely admit that I have more questions than answers, so I would > like to hear from a wider audience. The main reason I have said that non-underscore names should be properly deprecated (assuming they are not contained in an underscored-named module) is that dir() and help() do not distinguish. If you are perusing a module from the interpreter prompt you have no way to know whether something is public or private if it lacks an underscore. Is it reasonable to assume that any API found through dir() or help() must be checked with the official docs before you can consider using it, even if you have no explicit need to read the official docs? I (unfortunately) say no, which is why I have argued that non-underscored names need to be properly deprecated. This obviously places a nasty burden on us, though, so I don't like taking this position. Unless we can make it clearly known through help() or something that the official docs must be checked to know what can and cannot be reliably used I don't think it is reasonable to force users to not be able to rely on help() (we should probably change help() to print a big disclaimer for anything with a leading underscore, though). But that doesn't mean we can't go through, fix up our names, and deprecate the old public names; that's fair game in my book. From raymond.hettinger at gmail.com Mon Nov 8 21:56:22 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 8 Nov 2010 12:56:22 -0800 Subject: [Python-Dev] GUI test runner tool In-Reply-To: References: <4CD7E886.4060702@voidspace.org.uk> Message-ID: <84D0A582-85BD-45B4-833B-181A971F44DF@gmail.com> On Nov 8, 2010, at 11:28 AM, Alexander Belopolsky wrote: > On Mon, Nov 8, 2010 at 7:09 AM, Michael Foord wrote: > .. >> I'd like to propose adding [unittestgui] to Python in Tools/ and am volunteering to >> maintain it. > > Why not adding it under Lib/unittest/? Michael's instinct to put it in Tools is good one. GUI preferences and support varies among users and environments. Also, any given GUI runner is just one of many possible solutions and there is no reason to commit to one. Better to add something to Tools, post an ASPN recipe, or list a package on PyPI. If you need it to be more visible, we can always give it a mention in the docs. Though we might want to mention more full featured tools like Hudson. Remember, the standard library is where code goes to die ;-) Raymond From exarkun at twistedmatrix.com Mon Nov 8 22:03:18 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 08 Nov 2010 21:03:18 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> On 07:58 pm, brett at python.org wrote: >>I don't think a strict don't remove without deprecation policy is >>workable. ?For example, is trace.rx_blank constant part of the trace >>module API that needs to be preserved indefinitely? ?I don't even know >>if it is possible to add a deprecation warning to it, but >>CoverageResults._blank_re would certainly be a better place for it. > >The deprecation policy obviously cannot apply to module-level >attributes. I'm not sure why this is. Can you elaborate? > >The main reason I have said that non-underscore names should be >properly deprecated (assuming they are not contained in an >underscored-named module) is that dir() and help() do not distinguish. >If you are perusing a module from the interpreter prompt you have no >way to know whether something is public or private if it lacks an >underscore. Is it reasonable to assume that any API found through >dir() or help() must be checked with the official docs before you can >consider using it, even if you have no explicit need to read the >official docs? > >I (unfortunately) say no, which is why I have argued that >non-underscored names need to be properly deprecated. This obviously >places a nasty burden on us, though, so I don't like taking this >position. Unless we can make it clearly known through help() or >something that the official docs must be checked to know what can and >cannot be reliably used I don't think it is reasonable to force users >to not be able to rely on help() (we should probably change help() to >print a big disclaimer for anything with a leading underscore, >though). > >But that doesn't mean we can't go through, fix up our names, and >deprecate the old public names; that's fair game in my book. +1 Jean-Paul From db3l.net at gmail.com Mon Nov 8 22:11:36 2010 From: db3l.net at gmail.com (David Bolen) Date: Mon, 08 Nov 2010 16:11:36 -0500 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot References: <201011061219.55473.victor.stinner@haypocalc.com> <201011070430.54085.victor.stinner@haypocalc.com> <20101108122333.GJ27974@nexus.in-nomine.org> Message-ID: Jeroen Ruigrok van der Werven writes: > -On [20101108 00:36], David Bolen (db3l.net at gmail.com) wrote: >>Well, I think the SYSV semaphores are either less limited or at least >>more adjustable. They've certainly been around longer in FreeBSD. >>The POSIX semaphore support is not enabled by default in FreeBSD 7, so >>I added loader.conf stuff to load them (as part of issue7272). > > It is enabled by default on FreeBSD 8 at least. > Looking through the repository it seems 7-STABLE has it enabled by default > as well in the GENERIC kernel (the standard one it boots with after its > first install). It seems this was added for 7.3 and onward. So 7.2 and > before need an "options P1003_1B_SEMAPHORES" added to their kernel at least. > The SYSV options are already present in the entire 7.x line. My use of "enabled" may not have been the best word choice since I didn't mean to imply a kernel option. I'm still using GENERIC on the 7.2 buildbot, so I didn't need to recompile the kernel in that release either. The issue was that the POSIX semaphore module wasn't loaded by default (something I thought only changed in 8.x), so the buildbot currently has a 'sem_load="YES"' loader.conf entry to ensure that's done. -- David From brett at python.org Mon Nov 8 22:25:58 2010 From: brett at python.org (Brett Cannon) Date: Mon, 8 Nov 2010 13:25:58 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> Message-ID: On Mon, Nov 8, 2010 at 13:03, wrote: > On 07:58 pm, brett at python.org wrote: >>> >>> I don't think a strict don't remove without deprecation policy is >>> workable. ?For example, is trace.rx_blank constant part of the trace >>> module API that needs to be preserved indefinitely? ?I don't even know >>> if it is possible to add a deprecation warning to it, but >>> CoverageResults._blank_re would certainly be a better place for it. >> >> The deprecation policy obviously cannot apply to module-level attributes. > > I'm not sure why this is. ?Can you elaborate? There is no way to directly trigger a DeprecationWarning for an attribute. We can still document it, but there is just no way to programmatically enforce it. -Brett >> >> The main reason I have said that non-underscore names should be >> properly deprecated (assuming they are not contained in an >> underscored-named module) is that dir() and help() do not distinguish. >> If you are perusing a module from the interpreter prompt you have no >> way to know whether something is public or private if it lacks an >> underscore. Is it reasonable to assume that any API found through >> dir() or help() must be checked with the official docs before you can >> consider using it, even if you have no explicit need to read the >> official docs? >> >> I (unfortunately) say no, which is why I have argued that >> non-underscored names need to be properly deprecated. This obviously >> places a nasty burden on us, though, so I don't like taking this >> position. Unless we can make it clearly known through help() or >> something that the official docs must be checked to know what can and >> cannot be reliably used I don't think it is reasonable to force users >> to not be able to rely on help() (we should probably change help() to >> print a big disclaimer for anything with a leading underscore, >> though). >> >> But that doesn't mean we can't go through, fix up our names, and >> deprecate the old public names; that's fair game in my book. > > +1 > > Jean-Paul > From rrr at ronadam.com Mon Nov 8 22:36:25 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 15:36:25 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: <4CD86D59.8070006@ronadam.com> On 11/08/2010 01:58 PM, Brett Cannon wrote: > On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky > wrote: >> Was: [issue2001] Pydoc interactive browsing enhancement >> >> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan wrote: >> .. >>> >>> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public >>> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected >>> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality >>> without going through a deprecation period first. >>> >> >> I have a similar issue with the trace module and would appreciate some >> guidance on this as well. The trace module documented API includes >> just the Trace class, but the module defines several helper functions >> and classes that do not start with a leading underscore and are not >> excluded from * imports by __all__. (There is no trace.__all__.) > > I think we need to, as a group, decide how to handle undocumented APIs > that don't have a leading underscore: they get treated just the same > as the documented APIs, or are they private regardless and thus we can > change them at our whim? My understanding is that anything with an actual docstring is part of the public API. Any thing with a leading underscore is private. And to a lesser extent, objects with out docstrings, but have comments instead or nothing, may change, so don't depend on them. Thankfully most things do have docstrings. >> I freely admit that I have more questions than answers, so I would >> like to hear from a wider audience. > > The main reason I have said that non-underscore names should be > properly deprecated (assuming they are not contained in an > underscored-named module) is that dir() and help() do not distinguish. > If you are perusing a module from the interpreter prompt you have no > way to know whether something is public or private if it lacks an > underscore. Is it reasonable to assume that any API found through > dir() or help() must be checked with the official docs before you can > consider using it, even if you have no explicit need to read the > official docs? > > I (unfortunately) say no, which is why I have argued that > non-underscored names need to be properly deprecated. This obviously > places a nasty burden on us, though, so I don't like taking this > position. Unless we can make it clearly known through help() or > something that the official docs must be checked to know what can and > cannot be reliably used I don't think it is reasonable to force users > to not be able to rely on help() (we should probably change help() to > print a big disclaimer for anything with a leading underscore, > though). +1 on the help disclaimer for objects with leading underscores. Currently help() does not see comments when they are used in place of a docstring. I think it would be easy to have help notate things with no docstrings as "Warning: Undocumented . Use at your own risk." At first, it would probably have a nice side effect of getting any public API's documented with doc strings. (if they aren't already.) > But that doesn't mean we can't go through, fix up our names, and > deprecate the old public names; that's fair game in my book. I agree. It may also be useful to clarify that importing some "utility" modules is not recommended because they may be changed more often and may not follow the standard process. Would something like the following work, but still allow for importing if the exception is caught with a try except? if __name__ == "__main__": main() else: raise ImportWarning("This is utility module and may be changed.") Cheers, Ron From exarkun at twistedmatrix.com Mon Nov 8 22:45:12 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 08 Nov 2010 21:45:12 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> Message-ID: <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> On 09:25 pm, brett at python.org wrote: >On Mon, Nov 8, 2010 at 13:03, wrote: >>On 07:58 pm, brett at python.org wrote: >>>> >>>>I don't think a strict don't remove without deprecation policy is >>>>workable. ?For example, is trace.rx_blank constant part of the trace >>>>module API that needs to be preserved indefinitely? ?I don't even >>>>know >>>>if it is possible to add a deprecation warning to it, but >>>>CoverageResults._blank_re would certainly be a better place for it. >>> >>>The deprecation policy obviously cannot apply to module-level >>>attributes. >> >>I'm not sure why this is. ?Can you elaborate? > >There is no way to directly trigger a DeprecationWarning for an >attribute. We can still document it, but there is just no way to >programmatically enforce it. What about `deprecatedModuleAttribute` () or zope.deprecation () which inspired it? Jean-Paul From rrr at ronadam.com Mon Nov 8 22:36:25 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 15:36:25 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: <4CD86D59.8070006@ronadam.com> On 11/08/2010 01:58 PM, Brett Cannon wrote: > On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky > wrote: >> Was: [issue2001] Pydoc interactive browsing enhancement >> >> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan wrote: >> .. >>> >>> I'd actually started typing out the command to commit this before it finally clicked that the patch changes public >>> APIs of the pydoc module in incompatible ways. Sure, they aren't documented, but the fact they aren't protected >>> by an underscore means I'm not comfortable with the idea of removing them or radically change their functionality >>> without going through a deprecation period first. >>> >> >> I have a similar issue with the trace module and would appreciate some >> guidance on this as well. The trace module documented API includes >> just the Trace class, but the module defines several helper functions >> and classes that do not start with a leading underscore and are not >> excluded from * imports by __all__. (There is no trace.__all__.) > > I think we need to, as a group, decide how to handle undocumented APIs > that don't have a leading underscore: they get treated just the same > as the documented APIs, or are they private regardless and thus we can > change them at our whim? My understanding is that anything with an actual docstring is part of the public API. Any thing with a leading underscore is private. And to a lesser extent, objects with out docstrings, but have comments instead or nothing, may change, so don't depend on them. Thankfully most things do have docstrings. >> I freely admit that I have more questions than answers, so I would >> like to hear from a wider audience. > > The main reason I have said that non-underscore names should be > properly deprecated (assuming they are not contained in an > underscored-named module) is that dir() and help() do not distinguish. > If you are perusing a module from the interpreter prompt you have no > way to know whether something is public or private if it lacks an > underscore. Is it reasonable to assume that any API found through > dir() or help() must be checked with the official docs before you can > consider using it, even if you have no explicit need to read the > official docs? > > I (unfortunately) say no, which is why I have argued that > non-underscored names need to be properly deprecated. This obviously > places a nasty burden on us, though, so I don't like taking this > position. Unless we can make it clearly known through help() or > something that the official docs must be checked to know what can and > cannot be reliably used I don't think it is reasonable to force users > to not be able to rely on help() (we should probably change help() to > print a big disclaimer for anything with a leading underscore, > though). +1 on the help disclaimer for objects with leading underscores. Currently help() does not see comments when they are used in place of a docstring. I think it would be easy to have help notate things with no docstrings as "Warning: Undocumented . Use at your own risk." At first, it would probably have a nice side effect of getting any public API's documented with doc strings. (if they aren't already.) > But that doesn't mean we can't go through, fix up our names, and > deprecate the old public names; that's fair game in my book. I agree. It may also be useful to clarify that importing some "utility" modules is not recommended because they may be changed more often and may not follow the standard process. Would something like the following work, but still allow for importing if the exception is caught with a try except? if __name__ == "__main__": main() else: raise ImportWarning("This is utility module and may be changed.") Cheers, Ron From brett at python.org Mon Nov 8 22:57:43 2010 From: brett at python.org (Brett Cannon) Date: Mon, 8 Nov 2010 13:57:43 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> Message-ID: On Mon, Nov 8, 2010 at 13:45, wrote: > On 09:25 pm, brett at python.org wrote: >> >> On Mon, Nov 8, 2010 at 13:03, ? wrote: >>> >>> On 07:58 pm, brett at python.org wrote: >>>>> >>>>> I don't think a strict don't remove without deprecation policy is >>>>> workable. ?For example, is trace.rx_blank constant part of the trace >>>>> module API that needs to be preserved indefinitely? ?I don't even know >>>>> if it is possible to add a deprecation warning to it, but >>>>> CoverageResults._blank_re would certainly be a better place for it. >>>> >>>> The deprecation policy obviously cannot apply to module-level >>>> attributes. >>> >>> I'm not sure why this is. ?Can you elaborate? >> >> There is no way to directly trigger a DeprecationWarning for an >> attribute. We can still document it, but there is just no way to >> programmatically enforce it. > > What about `deprecatedModuleAttribute` > () > or zope.deprecation > () which inspired it? Just checked the code and it looks like it substitutes the module for some proxy object? To begin that break subclass checks. After that I don't know the ramifications without really digging into the ModuleType code. From brett at python.org Mon Nov 8 23:01:18 2010 From: brett at python.org (Brett Cannon) Date: Mon, 8 Nov 2010 14:01:18 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD86D59.8070006@ronadam.com> References: <4CD86D59.8070006@ronadam.com> Message-ID: On Mon, Nov 8, 2010 at 13:36, Ron Adam wrote: > > > On 11/08/2010 01:58 PM, Brett Cannon wrote: >> >> On Mon, Nov 8, 2010 at 09:20, Alexander Belopolsky >> ?wrote: >>> >>> Was: [issue2001] Pydoc interactive browsing enhancement >>> >>> On Sun, Nov 7, 2010 at 9:17 AM, Nick Coghlan >>> ?wrote: >>> .. >>>> >>>> I'd actually started typing out the command to commit this before it >>>> finally clicked that the patch changes public >>>> APIs of the pydoc module in incompatible ways. Sure, they aren't >>>> documented, but the fact they aren't protected >>>> by an underscore means I'm not comfortable with the idea of removing >>>> them or radically change their functionality >>>> without going through a deprecation period first. >>>> >>> >>> I have a similar issue with the trace module and would appreciate some >>> guidance on this as well. ?The trace module documented API includes >>> just the Trace class, but the module defines several helper functions >>> and classes ?that do not start with a leading underscore and are not >>> excluded from * imports by __all__. ?(There is no trace.__all__.) >> >> I think we need to, as a group, decide how to handle undocumented APIs >> that don't have a leading underscore: they get treated just the same >> as the documented APIs, or are they private regardless and thus we can >> change them at our whim? > > My understanding is that anything with an actual docstring is part of the > public API. ?Any thing with a leading underscore is private. That's a bad rule. Why shouldn't I be able to document something that is not meant for the public so that fellow developers know what the heck should be going on in the code? > > And to a lesser extent, objects with out docstrings, but have comments > instead or nothing, may change, so don't depend on them. ?Thankfully most > things do have docstrings. > > >>> I freely admit that I have more questions than answers, so I would >>> like to hear from a wider audience. >> >> The main reason I have said that non-underscore names should be >> properly deprecated (assuming they are not contained in an >> underscored-named module) is that dir() and help() do not distinguish. >> If you are perusing a module from the interpreter prompt you have no >> way to know whether something is public or private if it lacks an >> underscore. Is it reasonable to assume that any API found through >> dir() or help() must be checked with the official docs before you can >> consider using it, even if you have no explicit need to read the >> official docs? >> >> I (unfortunately) say no, which is why I have argued that >> non-underscored names need to be properly deprecated. This obviously >> places a nasty burden on us, though, so I don't like taking this >> position. Unless we can make it clearly known through help() or >> something that the official docs must be checked to know what can and >> cannot be reliably used I don't think it is reasonable to force users >> to not be able to rely on help() (we should probably change help() to >> print a big disclaimer for anything with a leading underscore, >> though). > > +1 on the help disclaimer for objects with leading underscores. > > Currently help() does not see comments when they are used in place of a > docstring. ?I think it would be easy to have help notate things with no > docstrings as "Warning: Undocumented . Use at your own risk." > > At first, it would probably have a nice side effect of getting any public > API's documented with doc strings. (if they aren't already.) > > >> But that doesn't mean we can't go through, fix up our names, and >> deprecate the old public names; that's fair game in my book. > > I agree. > > > It may also be useful to clarify that importing some "utility" modules is > not recommended because they may be changed more often and may not follow > the standard process. ?Would something like the following work, but still > allow for importing if the exception is caught with a try except? > > if __name__ == "__main__": > ? ?main() > else: > ? ?raise ImportWarning("This is utility module and may be changed.") Sure it would work, but that doesn't make it pleasant to use. It already breaks how warnings are typically handled by raising it instead of calling warnings.warn(). Plus I'm now supposed to try/except certain imports? That's messy. At that point we are coding in visibility rules instead of following convention and that doesn't sit well with me. -Brett > > Cheers, > ?Ron > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > From raymond.hettinger at gmail.com Mon Nov 8 23:07:46 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 8 Nov 2010 14:07:46 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote: > I think we need to, as a group, decide how to handle undocumented APIs > that don't have a leading underscore: they get treated just the same > as the documented APIs, or are they private regardless and thus we can > change them at our whim? To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed. That way, we can at least declare the parts that are intended to be public on a go-forward basis. For the most part, the non-underscored parts of the API shouldn't be changed "at our whim". Some sense needs to be applied to the decision. Google's code search is great for showing how people actually have used a module in real world code. If that shows that people are accessing and/or changing an attribute, it probably needs to remain exposed. In the absence of a code search, good guesses can be made about what someone might reasonably and usefully be accessing (i.e. glob0 isn't likely). The goal is to improve the standard library while minimizing breakage, and that will involve trade-offs depending on what is being changed. IIRC, we've been trying to get away from deprecations because they're so disruptive. For example, when the pprint rewrite is finally ready, if there is an incompatible API change, I expect that a new clean class will be offered, but that the old will be left in-place so that tons of existing code won't break). Likewise, with the unittest clean-ups, I'm expecting that Michael will introduce aliases when fixing-up mis-named methods, rather than break code that uses the existing names. my-two-cents, Raymond From steve at pearwood.info Mon Nov 8 23:17:40 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 09 Nov 2010 09:17:40 +1100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD86D59.8070006@ronadam.com> References: <4CD86D59.8070006@ronadam.com> Message-ID: <4CD87704.2040909@pearwood.info> Ron Adam wrote: > My understanding is that anything with an actual docstring is part of > the public API. I frequently add docstrings to _private functions. Just because it is private doesn't mean I don't want documentation for it, and it is very handy for running doctests. Yes, I test my internal functions *wink* The convention I use is: * If __all__ exists, anything in that is public. * Anything not listed in __all__ but without a leading underscore is public, but not part of the module's API; e.g. utility functions, imported modules, globals (but hopefully not too many of the last). That means I don't expect you to use it, but you can if you want. * Anything with a _private name is internal use only. That includes modules. Any attribute of a private object is also private. If a class is flagged as private, _MyClass, you wouldn't expect that _MyClass.attribute were public just because the attribute name wasn't also flagged with an underscore. So why treat _module.name as public? > +1 on the help disclaimer for objects with leading underscores. I don't know that it will be that useful, but I don't think it will help that much. +0. > Currently help() does not see comments when they are used in place of a > docstring. I think it would be easy to have help notate things with no > docstrings as "Warning: Undocumented . Use at your own risk." I wouldn't like that. I don't think that "no docstring" = "undocumented" -- the documentation might exist somewhere else. Besides, I don't think that help() should start misidentifying public objects as private if you run it under python -OO. > It may also be useful to clarify that importing some "utility" modules > is not recommended because they may be changed more often and may not > follow the standard process. Would something like the following work, > but still allow for importing if the exception is caught with a try except? > > if __name__ == "__main__": > main() > else: > raise ImportWarning("This is utility module and may be changed.") There's no way for the imported module to know what module is importing it, is there? Because the API I'd much prefer is: safe_modules = [a, b, c, d] # List of modules allowed to import me. if calling_module not in safe_modules: warning.warn("private module, are you sure you want to do this?") -- Steven From merwok at netwok.org Mon Nov 8 23:22:42 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 08 Nov 2010 23:22:42 +0100 Subject: [Python-Dev] [Python-checkins] r86327 - in python/branches/py3k: Doc/includes/email-mime.py Doc/includes/email-simple.py Doc/library/smtplib.rst Doc/whatsnew/3.2.rst Lib/smtplib.py Lib/test/test_smtplib.py Misc/NEWS In-Reply-To: <20101108171513.F0642FC70@mail.python.org> References: <20101108171513.F0642FC70@mail.python.org> Message-ID: <4CD87832.6090602@netwok.org> > Author: r.david.murray > New Revision: 86327 > > Log: #10321: Add support for sending binary DATA and Message objects to smtplib > > Modified: python/branches/py3k/Doc/includes/email-mime.py > ============================================================================== > # Send the email via our own SMTP server. > s = smtplib.SMTP() > -s.sendmail(me, family, msg.as_string()) > +s.sendmail(msg) > s.quit() If I?m not mistaken, you?re giving a message object to a method that only accepts str or bytes. That line should read s.send_message(msg). Regards From exarkun at twistedmatrix.com Mon Nov 8 23:35:53 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 08 Nov 2010 22:35:53 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> Message-ID: <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> On 09:57 pm, brett at python.org wrote: >On Mon, Nov 8, 2010 at 13:45, wrote: >>On 09:25 pm, brett at python.org wrote: >>> >>>On Mon, Nov 8, 2010 at 13:03, ? wrote: >>>> >>>>On 07:58 pm, brett at python.org wrote: >>>>>> >>>>>>I don't think a strict don't remove without deprecation policy is >>>>>>workable. ?For example, is trace.rx_blank constant part of the >>>>>>trace >>>>>>module API that needs to be preserved indefinitely? ?I don't even >>>>>>know >>>>>>if it is possible to add a deprecation warning to it, but >>>>>>CoverageResults._blank_re would certainly be a better place for >>>>>>it. >>>>> >>>>>The deprecation policy obviously cannot apply to module-level >>>>>attributes. >>>> >>>>I'm not sure why this is. ?Can you elaborate? >>> >>>There is no way to directly trigger a DeprecationWarning for an >>>attribute. We can still document it, but there is just no way to >>>programmatically enforce it. >> >>What about `deprecatedModuleAttribute` >>() >>or zope.deprecation >>() which >>inspired it? > >Just checked the code and it looks like it substitutes the module for >some proxy object? To begin that break subclass checks. After that I >don't know the ramifications without really digging into the >ModuleType code. That could be fixed if ModuleType allowed subclassing. :) For what it's worth, no one has complained about problems caused by `deprecatedModuleAttribute`, but we've only been using it for about two and a half years. Jean-Paul From tjreedy at udel.edu Mon Nov 8 23:59:40 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 08 Nov 2010 17:59:40 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD86D59.8070006@ronadam.com> References: <4CD86D59.8070006@ronadam.com> Message-ID: On 11/8/2010 4:36 PM, Ron Adam wrote: > My understanding is that anything with an actual docstring is part of > the public API. Any thing with a leading underscore is private. When the trace module was written, the rule seems to have been more like: docs (but no docstrings) for public API, docstrings (but no doc mention) for private stuff. Eli and I fixed the first part. -- Terry Jan Reedy From tjreedy at udel.edu Tue Nov 9 00:05:19 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 08 Nov 2010 18:05:19 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: On 11/8/2010 2:58 PM, Brett Cannon wrote: > I think we need to, as a group, decide how to handle undocumented APIs > that don't have a leading underscore: they get treated just the same > as the documented APIs, or are they private regardless and thus we can > change them at our whim? How about in between: deprecate as if private, but do so much more freely that we would for public stuff. I think this is what you actually propose. We might deprecate faster too. > The main reason I have said that non-underscore names should be > properly deprecated (assuming they are not contained in an > underscored-named module) is that dir() and help() do not distinguish. > If you are perusing a module from the interpreter prompt you have no > way to know whether something is public or private if it lacks an > underscore. Is it reasonable to assume that any API found through > dir() or help() must be checked with the official docs before you can > consider using it, even if you have no explicit need to read the > official docs? > > I (unfortunately) say no, which is why I have argued that > non-underscored names need to be properly deprecated. This obviously > places a nasty burden on us, though, so I don't like taking this Completely naive question: Is there anything that could be automated to reduce the burden? > position. Unless we can make it clearly known through help() or > something that the official docs must be checked to know what can and > cannot be reliably used I don't think it is reasonable to force users > to not be able to rely on help() (we should probably change help() to > print a big disclaimer for anything with a leading underscore, > though). > > But that doesn't mean we can't go through, fix up our names, and > deprecate the old public names; that's fair game in my book. -- Terry Jan Reedy From regebro at gmail.com Tue Nov 9 00:08:08 2010 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 9 Nov 2010 00:08:08 +0100 Subject: [Python-Dev] Continuing 2.x In-Reply-To: <0A652EB4-2C5D-4070-83EE-CF75603EE721@fuhm.net> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> <0A652EB4-2C5D-4070-83EE-CF75603EE721@fuhm.net> Message-ID: 2010/11/8 James Y Knight : > On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote: >> Except for making releases that start backporting Python 3 features >> and breaking backwards compatibility gradually (which may or may not >> be a good idea) I don't see the point. There isn't much to do when it >> comes to improving the language, and there is a moratorium anyway. >> Improvements in the standard library can be more easily done in >> external libraries anyway, and then you can release the improved >> libraries for everything from Python 2.4 and forwards if you like. >> >> So it can be done, but the question is "Why?" > > To keep the batteries included? But they'll only be included in > 2.7, which won't be used much, which defeats the purpose of including those batteries. -- Lennart Regebro, Colliberty: http://www.colliberty.com/ Telephone: +48 691 268 328 From bobbyi at gmail.com Tue Nov 9 00:26:58 2010 From: bobbyi at gmail.com (Bobby Impollonia) Date: Mon, 8 Nov 2010 15:26:58 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: On Mon, Nov 8, 2010 at 2:07 PM, Raymond Hettinger wrote: > > On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote: > >> I think we need to, as a group, decide how to handle undocumented APIs >> that don't have a leading underscore: they get treated just the same >> as the documented APIs, or are they private regardless and thus we can >> change them at our whim? > > To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed. ?That way, we can at least declare the parts that are intended to be public on a go-forward basis. This does hurt because anyone who was relying on "import *" to get a name which is now omitted from __all__ is going to upgrade and find their program failing with NameErrors. This is a backwards compatible change and shouldn't happen without a deprecation warning first. From guido at python.org Tue Nov 9 00:47:03 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Nov 2010 15:47:03 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: On Mon, Nov 8, 2010 at 3:26 PM, Bobby Impollonia wrote: > On Mon, Nov 8, 2010 at 2:07 PM, Raymond Hettinger > wrote: >> >> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote: >> >>> I think we need to, as a group, decide how to handle undocumented APIs >>> that don't have a leading underscore: they get treated just the same >>> as the documented APIs, or are they private regardless and thus we can >>> change them at our whim? >> >> To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed. ?That way, we can at least declare the parts that are intended to be public on a go-forward basis. > > This does hurt because anyone who was relying on "import *" to get a > name which is now omitted from __all__ is going to upgrade and find > their program failing with NameErrors. This is a backwards compatible > change and shouldn't happen without a deprecation warning first. Given that import * is generally frowned upon you can't make a blanket statement like this without referring to the specifics of the name being considered for removal. In fact, for any proposed change the risk and reward need to be weighed properly. If the risk is "someone's code could break if they used some undocumented API" it is useful to estimate the probability that this would happen and that somebody would care (rather than just fixing their code and moving on). Many factors go into such an estimate. Just one example would be if we knew of usage of the offending name in code that could reasonably be assumed to be widely copied or distributed -- in such cases we should move very carefully indeed no matter how "officially undocumented" something is. I don't want to go into the specifics of the trace module (even if I wrote it, it's too long ago to remember, nor can I recall using it) but I do want to warn about the dangers of applying simplifying rules mindlessly. -- --Guido van Rossum (python.org/~guido) From glyph at twistedmatrix.com Tue Nov 9 00:55:23 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Mon, 8 Nov 2010 15:55:23 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> Message-ID: <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> On Nov 8, 2010, at 2:35 PM, exarkun at twistedmatrix.com wrote: > On 09:57 pm, brett at python.org wrote: >> On Mon, Nov 8, 2010 at 13:45, wrote: >>> On 09:25 pm, brett at python.org wrote: >>>> >>>> On Mon, Nov 8, 2010 at 13:03, wrote: >>>>> >>>>> On 07:58 pm, brett at python.org wrote: >>>>>>> >>>>>>> I don't think a strict don't remove without deprecation policy is >>>>>>> workable. For example, is trace.rx_blank constant part of the trace >>>>>>> module API that needs to be preserved indefinitely? I don't even know >>>>>>> if it is possible to add a deprecation warning to it, but >>>>>>> CoverageResults._blank_re would certainly be a better place for it. >>>>>> >>>>>> The deprecation policy obviously cannot apply to module-level >>>>>> attributes. >>>>> >>>>> I'm not sure why this is. Can you elaborate? >>>> >>>> There is no way to directly trigger a DeprecationWarning for an >>>> attribute. We can still document it, but there is just no way to >>>> programmatically enforce it. >>> >>> What about `deprecatedModuleAttribute` >>> () >>> or zope.deprecation >>> () which inspired it? >> >> Just checked the code and it looks like it substitutes the module for >> some proxy object? To begin that break subclass checks. After that I >> don't know the ramifications without really digging into the >> ModuleType code. > > That could be fixed if ModuleType allowed subclassing. :) > > For what it's worth, no one has complained about problems caused by `deprecatedModuleAttribute`, but we've only been using it for about two and a half years. This seems like a pretty clear case of "practicality beats purity". Not only has nobody complained about deprecatedModuleAttribute, but there are tons of things which show up in sys.modules that aren't modules in the sense of 'instances of ModuleType'. The Twisted reactor, for example, is an instance, and we've been doing *that* for about 10 years with no complaints. From rrr at ronadam.com Tue Nov 9 01:10:17 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 18:10:17 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CD86D59.8070006@ronadam.com> Message-ID: <4CD89169.6000606@ronadam.com> On 11/08/2010 04:01 PM, Brett Cannon wrote: >> My understanding is that anything with an actual docstring is part of the >> public API. Any thing with a leading underscore is private. > > That's a bad rule. Why shouldn't I be able to document something that > is not meant for the public so that fellow developers know what the > heck should be going on in the code? You can use comments instead of a docstring. Here are the possible cases concerned with the subject. I'm using functions here for these examples, but this also applies to other objects. def public_api(): """ Should always have a nice docstring. """ ... def _private_api(): # # Isn't it a good practice to use comments here? # ... def _publicly_documented_private_api(): """ Not sure why you would want to do this instead of using comments. """ ... def undocumented_public_api(): ... def _undocumented_private_api(): ... Out of these, the two that are problematic are the _publicly_documented_private_api() and the undocumented_public_api(). The _publicly_documented_private_api() is a problem because people *will* use it even though it has a leading underscore. Especially those who are new to python. The undocumented_public_api() wouldn't be a problem if all private api's used leading underscore, but for older modules, it isn't always clear what the intention was. Was it undocumented because the programmer simply forgot, or was it intended to be a private api? >> It may also be useful to clarify that importing some "utility" modules is >> not recommended because they may be changed more often and may not follow >> the standard process. Would something like the following work, but still >> allow for importing if the exception is caught with a try except? >> >> if __name__ == "__main__": >> main() >> else: >> raise ImportWarning("This is utility module and may be changed.") > > Sure it would work, but that doesn't make it pleasant to use. It > already breaks how warnings are typically handled by raising it > instead of calling warnings.warn(). Plus I'm now supposed to > try/except certain imports? That's messy. At that point we are coding > in visibility rules instead of following convention and that doesn't > sit well with me. No, you're not suppose to try/except imports. That's the point. You can do that, only if you really want to abuse the intended purpose of a module that isn't meant to be imported in the first place. If someone wants to do that, it isn't a problem. They are well aware of the risks if they do it. (This is just one option and probably one that isn't thought out very well.) Brett, I'm sure you can up with a better alternative. ;-) Cheers, Ron From rrr at ronadam.com Tue Nov 9 01:10:17 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 18:10:17 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CD86D59.8070006@ronadam.com> Message-ID: <4CD89169.6000606@ronadam.com> On 11/08/2010 04:01 PM, Brett Cannon wrote: >> My understanding is that anything with an actual docstring is part of the >> public API. Any thing with a leading underscore is private. > > That's a bad rule. Why shouldn't I be able to document something that > is not meant for the public so that fellow developers know what the > heck should be going on in the code? You can use comments instead of a docstring. Here are the possible cases concerned with the subject. I'm using functions here for these examples, but this also applies to other objects. def public_api(): """ Should always have a nice docstring. """ ... def _private_api(): # # Isn't it a good practice to use comments here? # ... def _publicly_documented_private_api(): """ Not sure why you would want to do this instead of using comments. """ ... def undocumented_public_api(): ... def _undocumented_private_api(): ... Out of these, the two that are problematic are the _publicly_documented_private_api() and the undocumented_public_api(). The _publicly_documented_private_api() is a problem because people *will* use it even though it has a leading underscore. Especially those who are new to python. The undocumented_public_api() wouldn't be a problem if all private api's used leading underscore, but for older modules, it isn't always clear what the intention was. Was it undocumented because the programmer simply forgot, or was it intended to be a private api? >> It may also be useful to clarify that importing some "utility" modules is >> not recommended because they may be changed more often and may not follow >> the standard process. Would something like the following work, but still >> allow for importing if the exception is caught with a try except? >> >> if __name__ == "__main__": >> main() >> else: >> raise ImportWarning("This is utility module and may be changed.") > > Sure it would work, but that doesn't make it pleasant to use. It > already breaks how warnings are typically handled by raising it > instead of calling warnings.warn(). Plus I'm now supposed to > try/except certain imports? That's messy. At that point we are coding > in visibility rules instead of following convention and that doesn't > sit well with me. No, you're not suppose to try/except imports. That's the point. You can do that, only if you really want to abuse the intended purpose of a module that isn't meant to be imported in the first place. If someone wants to do that, it isn't a problem. They are well aware of the risks if they do it. (This is just one option and probably one that isn't thought out very well.) Brett, I'm sure you can up with a better alternative. ;-) Cheers, Ron From ben+python at benfinney.id.au Tue Nov 9 01:32:05 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Nov 2010 11:32:05 +1100 Subject: [Python-Dev] Breaking undocumented API References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: <87tyjrcp6i.fsf@benfinney.id.au> Bobby Impollonia writes: > On Mon, Nov 8, 2010 at 2:07 PM, Raymond Hettinger > wrote: > > To start with, it doesn't hurt for a maintainer to add an __all__ > > entry and to only document the parts of the API we think need to be > > exposed. ?That way, we can at least declare the parts that are > > intended to be public on a go-forward basis. > > This does hurt because anyone who was relying on "import *" to get a > name which is now omitted from __all__ is going to upgrade and find > their program failing with NameErrors. This is a backwards compatible > change and shouldn't happen without a deprecation warning first. It also introduces a (perhaps small, but clearly non-zero) maintenance burden: the name of an object must be added, changed, and removed not only where it is defined, but also in the ?__all__? entry. This burden is avoided when using the spelling of the name itself as the indicator for exposure in the API. -- \ ?In any great organization it is far, far safer to be wrong | `\ with the majority than to be right alone.? ?John Kenneth | _o__) Galbraith, 1989-07-28 | Ben Finney From ben+python at benfinney.id.au Tue Nov 9 01:46:59 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Nov 2010 11:46:59 +1100 Subject: [Python-Dev] Breaking undocumented API References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> Message-ID: <87pqufcoho.fsf@benfinney.id.au> Ron Adam writes: > def _publicly_documented_private_api(): > """ Not sure why you would want to do this > instead of using comments. > """ > ... Because the docstring is available at the interpreter via ?help()?, and because it's automatically available to ?doctest?, and most of the other good reasons for docstrings. > The _publicly_documented_private_api() is a problem because people > *will* use it even though it has a leading underscore. Especially > those who are new to python. That isn't an argument against docstrings, since the problem you describe isn't dependent on the presence or absence of docstrings. -- \ ?I wish there was a knob on the TV to turn up the intelligence. | `\ There's a knob called ?brightness? but it doesn't work.? | _o__) ?Eugene P. Gallagher | Ben Finney From guido at python.org Tue Nov 9 01:50:42 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Nov 2010 16:50:42 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> Message-ID: On Mon, Nov 8, 2010 at 3:55 PM, Glyph Lefkowitz wrote: > This seems like a pretty clear case of "practicality beats purity". ?Not only has nobody complained about deprecatedModuleAttribute, but there are tons of things which show up in sys.modules that aren't modules in the sense of 'instances of ModuleType'. ?The Twisted reactor, for example, is an instance, and we've been doing *that* for about 10 years with no complaints. But the Twisted universe is only a subset of the Python universe. The Python stdlib needs to move more carefully. -- --Guido van Rossum (python.org/~guido) From rrr at ronadam.com Tue Nov 9 02:18:00 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 19:18:00 -0600 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: References: Message-ID: On 11/08/2010 05:44 AM, Nick Coghlan wrote: > All, > > I was about to commit the patch for issue 2001 (the improvements to > the pydoc web server and the removal of the Tk GUI) when I realised > that pydoc.serve() and pydoc.gui() are technically public standard > library APIs (albeit undocumented ones). > > Currently the patch switches serve() to start the new server > implementation and gui() to start the server and open a browser window > for it. > > It occurred to me that, despite the "it's an application" feel to the > pydoc web server APIs, it may be a better idea to leave the two > existing functions alone (aside from adding DeprecationWarning), and > using new private function names to start the new server and the web > browser. > > Is following the standard deprecation procedure the better course > here, or am I being overly paranoid? What do you think about adding a new _pydoc3.py module along with a pydoc3.py loader module with a basic user api. The number 3, so that it match's python3.x. We can then keep the old pydoc.py unchanged and be free to make a lot more changes to the _pydoc3.py file without having to be even a little paranoid. Cheers, Ron From brett at python.org Tue Nov 9 02:18:33 2010 From: brett at python.org (Brett Cannon) Date: Mon, 8 Nov 2010 17:18:33 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD89169.6000606@ronadam.com> References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> Message-ID: On Mon, Nov 8, 2010 at 16:10, Ron Adam wrote: > > > On 11/08/2010 04:01 PM, Brett Cannon wrote: > >>> My understanding is that anything with an actual docstring is part of the >>> public API. ?Any thing with a leading underscore is private. >> >> That's a bad rule. Why shouldn't I be able to document something that >> is not meant for the public so that fellow developers know what the >> heck should be going on in the code? > > You can use comments instead of a docstring. > > Here are the possible cases concerned with the subject. ?I'm using functions > here for these examples, but this also applies to other objects. > > > def public_api(): > ? ?""" Should always have a nice docstring. """ > ? ?... > > > def _private_api(): > ? ?# > ? ?# Isn't it a good practice to use comments here? > ? ?# > ? ?... That is ugly. I already hate doing that for unittest, I'm not about to champion that for anything else. It would also lead to essentially requiring a docstrings for everything that is public whether someone wants to bother to writing a docstring or not. I don't think we should be suggesting that a docstring be required either. > > > def _publicly_documented_private_api(): > ? ?""" ?Not sure why you would want to do this > ? ? ? ? instead of using comments. > ? ?""" > ? ?... > > > def undocumented_public_api(): > ? ?... > > > def _undocumented_private_api(): > ? ?... > > > Out of these, the two that are problematic are the > _publicly_documented_private_api() and the undocumented_public_api(). > > The _publicly_documented_private_api() is a problem because people *will* > use it even though it has a leading underscore. ?Especially those who are > new to python. > > The undocumented_public_api() wouldn't be a problem if all private api's > used leading ?underscore, but for older modules, it isn't always clear what > the intention was. ?Was it undocumented because the programmer simply > forgot, or was it intended to be a private api? > > > >>> It may also be useful to clarify that importing some "utility" modules is >>> not recommended because they may be changed more often and may not follow >>> the standard process. ?Would something like the following work, but still >>> allow for importing if the exception is caught with a try except? >>> >>> if __name__ == "__main__": >>> ? ?main() >>> else: >>> ? ?raise ImportWarning("This is utility module and may be changed.") >> >> Sure it would work, but that doesn't make it pleasant to use. It >> already breaks how warnings are typically handled by raising it >> instead of calling warnings.warn(). Plus I'm now supposed to >> try/except certain imports? That's messy. At that point we are coding >> in visibility rules instead of following convention and that doesn't >> sit well with me. > > No, you're not suppose to try/except imports. ?That's the point. > > You can do that, only if you really want to abuse the intended purpose of a > module that isn't meant to be imported in the first place. ?If someone wants > to do that, it isn't a problem. ?They are well aware of the risks if they do > it. ?(This is just one option and probably one that isn't thought out very > well.) > > Brett, I'm sure you can up with a better alternative. ? ;-) But I don't want to have to do that in the stdlib by remembering what modules I should or should not import. This is just as much about developer burden on core devs as it is making sure we don't yank the rug out from underneath users. From guido at python.org Tue Nov 9 02:21:50 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Nov 2010 17:21:50 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87pqufcoho.fsf@benfinney.id.au> References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> <87pqufcoho.fsf@benfinney.id.au> Message-ID: On Mon, Nov 8, 2010 at 4:46 PM, Ben Finney wrote: > Ron Adam writes: > >> def _publicly_documented_private_api(): >> ? ? """ ?Not sure why you would want to do this >> ? ? ? ? ?instead of using comments. >> ? ? """ >> ? ? ... > > Because the docstring is available at the interpreter via ?help()?, and > because it's automatically available to ?doctest?, and most of the other > good reasons for docstrings. > >> The _publicly_documented_private_api() is a problem because people >> *will* use it even though it has a leading underscore. Especially >> those who are new to python. > > That isn't an argument against docstrings, since the problem you > describe isn't dependent on the presence or absence of docstrings. +1 -- --Guido van Rossum (python.org/~guido) From a.badger at gmail.com Tue Nov 9 02:39:47 2010 From: a.badger at gmail.com (Toshio Kuratomi) Date: Mon, 8 Nov 2010 17:39:47 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87pqufcoho.fsf@benfinney.id.au> References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> <87pqufcoho.fsf@benfinney.id.au> Message-ID: <20101109013947.GD22818@unaka.lan> On Tue, Nov 09, 2010 at 11:46:59AM +1100, Ben Finney wrote: > Ron Adam writes: > > > def _publicly_documented_private_api(): > > """ Not sure why you would want to do this > > instead of using comments. > > """ > > ... > > Because the docstring is available at the interpreter via ?help()?, and > because it's automatically available to ?doctest?, and most of the other > good reasons for docstrings. > > > The _publicly_documented_private_api() is a problem because people > > *will* use it even though it has a leading underscore. Especially > > those who are new to python. > > That isn't an argument against docstrings, since the problem you > describe isn't dependent on the presence or absence of docstrings. > Just wanted to expand a bit here: as a general practice, you may be involved in a project where the _private_api() is not intended by people outside of the project but is intended to be used in multiple places within the project. If you have different people working on those different areas, it can be very useful for them to be able to use help(_private_api) on the other functions from within the interpreter shell. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From victor.stinner at haypocalc.com Tue Nov 9 02:57:00 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 9 Nov 2010 02:57:00 +0100 Subject: [Python-Dev] "Too many open files" errors on "x86 FreeBSD 7.2 3.x" buildbot In-Reply-To: <20101108122333.GJ27974@nexus.in-nomine.org> References: <201011061219.55473.victor.stinner@haypocalc.com> <20101108122333.GJ27974@nexus.in-nomine.org> Message-ID: <201011090257.00546.victor.stinner@haypocalc.com> On Monday 08 November 2010 13:23:33 Jeroen Ruigrok van der Werven wrote: > >The POSIX semaphore support is not enabled by default in FreeBSD 7, so > >I added loader.conf stuff to load them (as part of issue7272). > > It is enabled by default on FreeBSD 8 at least. Ok, but I suppose that many users use older versions. > PostgreSQL installations via ports as well as its documentation instruct > the FreeBSD user to tweak kern.ipc settings. I found some informations about SysV semaphores on FreeBSD in a Firebird patch, which means that Firebird uses SysV semaphores on FreeBSD :-) (at least in Debian/kFreeBSD). > Almost every FreeBSD user I know of compiles a new kernel. It's just one of > those BSD things that every user goes through. If #10348 is implemented, FreeBSD users will be able to use the multiprocessing module without having to recompile their kernel. The question is more who would like to implement SysV semaphores in Python :-) I don't know anything about these semaphores. http://bugs.python.org/issue10348 Victor From exarkun at twistedmatrix.com Tue Nov 9 03:03:23 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Tue, 09 Nov 2010 02:03:23 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> Message-ID: <20101109020323.2040.1678073058.divmod.xquotient.812@localhost.localdomain> On 12:50 am, guido at python.org wrote: >On Mon, Nov 8, 2010 at 3:55 PM, Glyph Lefkowitz > wrote: >>This seems like a pretty clear case of "practicality beats purity". >>Not only has nobody complained about deprecatedModuleAttribute, but >>there are tons of things which show up in sys.modules that aren't >>modules in the sense of 'instances of ModuleType'. ?The Twisted >>reactor, for example, is an instance, and we've been doing *that* for >>about 10 years with no complaints. > >But the Twisted universe is only a subset of the Python universe. The >Python stdlib needs to move more carefully. I think that Twisted developers are pretty careful to consider the consequences of changes they make to Twisted. We have an explicit, documented backwards compatibility policy, for example. We also have mandatory code review for all changes, with a documented set of guidelines outlining the minimum things a reviewer should be considering. I wonder if there are any actual technical arguments to be made against something like `deprecatedModuleAttribute`? Also, it turns out that ModuleType can be subclassed these days. Jean-Paul From rdmurray at bitdance.com Tue Nov 9 03:07:52 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 08 Nov 2010 21:07:52 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CD89169.6000606@ronadam.com> References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> Message-ID: <20101109020752.D2D3D1FE33E@kimball.webabinitio.net> On Mon, 08 Nov 2010 18:10:17 -0600, Ron Adam wrote: > def _private_api(): > # > # Isn't it a good practice to use comments here? > # > ... IMO, no. -- R. David Murray www.bitdance.com From belopolsky at users.sourceforge.net Tue Nov 9 04:28:40 2010 From: belopolsky at users.sourceforge.net (Alexander Belopolsky) Date: Mon, 8 Nov 2010 22:28:40 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: On Mon, Nov 8, 2010 at 2:58 PM, Brett Cannon wrote: .. > But that doesn't mean we can't go through, fix up our names, and > deprecate the old public names; that's fair game in my book. > +1 See http://bugs.python.org/issue10371 From ncoghlan at gmail.com Tue Nov 9 05:26:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Nov 2010 14:26:22 +1000 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: References: Message-ID: On Tue, Nov 9, 2010 at 11:18 AM, Ron Adam wrote: > What do you think about adding a new _pydoc3.py module along with a > pydoc3.py loader module with a basic user api. ?The number 3, so that it > match's python3.x. > > We can then keep the old pydoc.py unchanged and be free to make a lot more > changes to the _pydoc3.py file without having to be even a little paranoid. I think changing the behaviour of the pydoc command line app is a fine idea - it's only the pydoc.serve and pydoc.gui functions that are worrying me. As I noted on the tracker issue, there's a reasonably clean way to do this, even given the coupling between the 3.1 GUI app and server: leave the existing serve() and gui() functions alone (aside from adding DeprecationWarning), and add your new implementation as a parallel private API. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Tue Nov 9 05:28:32 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 22:28:32 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> Message-ID: <4CD8CDF0.1010500@ronadam.com> On 11/08/2010 07:18 PM, Brett Cannon wrote: > On Mon, Nov 8, 2010 at 16:10, Ron Adam wrote: >> def _private_api(): >> # >> # Isn't it a good practice to use comments here? >> # >> ... > > That is ugly. I already hate doing that for unittest, I'm not about to > champion that for anything else. Ugly? I suppose it's a matter of what you are used to. > It would also lead to essentially requiring a docstrings for > everything that is public whether someone wants to bother to writing a > docstring or not. I don't think we should be suggesting that a > docstring be required either. I can see where that would be overly strict in an application or script made with python. But it seems odd to me, to have undocumented api's in a programming language. If it's being replaced with something else, the doc string can say that. A null string is also a valid doc string if you just need a place holder until someone gets to it. >> Brett, I'm sure you can up with a better alternative. ;-) > > But I don't want to have to do that in the stdlib by remembering what > modules I should or should not import. This is just as much about > developer burden on core devs as it is making sure we don't yank the > rug out from underneath users. Yes, I agree. But how to best do that? From rrr at ronadam.com Tue Nov 9 05:28:32 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 22:28:32 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> Message-ID: <4CD8CDF0.1010500@ronadam.com> On 11/08/2010 07:18 PM, Brett Cannon wrote: > On Mon, Nov 8, 2010 at 16:10, Ron Adam wrote: >> def _private_api(): >> # >> # Isn't it a good practice to use comments here? >> # >> ... > > That is ugly. I already hate doing that for unittest, I'm not about to > champion that for anything else. Ugly? I suppose it's a matter of what you are used to. > It would also lead to essentially requiring a docstrings for > everything that is public whether someone wants to bother to writing a > docstring or not. I don't think we should be suggesting that a > docstring be required either. I can see where that would be overly strict in an application or script made with python. But it seems odd to me, to have undocumented api's in a programming language. If it's being replaced with something else, the doc string can say that. A null string is also a valid doc string if you just need a place holder until someone gets to it. >> Brett, I'm sure you can up with a better alternative. ;-) > > But I don't want to have to do that in the stdlib by remembering what > modules I should or should not import. This is just as much about > developer burden on core devs as it is making sure we don't yank the > rug out from underneath users. Yes, I agree. But how to best do that? From ncoghlan at gmail.com Tue Nov 9 05:37:03 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Nov 2010 14:37:03 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: Message-ID: On Tue, Nov 9, 2010 at 1:28 PM, Alexander Belopolsky wrote: > On Mon, Nov 8, 2010 at 2:58 PM, Brett Cannon wrote: > .. >> But that doesn't mean we can't go through, fix up our names, and >> deprecate the old public names; that's fair game in my book. Indeed. I've now recommended Ron do exactly that for the pydoc patch. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Tue Nov 9 05:47:34 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 08 Nov 2010 22:47:34 -0600 Subject: [Python-Dev] Backward incompatible API changes in the pydoc module In-Reply-To: References: Message-ID: On 11/08/2010 10:26 PM, Nick Coghlan wrote: > On Tue, Nov 9, 2010 at 11:18 AM, Ron Adam wrote: >> What do you think about adding a new _pydoc3.py module along with a >> pydoc3.py loader module with a basic user api. The number 3, so that it >> match's python3.x. >> >> We can then keep the old pydoc.py unchanged and be free to make a lot more >> changes to the _pydoc3.py file without having to be even a little paranoid. > > I think changing the behaviour of the pydoc command line app is a fine > idea - it's only the pydoc.serve and pydoc.gui functions that are > worrying me. As I noted on the tracker issue, there's a reasonably > clean way to do this, even given the coupling between the 3.1 GUI app > and server: leave the existing serve() and gui() functions alone > (aside from adding DeprecationWarning), and add your new > implementation as a parallel private API. Ok, I guess that's what needs to be done then. I can try to do it over the next few days, and will probably need a bit more advise on how to add in the depreciation warnings. Or if you want to go ahead and do it, I'm more than OK with that. Thanks for the help on this. I do appreciate it. Cheers, Ron From eliben at gmail.com Tue Nov 9 06:30:08 2010 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 9 Nov 2010 07:30:08 +0200 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101109020752.D2D3D1FE33E@kimball.webabinitio.net> References: <4CD86D59.8070006@ronadam.com> <4CD89169.6000606@ronadam.com> <20101109020752.D2D3D1FE33E@kimball.webabinitio.net> Message-ID: On Tue, Nov 9, 2010 at 04:07, R. David Murray wrote: > On Mon, 08 Nov 2010 18:10:17 -0600, Ron Adam wrote: > > def _private_api(): > > # > > # Isn't it a good practice to use comments here? > > # > > ... > > IMO, no. > > FWIW, I agree completely. Docstrings are a part of Python I don't see a reason to leave out for "non-public" code. They're convenient in the beginning of functions and we all are used to seeing them there. IDE's use them to display helpful "tooltips" on functions, and so on. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From khamenya at gmail.com Tue Nov 9 11:40:16 2010 From: khamenya at gmail.com (Valery Khamenya) Date: Tue, 9 Nov 2010 11:40:16 +0100 Subject: [Python-Dev] rlcompleter -- auto-complete dictionary keys (+ tests) In-Reply-To: References: Message-ID: > > Can you post your patch on bugs.python.org? > done -- now both 2.x and 3.x patches are available on http://bugs.python.org/issue10351 The py3k appeared to be *much* more friendly regarding the unpleasant unicode-issues that I've faced in python 2.x regards, Valery -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Tue Nov 9 11:59:30 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 09 Nov 2010 10:59:30 +0000 Subject: [Python-Dev] GUI test runner tool In-Reply-To: References: <4CD7E886.4060702@voidspace.org.uk> Message-ID: <4CD92992.80009@voidspace.org.uk> On 08/11/2010 19:00, Brett Cannon wrote: > On Mon, Nov 8, 2010 at 04:09, Michael Foord wrote: >> Hello all, >> >> Now that unittest has test discovery, Mark Roddy has been working on >> resurrecting the old GUI test runner (using Tkinter): >> >> https://bitbucket.org/markroddy/unittestgui >> >> This was part of the original pyunit project but I believe it was never part >> of the standard library: >> >> http://sourceforge.net/projects/pyunit/ >> >> Here's a screenshot of what it looks like: >> >> http://skitch.com/fuzzyman/dhu9r/pyunit >> >> I'd like to propose adding it to Python in Tools/ and am volunteering to >> maintain it. > Does that mean upgrading it as well? =) Yes... > For instance it would be great > to get it to use ttk so it looks a bit sharper, I've never used Ttk. Patches welcomed... > supports skipped tests > and expected failures, It already does, the screenshot is a bit old. :-) > and dream-of-dreams ties into regrtest so you > can just check boxes instead of passing a ton of CLI flags. > That would be great, but regrtest is a bit custom. It's a great idea, but would need a different UI shell. >> If the answer is "not yet" that is fine as it can go into >> unittest2 first. Mark has updated it to work with test discovery and added >> support for configuring test discovery in the same way as you can from the >> command line. It is a nice tool for those new to writing tests who aren't >> yet familiar with the command line or prefer a GUI. > I personally have no problem with it going into tools as long as it > can also be used to run the tests in the stdlib. Unfortunately the stdlib tests largely aren't compatible with test discovery. There is an open issue about that. Many of the tests depend on being run with regrtest, and use features that are in many places now obsolete due to improvements in unittest. No-one has yet done the work to switch them over. It is 'on my list' though. All the best, Michael Foord > Just don't put it in > Demos/ . =) > > -Brett > >> In its basic form you simply pick a directory and unittestgui will discover >> and run all the tests it finds. It would be nice if it provided more >> diagnostic information on tests it ran (clicking through test results) but >> these can be added later. >> >> All the best, >> >> Michael Foord >> >> -- >> >> http://www.voidspace.org.uk/ >> >> READ CAREFULLY. By accepting and reading this email you agree, >> on behalf of your employer, to release me from all obligations >> and waivers arising from any and all NON-NEGOTIATED agreements, >> licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, >> confidentiality, non-disclosure, non-compete and acceptable use >> policies (?BOGUS AGREEMENTS?) that I have entered into with your >> employer, its partners, licensors, agents and assigns, in >> perpetuity, without prejudice to my ongoing rights and privileges. >> You further represent that you have the authority to release me >> from any BOGUS AGREEMENTS on behalf of your employer. >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/brett%40python.org >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Tue Nov 9 12:53:44 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 9 Nov 2010 12:53:44 +0100 Subject: [Python-Dev] Breaking undocumented API References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> <20101109020323.2040.1678073058.divmod.xquotient.812@localhost.localdomain> Message-ID: <20101109125344.1ab05a15@pitrou.net> On Tue, 09 Nov 2010 02:03:23 -0000 exarkun at twistedmatrix.com wrote: > > I wonder if there are any actual technical arguments to be made against > something like `deprecatedModuleAttribute`? For example, does it work well with import hacks such as Mercurial's demandimport? Regards Antoine. From solipsis at pitrou.net Tue Nov 9 13:00:57 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 9 Nov 2010 13:00:57 +0100 Subject: [Python-Dev] r86351 - python/branches/py3k/Lib/tempfile.py References: <20101109034358.C6F75EE989@mail.python.org> Message-ID: <20101109130057.616401f7@pitrou.net> On Tue, 9 Nov 2010 04:43:58 +0100 (CET) raymond.hettinger wrote: > Author: raymond.hettinger > Date: Tue Nov 9 04:43:58 2010 > New Revision: 86351 > > Log: > Simplify code > > Modified: > python/branches/py3k/Lib/tempfile.py > > Modified: python/branches/py3k/Lib/tempfile.py > ============================================================================== > --- python/branches/py3k/Lib/tempfile.py (original) > +++ python/branches/py3k/Lib/tempfile.py Tue Nov 9 04:43:58 2010 > @@ -108,30 +108,19 @@ > > _RandomNameSequence is an iterator.""" > > - characters = ("abcdefghijklmnopqrstuvwxyz" + > - "ABCDEFGHIJKLMNOPQRSTUVWXYZ" + > - "0123456789_") > + characters = "abcdefghijklmnopqrstuvwxyz0123456789_" Aren't you reducing entropy here? From fuzzyman at voidspace.org.uk Tue Nov 9 13:17:13 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 09 Nov 2010 12:17:13 +0000 Subject: [Python-Dev] GUI test runner tool In-Reply-To: References: <4CD7E886.4060702@voidspace.org.uk> Message-ID: <4CD93BC9.2090400@voidspace.org.uk> On 08/11/2010 19:28, Alexander Belopolsky wrote: > On Mon, Nov 8, 2010 at 7:09 AM, Michael Foord wrote: > .. >> I'd like to propose adding [unittestgui] to Python in Tools/ and am volunteering to >> maintain it. > Why not adding it under Lib/unittest/? I really don't want to make Tk a dependency for unittest itself. :-) I also don't want a GUI test runner to in any way be part of the *api* of unittest... > I think Tools/ is a less > attractive location for most users than say PyPI or some other package > repository. Tools/ is for stuff that is primarily of interest to > python developers, not python users. OS vendors are less likely to > install packages in Tools/ in a user-visible place than they are a > popular 3rd-party package. Well, there's always Demos/. ;-) I realise that putting it in Tools/ means that distros will probably have to make a conscious decision to package it. unittest2 will install it as a script. I don't think the gui runner belongs in Demos/, so Tools/ is the logical choice for including in core-Python. As Raymond says we can point to it in the docs. (The gui test runner is merely a convenience / beginners tool - so pointing to more "production suitable" tools like Hudson would also be good.) I was looking to see what else was in Tools/ that was distributed with Python, but I don't *think* the Mac distribution includes them at all. (freeze is in the Tools/ directory of the repo and is an 'end user' tool rather than core-developer tool.) The Mac distribution does put a bunch of stuff in the Python 'bin' directory, and ideally it would go there. All the best, Michael Foord > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Tue Nov 9 13:26:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Nov 2010 22:26:22 +1000 Subject: [Python-Dev] r86351 - python/branches/py3k/Lib/tempfile.py In-Reply-To: <20101109130057.616401f7@pitrou.net> References: <20101109034358.C6F75EE989@mail.python.org> <20101109130057.616401f7@pitrou.net> Message-ID: On Tue, Nov 9, 2010 at 10:00 PM, Antoine Pitrou wrote: >> - ? ?characters = ("abcdefghijklmnopqrstuvwxyz" + >> - ? ? ? ? ? ? ? ? ?"ABCDEFGHIJKLMNOPQRSTUVWXYZ" + >> - ? ? ? ? ? ? ? ? ?"0123456789_") >> + ? ?characters = "abcdefghijklmnopqrstuvwxyz0123456789_" > > Aren't you reducing entropy here? Perhaps in some cases, but it also makes the behaviour consistent across all platforms. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From exarkun at twistedmatrix.com Tue Nov 9 17:10:23 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Tue, 09 Nov 2010 16:10:23 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101109125344.1ab05a15@pitrou.net> References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> <20101109020323.2040.1678073058.divmod.xquotient.812@localhost.localdomain> <20101109125344.1ab05a15@pitrou.net> Message-ID: <20101109161023.2040.316260932.divmod.xquotient.834@localhost.localdomain> On 11:53 am, solipsis at pitrou.net wrote: >On Tue, 09 Nov 2010 02:03:23 -0000 >exarkun at twistedmatrix.com wrote: >> >>I wonder if there are any actual technical arguments to be made >>against >>something like `deprecatedModuleAttribute`? > >For example, does it work well with import hacks such as Mercurial's >demandimport? I haven't tried before, but a quick experiment suggests that the two happily co-exist (aside from demandimport getting the blame instead of the true offending code, but that's really a problem with the warnings module): >>> import mercurial.demandimport as di >>> di.enable() >>> import twisted.python.threadpool as tp >>> tp.ThreadSafeList /usr/lib/pymodules/python2.6/mercurial/demandimport.py:76: DeprecationWarning: twisted.python.threadpool.ThreadSafeList was deprecated in Twisted 10.1.0: This was an internal implementation detail of support for Jython 2.1, which is now obsolete. return getattr(self._module, attr) >>> Jean-Paul From alexander.belopolsky at gmail.com Tue Nov 9 17:23:23 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 9 Nov 2010 11:23:23 -0500 Subject: [Python-Dev] [Python-checkins] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: <20101109093941.AD093EEA49@mail.python.org> References: <20101109093941.AD093EEA49@mail.python.org> Message-ID: On Tue, Nov 9, 2010 at 4:39 AM, victor.stinner wrote: .. > Log: > Issue #10359: Remove useless comma, invalid in ISO C C99 allows it. Which compiler is giving you trouble? From solipsis at pitrou.net Tue Nov 9 17:36:57 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 9 Nov 2010 17:36:57 +0100 Subject: [Python-Dev] r86355 - python/branches/py3k/Modules/_pickle.c References: <20101109093941.AD093EEA49@mail.python.org> Message-ID: <20101109173657.5f7a233a@pitrou.net> On Tue, 9 Nov 2010 11:23:23 -0500 Alexander Belopolsky wrote: > On Tue, Nov 9, 2010 at 4:39 AM, victor.stinner > wrote: > .. > > Log: > > Issue #10359: Remove useless comma, invalid in ISO C > > C99 allows it. Which compiler is giving you trouble? One part of the answer is that we generally try to enforce C89 compatibility. I don't know if any modern compiler would mind, though. Regards Antoine. From alexander.belopolsky at gmail.com Tue Nov 9 18:12:35 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 9 Nov 2010 12:12:35 -0500 Subject: [Python-Dev] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: <20101109173657.5f7a233a@pitrou.net> References: <20101109093941.AD093EEA49@mail.python.org> <20101109173657.5f7a233a@pitrou.net> Message-ID: On Tue, Nov 9, 2010 at 11:36 AM, Antoine Pitrou wrote: .. >> C99 allows it. ?Which compiler is giving you trouble? > > One part of the answer is that we generally try to enforce C89 > compatibility. I don't know if any modern compiler would mind, though. I know, but if we ever start making exceptions, this would be a particularly harmless one. There must be a reason why we don't use -std=c89 flag with the standard config. I don't think too many people remember that c89 allows trailing commas in array and struct initialization lists, but not in enum declarations. Without compiler help, enforcing this is an unnecessary maintenance burden. From stefan-usenet at bytereef.org Tue Nov 9 18:29:53 2010 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Tue, 9 Nov 2010 18:29:53 +0100 Subject: [Python-Dev] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: References: <20101109093941.AD093EEA49@mail.python.org> <20101109173657.5f7a233a@pitrou.net> Message-ID: <20101109172953.GA16957@yoda.bytereef.org> Alexander Belopolsky wrote: > On Tue, Nov 9, 2010 at 11:36 AM, Antoine Pitrou wrote: > .. > >> C99 allows it. ?Which compiler is giving you trouble? > > > > One part of the answer is that we generally try to enforce C89 > > compatibility. I don't know if any modern compiler would mind, though. > > I know, but if we ever start making exceptions, this would be a > particularly harmless one. There must be a reason why we don't use > -std=c89 flag with the standard config. I don't think too many people > remember that c89 allows trailing commas in array and struct > initialization lists, but not in enum declarations. Without compiler > help, enforcing this is an unnecessary maintenance burden. xlc on AIX has problems: http://bugs.python.org/issue5889 Stefan Krah From tseaver at palladion.com Tue Nov 9 19:49:01 2010 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 09 Nov 2010 13:49:01 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/08/2010 06:26 PM, Bobby Impollonia wrote: > This does hurt because anyone who was relying on "import *" to get a > name which is now omitted from __all__ is going to upgrade and find > their program failing with NameErrors. This is a backwards compatible > change and shouldn't happen without a deprecation warning first. Outside an interactive prompt, anyone using "from foo import *" has set themselves and their users up to lose anyway. That syntax is the single worst misfeature in all of Python. It impairs readability and discoverability for *no* benefit beyond one-time typing convenience. Module writers who compound the error by expecting to be imported this way, thereby bogarting the global namespace for their own purposes, should be fish-slapped. ;) Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzZl50ACgkQ+gerLs4ltQ5PYQCfUF2l8BjYvaZSu7ATT8/PxweH jqMAoIWD/D5KIfLp/JOdPVuWJsH/kdc/ =/349 -----END PGP SIGNATURE----- From merwok at netwok.org Tue Nov 9 20:46:41 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Tue, 09 Nov 2010 20:46:41 +0100 Subject: [Python-Dev] [Python-checkins] r86348 - in python/branches/py3k/Lib: test/test_xml_etree.py xml/etree/ElementTree.py In-Reply-To: <20101109023700.32DC1EEA06@mail.python.org> References: <20101109023700.32DC1EEA06@mail.python.org> Message-ID: <4CD9A521.9030200@netwok.org> Hello Senthil > Author: senthil.kumaran > New Revision: 86348 > Log: Fix Issue10205 - XML QName error when different tags have same QName. > > Modified: > python/branches/py3k/Lib/test/test_xml_etree.py > python/branches/py3k/Lib/xml/etree/ElementTree.py Shouldn?t this include an entry in NEWS and maybe in ACKS? Regards From glyph at twistedmatrix.com Tue Nov 9 21:36:16 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 9 Nov 2010 12:36:16 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101108210318.2040.535133461.divmod.xquotient.744@localhost.localdomain> <20101108214512.2040.865703760.divmod.xquotient.763@localhost.localdomain> <20101108223553.2040.1385281841.divmod.xquotient.766@localhost.localdomain> <07ABB6B8-6858-4FAD-8FE0-F93BFB6C98BF@twistedmatrix.com> Message-ID: On Nov 8, 2010, at 4:50 PM, Guido van Rossum wrote: > On Mon, Nov 8, 2010 at 3:55 PM, Glyph Lefkowitz wrote: >> This seems like a pretty clear case of "practicality beats purity". Not only has nobody complained about deprecatedModuleAttribute, but there are tons of things which show up in sys.modules that aren't modules in the sense of 'instances of ModuleType'. The Twisted reactor, for example, is an instance, and we've been doing *that* for about 10 years with no complaints. > > But the Twisted universe is only a subset of the Python universe. The > Python stdlib needs to move more carefully. While this is true, I think the Twisted universe generally represents a particularly conservative, compatibility-conscious area within the Python universe (multiverse?). I know of several Twisted users who regularly upgrade to the most recent version of Twisted without incident, but can't move from Python 2.4->2.5 because of compatibility issues. That's not to say that there are no areas within the larger Python ecosystem that I'm unaware of where putting non-module-objects into sys.modules would cause issues. But if it were a practice that were at all common, I suspect that we would have bumped into it by now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.badger at gmail.com Tue Nov 9 21:48:06 2010 From: a.badger at gmail.com (Toshio Kuratomi) Date: Tue, 9 Nov 2010 12:48:06 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: <20101109204806.GB14976@unaka.lan> On Tue, Nov 09, 2010 at 01:49:01PM -0500, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/08/2010 06:26 PM, Bobby Impollonia wrote: > > > This does hurt because anyone who was relying on "import *" to get a > > name which is now omitted from __all__ is going to upgrade and find > > their program failing with NameErrors. This is a backwards compatible > > change and shouldn't happen without a deprecation warning first. > > Outside an interactive prompt, anyone using "from foo import *" has set > themselves and their users up to lose anyway. > > That syntax is the single worst misfeature in all of Python. It impairs > readability and discoverability for *no* benefit beyond one-time typing > convenience. Module writers who compound the error by expecting to be > imported this way, thereby bogarting the global namespace for their own > purposes, should be fish-slapped. ;) > I think there's a valid case for bogarting the namespace in this instance, but let me know if there's a better way to do it:: # Method to use system libraries if available, otherwise use a bundled copy, # aka: make both system packagers and developers happy:: Relevant directories and files for this module:: + foo/ +- __init__.py ++ compat/ +- __init__.py ++ bar/ +- __init__.py +- _bar.py foo/compat/bar/_bar.py is a bundled module. foo/compat/bar/__init__.py has: try: from bar import * from bar import __all__ except ImportError:: from foo.compat.bar._bar import * from foo.compat.bar._bar import __all__ -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From martin at v.loewis.de Tue Nov 9 22:17:25 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 09 Nov 2010 22:17:25 +0100 Subject: [Python-Dev] bugs.python.org migration Message-ID: <4CD9BA65.1060005@v.loewis.de> bugs.python.org is moving to a new hardware; this also involves a new IP address. The migration will happen on Thursday, likely around 8:00 UTC. If all goes well, outage should be very short. Regards, Martin From ncoghlan at gmail.com Tue Nov 9 23:09:09 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 10 Nov 2010 08:09:09 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: On Wed, Nov 10, 2010 at 4:49 AM, Tres Seaver wrote: > Outside an interactive prompt, anyone using "from foo import *" has set > themselves and their users up to lose anyway. > > That syntax is the single worst misfeature in all of Python. ?It impairs > readability and discoverability for *no* benefit beyond one-time typing > convenience. ?Module writers who compound the error by expecting to be > imported this way, thereby bogarting the global namespace for their own > purposes, should be fish-slapped. ;) Be prepared to fish-slap all of python-dev then - we use precisely this technique to support optional acceleration modules. The pure Python versions of pairs like profile/_profile and heapq/_heapq include a try/except block at the end that does the equivalent of: try: from _accelerated import * # Allow accelerated overrides except ImportError: pass # Use pure Python versions This allows each implementation to make its own decisions about exactly which parts to accelerate without needing to change the pure Python version. In CPython itself, different *builds* may vary based on which components are available during the build process. There are utility functions provided in test.support that allow us to make sure that these modules are tested both with and without their accelerated components. The new unittest package in 2.7 and 3.2 also uses it in the module __init__ to present the old "flat" namespace despite become a package under the hood. Star imports are certainly open to abuse, but there are legitimate use cases when you want to lie about where particular APIs live in the module heirarchy. Those use cases generally involve being imported by one *specific* other module, such that anyone else importing the module directly *at all* is already doing the wrong thing. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tseaver at palladion.com Tue Nov 9 23:12:00 2010 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 09 Nov 2010 17:12:00 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101109204806.GB14976@unaka.lan> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <20101109204806.GB14976@unaka.lan> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/09/2010 03:48 PM, Toshio Kuratomi wrote: > I think there's a valid case for bogarting the namespace in this instance, > but let me know if there's a better way to do it:: > > # Method to use system libraries if available, otherwise use a bundled copy, > # aka: make both system packagers and developers happy:: > > > Relevant directories and files for this module:: > > + foo/ > +- __init__.py > ++ compat/ > +- __init__.py > ++ bar/ > +- __init__.py > +- _bar.py > > foo/compat/bar/_bar.py is a bundled module. > > foo/compat/bar/__init__.py has: > > try: > from bar import * > from bar import __all__ > except ImportError:: > from foo.compat.bar._bar import * > from foo.compat.bar._bar import __all__ I guess the usual caveats apply for dopplegangers / proxies. ;) Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzZxzAACgkQ+gerLs4ltQ5UsgCfcaxeFruJCDGnxBA0ma8Pjggg lW8AoMBx2FYg+PSA/Zbq94UbiPhKGnjO =/8QU -----END PGP SIGNATURE----- From foom at fuhm.net Wed Nov 10 00:33:35 2010 From: foom at fuhm.net (James Y Knight) Date: Tue, 9 Nov 2010 18:33:35 -0500 Subject: [Python-Dev] Continuing 2.x In-Reply-To: References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> <0A652EB4-2C5D-4070-83EE-CF75603EE721@fuhm.net> Message-ID: On Nov 8, 2010, at 6:08 PM, Lennart Regebro wrote: > 2010/11/8 James Y Knight : >> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote: >>> So it can be done, but the question is "Why?" >> >> To keep the batteries included? > > But they'll only be included in > 2.7, which won't be used much, [...] If there was going to be an official python.org sanctioned Python 2.8 release, I'm not at all sure that'd be the case. Since there isn't going to be one, then yes, that's probably true. James From orsenthil at gmail.com Wed Nov 10 00:48:42 2010 From: orsenthil at gmail.com (Senthil Kumaran) Date: Wed, 10 Nov 2010 07:48:42 +0800 Subject: [Python-Dev] [Python-checkins] r86348 - in python/branches/py3k/Lib: test/test_xml_etree.py xml/etree/ElementTree.py In-Reply-To: <4CD9A521.9030200@netwok.org> References: <20101109023700.32DC1EEA06@mail.python.org> <4CD9A521.9030200@netwok.org> Message-ID: <20101109234842.GA1068@rubuntu> Hello ?ric, On Tue, Nov 09, 2010 at 08:46:41PM +0100, ?ric Araujo wrote: > > Shouldn?t this include an entry in NEWS and maybe in ACKS? It was a very simple bug fix (caused due to an overlook initially), so did not add NEWS/ACKS. For features, larger fixes or complete patches, I the add NEWS and ACKS as appropriate. Thanks, Senthil From stephen at xemacs.org Wed Nov 10 05:12:09 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 10 Nov 2010 13:12:09 +0900 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > > Module writers who compound the error by expecting to be imported > > this way, thereby bogarting the global namespace for their own > > purposes, should be fish-slapped. ;) > > Be prepared to fish-slap all of python-dev then - we use precisely > this technique to support optional acceleration modules. The pure > Python versions of pairs like profile/_profile and heapq/_heapq > include a try/except block at the end that does the equivalent of: > > try: > from _accelerated import * # Allow accelerated overrides > except ImportError: > pass # Use pure Python versions But these identifiers will appear at the module level, not global, no? Otherwise this technique couldn't be used. I don't really understand what Tres is talking about when he writes "modules that expect to be imported this way". The *imported* module shouldn't care, no? This is an issue for the *importing* code to deal with. From stephen at xemacs.org Wed Nov 10 05:20:58 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 10 Nov 2010 13:20:58 +0900 Subject: [Python-Dev] Continuing 2.x In-Reply-To: References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC57@exchcn.ccp.ad.local> <0A652EB4-2C5D-4070-83EE-CF75603EE721@fuhm.net> Message-ID: <87eiatg66t.fsf@uwakimon.sk.tsukuba.ac.jp> James Y Knight writes: > > On Nov 8, 2010, at 6:08 PM, Lennart Regebro wrote: > > > 2010/11/8 James Y Knight : > >> On Nov 8, 2010, at 4:42 AM, Lennart Regebro wrote: > >>> So it can be done, but the question is "Why?" > >> > >> To keep the batteries included? > > > > But they'll only be included in > 2.7, which won't be used much, [...] > > If there was going to be an official python.org sanctioned Python > 2.8 release, I'm not at all sure that'd be the case. Since there > isn't going to be one, then yes, that's probably true. Which pretty much demonstrates that the argument for a sanctioned 2.8 is weak, and ditto for adding features to 2.7. Python 2.7 is a great language; existing projects which need to go beyond that need to port to a different language. The OP is already doing that IIUC: Stackless is a pretty faithful implementation of Python (in several versions of the language, too!), but not quite 100%, right? OTOH, how many derivatives has C spawned? Or Pascal, FORTRAN, LISP? ML? And people continue to find that variety *constraining*, and invent new languages! python-dev's decision to offer that different language as Python 3, where *almost all* of your skills will upgrade transparently (even though unfortunately a lot of code won't, at least not today), is probably a great boon to developers *in* Python. Time will tell. From ncoghlan at gmail.com Wed Nov 10 14:32:52 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 10 Nov 2010 23:32:52 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDA8EB5.6090306@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDA8EB5.6090306@voidspace.org.uk> Message-ID: On Wed, Nov 10, 2010 at 10:23 PM, Michael Foord wrote: > On 09/11/2010 22:09, Nick Coghlan wrote: >> The new unittest package in 2.7 and 3.2 also uses it in the module >> __init__ to present the old "flat" namespace despite become a package >> under the hood. > > Look again. :-) > > Benjamin did the refactoring into a package and he obviously dislikes > "import *" as much as me. If he had used "import *" I would have changed it > anyway, but he didn't. > > We also define a __all__ to make the exported names explicit. Fair cop :) (and in that particular case, the maintenance burden in being explicit is minimal, since new top-level names in unittest are going to be significantly more rare than new methods on existing unittest classes) Even some of the acceleration modules (such as _hashlib) use approaches that are more explicit than using "import *". The point at least stands for the cases where the pure Python version is largely agnostic as to exactly which names the acceleration module overrides. It's a very, very niche use case though, so the default position of "if you use a star import anywhere other than at the interactive prompt, you're most like wrong to do so" is still a reasonable stance to take :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From victor.stinner at haypocalc.com Wed Nov 10 13:28:36 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 10 Nov 2010 13:28:36 +0100 Subject: [Python-Dev] [Python-checkins] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: References: <20101109093941.AD093EEA49@mail.python.org> Message-ID: <201011101328.36095.victor.stinner@haypocalc.com> On Tuesday 09 November 2010 17:23:23 Alexander Belopolsky wrote: > On Tue, Nov 9, 2010 at 4:39 AM, victor.stinner > wrote: > .. > > > Log: > > Issue #10359: Remove useless comma, invalid in ISO C > > C99 allows it. Which compiler is giving you trouble? I don't know, but the commit is trivial and cheap. If it improves the support on uncommon compiler, I agree to commit such change. Victor From flub at devork.be Wed Nov 10 10:20:09 2010 From: flub at devork.be (Floris Bruynooghe) Date: Wed, 10 Nov 2010 09:20:09 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 10 November 2010 04:12, Stephen J. Turnbull wrote: > Nick Coghlan writes: > > ?> > Module writers who compound the error by expecting to be imported > ?> > this way, thereby bogarting the global namespace for their own > ?> > purposes, should be fish-slapped. ;) > ?> > ?> Be prepared to fish-slap all of python-dev then - we use precisely > ?> this technique to support optional acceleration modules. The pure > ?> Python versions of pairs like profile/_profile and heapq/_heapq > ?> include a try/except block at the end that does the equivalent of: > ?> > ?> ? try: > ?> ? ? from _accelerated import * # Allow accelerated overrides > ?> ? except ImportError: > ?> ? ? pass # Use pure Python versions > > But these identifiers will appear at the module level, not global, no? > Otherwise this technique couldn't be used. ?I don't really understand > what Tres is talking about when he writes "modules that expect to be > imported this way". ?The *imported* module shouldn't care, no? ?This > is an issue for the *importing* code to deal with. I can't think of stdlib examples, but for 3rd party packages I'd say storm.locals and fabric.api are examples of packages designed with "from foo import * " in mind. So this does happen. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From hrvoje.niksic at avl.com Wed Nov 10 13:23:35 2010 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Wed, 10 Nov 2010 13:23:35 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CDA8EC7.5060302@avl.com> On 11/10/2010 05:12 AM, Stephen J. Turnbull wrote: > But these identifiers will appear at the module level, not global, no? > Otherwise this technique couldn't be used. I don't really understand > what Tres is talking about when he writes "modules that expect to be > imported this way". The *imported* module shouldn't care, no? I think he's referring to the choice of identifiers, and the usage examples given in the documentation and tutorials. For example, in the original PyGTK, all identifiers included "Gtk" in the name, so it made sense to write from pygtk import * so you could spell GtkWindow as GtkWindow rather than the redundant pygtk.GtkWindow. In that sense the module writer "expected" to be imported this way, although you are right that it doesn't the least bit matter for the correct operation of the module itself. For GTK 2 PyGTK switch to "gtk.Window", which effectively removes the temptation to import * from the module. There are other examples of that school, most notably ctypes, but also Tkinter and the python2 threading module. Fortunately it has become much less popular in the last ~5 years of Python history. From jcea at jcea.es Wed Nov 10 12:34:47 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 10 Nov 2010 12:34:47 +0100 Subject: [Python-Dev] bugs.python.org migration In-Reply-To: <4CD9BA65.1060005@v.loewis.de> References: <4CD9BA65.1060005@v.loewis.de> Message-ID: <4CDA8357.9060304@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 09/11/10 22:17, "Martin v. L?wis" wrote: > bugs.python.org is moving to a new hardware; this also involves a new IP > address. The migration will happen on Thursday, likely around 8:00 UTC. > If all goes well, outage should be very short. Seems to be offline now. I get timeouts. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNqDV5lgi5GaxT1NAQLrSQP+KsvV9ZUWWPmLQT7cavH6IiuEDIwq+qDS eoQIa149Qv2G7W8rRkQLK1KcpCyyF50vPdTLuTksZXjm3aHikIuTVkIWSoQUhTUy 4Un1rNF9KC2mEtBuEUDyREoAEgpC4tMxXucYUGl37IM9HUqVCd9MWrG9Paf6EAs8 d1yfY7PNGvE= =jTk0 -----END PGP SIGNATURE----- From raymond.hettinger at gmail.com Wed Nov 10 20:33:55 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 10 Nov 2010 11:33:55 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDAA27B.8040703@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: On Nov 10, 2010, at 5:47 AM, Michael Foord wrote: > > So it is obvious that we don't have a clearly stated policy for what defines the public API of standard library modules. > > How about making this explicit (either pep 8 or our developer docs): I believe the point of Guido's email was that it is a situation dependent judgment call and not readily boiled down to a set of rules for PEP 8. Raymond From fuzzyman at voidspace.org.uk Wed Nov 10 13:23:17 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 10 Nov 2010 12:23:17 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: <4CDA8EB5.6090306@voidspace.org.uk> On 09/11/2010 22:09, Nick Coghlan wrote: > On Wed, Nov 10, 2010 at 4:49 AM, Tres Seaver wrote: >> Outside an interactive prompt, anyone using "from foo import *" has set >> themselves and their users up to lose anyway. >> >> That syntax is the single worst misfeature in all of Python. It impairs >> readability and discoverability for *no* benefit beyond one-time typing >> convenience. Module writers who compound the error by expecting to be >> imported this way, thereby bogarting the global namespace for their own >> purposes, should be fish-slapped. ;) > Be prepared to fish-slap all of python-dev then - we use precisely > this technique to support optional acceleration modules. The pure > Python versions of pairs like profile/_profile and heapq/_heapq > include a try/except block at the end that does the equivalent of: > > try: > from _accelerated import * # Allow accelerated overrides > except ImportError: > pass # Use pure Python versions > > This allows each implementation to make its own decisions about > exactly which parts to accelerate without needing to change the pure > Python version. In CPython itself, different *builds* may vary based > on which components are available during the build process. > > There are utility functions provided in test.support that allow us to > make sure that these modules are tested both with and without their > accelerated components. > > The new unittest package in 2.7 and 3.2 also uses it in the module > __init__ to present the old "flat" namespace despite become a package > under the hood. Look again. :-) Benjamin did the refactoring into a package and he obviously dislikes "import *" as much as me. If he had used "import *" I would have changed it anyway, but he didn't. We also define a __all__ to make the exported names explicit. All the best, Michael > Star imports are certainly open to abuse, but there are legitimate use > cases when you want to lie about where particular APIs live in the > module heirarchy. Those use cases generally involve being imported by > one *specific* other module, such that anyone else importing the > module directly *at all* is already doing the wrong thing. > > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Wed Nov 10 14:47:39 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 10 Nov 2010 13:47:39 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: <4CDAA27B.8040703@voidspace.org.uk> On 08/11/2010 22:07, Raymond Hettinger wrote: > On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote: > >> I think we need to, as a group, decide how to handle undocumented APIs >> that don't have a leading underscore: they get treated just the same >> as the documented APIs, or are they private regardless and thus we can >> change them at our whim? > To start with, it doesn't hurt for a maintainer to add an __all__ entry and to only document the parts of the API we think need to be exposed. That way, we can at least declare the parts that are intended to be public on a go-forward basis. > > For the most part, the non-underscored parts of the API shouldn't be changed "at our whim". Some sense needs to be applied to the decision. Google's code search is great for showing how people actually have used a module in real world code. If that shows that people are accessing and/or changing an attribute, it probably needs to remain exposed. In the absence of a code search, good guesses can be made about what someone might reasonably and usefully be accessing (i.e. glob0 isn't likely). The goal is to improve the standard library while minimizing breakage, and that will involve trade-offs depending on what is being changed. > > IIRC, we've been trying to get away from deprecations because they're so disruptive. For example, when the pprint rewrite is finally ready, if there is an incompatible API change, I expect that a new clean class will be offered, but that the old will be left in-place so that tons of existing code won't break). Likewise, with the unittest clean-ups, I'm expecting that Michael will introduce aliases when fixing-up mis-named methods, rather than break code that uses the existing names. > So it is obvious that we don't have a clearly stated policy for what defines the public API of standard library modules. How about making this explicit (either pep 8 or our developer docs): If a module or package defines __all__ that authoritatively defines the public interface. Modules with __all__ SHOULD still respect the naming conventions (leading underscore for private members) to avoid confusing users. Modules SHOULD NOT export private members in __all__. Names imported into a module a never considered part of its public API unless documented to be so or included in __all__. Methods / functions / classes and module attributes whose names begin with a leading underscore are private. If a class name begins with a leading underscore none of its members are public, whether or not they begin with a leading underscore. If a module name in a package begins with a leading underscore none of its members are public, whether or not they begin with a leading underscore. If a module or package doesn't define __all__ then all names that don't start with a leading underscore are public. All public members MUST be documented. Public functions, methods and classes SHOULD have docstrings. Private members may have docstrings. Where in the standard library this means that a module exports stuff that isn't helpful or shouldn't be part of the public API we need to migrate to private names and follow our deprecation process for the public names. All the best, Michael Foord > my-two-cents, > > > Raymond > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From rdmurray at bitdance.com Wed Nov 10 16:48:37 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 10 Nov 2010 10:48:37 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20101110154837.CA4731FB625@kimball.webabinitio.net> On Wed, 10 Nov 2010 13:12:09 +0900, "Stephen J. Turnbull" wrote: > Nick Coghlan writes: > > > > Module writers who compound the error by expecting to be imported > > > this way, thereby bogarting the global namespace for their own > > > purposes, should be fish-slapped. ;) > > > > Be prepared to fish-slap all of python-dev then - we use precisely > > this technique to support optional acceleration modules. The pure > > Python versions of pairs like profile/_profile and heapq/_heapq > > include a try/except block at the end that does the equivalent of: > > > > try: > > from _accelerated import * # Allow accelerated overrides > > except ImportError: > > pass # Use pure Python versions > > But these identifiers will appear at the module level, not global, no? > Otherwise this technique couldn't be used. I don't really understand > what Tres is talking about when he writes "modules that expect to be > imported this way". The *imported* module shouldn't care, no? This > is an issue for the *importing* code to deal with. I think Tres was referring to certain packages (which shall remain nameless since I don't feel like googling to find one) whose documentation recommends the 'from import *' methodology. At least that's how I read "Module writers who..." (that is, he's not saying the *module* expects to be imported that way). [*] -- R. David Murray www.bitdance.com [*] although reading that sentence literally, the thought of such a module writer themselves being imported that way (a la Tron) has a certain charm.... From tseaver at palladion.com Wed Nov 10 17:58:17 2010 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 10 Nov 2010 11:58:17 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/09/2010 11:12 PM, Stephen J. Turnbull wrote: > Nick Coghlan writes: > > > > Module writers who compound the error by expecting to be imported > > > this way, thereby bogarting the global namespace for their own > > > purposes, should be fish-slapped. ;) > > > > Be prepared to fish-slap all of python-dev then - we use precisely > > this technique to support optional acceleration modules. The pure > > Python versions of pairs like profile/_profile and heapq/_heapq > > include a try/except block at the end that does the equivalent of: > > > > try: > > from _accelerated import * # Allow accelerated overrides > > except ImportError: > > pass # Use pure Python versions > > But these identifiers will appear at the module level, not global, no? > Otherwise this technique couldn't be used. I don't really understand > what Tres is talking about when he writes "modules that expect to be > imported this way". The *imported* module shouldn't care, no? This > is an issue for the *importing* code to deal with. Right -- "private" star imports aren't the issue for me, because the same user who creates them is responsible for the other end fo the stick. I was ranting about library authors who document star imports as the expected usage pattern for their external users. Note that I still wouldn't use star imports in the "private acceleration" case myself. I would prefer a pattern like: - ----------------------- $< ----------------------------- # spam.py # Pure python API implementation def foo(spat, blarg): ... def bar(qux): ... # Replace with accelearated C implemenataion try: import _spam except ImportError: pass # accelerated version not available else: foo = _spam.foo bar = _spam.bar - ----------------------- $< ----------------------------- This explicit name remapping catches unintentional erros (e.g., _spam renames a method) better than the star import. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzazykACgkQ+gerLs4ltQ5BHACfaAh2lVLZ8C+mdV/88UJ0JXTo sqQAn2b2J9cZSQuz2xrwZX/JrvY3AaMh =EIDa -----END PGP SIGNATURE----- From greg.ewing at canterbury.ac.nz Wed Nov 10 21:44:26 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 11 Nov 2010 09:44:26 +1300 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CDB042A.9000100@canterbury.ac.nz> Stephen J. Turnbull wrote: > I don't really understand > what Tres is talking about when he writes "modules that expect to be > imported this way". The *imported* module shouldn't care, no? This > is an issue for the *importing* code to deal with. I think he's talking about modules that add a prefix to all of their exported names, such as Tkinter starting everything with "Tk", on the expectation that import * will be the normal way of using the module. For very well-known modules with very well-known prefixes, this probably doesn't do too much harm, since it's usually fairly obvious where a given name is coming from. However, it's probably best not encouraged, as it could lead people who don't know better into bad habits. There's also the downside that people who choose *not* to use import *, and instead import the module itself and use qualified references, end up with everything being prefixed twice, e.g. 'import Tkinter as tk' leads to 'tk.TkWhatever' everywhere. On the other hand, when wrapping a C library there's a desire to keep the Python names as close as possible to the C ones, which usually come with prefixes to manage C's totally-global namespace. So there's a bit of a double bind there. -- Greg From brett at python.org Wed Nov 10 21:45:53 2010 From: brett at python.org (Brett Cannon) Date: Wed, 10 Nov 2010 12:45:53 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDAA27B.8040703@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: On Wed, Nov 10, 2010 at 05:47, Michael Foord wrote: > On 08/11/2010 22:07, Raymond Hettinger wrote: >> >> On Nov 8, 2010, at 11:58 AM, Brett Cannon wrote: >> >>> I think we need to, as a group, decide how to handle undocumented APIs >>> that don't have a leading underscore: they get treated just the same >>> as the documented APIs, or are they private regardless and thus we can >>> change them at our whim? >> >> To start with, it doesn't hurt for a maintainer to add an __all__ entry >> and to only document the parts of the API we think need to be exposed. ?That >> way, we can at least declare the parts that are intended to be public on a >> go-forward basis. >> >> For the most part, the non-underscored parts of the API shouldn't be >> changed "at our whim". ?Some sense needs to be applied to the decision. >> ?Google's code search is great for showing how people actually have used a >> module in real world code. ?If that shows that people are accessing and/or >> changing an attribute, it probably needs to remain exposed. ? In the absence >> of a code search, good guesses can be made about what someone might >> reasonably and usefully be accessing (i.e. glob0 isn't likely). ? The goal >> is to improve the standard library while minimizing breakage, and that will >> involve trade-offs depending on what is being changed. >> >> IIRC, we've been trying to get away from deprecations because they're so >> disruptive. ?For example, when the pprint rewrite is finally ready, if there >> is an incompatible API change, I expect that a new clean class will be >> offered, but that the old will be left in-place so that tons of existing >> code won't break). ?Likewise, with the unittest clean-ups, I'm expecting >> that Michael will introduce aliases when fixing-up mis-named methods, rather >> than break code that uses the existing names. >> > > So it is obvious that we don't have a clearly stated policy for what defines > the public API of standard library modules. > > How about making this explicit (either pep 8 or our developer docs): > > If a module or package defines __all__ that authoritatively defines the > public interface. Modules with __all__ SHOULD still respect the naming > conventions (leading underscore for private members) to avoid confusing > users. Modules SHOULD NOT export private members in __all__. > > Names imported into a module a never considered part of its public API > unless documented to be so or included in __all__. > > Methods / functions / classes and module attributes whose names begin with a > leading underscore are private. > > If a class name begins with a leading underscore none of its members are > public, whether or not they begin with a leading underscore. > > If a module name in a package begins with a leading underscore none of its > members are public, whether or not they begin with a leading underscore. > > If a module or package doesn't define __all__ then all names that don't > start with a leading underscore are public. > > All public members MUST be documented. Public functions, methods and classes > SHOULD have docstrings. Private members may have docstrings. > > > Where in the standard library this means that a module exports stuff that > isn't helpful or shouldn't be part of the public API we need to migrate to > private names and follow our deprecation process for the public names. All sounds reasonable to me and what common practice out in the community is. -Brett > > All the best, > > > Michael Foord >> >> my-two-cents, >> >> >> Raymond >> >> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > > -- > > http://www.voidspace.org.uk/ > > READ CAREFULLY. By accepting and reading this email you agree, > on behalf of your employer, to release me from all obligations > and waivers arising from any and all NON-NEGOTIATED agreements, > licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, > confidentiality, non-disclosure, non-compete and acceptable use > policies (?BOGUS AGREEMENTS?) that I have entered into with your > employer, its partners, licensors, agents and assigns, in > perpetuity, without prejudice to my ongoing rights and privileges. > You further represent that you have the authority to release me > from any BOGUS AGREEMENTS on behalf of your employer. > > From alexander.belopolsky at gmail.com Wed Nov 10 21:48:38 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 10 Nov 2010 15:48:38 -0500 Subject: [Python-Dev] [Python-checkins] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: <201011101328.36095.victor.stinner@haypocalc.com> References: <20101109093941.AD093EEA49@mail.python.org> <201011101328.36095.victor.stinner@haypocalc.com> Message-ID: On Wed, Nov 10, 2010 at 7:28 AM, Victor Stinner wrote: .. > I don't know, but the commit is trivial and cheap. If it improves the support > on uncommon compiler, I agree to commit such change. > But it does it at the cost of invalidating the "svn blame" for the last enum entry now and for future additions. The problem is that when you change from enum { .. X } to enum { .. X, Y } you modify the X line while you are not responsible for adding the X entry. Someone who will then add Z, will be blamed for Y as well. From barry at python.org Wed Nov 10 22:19:40 2010 From: barry at python.org (Barry Warsaw) Date: Wed, 10 Nov 2010 16:19:40 -0500 Subject: [Python-Dev] bugs.python.org migration In-Reply-To: <4CDA8357.9060304@jcea.es> References: <4CD9BA65.1060005@v.loewis.de> <4CDA8357.9060304@jcea.es> Message-ID: <20101110161940.21d978f5@mission> On Nov 10, 2010, at 12:34 PM, Jesus Cea wrote: >On 09/11/10 22:17, "Martin v. L?wis" wrote: >> bugs.python.org is moving to a new hardware; this also involves a new IP >> address. The migration will happen on Thursday, likely around 8:00 UTC. >> If all goes well, outage should be very short. > >Seems to be offline now. I get timeouts. I just had no problems updating issue 9807. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Nov 10 22:27:19 2010 From: barry at python.org (Barry Warsaw) Date: Wed, 10 Nov 2010 16:27:19 -0500 Subject: [Python-Dev] issue 9807 - abiflags in paths and symlinks (updated patch) Message-ID: <20101110162719.11ae7fe6@mission> I finally found a chance to address all the outstanding technical issues mentioned in bug 9807: http://bugs.python.org/issue9807 I've uploaded a new patch which contains the rest of the changes I'm proposing. I think we still need consensus about whether these changes are good to commit. With 3.2b1 coming soon, now's the time to do that. If there are any remaining concerns about the details of the patch, please add them to the tracker issue. If you have any remaining objections to the change, please let me know or follow up here. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From foom at fuhm.net Wed Nov 10 23:21:54 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 10 Nov 2010 17:21:54 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDAA27B.8040703@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: On Nov 10, 2010, at 8:47 AM, Michael Foord wrote: > How about making this explicit (either pep 8 or our developer docs): > > If a module or package defines __all__ that authoritatively defines the public interface. Modules with __all__ SHOULD still respect the naming conventions (leading underscore for private members) to avoid confusing users. Modules SHOULD NOT export private members in __all__. I don't like the idea of the authoritative definition of a public interface being defined based on __all__, because that provides users almost no warning that they're using a private API: the __all__ attribute doesn't do anything if you aren't using import *. If there was some proposal to make it so that accessing an attribute not in __all__ did prevent or somehow warn users that they're doing something dangerous, that'd be different, but there isn't such a proposal, and I don't even know what such a proposal would look like... On the other hand, if you make the primary mechanism to indicate privateness be a leading underscore, that's obvious to everyone. James From rrr at ronadam.com Thu Nov 11 00:10:21 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 10 Nov 2010 17:10:21 -0600 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: On 11/10/2010 01:33 PM, Raymond Hettinger wrote: > > On Nov 10, 2010, at 5:47 AM, Michael Foord wrote: > >> >> So it is obvious that we don't have a clearly stated policy for what defines the public API of standard library modules. >> >> How about making this explicit (either pep 8 or our developer docs): > > I believe the point of Guido's email was that it is a situation dependent judgment call and not readily boiled down to a set of rules for PEP 8. The way I read Guido's email is that it is a situation dependent judgment call for those cases that aren't clear. I think what Micheal is trying to say is for us to agree on some things so we can go forward with a little more clarity. Cheers, Ron From jcea at jcea.es Thu Nov 11 01:20:16 2010 From: jcea at jcea.es (Jesus Cea) Date: Thu, 11 Nov 2010 01:20:16 +0100 Subject: [Python-Dev] bugs.python.org migration In-Reply-To: <20101110161940.21d978f5@mission> References: <4CD9BA65.1060005@v.loewis.de> <4CDA8357.9060304@jcea.es> <20101110161940.21d978f5@mission> Message-ID: <4CDB36C0.6070007@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/11/10 22:19, Barry Warsaw wrote: > On Nov 10, 2010, at 12:34 PM, Jesus Cea wrote: >> Seems to be offline now. I get timeouts. > > I just had no problems updating issue 9807. That was 10 hours after my message :-). - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTNs2wJlgi5GaxT1NAQKTlAP/dI29cbxfsHCj4pjATuR0yIryAEsZyKml M4+EXxohASAaOG3prdEKwE8bbyZDaX4+nvcm+2X7S9aoTgVlLJWavGraH8ApE/AU SShTsvzLHtNgB6MNNzT+58kv9z2pdCHJcrEY6d98Qh0buJp0Qz7AKcBw6mEb/bG4 v2bF7MyolOE= =oG9l -----END PGP SIGNATURE----- From glyph at twistedmatrix.com Thu Nov 11 03:41:11 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Wed, 10 Nov 2010 18:41:11 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: <69111B4E-45D0-429F-8027-9D379EADE5F1@twistedmatrix.com> On Nov 10, 2010, at 2:21 PM, James Y Knight wrote: > On the other hand, if you make the primary mechanism to indicate privateness be a leading underscore, that's obvious to everyone. +1. One of the best features of Python is the ability to make a conscious decision to break the interface of a library and just get on with your work, even if your use-case is not really supported, because nothing can stop you calling its private functionality. But, IMHO the worst problem with Python is the fact that you can do this _without realizing it_ and pay a steep maintenance price later when an upgrade of something springs the trap that you had unwittingly set for yourself. The leading-underscore convention is the only thing I've found that even mitigates this problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Thu Nov 11 04:04:39 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 11 Nov 2010 12:04:39 +0900 Subject: [Python-Dev] [Python-checkins] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: References: <20101109093941.AD093EEA49@mail.python.org> <201011101328.36095.victor.stinner@haypocalc.com> Message-ID: <87tyjoef20.fsf@uwakimon.sk.tsukuba.ac.jp> Alexander Belopolsky writes: > On Wed, Nov 10, 2010 at 7:28 AM, Victor Stinner > wrote: > .. > > I don't know, but the commit is trivial and cheap. If it improves the support > > on uncommon compiler, I agree to commit such change. > > > > But it does it at the cost of invalidating the "svn blame" for the > last enum entry now and for future additions. The problem is that > when you change from > > enum { > .. > X > } > > to > > enum { > .. > X, > Y > } If that bothers you, you can write enum { A , B /* etc */ , X } or enum { A, B, /* etc */ X, enum_bound_otherwise_unused } I prefer the last; it's a compiler (and debugger) space burden, but shouldn't affect the running python. On the original question, I think it's preferable to keep compilers happy unless you're willing to *require* C99. From alexander.belopolsky at gmail.com Thu Nov 11 05:31:22 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 10 Nov 2010 23:31:22 -0500 Subject: [Python-Dev] [Python-checkins] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: <87tyjoef20.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20101109093941.AD093EEA49@mail.python.org> <201011101328.36095.victor.stinner@haypocalc.com> <87tyjoef20.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 10, 2010 at 10:04 PM, Stephen J. Turnbull wrote: > ... ?On the original question, I > think it's preferable to keep compilers happy unless you're willing to > *require* C99. Hmm, maybe I should take another look at http://bugs.python.org/issue4805 . Note that issue #10359 was not about any real compiler - it was about compiling with gcc -pedantic. If we *require* pedantic c89 compliance - we should add -pedantic -std=c89 to the standard build flags. Otherwise no-compliant code will accumulate between "ISO C cleanups" and such cleanups will continue to pollute VC logs. From alexander.belopolsky at gmail.com Thu Nov 11 06:41:16 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 11 Nov 2010 00:41:16 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: On Wed, Nov 10, 2010 at 6:10 PM, Ron Adam wrote: .. >> On Nov 10, 2010, at 5:47 AM, Michael Foord wrote: >>> >>> So it is obvious that we don't have a clearly stated policy for what >>> defines the public API of standard library modules. >>> >>> How about making this explicit (either pep 8 or our developer docs): >> .. > The way I read Guido's email is that it is a situation dependent judgment > call for those cases that aren't clear. > > I think what Micheal is trying to say is for us to agree on some things so > we can go forward with a little more clarity. I don't understand why everyone seem to have accepted Michael's premise that "we don't have a clearly stated policy for what defines the public API of standard library modules." We do have such a policy and it is well known (while the location in the reference manual may not be): """ The public names defined by a module are determined by checking the module?s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module?s namespace which do not begin with an underscore character ('_'). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module). """ -- The question that I had when I started this thread was not about a definition of "public API." It was about a policy with respect to modules that precede the introduction of __all__ and the modern definition of public names. (See r18692 "Two changes to from...import", and r23920 ' adding a definition of "public names"'.) Is it OK to add __all__ to such modules that does not include all names not starting with an underscore? Is it OK to then remove names that clearly were not intended to be public? Case in point: trace.rx_blank. See also . From stephen at xemacs.org Thu Nov 11 07:09:58 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 11 Nov 2010 15:09:58 +0900 Subject: [Python-Dev] [Python-checkins] r86355 - python/branches/py3k/Modules/_pickle.c In-Reply-To: References: <20101109093941.AD093EEA49@mail.python.org> <201011101328.36095.victor.stinner@haypocalc.com> <87tyjoef20.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87oc9we6h5.fsf@uwakimon.sk.tsukuba.ac.jp> Alexander Belopolsky writes: > On Wed, Nov 10, 2010 at 10:04 PM, Stephen J. Turnbull > wrote: > > ... ?On the original question, I > > think it's preferable to keep compilers happy unless you're willing to > > *require* C99. > > Hmm, maybe I should take another look at http://bugs.python.org/issue4805 . > > Note that issue #10359 was not about any real compiler True, but a real compiler has been mentioned in the thread, and I know that every time XEmacs lets a non-C89 feature slip through (most commonly, "//" comments and declarations following non-declarations, the latter being a killer feature in C-like languages IMO, but our current coding standard says "C89") we get build breakage reports. From fuzzyman at voidspace.org.uk Thu Nov 11 12:51:26 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 11 Nov 2010 11:51:26 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101110154837.CA4731FB625@kimball.webabinitio.net> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> <20101110154837.CA4731FB625@kimball.webabinitio.net> Message-ID: <4CDBD8BE.9090605@voidspace.org.uk> On 10/11/2010 15:48, R. David Murray wrote: > On Wed, 10 Nov 2010 13:12:09 +0900, "Stephen J. Turnbull" wrote: >> Nick Coghlan writes: >> >> > > Module writers who compound the error by expecting to be imported >> > > this way, thereby bogarting the global namespace for their own >> > > purposes, should be fish-slapped. ;) >> > >> > Be prepared to fish-slap all of python-dev then - we use precisely >> > this technique to support optional acceleration modules. The pure >> > Python versions of pairs like profile/_profile and heapq/_heapq >> > include a try/except block at the end that does the equivalent of: >> > >> > try: >> > from _accelerated import * # Allow accelerated overrides >> > except ImportError: >> > pass # Use pure Python versions >> >> But these identifiers will appear at the module level, not global, no? >> Otherwise this technique couldn't be used. I don't really understand >> what Tres is talking about when he writes "modules that expect to be >> imported this way". The *imported* module shouldn't care, no? This >> is an issue for the *importing* code to deal with. > I think Tres was referring to certain packages (which shall remain > nameless since I don't feel like googling to find one) whose > documentation recommends the 'from import *' methodology. Contenders include popular libraries like fabric and django: http://docs.fabfile.org/0.9.2/usage/fabfiles.html http://docs.djangoproject.com/en/1.2/intro/tutorial03/ All the best, Michael > At least that's how I read "Module writers who..." (that is, he's not > saying the *module* expects to be imported that way). [*] > > -- > R. David Murray www.bitdance.com > > [*] although reading that sentence literally, the thought of such a > module writer themselves being imported that way (a la Tron) has a > certain charm.... > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Thu Nov 11 13:01:16 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 11 Nov 2010 12:01:16 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: <4CDBDB0C.6080703@voidspace.org.uk> On 11/11/2010 05:41, Alexander Belopolsky wrote: > On Wed, Nov 10, 2010 at 6:10 PM, Ron Adam wrote: > .. >>> On Nov 10, 2010, at 5:47 AM, Michael Foord wrote: >>>> So it is obvious that we don't have a clearly stated policy for what >>>> defines the public API of standard library modules. >>>> >>>> How about making this explicit (either pep 8 or our developer docs): >>> .. >> The way I read Guido's email is that it is a situation dependent judgment >> call for those cases that aren't clear. >> >> I think what Micheal is trying to say is for us to agree on some things so >> we can go forward with a little more clarity. > I don't understand why everyone seem to have accepted Michael's > premise that "we don't have a clearly stated policy for what defines > the public API of standard library modules." We do have such a policy > and it is well known (while the location in the reference manual may > not be): Ha. 14 paragraphs into the grammar reference on the import statement is perhaps not where developers would go to look for Python standard library development policy (and it *isn't* where they should go - standard library policy should be in pep 8 or our developer docs). What you're saying is that the behaviour of "import *" *already* defines the public API at module level (but says nothing about class members or modules whose names begin with a leading underscore - those rules follow as a natural extension though). By "clearly stated", I meant part of the python development documentation and / or standard library documentation. This is so that both users and developers are clear about the rules, and we have somewhere obvious to point people to. From this discussion it is clear that developers *don't* have a common understanding about what defines the public API of a standard library module. Suggestions as to what the rule is have included "only documented APIs are public" and "every member with a docstring is public"... This largely comes from the heritage of the standard library which, as you point out, pre-dates the addition of __all__ / import * behaviour to the language. However many newer modules don't define __all__ either and several core developers have said they don't consider it a requirement that they do (as __all__ is a maintenance burden). > """ > The public names defined by a module are determined by checking the > module?s namespace for a variable named __all__; > if defined, it must > be a sequence of strings which are names defined or imported by that > module. The names given in __all__ are all considered public and are > required to exist. If __all__ is not defined, the set of public names > includes all names found in the module?s namespace which do not begin > with an underscore character ('_'). __all__ should contain the entire > public API. It is intended to avoid accidentally exporting items that > are not part of the API (such as library modules which were imported > and used within the module). > """ -- > > The question that I had when I started this thread was not about a > definition of "public API." It was about a policy with respect to > modules that precede the introduction of __all__ and the modern > definition of public names. (See r18692 "Two changes to > from...import", and r23920 ' adding a definition of "public names"'.) > Well - restated your question is asking if adding a __all__ *changes* the public API of a standard library module. If it does then it is has stronger backwards compatibility implications than if it doesn't. So given a standard library module that doesn't define __all__, what is considered the public API? > Is it OK to add __all__ to such modules that does not include all > names not starting with an underscore? Is it OK to then remove names > that clearly were not intended to be public? Given the rules I suggested, which are basically the same as the one *you* are saying are already in place, if "import *" exports these names then you shouldn't change that behaviour without going through the deprecation process. It would be clearer if these rules were stated either in pep 8 or our developer documentation of course. All the best, Michael > Case in point: trace.rx_blank. See also. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From kmtracey at gmail.com Thu Nov 11 13:22:00 2010 From: kmtracey at gmail.com (Karen Tracey) Date: Thu, 11 Nov 2010 07:22:00 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDBD8BE.9090605@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> <20101110154837.CA4731FB625@kimball.webabinitio.net> <4CDBD8BE.9090605@voidspace.org.uk> Message-ID: On Thu, Nov 11, 2010 at 6:51 AM, Michael Foord wrote: > On 10/11/2010 15:48, R. David Murray wrote: > >> I think Tres was referring to certain packages (which shall remain >> nameless since I don't feel like googling to find one) whose >> documentation recommends the 'from import *' methodology. >> > > Contenders include popular libraries like fabric and django: > > http://docs.fabfile.org/0.9.2/usage/fabfiles.html > http://docs.djangoproject.com/en/1.2/intro/tutorial03/ > > That is one very specific module in Django that gets imported that way, it is not a general pattern recommended by Django. For every other Django module besides that one you will see specific imports being used in the doc. Karen -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Nov 11 14:23:46 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 11 Nov 2010 08:23:46 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDBDB0C.6080703@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord wrote: .. >> Is it OK to add __all__ to such modules that does not include all >> names not starting with an underscore? ?Is it OK to then remove names >> that clearly were not intended to be public? > > Given the rules I suggested, which are basically the same as the one *you* > are saying are already in place, if "import *" exports these names then you > shouldn't change that behaviour without going through the deprecation > process. I don't dispute that these are *the* rules, but my question was whether it is ok to break them in specific cases such as trace.rx_blank. If not, how can we deprecate trace.rx_blank which is a regex constant? Another specific case is token.main. See . From tseaver at palladion.com Thu Nov 11 14:39:54 2010 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 11 Nov 2010 08:39:54 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/11/2010 08:23 AM, Alexander Belopolsky wrote: > On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord > wrote: > .. >>> Is it OK to add __all__ to such modules that does not include all >>> names not starting with an underscore? Is it OK to then remove names >>> that clearly were not intended to be public? >> >> Given the rules I suggested, which are basically the same as the one *you* >> are saying are already in place, if "import *" exports these names then you >> shouldn't change that behaviour without going through the deprecation >> process. > > I don't dispute that these are *the* rules, but my question was > whether it is ok to break them in specific cases such as > trace.rx_blank. If not, how can we deprecate trace.rx_blank which is > a regex constant? > > Another specific case is token.main. See . I would argue that the narrative documentation for the module is normative for defining "public API", trumping even a pre-existing '__all__'. Given that all non-private stdlib modules have such docs, nobody should be relying on '__all__' as anything other than a convenience. Therefore, in the absence of an '__all__', adding one which conforms to the docs should not require deprecations, as the set of applications / modules which both use the undocumented names *and* do so via 'import *' can be safely deemed "too small to worry about". Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzb8ioACgkQ+gerLs4ltQ4WBwCgux91ooO8lega+HRlYClSDj/B SdwAoIq3ZjMwEL1V7vX8sq9k/xSRhIjA =v9Zc -----END PGP SIGNATURE----- From fdrake at acm.org Thu Nov 11 14:43:44 2010 From: fdrake at acm.org (Fred Drake) Date: Thu, 11 Nov 2010 08:43:44 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: On Thu, Nov 11, 2010 at 8:23 AM, Alexander Belopolsky wrote: > I don't dispute that these are *the* rules, but my question was > whether it is ok to break them in specific cases such as > trace.rx_blank. ?If not, how can we deprecate trace.rx_blank which is > a regex constant? Since trace is documented and rx_blank isn't covered, I think it's pretty clear it was never intended as API. I'd be fine with changing the visibility of rx_blank, and see no need to change its name. > Another specific case is token.main. ?See . Yep. Again, it's clear that it's not API, and that's a documented module. ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From alexander.belopolsky at gmail.com Thu Nov 11 14:51:32 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 11 Nov 2010 08:51:32 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: On Thu, Nov 11, 2010 at 8:43 AM, Fred Drake wrote: .. > Since trace is documented and rx_blank isn't covered, I think it's > pretty clear it was never intended as API. ?I'd be fine with changing > the visibility of rx_blank, and see no need to change its name. While I obviously agree with your conclusion, your logic is not perfect because trace documentation is *much* younger than the module. How would you apply your reasoning to trace.find_strings()? It is undocumented, its name is misleading, but it is used in the wild according to google code search. I draw the line somewhere between trace.rx_blank and trace.find_strings. From fuzzyman at voidspace.org.uk Thu Nov 11 14:57:53 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 11 Nov 2010 13:57:53 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: <4CDBF661.70204@voidspace.org.uk> On 11/11/2010 13:39, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/11/2010 08:23 AM, Alexander Belopolsky wrote: >> On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord >> wrote: >> .. >>>> Is it OK to add __all__ to such modules that does not include all >>>> names not starting with an underscore? Is it OK to then remove names >>>> that clearly were not intended to be public? >>> Given the rules I suggested, which are basically the same as the one *you* >>> are saying are already in place, if "import *" exports these names then you >>> shouldn't change that behaviour without going through the deprecation >>> process. >> I don't dispute that these are *the* rules, but my question was >> whether it is ok to break them in specific cases such as >> trace.rx_blank. If not, how can we deprecate trace.rx_blank which is >> a regex constant? >> >> Another specific case is token.main. See. > I would argue that the narrative documentation for the module is > normative for defining "public API", trumping even a pre-existing > '__all__'. Given that all non-private stdlib modules have such docs, > nobody should be relying on '__all__' as anything other than a convenience. > > Therefore, in the absence of an '__all__', adding one which conforms to > the docs should not require deprecations, as the set of applications / > modules which both use the undocumented names *and* do so via 'import *' > can be safely deemed "too small to worry about". I don't think this is generally sufficient given the not-infrequent occurrence of undocumented-but-used APIs in the standard library. Another example is re.Scanner. http://bugs.python.org/issue5337 Making the rules explicit and following a deprecation process seems like a sensible way forward to me. That still leaves Alexander's question open; how to handle module level constants that can't easily be formally deprecated. One possibility is using something similar to the twisted technique for deprecating module constants. That would mean adding code to the standard library to do this. I would say that if it seems unlikely that the constants are used in the wild, and google code search confirms this, then it is fine to skip the deprecation process. If there are known uses we should at least document the deprecation (and alias) for a release before removing. All the best, Michael > > Tres. > - -- > =================================================================== > Tres Seaver +1 540-429-0999 tseaver at palladion.com > Palladion Software "Excellence by Design" http://palladion.com > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.10 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iEYEARECAAYFAkzb8ioACgkQ+gerLs4ltQ4WBwCgux91ooO8lega+HRlYClSDj/B > SdwAoIq3ZjMwEL1V7vX8sq9k/xSRhIjA > =v9Zc > -----END PGP SIGNATURE----- > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Thu Nov 11 15:00:06 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 11 Nov 2010 14:00:06 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: <4CDBF6E6.3050201@voidspace.org.uk> On 11/11/2010 13:51, Alexander Belopolsky wrote: > On Thu, Nov 11, 2010 at 8:43 AM, Fred Drake wrote: > .. >> Since trace is documented and rx_blank isn't covered, I think it's >> pretty clear it was never intended as API. I'd be fine with changing >> the visibility of rx_blank, and see no need to change its name. > While I obviously agree with your conclusion, your logic is not > perfect because trace documentation is *much* younger than the module. > How would you apply your reasoning to trace.find_strings()? It is > undocumented, its name is misleading, but it is used in the wild > according to google code search. I draw the line somewhere between > trace.rx_blank and trace.find_strings. I agree. Known / likely usage has to be the determining factor for poorly named but undocumented members. For functions though formal deprecation is easy - so it should only be an *issue* for constants. (And there the issue is not whether we can remove but how we do it.) Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Thu Nov 11 15:02:58 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 12 Nov 2010 00:02:58 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: On Thu, Nov 11, 2010 at 11:39 PM, Tres Seaver wrote: > I would argue that the narrative documentation for the module is > normative for defining "public API", trumping even a pre-existing > '__all__'. ?Given that all non-private stdlib modules have such docs, > nobody should be relying on '__all__' as anything other than a convenience. > > Therefore, in the absence of an '__all__', adding one which conforms to > the docs should not require deprecations, as the set of applications / > modules which both use the undocumented names *and* do so via 'import *' > can be safely deemed "too small to worry about". Except, as noted earlier in the thread, many Python programmers (and I count myself amongst this group) often use dir() and help() to find out what a module has available, and only resort to the written documentation if we get stuck. My personal opinion is that we should be trying to get the standard library to the point where __all__ definitions are unnecessary - if a name isn't in __all__, it should start with an underscore (and if that is true, then the __all__ definition becomes effectively redundant). That way, all sources of information (docs, dir(), help(), import *) give the same answer as to what constitutes the public API. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From lukasz at langa.pl Thu Nov 11 15:45:47 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Thu, 11 Nov 2010 15:45:47 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> Message-ID: <4CDC019B.4020406@langa.pl> Am 08.11.2010 23:07, schrieb Raymond Hettinger: > Some sense needs to be applied to the decision. Google's code search > is great for showing how people actually have used a module in real > world code. If that shows that people are accessing and/or changing an > attribute, it probably needs to remain exposed. In the absence of a > code search, good guesses can be made about what someone might > reasonably and usefully be accessing (i.e. glob0 isn't likely). Danger, Will Robinson! I just tried to use that to determine if I could consider moving a module-wide constant in configparser to the parser instance (to enable customization). Search on code.google.com returned me four incompatible result sets within 30 minutes. One had only two entries whereas another had 7 pages of results. Search using www.google.com/codesearch found 3 pages of results different than the search on Google Code. The best part is that codesearch found some occurences on Google Code which Google Code's own search didn't. None of them returned sourceforge.net results whereas search on Koders.com found occurences only on SourceForge. The idea to use a code search engine is ingenious but the current tools are not yet reliable enough for the task. > For example, when the pprint rewrite is finally ready, if there is an incompatible API change, I expect that a new clean class will be offered, but that the old will be left in-place so that tons of existing code won't break). Unrelated but that's the way I'm doing it. From barry at python.org Thu Nov 11 16:01:25 2010 From: barry at python.org (Barry Warsaw) Date: Thu, 11 Nov 2010 10:01:25 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> Message-ID: <20101111100125.2ca443c1@mission> On Nov 11, 2010, at 12:41 AM, Alexander Belopolsky wrote: >Is it OK to add __all__ to such modules that does not include all >names not starting with an underscore? Is it OK to then remove names >that clearly were not intended to be public? I would say in general, yes. It's a good small modernization and stdlib improvement. However, this shouldn't be done as a bug fix to a stable release, and care must be taken to consider backward compatibility. IOW, if you really think it's a name that is not used publicly, or is usually only imported explicitly, then I think it's fine leaving it out of __all__. It's not a difficult change to work around. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Thu Nov 11 16:05:16 2010 From: barry at python.org (Barry Warsaw) Date: Thu, 11 Nov 2010 10:05:16 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: <20101111100516.6e90aa41@mission> On Nov 12, 2010, at 12:02 AM, Nick Coghlan wrote: >My personal opinion is that we should be trying to get the standard >library to the point where __all__ definitions are unnecessary - if a >name isn't in __all__, it should start with an underscore (and if that >is true, then the __all__ definition becomes effectively redundant). Agreed, though I wouldn't *remove* __all__'s, I would establish a convention where they can be generated programmatically. Keeping __all__ in sync with the code is a PITA. It screams for automation. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From lukasz at langa.pl Thu Nov 11 16:17:07 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Thu, 11 Nov 2010 16:17:07 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101111100516.6e90aa41@mission> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> Message-ID: <4CDC08F3.6010501@langa.pl> Am 11.11.2010 16:05, schrieb Barry Warsaw: > Agreed, though I wouldn't *remove* __all__'s, I would establish a > convention > where they can be generated programmatically. Keeping __all__ in sync with > the code is a PITA. It screams for automation. You mean runtime automation, e.g. creating __all__ on the fly omitting underscored names? -- Best regards, ?ukasz Langa From fuzzyman at voidspace.org.uk Thu Nov 11 16:18:40 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 11 Nov 2010 15:18:40 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDC08F3.6010501@langa.pl> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> Message-ID: <4CDC0950.5040309@voidspace.org.uk> On 11/11/2010 15:17, ?ukasz Langa wrote: > Am 11.11.2010 16:05, schrieb Barry Warsaw: >> Agreed, though I wouldn't *remove* __all__'s, I would establish a >> convention >> where they can be generated programmatically. Keeping __all__ in >> sync with >> the code is a PITA. It screams for automation. > > You mean runtime automation, e.g. creating __all__ on the fly omitting > underscored names? > Writing code to generate a __all__ that duplicates the default behaviour seems redundant to me. Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Thu Nov 11 16:32:31 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 11 Nov 2010 16:32:31 +0100 Subject: [Python-Dev] Breaking undocumented API References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: <20101111163231.304645e4@pitrou.net> On Thu, 11 Nov 2010 15:18:40 +0000 Michael Foord wrote: > On 11/11/2010 15:17, ?ukasz Langa wrote: > > Am 11.11.2010 16:05, schrieb Barry Warsaw: > >> Agreed, though I wouldn't *remove* __all__'s, I would establish a > >> convention > >> where they can be generated programmatically. Keeping __all__ in > >> sync with > >> the code is a PITA. It screams for automation. > > > > You mean runtime automation, e.g. creating __all__ on the fly omitting > > underscored names? > > > Writing code to generate a __all__ that duplicates the default behaviour > seems redundant to me. Agreed with Michael. __all__ is useful mostly when you don't adhere to the convention that private APIs should have a leading underscore. Regards Antoine. From ocean-city at m2.ccsnet.ne.jp Thu Nov 11 17:07:28 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Fri, 12 Nov 2010 01:07:28 +0900 Subject: [Python-Dev] Removal of Win32 ANSI API Message-ID: <4CDC14C0.6070300@m2.ccsnet.ne.jp> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) and only use Win32 WIDE API (ie: GetFileAttributesW)? Mainly in posixmodule.c. I think we can simplify the code hugely. (This means droping bytes support for os.stat etc on windows) # I recently did it for winsound.PlaySound with MvL's approval Thank you. From mail at timgolden.me.uk Thu Nov 11 17:10:35 2010 From: mail at timgolden.me.uk (Tim Golden) Date: Thu, 11 Nov 2010 16:10:35 +0000 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDC14C0.6070300@m2.ccsnet.ne.jp> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> Message-ID: <4CDC157B.6090406@timgolden.me.uk> On 11/11/2010 16:07, Hirokazu Yamamoto wrote: > Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) > and only use Win32 WIDE API (ie: GetFileAttributesW)? > Mainly in posixmodule.c. > I think we can simplify the code hugely. (This means droping bytes > support for os.stat etc on windows) > > # I recently did it for winsound.PlaySound with MvL's approval +1 from me TJG From eckhardt at satorlaser.com Thu Nov 11 17:18:08 2010 From: eckhardt at satorlaser.com (Ulrich Eckhardt) Date: Thu, 11 Nov 2010 17:18:08 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDC14C0.6070300@m2.ccsnet.ne.jp> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> Message-ID: <201011111718.08207.eckhardt@satorlaser.com> On Thursday 11 November 2010, Hirokazu Yamamoto wrote: > Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) > and only use Win32 WIDE API (ie: GetFileAttributesW)? > Mainly in posixmodule.c. > I think we can simplify the code hugely. +1 MS Windows variants that only support the ANSI API (win9x) are officially unsupported since 2.5 or 2.6. Further, this also eases porting to MS Windows CE, which I'd still like to to see one day. > (This means droping bytes support for os.stat etc on windows) I disagree that not using the ANSI win32 API means dropping byte support for os.stat. I'd rather say that it means converting bytes at the earliest possible time and only using unicode internally. But I'm guessing a bit here, I haven't looked at the code for a while. > # I recently did it for winsound.PlaySound with MvL's approval Interesting, is there a ticket associate with this? Also, was that on Python 3 or 2? Which commits? Uli -- Sator Laser GmbH, Fangdieckstra?e 75a, 22547 Hamburg, Deutschland Gesch?ftsf?hrer: Thorsten F?cking, Amtsgericht Hamburg HR B62 932 ************************************************************************************** Sator Laser GmbH, Fangdieckstra?e 75a, 22547 Hamburg, Deutschland Gesch?ftsf?hrer: Thorsten F?cking, Amtsgericht Hamburg HR B62 932 ************************************************************************************** Visit our website at ************************************************************************************** Diese E-Mail einschlie?lich s?mtlicher Anh?nge ist nur f?r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf?nger sein sollten. Die E-Mail ist in diesem Fall zu l?schen und darf weder gelesen, weitergeleitet, ver?ffentlicht oder anderweitig benutzt werden. E-Mails k?nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte ?nderungen enthalten. Sator Laser GmbH ist f?r diese Folgen nicht verantwortlich. ************************************************************************************** From solipsis at pitrou.net Thu Nov 11 17:43:35 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 11 Nov 2010 17:43:35 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <4CDC157B.6090406@timgolden.me.uk> Message-ID: <20101111174335.3173f67e@pitrou.net> On Thu, 11 Nov 2010 16:10:35 +0000 Tim Golden wrote: > On 11/11/2010 16:07, Hirokazu Yamamoto wrote: > > Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) > > and only use Win32 WIDE API (ie: GetFileAttributesW)? > > Mainly in posixmodule.c. > > I think we can simplify the code hugely. (This means droping bytes > > support for os.stat etc on windows) > > > > # I recently did it for winsound.PlaySound with MvL's approval > > +1 from me How do you support cross-platform code using bytes filenames? IIRC, it has already been argued that it was an important feature. Many filesystem-related utilities might prefer to handle filenames in bytes form. ("winsound" is a Windows-specific module so that wasn't a concern obviously) Regards Antoine. From merwok at netwok.org Thu Nov 11 18:38:11 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 11 Nov 2010 18:38:11 +0100 Subject: [Python-Dev] [Python-checkins] r86348 - in python/branches/py3k/Lib: test/test_xml_etree.py xml/etree/ElementTree.py In-Reply-To: <20101109234842.GA1068@rubuntu> References: <20101109023700.32DC1EEA06@mail.python.org> <4CD9A521.9030200@netwok.org> <20101109234842.GA1068@rubuntu> Message-ID: <4CDC2A03.1080404@netwok.org> >> Shouldn?t this include an entry in NEWS and maybe in ACKS? > It was a very simple bug fix (caused due to an overlook initially), so > did not add NEWS/ACKS. For features, larger fixes or complete patches, > I the add NEWS and ACKS as appropriate. Thanks for the reply. Now I?m unsure about the rules for adding NEWS entries: some bugs are important but have a very simple fix (see #1718574 for an example). I guess I?ll just always add an entry :) Brett, maybe this is something to cover in the dev docs. make-patchcheck-ly yours From brett at python.org Thu Nov 11 18:56:11 2010 From: brett at python.org (Brett Cannon) Date: Thu, 11 Nov 2010 09:56:11 -0800 Subject: [Python-Dev] [Python-checkins] r86348 - in python/branches/py3k/Lib: test/test_xml_etree.py xml/etree/ElementTree.py In-Reply-To: <4CDC2A03.1080404@netwok.org> References: <20101109023700.32DC1EEA06@mail.python.org> <4CD9A521.9030200@netwok.org> <20101109234842.GA1068@rubuntu> <4CDC2A03.1080404@netwok.org> Message-ID: On Thu, Nov 11, 2010 at 09:38, ?ric Araujo wrote: >>> Shouldn?t this include an entry in NEWS and maybe in ACKS? >> It was a very simple bug fix (caused due to an overlook initially), so >> did not add NEWS/ACKS. For features, larger fixes or complete patches, >> I the add NEWS and ACKS as appropriate. > > Thanks for the reply. ?Now I?m unsure about the rules for adding NEWS > entries: some bugs are important but have a very simple fix (see > #1718574 for an example). ?I guess I?ll just always add an entry :) > > Brett, maybe this is something to cover in the dev docs. I just follow Guido's own personal rule: if the fix required thought they should go into Misc/ACKS. From alexander.belopolsky at gmail.com Thu Nov 11 19:01:10 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 11 Nov 2010 13:01:10 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDC0950.5040309@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: 2010/11/11 Michael Foord : .. >> You mean runtime automation, e.g. creating __all__ on the fly omitting >> underscored names? >> > Writing code to generate a __all__ that duplicates the default behaviour > seems redundant to me. > FWIW, I like having __all__ at the top of the module. It feels like a table of contents at the start of a chapter. In some cases it may also serve as an optimization when len(__all__) is much smaller than len(__dict__). I also don't like _ prefix to become an exclusive means to express privateness. I think the current definition of "public names" is a good one and just needs to be made more visible in the docs. If the module defines __all__, that should be the ultimate answer to what is public in that module. (Users should learn to use help(module) instead of dir(module) for API discovery.) If __all__ is not defined in the module, I think it is good to introduce it after a careful review of what it should contain. And __all__ should never contain names that start with _. From merwok at netwok.org Thu Nov 11 19:10:43 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 11 Nov 2010 19:10:43 +0100 Subject: [Python-Dev] [Python-checkins] r86348 - in python/branches/py3k/Lib: test/test_xml_etree.py xml/etree/ElementTree.py In-Reply-To: References: <20101109023700.32DC1EEA06@mail.python.org> <4CD9A521.9030200@netwok.org> <20101109234842.GA1068@rubuntu> <4CDC2A03.1080404@netwok.org> Message-ID: <4CDC31A3.1020306@netwok.org> > I just follow Guido's own personal rule: if the fix required thought > they should go into Misc/ACKS. Okay. Same rule for NEWS? From brett at python.org Thu Nov 11 19:16:04 2010 From: brett at python.org (Brett Cannon) Date: Thu, 11 Nov 2010 10:16:04 -0800 Subject: [Python-Dev] [Python-checkins] r86348 - in python/branches/py3k/Lib: test/test_xml_etree.py xml/etree/ElementTree.py In-Reply-To: <4CDC31A3.1020306@netwok.org> References: <20101109023700.32DC1EEA06@mail.python.org> <4CD9A521.9030200@netwok.org> <20101109234842.GA1068@rubuntu> <4CDC2A03.1080404@netwok.org> <4CDC31A3.1020306@netwok.org> Message-ID: On Thu, Nov 11, 2010 at 10:10, ?ric Araujo wrote: >> I just follow Guido's own personal rule: if the fix required thought >> they should go into Misc/ACKS. > > Okay. ?Same rule for NEWS? > > I do a NEWS entry if a bug was fixed or semantics changed/added for anything public (e.g., I don't do an entry for every little clarification in the docs or new tests fixed or written). From steve at pearwood.info Thu Nov 11 19:16:16 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 12 Nov 2010 05:16:16 +1100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: <4CDC32F0.3010500@pearwood.info> Nick Coghlan wrote: > My personal opinion is that we should be trying to get the standard > library to the point where __all__ definitions are unnecessary - if a > name isn't in __all__, it should start with an underscore (and if that > is true, then the __all__ definition becomes effectively redundant). You don't *need* to define __all__ -- if you don't, import * will import everything that doesn't start with a leading underscore. __all__ is only useful when you want more control over what is or isn't imported. If you don't need that control, just don't define __all__, and the problem is solved. > That way, all sources of information (docs, dir(), help(), import *) > give the same answer as to what constitutes the public API. I disagree with the underlying assumption that import * need necessarily import the entire public API. That's not how I use it in my modules, and the option should be available to std library modules as well. When I create a module, I distinguish between three categories of functions: * private, which start with an underscore; * the core public API, which is listed in __all__; and * support/helper functions, which are not part of the core functionality of the module but are public. If you import * you will get just the core functions. If you want the support functions, you need to use the fully qualified module.name, or otherwise import them yourself. This division of public functions into first and second class API functions is a deliberate design choice on my part. I expect the core functionality to be fully documented. Helper and support functions may not be -- there should be some docs, but doing so is a lower priority. The support functions are public, and available for use, if you go looking for them, but I neither encourage nor discourage users from doing so. I don't see any reason that the standard library should not be permitted to use the same convention. Another couple of objections to getting rid of __all__: If you're proxying modules or built-ins, you may not be able to use a _private name, but you may not want import * to pick up your proxies. I find it annoying to see this: import module as _module _module.func() (instead of import module and merely leaving module out of __all__) I accept that some standard library authors may choose this convention, but I don't want to see it become mandatory. -- Steven From alexander.belopolsky at gmail.com Thu Nov 11 19:40:36 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 11 Nov 2010 13:40:36 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDBDB0C.6080703@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord wrote: .. >> I don't understand why everyone seem to have accepted Michael's >> premise that "we don't have a clearly stated policy for what defines >> the public API of standard library modules." ?We do have such a policy >> and it is well known (while the location in the reference manual may >> not be): > > Ha. 14 paragraphs into the grammar reference on the import statement is > perhaps not where developers would go to look for Python standard library > development policy.. Very true. To make it slightly more visible, any objections to the following patch? (It adds "public names (in module globals)" linking to that 14-th paragraph in the index.) Index: Doc/reference/simple_stmts.rst =================================================================== --- Doc/reference/simple_stmts.rst (revision 86409) +++ Doc/reference/simple_stmts.rst (working copy) @@ -794,6 +794,7 @@ namespace of the :keyword:`import` statement.. .. index:: single: __all__ (optional module attribute) +.. index:: public names (in module globals) The *public names* defined by a module are determined by checking the module's namespace for a variable named ``__all__``; if defined, it must be a sequence of From solipsis at pitrou.net Thu Nov 11 19:47:34 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 11 Nov 2010 19:47:34 +0100 Subject: [Python-Dev] Breaking undocumented API References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: <20101111194734.78fb3846@pitrou.net> On Thu, 11 Nov 2010 13:40:36 -0500 Alexander Belopolsky wrote: > On Thu, Nov 11, 2010 at 7:01 AM, Michael Foord > wrote: > .. > >> I don't understand why everyone seem to have accepted Michael's > >> premise that "we don't have a clearly stated policy for what defines > >> the public API of standard library modules." ?We do have such a policy > >> and it is well known (while the location in the reference manual may > >> not be): > > > > Ha. 14 paragraphs into the grammar reference on the import statement is > > perhaps not where developers would go to look for Python standard library > > development policy.. > > Very true. To make it slightly more visible, any objections to the > following patch? (It adds "public names (in module globals)" linking > to that 14-th paragraph in the index.) I think what Michael meant is that the language grammar reference is not (and shouldn't be) the authority on stdlib development policy. To which I would agree. Regards Antoine. From victor.stinner at haypocalc.com Thu Nov 11 20:26:24 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 11 Nov 2010 20:26:24 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDC14C0.6070300@m2.ccsnet.ne.jp> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> Message-ID: <201011112026.24445.victor.stinner@haypocalc.com> On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote: > Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) > and only use Win32 WIDE API (ie: GetFileAttributesW)? > Mainly in posixmodule.c. Even if I hate the MBCS encoding, because it replaces undecodable characters by similar glyphs by default, I'm not certain that it is a good idea to drop the bytes API. Can it be a problem to port programs from Python2 to Python3? Do major Python2 programs/libraries rely on the bytes API? > I think we can simplify the code hugely. (This means droping bytes > support for os.stat etc on windows) Sure, it will divide the number of lines, of the code specific to Windows, by two. Victor From martin at v.loewis.de Thu Nov 11 20:44:52 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 11 Nov 2010 20:44:52 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <20101111174335.3173f67e@pitrou.net> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <4CDC157B.6090406@timgolden.me.uk> <20101111174335.3173f67e@pitrou.net> Message-ID: <4CDC47B4.5080200@v.loewis.de> > How do you support cross-platform code using bytes filenames? > IIRC, it has already been argued that it was an important feature. Many > filesystem-related utilities might prefer to handle filenames in bytes > form. It would be a policy decision. However, I think it is hear-say that filesystem-related utilities might prefer byte file names. On Windows, some files are inaccessible if you constrain yourself to byte filenames, so once people learn about this limitation, I expect them to switch to Unicode filenames on Windows - for the same reason they use byte filenames on Unix (i.e. to be able to access all files correctly). Regards, Martin From martin at v.loewis.de Thu Nov 11 20:50:35 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 11 Nov 2010 20:50:35 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011112026.24445.victor.stinner@haypocalc.com> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> Message-ID: <4CDC490B.9060809@v.loewis.de> > Even if I hate the MBCS encoding, because it replaces undecodable characters > by similar glyphs by default, I'm not certain that it is a good idea to drop > the bytes API. Can it be a problem to port programs from Python2 to Python3? > Do major Python2 programs/libraries rely on the bytes API? I don't actually know for a fact, but I expect that the answer is "no". The questions is: where do file names typically come from? My guess is that they come from a) hard-coded strings in the source code b) command line arguments/environment variables c) directory listings [of course, there are other ways, like GUI input, getcwd(), etc] In case a), you have filenames such as ".", e.g. as a parameter to listdir or walk. These will typically be regular strings in Python 2, which become Unicode strings in 3. You would actively need to put b"" prefixes into the code. In case b), they will be Unicode strings in Python 3. In case c), they will be Unicode strings if the argument is a Unicode string. So by induction, file names will be typically unicode. The exception will be libraries/applications which make deliberate attempts to get byte-oriented file names. Regards, Martin From solipsis at pitrou.net Thu Nov 11 21:02:43 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 11 Nov 2010 21:02:43 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDC47B4.5080200@v.loewis.de> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <4CDC157B.6090406@timgolden.me.uk> <20101111174335.3173f67e@pitrou.net> <4CDC47B4.5080200@v.loewis.de> Message-ID: <20101111210243.264ccfb7@pitrou.net> On Thu, 11 Nov 2010 20:44:52 +0100 "Martin v. L?wis" wrote: > > How do you support cross-platform code using bytes filenames? > > IIRC, it has already been argued that it was an important feature. Many > > filesystem-related utilities might prefer to handle filenames in bytes > > form. > > It would be a policy decision. However, I think it is hear-say that > filesystem-related utilities might prefer byte file names. One possible situation is when you receive filenames in bytes form from an external API or tool (or even the contents of a file). If you don't know the encoding, keeping the bytes form is obviously recommended. I don't know how often this happens. Regards Antoine. From eric at trueblade.com Thu Nov 11 21:44:23 2010 From: eric at trueblade.com (Eric Smith) Date: Thu, 11 Nov 2010 15:44:23 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <87fwv9g6li.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CDC55A7.5000001@trueblade.com> On 11/10/2010 11:58 AM, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 11/09/2010 11:12 PM, Stephen J. Turnbull wrote: >> Nick Coghlan writes: >> >> > > Module writers who compound the error by expecting to be imported >> > > this way, thereby bogarting the global namespace for their own >> > > purposes, should be fish-slapped. ;) >> > >> > Be prepared to fish-slap all of python-dev then - we use precisely >> > this technique to support optional acceleration modules. The pure >> > Python versions of pairs like profile/_profile and heapq/_heapq >> > include a try/except block at the end that does the equivalent of: >> > >> > try: >> > from _accelerated import * # Allow accelerated overrides >> > except ImportError: >> > pass # Use pure Python versions >> >> But these identifiers will appear at the module level, not global, no? >> Otherwise this technique couldn't be used. I don't really understand >> what Tres is talking about when he writes "modules that expect to be >> imported this way". The *imported* module shouldn't care, no? This >> is an issue for the *importing* code to deal with. > > Right -- "private" star imports aren't the issue for me, because the > same user who creates them is responsible for the other end fo the > stick. I was ranting about library authors who document star imports as > the expected usage pattern for their external users. > > Note that I still wouldn't use star imports in the "private > acceleration" case myself. I would prefer a pattern like: > > - ----------------------- $< ----------------------------- > # spam.py > > # Pure python API implementation > def foo(spat, blarg): > ... > > def bar(qux): > ... > > # Replace with accelearated C implemenataion > try: > import _spam > except ImportError: > pass # accelerated version not available > else: > foo = _spam.foo > bar = _spam.bar > - ----------------------- $< ----------------------------- > > This explicit name remapping catches unintentional erros (e.g., _spam > renames a method) better than the star import. But then you're saying that all implementations of _spam have to support the same API. What if CPython's _spam has foo, bar, and baz, but Jython's only has foo and bar, and IronPython's only has baz? Without getting into special casing or lots of try/catch blocks on individual names, I think import * is the best way to go. Eric. From ncoghlan at gmail.com Thu Nov 11 23:01:32 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 12 Nov 2010 08:01:32 +1000 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011112026.24445.victor.stinner@haypocalc.com> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> Message-ID: On Fri, Nov 12, 2010 at 5:26 AM, Victor Stinner wrote: > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote: >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) >> and only use Win32 WIDE API (ie: GetFileAttributesW)? >> Mainly in posixmodule.c. > > Even if I hate the MBCS encoding, because it replaces undecodable characters > by similar glyphs by default, I'm not certain that it is a good idea to drop > the bytes API. Can it be a problem to port programs from Python2 to Python3? > Do major Python2 programs/libraries rely on the bytes API? > >> I think we can simplify the code hugely. (This means droping bytes >> support for os.stat etc on windows) > > Sure, it will divide the number of lines, of the code specific to Windows, by > two. Can we get most of the code cleanup benefit without the backwards compatibility risk by doing the decode from 'mbcs' on our side of the fence? That is, have code that was the C equivalent of: arg_is_bytes = not isinstance(arg, str) if arg_is_bytes: val = _decode_mbcs(arg) # Decoding error checking here else: val = arg # Common processing using WIDE API if arg_is_bytes: result = _encode_mbcs(wide_result) # Encoding error checking here else: result = wide_result Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Nov 11 23:15:36 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 12 Nov 2010 08:15:36 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDC32F0.3010500@pearwood.info> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <4CDC32F0.3010500@pearwood.info> Message-ID: On Fri, Nov 12, 2010 at 4:16 AM, Steven D'Aprano wrote: > Another couple of objections to getting rid of __all__: > > If you're proxying modules or built-ins, you may not be able to use a > _private name, but you may not want import * to pick up your proxies. > > I find it annoying to see this: > > import module as _module > _module.func() > > (instead of import module and merely leaving module out of __all__) That gets us back to dir() and help() giving the wrong impression of the module's public API though. The issue I have is that the current policy (public APIs may or may not be in all, private APIs may or may not be prefixed by a leading underscore) makes it impossible to reliably extract a module's public API programmatically. If we instead adopt the explicit policy that private APIs are: - imported modules (with the exception of os.path) - any names starting with a leading underscore Then we get the 3 API tiers you describe: core public API in __all__, other public functions and globals without leading underscores, private API with leading underscores (or imported modules). We could even add two additional functions to the inspect module (e.g. getpublicnames() and getimportstarnames()) which applied the relevant filtering rules. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Thu Nov 11 23:24:49 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 12 Nov 2010 11:24:49 +1300 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> Message-ID: <4CDC6D31.2040809@canterbury.ac.nz> Nick Coghlan wrote: > My personal opinion is that we should be trying to get the standard > library to the point where __all__ definitions are unnecessary - if a > name isn't in __all__, it should start with an underscore (and if that > is true, then the __all__ definition becomes effectively redundant). What about names imported from other modules that are used by the module, but not intended for re-export? How would you prevent them from turning up in help() etc. without using __all__? -- Greg From nd at perlig.de Fri Nov 12 08:51:58 2010 From: nd at perlig.de (=?iso-8859-1?q?Andr=E9_Malo?=) Date: Fri, 12 Nov 2010 08:51:58 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDC490B.9060809@v.loewis.de> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> <4CDC490B.9060809@v.loewis.de> Message-ID: <201011120851.58615.nd@perlig.de> On Thursday 11 November 2010 20:50:35 Martin v. L?wis wrote: > > Even if I hate the MBCS encoding, because it replaces undecodable > > characters by similar glyphs by default, I'm not certain that it is a > > good idea to drop the bytes API. Can it be a problem to port programs > > from Python2 to Python3? Do major Python2 programs/libraries rely on the > > bytes API? > > I don't actually know for a fact, but I expect that the answer is "no". > > The questions is: where do file names typically come from? My guess > is that they come from > a) hard-coded strings in the source code > b) command line arguments/environment variables [...] > In case b), they will be Unicode strings in Python 3. But not neccessarily with unicode semantics if I get the discussions about the environment topic right. Additionally: d) Over a socket (like the HTTP protocol) -> Bytes. nd From p.f.moore at gmail.com Fri Nov 12 09:44:03 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 12 Nov 2010 08:44:03 +0000 Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and buildbot run as a service Message-ID: Hi, My buildbot has been failing for some time because of these 2 issues, both related to the fact that tests are hanging when run as a service (and hence have no display to open GUI elements on). Both issues have patches, and as far as I am aware, the patches fix the issues reasonably well. What can I do to move these 2 issues forwards? As things stand, my buildbot is not providing a lot of value on the 3.x branch :-( Thanks, Paul. From martin at v.loewis.de Fri Nov 12 09:51:19 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Nov 2010 09:51:19 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011120851.58615.nd@perlig.de> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> <4CDC490B.9060809@v.loewis.de> <201011120851.58615.nd@perlig.de> Message-ID: <4CDD0007.7060201@v.loewis.de> > Additionally: > > d) Over a socket (like the HTTP protocol) -> Bytes. Sure. However, you can't really expect that the bytes you receive over the socket are a meaningful filename on your local Windows installation. So it would be a bug in the application to not decode the bytes that you receive before using them as a file name. In a well-specified network protocol, you would know the encoding of the bytes; IETF recommends to use UTF-8 for all new protocols. Using an UTF-8 string as a filename on Windows will create mojibake. Regards, Martin From martin at v.loewis.de Fri Nov 12 10:29:31 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Nov 2010 10:29:31 +0100 Subject: [Python-Dev] buildbot master update Message-ID: <4CDD08FB.3070701@v.loewis.de> As you may have noticed: I updated the buildbot master to release 0.8.2. If you notice any problems, please post them here. Slave operators can upgrade their installations at their own pace; buildbot is highly backwards compatible. As a recommendation, I suggest that slaves run at least at the version that is available in Debian stable (currently 0.7.8). Regards, Martin From martin at v.loewis.de Fri Nov 12 10:32:46 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Nov 2010 10:32:46 +0100 Subject: [Python-Dev] bugs.python.org migration complete Message-ID: <4CDD09BE.8090106@v.loewis.de> bugs.python.org is now on the new hardware. There have been some problems in the migration: the old hardware would start failing before the scheduled migration date, so the migration was done early, causing outage for some people who then the old address in their DNS caches. In addition, there was initially a misconfiguration preventing outgoing IP traffic, particularly preventing outgoing emails from being delivered. This is all fixed now; report any remaining issues to the metatracker. Regards, Martin From hrvoje.niksic at avl.com Fri Nov 12 10:49:48 2010 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Fri, 12 Nov 2010 10:49:48 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDC6D31.2040809@canterbury.ac.nz> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <4CDC6D31.2040809@canterbury.ac.nz> Message-ID: <4CDD0DBC.4050405@avl.com> On 11/11/2010 11:24 PM, Greg Ewing wrote: > Nick Coghlan wrote: > >> My personal opinion is that we should be trying to get the standard >> library to the point where __all__ definitions are unnecessary - if a >> name isn't in __all__, it should start with an underscore (and if that >> is true, then the __all__ definition becomes effectively redundant). > > What about names imported from other modules that are used by > the module, but not intended for re-export? How would you > prevent them from turning up in help() etc. without using > __all__? import foo as _foo I believe I am not the only one who finds that practice ugly, but I find it just as ugly to underscore-ize every non-public helper function. __all__ is there for a reason, let's use it. Maybe help() could automatically ignore stuff not in __all__, or display it but warn the user of non-public identifiers? From lukasz at langa.pl Fri Nov 12 11:34:01 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Fri, 12 Nov 2010 11:34:01 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <4CDC32F0.3010500@pearwood.info> Message-ID: <4CDD1819.4020306@langa.pl> Am 11.11.2010 23:15, schrieb Nick Coghlan: > If we instead adopt the explicit policy that private APIs are: > - imported modules (with the exception of os.path) > - any names starting with a leading underscore > > Then we get the 3 API tiers you describe: core public API in __all__, > other public functions and globals without leading underscores, > private API with leading underscores (or imported modules). +1 I like this approach *very much*. Let me elaborate: 1. The community knows, understands and accepts _names as private. We need to have _names for private functions and constants because we can change or remove those in later versions. It's very explicit: when the user complains "What, you removed _foo?" we can say "Yes, it was considered an implementation detail *from the start*." And it's hard to beat that argument. It was private from the start. You knew that because the name you called specifies that. If we would be now to proclaim __all__ as a decisive point on what's private and what's not, it makes lives of all Python programmers (I mean the users as well) more complicated. 2. That being said, having help() mark non-underscored names which aren't included in __all__ as private is a good idea, too [1]. I'm a heavy user of interactive API discovery using dir() and help() and this would be definitely welcome. And even for those who don't use those tools, this feature can expose inconsistencies between documentation and code. 3. "import name as _name" or "from x.y import z as _z" is just bad form. There may be valid exceptions but imagine if that would be the default way to do it. Uglier than nights of November. 4. This is why I think considering all imports as private (unless they're in __all__) is a fine example of "practicability beats purity". We could try to conceive a way to expose this information programatically but that's not so important at the moment. [1] As Hrvoje Niksic wrote here: http://mail.python.org/pipermail/python-dev/2010-November/105533.html -- Best regards, ?ukasz Langa From fdrake at acm.org Fri Nov 12 12:23:31 2010 From: fdrake at acm.org (Fred Drake) Date: Fri, 12 Nov 2010 06:23:31 -0500 Subject: [Python-Dev] [Python-checkins] r86429 - python/branches/py3k/Doc/tools/sphinxext/pyspecific.py In-Reply-To: <20101112085712.F3D23EEA2D@mail.python.org> References: <20101112085712.F3D23EEA2D@mail.python.org> Message-ID: On Fri, Nov 12, 2010 at 3:57 AM, georg.brandl wrote in a commit: > Add a deprecated-removed directive that allows to give the version of removal for deprecations. This sounds pretty general-purpose rather than Python-specific. Any chance this will move into Sphinx? I know a few projects that like to deprecate things and would use this. :-) ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From victor.stinner at haypocalc.com Fri Nov 12 13:08:30 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 12 Nov 2010 13:08:30 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> Message-ID: <201011121308.30368.victor.stinner@haypocalc.com> On Thursday 11 November 2010 23:01:32 you wrote: > > Sure, it will divide the number of lines, of the code specific to > > Windows, by two. > > Can we get most of the code cleanup benefit without the backwards > compatibility risk by doing the decode from 'mbcs' on our side of the > fence? I created PyUnicode_FSDecoder, a ParseTuple converter used to work on unicode paths, instead of bytes paths. On Windows, this converter uses mbcs encoding in strict mode, whereas Windows converter uses replace error handler to decode, and ignore to encode. So I don't think that we should this converter on Windows. > That is, have code that was the C equivalent of: > > arg_is_bytes = not isinstance(arg, str) > if arg_is_bytes: > val = _decode_mbcs(arg) > # Decoding error checking here > else: > val = arg > # Common processing using WIDE API > if arg_is_bytes: > result = _encode_mbcs(wide_result) > # Encoding error checking here > else: > result = wide_result This doesn't make the code shorter, it may be longer than the actual code, and it is less compliant with the Windows native API... Victor From victor.stinner at haypocalc.com Fri Nov 12 13:13:08 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 12 Nov 2010 13:13:08 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <20101111210243.264ccfb7@pitrou.net> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <4CDC47B4.5080200@v.loewis.de> <20101111210243.264ccfb7@pitrou.net> Message-ID: <201011121313.08741.victor.stinner@haypocalc.com> On Thursday 11 November 2010 21:02:43 Antoine Pitrou wrote: > On Thu, 11 Nov 2010 20:44:52 +0100 > > "Martin v. L?wis" wrote: > > > How do you support cross-platform code using bytes filenames? > > > IIRC, it has already been argued that it was an important feature. Many > > > filesystem-related utilities might prefer to handle filenames in bytes > > > form. > > > > It would be a policy decision. However, I think it is hear-say that > > filesystem-related utilities might prefer byte file names. > > One possible situation is when you receive filenames in bytes form from > an external API or tool (or even the contents of a file). If you don't > know the encoding, keeping the bytes form is obviously recommended. I disagree with you: the filename stored in the binary content/network stream may be encoded with a different code page than the current Windows code page. The application have to decode the filename itself, the application has more information about the right encoding than Windows. Examples: - MKV video stores filenames in utf-8 - ZIP stores filenames in cp437 or utf-8 - tar stores filenames... in the locale encoding (except for PAX format which uses utf-8) - etc. Victor From victor.stinner at haypocalc.com Fri Nov 12 13:15:35 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 12 Nov 2010 13:15:35 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDC490B.9060809@v.loewis.de> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> <4CDC490B.9060809@v.loewis.de> Message-ID: <201011121315.35541.victor.stinner@haypocalc.com> On Thursday 11 November 2010 20:50:35 you wrote: > > Even if I hate the MBCS encoding, because it replaces undecodable > > characters by similar glyphs by default, I'm not certain that it is a > > good idea to drop the bytes API. Can it be a problem to port programs > > from Python2 to Python3? Do major Python2 programs/libraries rely on the > > bytes API? > > I don't actually know for a fact, but I expect that the answer is "no". > > The questions is: where do file names typically come from? My guess > is that they come from > a) hard-coded strings in the source code > b) command line arguments/environment variables > c) directory listings > [of course, there are other ways, like GUI input, getcwd(), etc] > > In case a), you have filenames such as ".", e.g. as a parameter to > listdir or walk. These will typically be regular strings in Python 2, > which become Unicode strings in 3. You would actively need to put b"" > prefixes into the code. > > In case b), they will be Unicode strings in Python 3. > > In case c), they will be Unicode strings if the argument is a Unicode > string. So by induction, file names will be typically unicode. The > exception will be libraries/applications which make deliberate attempts > to get byte-oriented file names. Ok, good answer. In this case, I vote +1 to remove completly the ANSI version from all Python modules. I consider the ANSI version has a compatibility layer for old applications written for MS-Dos or early versions of Windows. Even if these APIs are still widely used in C/C++ applications, the wide versions should always be preferred. Victor From solipsis at pitrou.net Fri Nov 12 14:40:29 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 12 Nov 2010 14:40:29 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <4CDC47B4.5080200@v.loewis.de> <20101111210243.264ccfb7@pitrou.net> <201011121313.08741.victor.stinner@haypocalc.com> Message-ID: <20101112144029.00f6fdfc@pitrou.net> On Fri, 12 Nov 2010 13:13:08 +0100 Victor Stinner wrote: > On Thursday 11 November 2010 21:02:43 Antoine Pitrou wrote: > > On Thu, 11 Nov 2010 20:44:52 +0100 > > > > "Martin v. L?wis" wrote: > > > > How do you support cross-platform code using bytes filenames? > > > > IIRC, it has already been argued that it was an important feature. Many > > > > filesystem-related utilities might prefer to handle filenames in bytes > > > > form. > > > > > > It would be a policy decision. However, I think it is hear-say that > > > filesystem-related utilities might prefer byte file names. > > > > One possible situation is when you receive filenames in bytes form from > > an external API or tool (or even the contents of a file). If you don't > > know the encoding, keeping the bytes form is obviously recommended. > > I disagree with you: the filename stored in the binary content/network stream > may be encoded with a different code page than the current Windows code page. > The application have to decode the filename itself, the application has more > information about the right encoding than Windows. I'm not talking about Windows obviously. POSIX filenames are natively bytes, so if you get a bytes filename from an external source, it makes sense to reuse the bytes form. I think it would be a mistake to allow bytes filenames under POSIX but not under Windows. It makes porting harder. > - tar stores filenames... in the locale encoding (except for PAX format which > uses utf-8) So bytes filenames are useful at least for tar. I'm sure there are many other cases (actually, most kinds of configuration files containing paths would apply). Regards Antoine. From barry at python.org Fri Nov 12 17:15:53 2010 From: barry at python.org (Barry Warsaw) Date: Fri, 12 Nov 2010 11:15:53 -0500 Subject: [Python-Dev] buildbot master update In-Reply-To: <4CDD08FB.3070701@v.loewis.de> References: <4CDD08FB.3070701@v.loewis.de> Message-ID: <20101112111553.44da8c08@mission> On Nov 12, 2010, at 10:29 AM, Martin v. L?wis wrote: >As you may have noticed: I updated the buildbot master to release 0.8.2. >If you notice any problems, please post them here. Pretty! My buildbot seems fine. >Slave operators can upgrade their installations at their own pace; >buildbot is highly backwards compatible. As a recommendation, I suggest >that slaves run at least at the version that is available in Debian >stable (currently 0.7.8). Thanks Martin, for all you do to keep our infrastructure humming along smoothly, including the recent Roundup migration. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From status at bugs.python.org Fri Nov 12 18:07:02 2010 From: status at bugs.python.org (Python tracker) Date: Fri, 12 Nov 2010 18:07:02 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20101112170702.8111B1DBD7@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2010-11-05 - 2010-11-12) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2526 (+12) closed 19651 (+54) total 22177 (+66) Open issues with patches: 1050 Issues opened (47) ================== #9313: distutils error on MSVC older than 8 http://bugs.python.org/issue9313 reopened by eric.araujo #10252: Fix resource warnings in distutils http://bugs.python.org/issue10252 reopened by eric.araujo #10329: trace.py and unicode in Python 3 http://bugs.python.org/issue10329 reopened by belopolsky #10332: Multiprocessing maxtasksperchild results in hang http://bugs.python.org/issue10332 opened by Jimbofbx #10333: Remove ancient backwards compatibility GC API http://bugs.python.org/issue10333 opened by nascheme #10336: test_xmlrpc fails if gzip is not supported by client http://bugs.python.org/issue10336 opened by ocean-city #10338: test_lib2to3 failure on buildbot x86 debian parallel 3.x: node http://bugs.python.org/issue10338 opened by haypo #10339: test_lib2to3 leaks http://bugs.python.org/issue10339 opened by pitrou #10340: asyncore doesn't properly handle EINVAL on OSX http://bugs.python.org/issue10340 opened by giampaolo.rodola #10342: trace module cannot produce coverage reports for zipped module http://bugs.python.org/issue10342 opened by belopolsky #10344: codecs.readline doesn't care buffering=0 http://bugs.python.org/issue10344 opened by Santiago.Piccinini #10348: multiprocessing: use SysV semaphores on FreeBSD http://bugs.python.org/issue10348 opened by haypo #10349: Error in Module/python.c when building on OpenBSD 4.8 http://bugs.python.org/issue10349 opened by pgurumur #10350: errno is read too late http://bugs.python.org/issue10350 opened by hfuru #10351: Add autocompletion for keys in dictionaries http://bugs.python.org/issue10351 opened by Valery.Khamenya #10354: tempfile.template is broken http://bugs.python.org/issue10354 opened by giampaolo.rodola #10355: SpooledTemporaryFile's name property is broken http://bugs.python.org/issue10355 opened by giampaolo.rodola #10356: decimal.py: hash of -1 http://bugs.python.org/issue10356 opened by skrah #10357: ** and "mapping" are poorly defined in python docs http://bugs.python.org/issue10357 opened by Fergal.Daly #10358: Doc styles for print should only use dark colors http://bugs.python.org/issue10358 opened by fdrake #10359: ISO C cleanup http://bugs.python.org/issue10359 opened by hfuru #10360: _weakrefset.WeakSet.__contains__ should not propagate TypeErro http://bugs.python.org/issue10360 opened by tseaver #10362: AttributeError: addinfourl instance has no attribute 'tell' http://bugs.python.org/issue10362 opened by Valentin.Lorentz #10363: Embedded python, handle (memory) leak http://bugs.python.org/issue10363 opened by martind #10364: Color coding fails after running program. http://bugs.python.org/issue10364 opened by Typo #10365: IDLE Crashes on File Open Dialog when code window closed befor http://bugs.python.org/issue10365 opened by william.barr #10366: Remove unneeded '(object)' from 3.x class examples http://bugs.python.org/issue10366 opened by terry.reedy #10367: "python setup.py sdist upload --show-response" can fail with " http://bugs.python.org/issue10367 opened by jcea #10369: tarfile requires an actual file on disc; a file-like object is http://bugs.python.org/issue10369 opened by strombrg #10371: Deprecate trace module undocumented API http://bugs.python.org/issue10371 opened by belopolsky #10373: Setup Script example incorrect http://bugs.python.org/issue10373 opened by lensart #10374: setup.py caches outdated scripts in the build tree http://bugs.python.org/issue10374 opened by gjb1002 #10375: 2to3 print(single argument) http://bugs.python.org/issue10375 opened by hfuru #10376: ZipFile unzip is unbuffered http://bugs.python.org/issue10376 opened by Jimbofbx #10377: cProfile incorrectly labels its output http://bugs.python.org/issue10377 opened by exarkun #10379: locale.format() input regression http://bugs.python.org/issue10379 opened by barry #10381: Add timezone support to datetime C API http://bugs.python.org/issue10381 opened by belopolsky #10382: Command line error marker misplaced on unicode entry http://bugs.python.org/issue10382 opened by belopolsky #10383: test_os leaks under Windows http://bugs.python.org/issue10383 opened by pitrou #10384: SyntaxError should contain exact location of the invalid chara http://bugs.python.org/issue10384 opened by belopolsky #10385: Mark up "subprocess" as module in its doc http://bugs.python.org/issue10385 opened by belopolsky #10388: spwd returning different value depending on privileges http://bugs.python.org/issue10388 opened by giampaolo.rodola #10391: obj2ast's error handling can lead to python crashing with a C- http://bugs.python.org/issue10391 opened by dmalcolm #10392: GZipFile crash when fileobj.mode is None http://bugs.python.org/issue10392 opened by bgreenlee #10394: subprocess Popen deadlock http://bugs.python.org/issue10394 opened by Christoph.Mathys #10395: os.path.commonprefix broken by design http://bugs.python.org/issue10395 opened by ronaldoussoren #10345: fcntl.ioctl always fails claiming an invalid fd http://bugs.python.org/issue10345 opened by bgamari Most recent 15 issues with no replies (15) ========================================== #10394: subprocess Popen deadlock http://bugs.python.org/issue10394 #10392: GZipFile crash when fileobj.mode is None http://bugs.python.org/issue10392 #10388: spwd returning different value depending on privileges http://bugs.python.org/issue10388 #10384: SyntaxError should contain exact location of the invalid chara http://bugs.python.org/issue10384 #10381: Add timezone support to datetime C API http://bugs.python.org/issue10381 #10377: cProfile incorrectly labels its output http://bugs.python.org/issue10377 #10375: 2to3 print(single argument) http://bugs.python.org/issue10375 #10373: Setup Script example incorrect http://bugs.python.org/issue10373 #10350: errno is read too late http://bugs.python.org/issue10350 #10339: test_lib2to3 leaks http://bugs.python.org/issue10339 #10338: test_lib2to3 failure on buildbot x86 debian parallel 3.x: node http://bugs.python.org/issue10338 #10332: Multiprocessing maxtasksperchild results in hang http://bugs.python.org/issue10332 #10320: printf %qd is nonstandard http://bugs.python.org/issue10320 #10310: signed:1 bitfields rarely make sense http://bugs.python.org/issue10310 #10309: dlmalloc.c needs _GNU_SOURCE for mremap() http://bugs.python.org/issue10309 Most recent 15 issues waiting for review (15) ============================================= #10392: GZipFile crash when fileobj.mode is None http://bugs.python.org/issue10392 #10391: obj2ast's error handling can lead to python crashing with a C- http://bugs.python.org/issue10391 #10385: Mark up "subprocess" as module in its doc http://bugs.python.org/issue10385 #10382: Command line error marker misplaced on unicode entry http://bugs.python.org/issue10382 #10371: Deprecate trace module undocumented API http://bugs.python.org/issue10371 #10369: tarfile requires an actual file on disc; a file-like object is http://bugs.python.org/issue10369 #10360: _weakrefset.WeakSet.__contains__ should not propagate TypeErro http://bugs.python.org/issue10360 #10359: ISO C cleanup http://bugs.python.org/issue10359 #10356: decimal.py: hash of -1 http://bugs.python.org/issue10356 #10354: tempfile.template is broken http://bugs.python.org/issue10354 #10351: Add autocompletion for keys in dictionaries http://bugs.python.org/issue10351 #10350: errno is read too late http://bugs.python.org/issue10350 #10342: trace module cannot produce coverage reports for zipped module http://bugs.python.org/issue10342 #10340: asyncore doesn't properly handle EINVAL on OSX http://bugs.python.org/issue10340 #10329: trace.py and unicode in Python 3 http://bugs.python.org/issue10329 Top 10 most discussed issues (10) ================================= #10329: trace.py and unicode in Python 3 http://bugs.python.org/issue10329 11 msgs #7061: Improve turtle module documentation http://bugs.python.org/issue7061 9 msgs #10354: tempfile.template is broken http://bugs.python.org/issue10354 9 msgs #10359: ISO C cleanup http://bugs.python.org/issue10359 9 msgs #10379: locale.format() input regression http://bugs.python.org/issue10379 9 msgs #10325: PY_LLONG_MAX & co - preprocessor constants or not? http://bugs.python.org/issue10325 8 msgs #5412: extend configparser to support mapping access(__*item__) http://bugs.python.org/issue5412 7 msgs #10252: Fix resource warnings in distutils http://bugs.python.org/issue10252 7 msgs #10349: Error in Module/python.c when building on OpenBSD 4.8 http://bugs.python.org/issue10349 7 msgs #10364: Color coding fails after running program. http://bugs.python.org/issue10364 7 msgs Issues closed (51) ================== #1602: windows console doesn't print utf8 (Py30a2) http://bugs.python.org/issue1602 closed by haypo #1926: NNTPS support in nntplib http://bugs.python.org/issue1926 closed by pitrou #6058: Add cp65001 to encodings/aliases.py http://bugs.python.org/issue6058 closed by haypo #6226: Inconsistent 'file' vs 'stream' kwarg in pprint, other stdlibs http://bugs.python.org/issue6226 closed by eric.araujo #6317: winsound.PlaySound doesn't accept non-unicode string http://bugs.python.org/issue6317 closed by ocean-city #8634: get method for dbm interface http://bugs.python.org/issue8634 closed by eric.araujo #8679: write a distutils to distutils2 converter http://bugs.python.org/issue8679 closed by eric.araujo #8804: http.client should support SSL contexts http://bugs.python.org/issue8804 closed by pitrou #9421: configparser.ConfigParser's getint, getboolean and getfloat do http://bugs.python.org/issue9421 closed by lukasz.langa #9508: python3.2 reversal of distutils reintrocud macos9 support http://bugs.python.org/issue9508 closed by eric.araujo #10008: Two links point to same place http://bugs.python.org/issue10008 closed by georg.brandl #10022: Emit more information in decoded SSL certificates http://bugs.python.org/issue10022 closed by pitrou #10145: float.is_integer is undocumented http://bugs.python.org/issue10145 closed by mark.dickinson #10180: File objects should not pickleable http://bugs.python.org/issue10180 closed by pitrou #10226: urlparse example is wrong http://bugs.python.org/issue10226 closed by orsenthil #10229: Refleak run of test_gettext fails http://bugs.python.org/issue10229 closed by eric.araujo #10232: Tkinter issues with Scrollbar and custom widget list http://bugs.python.org/issue10232 closed by terry.reedy #10245: Fix resource warnings in test_telnetlib http://bugs.python.org/issue10245 closed by orsenthil #10282: IMPLEMENTATION token differently delt with in NNTP capability http://bugs.python.org/issue10282 closed by pitrou #10297: decimal module documentation is misguiding http://bugs.python.org/issue10297 closed by mark.dickinson #10302: Add class-functions to hash many small objects with hashlib http://bugs.python.org/issue10302 closed by gregory.p.smith #10303: small inconsistency in tutorial http://bugs.python.org/issue10303 closed by orsenthil #10304: error in tutorial triple-string example http://bugs.python.org/issue10304 closed by terry.reedy #10311: Signal handlers must preserve errno http://bugs.python.org/issue10311 closed by pitrou #10321: Add support for Message objects and binary data to smtplib.sen http://bugs.python.org/issue10321 closed by r.david.murray #10324: Modules/binascii.c: simplify expressions http://bugs.python.org/issue10324 closed by orsenthil #10327: Abnormal SSL timeouts when using socket timeouts - once again http://bugs.python.org/issue10327 closed by pakal #10330: trace module doesn't work without threads http://bugs.python.org/issue10330 closed by belopolsky #10331: test_gdb failure when warnings printed out http://bugs.python.org/issue10331 closed by dmalcolm #10334: Add new reST directive for links to source code http://bugs.python.org/issue10334 closed by georg.brandl #10335: tokenize.open(): open a file with encoding detected from a cod http://bugs.python.org/issue10335 closed by haypo #10337: testTanh() of test_math fails on "NetBSD 5 i386 3.x" http://bugs.python.org/issue10337 closed by haypo #10341: Remove traces of setuptools http://bugs.python.org/issue10341 closed by eric.araujo #10343: urllib.parse problems with bytes vs str http://bugs.python.org/issue10343 closed by r.david.murray #10346: strange arithmetic behaviour http://bugs.python.org/issue10346 closed by mark.dickinson #10347: regrtest progress counter makes -f option less useful http://bugs.python.org/issue10347 closed by pitrou #10352: rlcompleter.py has no tests in trunk http://bugs.python.org/issue10352 closed by georg.brandl #10353: 2to3 and places argument in unitests assertAlmostEqual http://bugs.python.org/issue10353 closed by r.david.murray #10361: Fix issue 9995 - distutils forces developers to store password http://bugs.python.org/issue10361 closed by eric.araujo #10368: "python setup.py sdist upload --show-response" fails http://bugs.python.org/issue10368 closed by eric.araujo #10370: py3 readlines() reports wrong offset for UnicodeDecodeError http://bugs.python.org/issue10370 closed by haypo #10372: [REGRESSION] test_gc fails in non-debug mode. http://bugs.python.org/issue10372 closed by pitrou #10378: Typo in results of help(divmod) http://bugs.python.org/issue10378 closed by benjamin.peterson #10380: AttributeError: 'module' object has no attribute 'exc_tracebac http://bugs.python.org/issue10380 closed by georg.brandl #10386: token module should define __all__ http://bugs.python.org/issue10386 closed by belopolsky #10387: ConfigParser's getboolean method is broken http://bugs.python.org/issue10387 closed by lukasz.langa #10389: Document rules for use of case in section titles http://bugs.python.org/issue10389 closed by belopolsky #10390: json.load should handle bytes input http://bugs.python.org/issue10390 closed by r.david.murray #10393: "with" statement isn't thread-safe http://bugs.python.org/issue10393 closed by amaury.forgeotdarc #1466065: base64 module ignores non-alphabet characters http://bugs.python.org/issue1466065 closed by r.david.murray #962772: when both maintainer and author provided, author discarded http://bugs.python.org/issue962772 closed by tarek From tjreedy at udel.edu Fri Nov 12 18:07:44 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 12 Nov 2010 12:07:44 -0500 Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and buildbot run as a service In-Reply-To: References: Message-ID: On 11/12/2010 3:44 AM, Paul Moore wrote: > Hi, > My buildbot has been failing for some time because of these 2 issues, > both related to the fact that tests are hanging when run as a service > (and hence have no display to open GUI elements on). Both issues have > patches, and as far as I am aware, the patches fix the issues > reasonably well. What can I do to move these 2 issues forwards? As > things stand, my buildbot is not providing a lot of value on the 3.x > branch :-( http://bugs.python.org/issue9055 is marked as a 2.7 issue only, perhaps fixed by Tim Golden's committed patches. Should it be re-versioned for 3.1/2? There is no patch file attached, though perhaps the code in Yamamoto's message is meant as such (but for which version?). So the first thing you could do is clarify the current status and remaining issue on the tracker. http://bugs.python.org/issue9931 by Yamamoto is marked for all 3 versions. It seems to be a similar issue, though marked 'test' rather than 'ctypes'. It does have a patch by him apparently based on his previous comments. The issue has no responses and needs a patch review. So the first thing you could do is to provide one;-). If it looks great (no changes that you can think of) and works great, say so. Then it can move on to commit review stage. PS. Providing links like the above makes it easier for multiple people to take a look and respond. -- Terry Jan Reedy From tjreedy at udel.edu Fri Nov 12 18:11:40 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 12 Nov 2010 12:11:40 -0500 Subject: [Python-Dev] bugs.python.org migration complete In-Reply-To: <4CDD09BE.8090106@v.loewis.de> References: <4CDD09BE.8090106@v.loewis.de> Message-ID: On 11/12/2010 4:32 AM, "Martin v. L?wis" wrote: > bugs.python.org is now on the new hardware. There have been some > problems in the migration: the old hardware would start failing before > the scheduled migration date, so the migration was done early, causing > outage for some people who then the old address in their DNS caches. > In addition, there was initially a misconfiguration preventing outgoing > IP traffic, particularly preventing outgoing emails from being > delivered. This is all fixed now; report any remaining issues to the > metatracker. I got stymied by some of the late failures, but it has been working great, with quick response, since last night. Thanks for the upgrade. -- Terry Jan Reedy From p.f.moore at gmail.com Fri Nov 12 18:25:05 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 12 Nov 2010 17:25:05 +0000 Subject: [Python-Dev] buildbot master update In-Reply-To: <20101112111553.44da8c08@mission> References: <4CDD08FB.3070701@v.loewis.de> <20101112111553.44da8c08@mission> Message-ID: On 12 November 2010 16:15, Barry Warsaw wrote: > On Nov 12, 2010, at 10:29 AM, Martin v. L?wis wrote: > >>As you may have noticed: I updated the buildbot master to release 0.8.2. >>If you notice any problems, please post them here. > > Pretty! ?My buildbot seems fine. Yes, I like the new look. >>Slave operators can upgrade their installations at their own pace; >>buildbot is highly backwards compatible. As a recommendation, I suggest >>that slaves run at least at the version that is available in Debian >>stable (currently 0.7.8). > > Thanks Martin, for all you do to keep our infrastructure humming along > smoothly, including the recent Roundup migration. Thanks from me, too! Paul From solipsis at pitrou.net Fri Nov 12 20:42:00 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 12 Nov 2010 20:42:00 +0100 Subject: [Python-Dev] r86418 - in python/branches/release27-maint: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS References: <20101111232219.6AC11EEA01@mail.python.org> Message-ID: <20101112204200.32856238@pitrou.net> Hello, On Fri, 12 Nov 2010 00:22:19 +0100 (CET) terry.reedy wrote: > + > + .. versionadded:: 2.7 > + The *autojunk* parameter. Maybe I've missed something, but is there any reason to add a new parameter in a bugfix release? (apart from security issues) Regards Antoine. From martin at v.loewis.de Fri Nov 12 20:44:34 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Nov 2010 20:44:34 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <20101112144029.00f6fdfc@pitrou.net> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <4CDC47B4.5080200@v.loewis.de> <20101111210243.264ccfb7@pitrou.net> <201011121313.08741.victor.stinner@haypocalc.com> <20101112144029.00f6fdfc@pitrou.net> Message-ID: <4CDD9922.4090309@v.loewis.de> > I'm not talking about Windows obviously. POSIX filenames are natively > bytes, so if you get a bytes filename from an external source, it makes > sense to reuse the bytes form. > > I think it would be a mistake to allow bytes filenames under POSIX but > not under Windows. It makes porting harder. Not really. People who want to write portable code should use Unicode filenames everywhere, not byte filenames. > >> - tar stores filenames... in the locale encoding (except for PAX format which >> uses utf-8) > > So bytes filenames are useful at least for tar. No, they are not. The tarfile module decodes all file names on its own, IIUC. Regards, Martin From martin at v.loewis.de Fri Nov 12 20:46:27 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Nov 2010 20:46:27 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011121315.35541.victor.stinner@haypocalc.com> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> <4CDC490B.9060809@v.loewis.de> <201011121315.35541.victor.stinner@haypocalc.com> Message-ID: <4CDD9993.5080709@v.loewis.de> > Ok, good answer. In this case, I vote +1 to remove completly the ANSI version > from all Python modules. I think caution is still necessary. So I propose to deprecate byte filenames on Windows in 3.2, with removal in 3.3. People who think this is a terrible mistake and breaks there applications with no hope of a sensible solution can then still intervene. Regards, Martin From martin at v.loewis.de Fri Nov 12 20:53:00 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 12 Nov 2010 20:53:00 +0100 Subject: [Python-Dev] buildbot master update In-Reply-To: <20101112111553.44da8c08@mission> References: <4CDD08FB.3070701@v.loewis.de> <20101112111553.44da8c08@mission> Message-ID: <4CDD9B1C.3070703@v.loewis.de> > Thanks Martin, for all you do to keep our infrastructure humming along > smoothly, including the recent Roundup migration. I just write the announcements :-) In this case. thanks should also extend to Izak Burger of Upfront Hosting who did most of the setup (I just did the DNS changes), and to bitdancer who investigated (together with Izak) the configuration problems of the new installation. Regards, Martin From solipsis at pitrou.net Fri Nov 12 21:07:52 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 12 Nov 2010 21:07:52 +0100 Subject: [Python-Dev] buildbot master update References: <4CDD08FB.3070701@v.loewis.de> <20101112111553.44da8c08@mission> <4CDD9B1C.3070703@v.loewis.de> Message-ID: <20101112210752.00528fcd@pitrou.net> On Fri, 12 Nov 2010 20:53:00 +0100 "Martin v. L?wis" wrote: > > Thanks Martin, for all you do to keep our infrastructure humming along > > smoothly, including the recent Roundup migration. > > I just write the announcements :-) In this case. thanks should also > extend to Izak Burger of Upfront Hosting who did most of the setup > (I just did the DNS changes), and to bitdancer who > investigated (together with Izak) the configuration problems of the new > installation. And for the record, bitdancer is R. David Murray :-) cheers Antoine. From hnassrat at gmail.com Fri Nov 12 21:08:42 2010 From: hnassrat at gmail.com (Hatem Nassrat) Date: Fri, 12 Nov 2010 13:08:42 -0700 Subject: [Python-Dev] Closures / Python Scopes Message-ID: A colleague of mine came across something anecdotal when working with lambdas, it is expressed by the following code snippet. In [1]: def a(): ...: for i in range(10): ...: def b(): ...: return i ...: yield b ...: ...: In [2]: funcs = list(a()) In [3]: print [f() for f in funcs] [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] I understand that for loops in python do not have a scope, neither do if statements, and python is awesome for that. Is this something accidental? i.e. will python ever evolve into having scopes for if and for loops (and other blocks that are not functions)? the reason I ask is with the introduction of http://docs.python.org/py3k/reference/simple_stmts.html#nonlocal it seems like something that can happen. -- Hatem Nassrat From tjreedy at udel.edu Fri Nov 12 21:32:21 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 12 Nov 2010 15:32:21 -0500 Subject: [Python-Dev] r86418 - in python/branches/release27-maint: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: <20101112204200.32856238@pitrou.net> References: <20101111232219.6AC11EEA01@mail.python.org> <20101112204200.32856238@pitrou.net> Message-ID: On 11/12/2010 2:42 PM, Antoine Pitrou wrote: > > Hello, > > On Fri, 12 Nov 2010 00:22:19 +0100 (CET) > terry.reedy wrote: >> + >> + .. versionadded:: 2.7 >> + The *autojunk* parameter. > > Maybe I've missed something, but is there any reason to add a new > parameter in a bugfix release? > (apart from security issues) This is a bugfix. We discussed this (with Tim's participation) here last July/August and pretty well agreed that this was the least obnoxious solution to a bad situation. -- Terry Jan Reedy From tjreedy at udel.edu Fri Nov 12 21:38:19 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 12 Nov 2010 15:38:19 -0500 Subject: [Python-Dev] Closures / Python Scopes In-Reply-To: References: Message-ID: On 11/12/2010 3:08 PM, Hatem Nassrat wrote: > A colleague of mine came across something anecdotal when working with > lambdas, it is expressed by the following code snippet. > > In [1]: def a(): > ...: for i in range(10): > ...: def b(): > ...: return i > ...: yield b > ...: > ...: > > In [2]: funcs = list(a()) > > In [3]: print [f() for f in funcs] > [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] > > > I understand that for loops in python do not have a scope, neither do > if statements, and python is awesome for that. Is this something > accidental? i.e. will python ever evolve into having scopes for if and > for loops (and other blocks that are not functions)? the reason I ask > is with the introduction of > http://docs.python.org/py3k/reference/simple_stmts.html#nonlocal it > seems like something that can happen. Question/discussion issues like this belong on python-list or python-ideas list. -- Terry Jan Reedy From tjreedy at udel.edu Fri Nov 12 21:53:17 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 12 Nov 2010 15:53:17 -0500 Subject: [Python-Dev] r86418 - in python/branches/release27-maint: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: References: <20101111232219.6AC11EEA01@mail.python.org> <20101112204200.32856238@pitrou.net> Message-ID: On 11/12/2010 3:32 PM, Terry Reedy wrote: > On 11/12/2010 2:42 PM, Antoine Pitrou wrote: >> >> Hello, >> >> On Fri, 12 Nov 2010 00:22:19 +0100 (CET) >> terry.reedy wrote: >>> + >>> + .. versionadded:: 2.7 >>> + The *autojunk* parameter. I just realized that this should say 2.7.1 so people know not to use it with the original 2.7. I will repeat it again in the SequenceMatcher section. >> Maybe I've missed something, but is there any reason to add a new >> parameter in a bugfix release? >> (apart from security issues) > > This is a bugfix. We discussed this (with Tim's participation) here last > July/August and pretty well agreed that this was the least obnoxious > solution to a bad situation. -- Terry Jan Reedy From ncoghlan at gmail.com Sat Nov 13 01:45:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 13 Nov 2010 10:45:22 +1000 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDD9993.5080709@v.loewis.de> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> <4CDC490B.9060809@v.loewis.de> <201011121315.35541.victor.stinner@haypocalc.com> <4CDD9993.5080709@v.loewis.de> Message-ID: On Sat, Nov 13, 2010 at 5:46 AM, "Martin v. L?wis" wrote: >> Ok, good answer. In this case, I vote +1 to remove completly the ANSI version >> from all Python modules. > > I think caution is still necessary. So I propose to deprecate byte > filenames on Windows in 3.2, with removal in 3.3. People who think this > is a terrible mistake and breaks there applications with no hope of a > sensible solution can then still intervene. I was going to suggest much the same thing. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 13 01:51:03 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 13 Nov 2010 10:51:03 +1000 Subject: [Python-Dev] r86418 - in python/branches/release27-maint: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: References: <20101111232219.6AC11EEA01@mail.python.org> <20101112204200.32856238@pitrou.net> Message-ID: On Sat, Nov 13, 2010 at 6:32 AM, Terry Reedy wrote: > On 11/12/2010 2:42 PM, Antoine Pitrou wrote: >> Maybe I've missed something, but is there any reason to add a new >> parameter in a bugfix release? >> (apart from security issues) > > This is a bugfix. We discussed this (with Tim's participation) here last > July/August and pretty well agreed that this was the least obnoxious > solution to a bad situation. Yep, as Terry said, the current behaviour is irredeemably broken in some situations, but switching it off completely would break other cases. Adding a new optional parameter that defaulted to the 2.7 behaviour was considered the least-bad option out of those available (do nothing, add parameter, change default behaviour, add new API). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Sat Nov 13 02:31:49 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 12 Nov 2010 20:31:49 -0500 Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py In-Reply-To: <20101113002853.526F8EEA40@mail.python.org> References: <20101113002853.526F8EEA40@mail.python.org> Message-ID: <4CDDEA85.6050907@udel.edu> On 11/12/2010 7:28 PM, antoine.pitrou wrote: > Author: antoine.pitrou > Date: Sat Nov 13 01:28:53 2010 > New Revision: 86441 > > Log: > Switch from gmane to another provider for NNTP tests (as gmane isn't reliable > enough). Also, use setUpClass in order to connect only once per test run. > class NetworkedNNTP_SSLTests(NetworkedNNTPTestsMixin, unittest.TestCase): > - NNTP_HOST = 'snews.gmane.org' > - GROUP_NAME = 'gmane.comp.python.devel' > - GROUP_PAT = 'gmane.comp.python.d*' gmane is most problematical on weekends. > + NNTP_HOST = 'nntp.aioe.org' > + GROUP_NAME = 'comp.lang.python' > + GROUP_PAT = 'comp.lang.*' aioe went away for several months a couple of years ago or so. Let us hope it stays up for awhile now. The ssl connection currently does not work (expired certificate). Terry From greg.ewing at canterbury.ac.nz Sat Nov 13 04:05:35 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 13 Nov 2010 16:05:35 +1300 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDC83B3.307@pearwood.info> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <4CDC32F0.3010500@pearwood.info> <4CDC6CBF.7060500@canterbury.ac.nz> <4CDC83B3.307@pearwood.info> Message-ID: <4CDE007F.9010903@canterbury.ac.nz> Steven D'Aprano wrote: > By the way, did you intend to send this off-list? No, I didn't realise I hadn't sent it to the list. If you don't document them, I won't use them, because I won't know if it's one of these don't-ask-don't-tell pseudo-public functions or something private that's accidentally been given a non-underscore name. > Greg Ewing wrote: >> Also it means that help() wouldn't show me documentation for >> the support functions, which is a bad thing if they really are >> intended for public use. > > I don't see why... if you import the module, and call help(module), they > will show up as normal. If the module has an __all__ list, then help(module) will only show functions included in that list. So your pseudo-public functions would not show up in it. Without some other reason to suspect their existence, I would probably never find them. -- Greg From guido at python.org Sat Nov 13 05:38:16 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 12 Nov 2010 20:38:16 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CDD0DBC.4050405@avl.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <4CDC6D31.2040809@canterbury.ac.nz> <4CDD0DBC.4050405@avl.com> Message-ID: On Fri, Nov 12, 2010 at 1:49 AM, Hrvoje Niksic wrote: > On 11/11/2010 11:24 PM, Greg Ewing wrote: >> >> Nick Coghlan wrote: >> >>> ?My personal opinion is that we should be trying to get the standard >>> ?library to the point where __all__ definitions are unnecessary - if a >>> ?name isn't in __all__, it should start with an underscore (and if that >>> ?is true, then the __all__ definition becomes effectively redundant). >> >> What about names imported from other modules that are used by >> the module, but not intended for re-export? How would you >> prevent them from turning up in help() etc. without using >> __all__? > > import foo as _foo > > I believe I am not the only one who finds that practice ugly, Agreed. > but I find it > just as ugly to underscore-ize every non-public helper function. __all__ is > there for a reason, let's use it. ?Maybe help() could automatically ignore > stuff not in __all__, or display it but warn the user of non-public > identifiers? No, I like all non-public functions, constants, classes and variables (but excluding imported modules) to start with _. You'd still need __all__ to make "import *" do the right thing, but the reader of the source code should not have to look up every name in __all__ to find whether it is supposed to be public or private. Plus, the same convention should carry over to methods and other class attributes, where you don't have __all__. If help() is broken we should fix it. (I'm not very happy with it myself anyway, I rarely use it.) Note that __all__ was originally invented to give "from package import *" a well-defined meaning when the package included submodules that might not have been loaded yet. Using it for other export control (while a good idea) could be considered "newfangled". :-) -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Sat Nov 13 13:06:46 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 13 Nov 2010 13:06:46 +0100 Subject: [Python-Dev] Breaking undocumented API References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <4CDC6D31.2040809@canterbury.ac.nz> <4CDD0DBC.4050405@avl.com> Message-ID: <20101113130646.1237977b@pitrou.net> On Fri, 12 Nov 2010 20:38:16 -0800 Guido van Rossum wrote: > > Note that __all__ was originally invented to give "from package import > *" a well-defined meaning when the package included submodules that > might not have been loaded yet. Using it for other export control > (while a good idea) could be considered "newfangled". :-) Newfangled in a rather old way already, then, perhaps :p regards Antoine. From solipsis at pitrou.net Sat Nov 13 13:08:39 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 13 Nov 2010 13:08:39 +0100 Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py References: <20101113002853.526F8EEA40@mail.python.org> <4CDDEA85.6050907@udel.edu> Message-ID: <20101113130839.1c315e45@pitrou.net> On Fri, 12 Nov 2010 20:31:49 -0500 Terry Reedy wrote: > > > class NetworkedNNTP_SSLTests(NetworkedNNTPTestsMixin, unittest.TestCase): > > - NNTP_HOST = 'snews.gmane.org' > > - GROUP_NAME = 'gmane.comp.python.devel' > > - GROUP_PAT = 'gmane.comp.python.d*' > > gmane is most problematical on weekends. Well we've had buildbot failures in the middle of the week. > > + NNTP_HOST = 'nntp.aioe.org' > > + GROUP_NAME = 'comp.lang.python' > > + GROUP_PAT = 'comp.lang.*' > > aioe went away for several months a couple of years ago or so. > Let us hope it stays up for awhile now. > The ssl connection currently does not work (expired certificate). Funny, it shows that the NNTP SSL tests don't check the certificate, then. Regards Antoine. From g.rodola at gmail.com Sat Nov 13 13:12:31 2010 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 13 Nov 2010 13:12:31 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: +1 on everything. 2010/11/11 Alexander Belopolsky : > 2010/11/11 Michael Foord : > .. >>> You mean runtime automation, e.g. creating __all__ on the fly omitting >>> underscored names? >>> >> Writing code to generate a __all__ that duplicates the default behaviour >> seems redundant to me. >> > > FWIW, I like having __all__ at the top of the module. ?It feels like a > table of contents at the start of a chapter. ?In some cases it may > also serve as an optimization when len(__all__) is much smaller than > len(__dict__). ?I also don't like _ prefix to become an exclusive > means to express privateness. > > I think the current definition of "public names" is a good one and > just needs to be made more visible in the docs. ?If the module defines > __all__, that should be the ultimate answer to what is public in that > module. ? (Users should learn to use help(module) instead of > dir(module) for API discovery.) ? If __all__ is not defined in the > module, I think it is good to introduce it after a careful review of > what it should contain. ?And __all__ should never contain names that > start with _. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com > From foom at fuhm.net Sat Nov 13 13:30:05 2010 From: foom at fuhm.net (James Y Knight) Date: Sat, 13 Nov 2010 07:30:05 -0500 Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py In-Reply-To: <20101113130839.1c315e45@pitrou.net> References: <20101113002853.526F8EEA40@mail.python.org> <4CDDEA85.6050907@udel.edu> <20101113130839.1c315e45@pitrou.net> Message-ID: <92814936-A0FC-403A-B3BA-46AE3085594B@fuhm.net> On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote: > Funny, it shows that the NNTP SSL tests don't check the certificate, > then. Unsurprising, given that you need 140 lines of pretty non-obvious python code to do so... James From solipsis at pitrou.net Sat Nov 13 13:37:12 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 13 Nov 2010 13:37:12 +0100 Subject: [Python-Dev] Stable buildbots Message-ID: <20101113133712.60e9be27@pitrou.net> Hi, Just to let you know that we now have 8 stable buildbots, including Barry's own PPC Ubuntu machine (even though the Windows buildbots give a rather unconventional meaning to the word "stability"). Right now they are mostly green: http://www.python.org/dev/buildbot/all/waterfall?category=3.x.stable cheers Antoine. From solipsis at pitrou.net Sat Nov 13 13:40:25 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 13 Nov 2010 13:40:25 +0100 Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py In-Reply-To: <92814936-A0FC-403A-B3BA-46AE3085594B@fuhm.net> References: <20101113002853.526F8EEA40@mail.python.org> <4CDDEA85.6050907@udel.edu> <20101113130839.1c315e45@pitrou.net> <92814936-A0FC-403A-B3BA-46AE3085594B@fuhm.net> Message-ID: <20101113134025.5604fc9c@pitrou.net> On Sat, 13 Nov 2010 07:30:05 -0500 James Y Knight wrote: > On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote: > > Funny, it shows that the NNTP SSL tests don't check the certificate, > > then. > > Unsurprising, given that you need 140 lines of pretty non-obvious python code to do so... You must have missed the new match_hostname() function: http://docs.python.org/dev/library/ssl.html#ssl.match_hostname (its implementation is 50 lines rather than 140 lines, though) Regards Antoine. From dickinsm at gmail.com Sat Nov 13 14:00:29 2010 From: dickinsm at gmail.com (Mark Dickinson) Date: Sat, 13 Nov 2010 13:00:29 +0000 Subject: [Python-Dev] buildbot master update In-Reply-To: <4CDD08FB.3070701@v.loewis.de> References: <4CDD08FB.3070701@v.loewis.de> Message-ID: On Fri, Nov 12, 2010 at 9:29 AM, "Martin v. L?wis" wrote: > As you may have noticed: I updated the buildbot master to release 0.8.2. > If you notice any problems, please post them here. One effect of this change seems to be that bbreport[1] no longer works, since it appears that buildbot 0.8.2 has done away with the XMLRPC interface that bbreport uses. But that's really a bbreport issue rather than a buildbot one... Mark [1] http://code.google.com/p/bbreport/ From g.brandl at gmx.net Sat Nov 13 15:15:43 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 13 Nov 2010 15:15:43 +0100 Subject: [Python-Dev] buildbot master update In-Reply-To: References: <4CDD08FB.3070701@v.loewis.de> Message-ID: Am 13.11.2010 14:00, schrieb Mark Dickinson: > On Fri, Nov 12, 2010 at 9:29 AM, "Martin v. L?wis" wrote: >> As you may have noticed: I updated the buildbot master to release 0.8.2. >> If you notice any problems, please post them here. > > One effect of this change seems to be that bbreport[1] no longer > works, since it appears that buildbot 0.8.2 has done away with the > XMLRPC interface that bbreport uses. > > But that's really a bbreport issue rather than a buildbot one... > > Mark I've added a quickfix by copying the removed xmlrpc interface to the local buildbot installation now. I had to patch it up a bit though, because of an apparent API change somewhere in buildbot, and I'm not sure whether this was correct. Georg From ocean-city at m2.ccsnet.ne.jp Sat Nov 13 15:47:53 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Sat, 13 Nov 2010 23:47:53 +0900 Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and buildbot run as a service In-Reply-To: References: Message-ID: <4CDEA519.1020801@m2.ccsnet.ne.jp> On 2010/11/13 2:07, Terry Reedy wrote: > On 11/12/2010 3:44 AM, Paul Moore wrote: >> Hi, >> My buildbot has been failing for some time because of these 2 issues, >> both related to the fact that tests are hanging when run as a service >> (and hence have no display to open GUI elements on). Both issues have >> patches, and as far as I am aware, the patches fix the issues >> reasonably well. What can I do to move these 2 issues forwards? As >> things stand, my buildbot is not providing a lot of value on the 3.x >> branch :-( > > http://bugs.python.org/issue9055 > is marked as a 2.7 issue only, perhaps fixed by Tim Golden's committed > patches. Should it be re-versioned for 3.1/2? There is no patch file > attached, though perhaps the code in Yamamoto's message is meant as such > (but for which version?). So the first thing you could do is clarify the > current status and remaining issue on the tracker. > > http://bugs.python.org/issue9931 > by Yamamoto is marked for all 3 versions. It seems to be a similar > issue, though marked 'test' rather than 'ctypes'. It does have a patch > by him apparently based on his previous comments. The issue has no > responses and needs a patch review. So the first thing you could do is > to provide one;-). If it looks great (no changes that you can think of) > and works great, say so. Then it can move on to commit review stage. > > PS. Providing links like the above makes it easier for multiple people > to take a look and respond. My patch won't fix issue9055 directly, but solves issue9931. Probably it's easy to create a patch to fix issue9055 based on my patch. One problem is, how to skip test. With single decorator like skip_unless_symlink? Or let requires() raise SkipTest? From ocean-city at m2.ccsnet.ne.jp Sat Nov 13 17:21:37 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Sun, 14 Nov 2010 01:21:37 +0900 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011121308.30368.victor.stinner@haypocalc.com> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011112026.24445.victor.stinner@haypocalc.com> <201011121308.30368.victor.stinner@haypocalc.com> Message-ID: <4CDEBB11.5050209@m2.ccsnet.ne.jp> On 2010/11/12 4:26, Victor Stinner wrote: > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote: >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) >> and only use Win32 WIDE API (ie: GetFileAttributesW)? >> Mainly in posixmodule.c. > > Even if I hate the MBCS encoding, because it replaces undecodable characters > by similar glyphs by default, I'm not certain that it is a good idea to drop > the bytes API. On 2010/11/12 21:08, Victor Stinner wrote: > On Thursday 11 November 2010 23:01:32 you wrote: >>> Sure, it will divide the number of lines, of the code specific to >>> Windows, by two. >> >> Can we get most of the code cleanup benefit without the backwards >> compatibility risk by doing the decode from 'mbcs' on our side of the >> fence? > > I created PyUnicode_FSDecoder, a ParseTuple converter used to work on unicode > paths, instead of bytes paths. On Windows, this converter uses mbcs encoding > in strict mode, whereas Windows converter uses replace error handler to > decode, and ignore to encode. So I don't think that we should this converter > on Windows. > >> That is, have code that was the C equivalent of: >> >> arg_is_bytes = not isinstance(arg, str) >> if arg_is_bytes: >> val = _decode_mbcs(arg) >> # Decoding error checking here >> else: >> val = arg >> # Common processing using WIDE API >> if arg_is_bytes: >> result = _encode_mbcs(wide_result) >> # Encoding error checking here >> else: >> result = wide_result > > This doesn't make the code shorter, it may be longer than the actual code, and > it is less compliant with the Windows native API... Is it possible to implement new PyArg_ParseTuple converter to use PyUnicode_Decode(const char *s, Py_ssize_t size, const char *encoding, /* mbcs */ const char *errors) /* replace */ and use it? From tjreedy at udel.edu Sat Nov 13 19:40:54 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 13 Nov 2010 13:40:54 -0500 Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py In-Reply-To: <20101113130839.1c315e45@pitrou.net> References: <20101113002853.526F8EEA40@mail.python.org> <4CDDEA85.6050907@udel.edu> <20101113130839.1c315e45@pitrou.net> Message-ID: On 11/13/2010 7:08 AM, Antoine Pitrou wrote: > On Fri, 12 Nov 2010 20:31:49 -0500 > Terry Reedy wrote: >> >>> class NetworkedNNTP_SSLTests(NetworkedNNTPTestsMixin, unittest.TestCase): >>> - NNTP_HOST = 'snews.gmane.org' >>> - GROUP_NAME = 'gmane.comp.python.devel' >>> - GROUP_PAT = 'gmane.comp.python.d*' >> >> gmane is most problematical on weekends. > > Well we've had buildbot failures in the middle of the week. Why I did not say 'only' ;-). >>> + NNTP_HOST = 'nntp.aioe.org' >>> + GROUP_NAME = 'comp.lang.python' >>> + GROUP_PAT = 'comp.lang.*' >> >> aioe went away for several months a couple of years ago or so. >> Let us hope it stays up for awhile now. >> The ssl connection currently does not work (expired certificate). More specifically, if, with Thunderbird, I turn on SSL/TLS, (which switches from port 119 to 563), I get *invalid* certificate message - good for aioe.org, news.aioe,org, but not nntp.aioe.org. I believe SSL worked before the hiatus so it might be an oversight in restarting. > Funny, it shows that the NNTP SSL tests don't check the certificate, > then. Or not thoroughly. -- Terry Jan Reedy From tjreedy at udel.edu Sat Nov 13 20:29:23 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 13 Nov 2010 14:29:23 -0500 Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py In-Reply-To: References: <20101113002853.526F8EEA40@mail.python.org> <4CDDEA85.6050907@udel.edu> <20101113130839.1c315e45@pitrou.net> Message-ID: O > More specifically, if, with Thunderbird, I turn on SSL/TLS, (which > switches from port 119 to 563), I get *invalid* certificate message - > good for aioe.org, news.aioe,org, but not nntp.aioe.org. I believe SSL > worked before the hiatus so it might be an oversight in restarting. > >> Funny, it shows that the NNTP SSL tests don't check the certificate, >> then. > > Or not thoroughly. I can access gmane with SSL, so you could add a conditional (on being up and running) certificate check using that. -- Terry Jan Reedy From tjreedy at udel.edu Sat Nov 13 19:17:25 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 13 Nov 2010 13:17:25 -0500 Subject: [Python-Dev] [Python-checkins] r86451 - python/branches/py3k/Misc/NEWS In-Reply-To: <20101113132541.861BBEEA82@mail.python.org> References: <20101113132541.861BBEEA82@mail.python.org> Message-ID: <4CDED635.3010409@udel.edu> On 11/13/2010 8:25 AM, georg.brandl wrote: > Author: georg.brandl > Date: Sat Nov 13 14:25:40 2010 > New Revision: 86451 > - unused undocumented value PyBUF_SHADOW, and strangely-looking code in > + undocumented value PyBUF_SHADOW, and strangely-looking code in For future reference, 'strangely-looking' should be either 'strange- looking' or 'strangely appearing'. First, '-ly' adverbs are never hypenated even when modifying adjectives. Second, 'strangely looking code' would mean that the code is actively looking around strangely (as opposed to passively sitting there appearing strange). tjr From tjreedy at udel.edu Sat Nov 13 19:21:09 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 13 Nov 2010 13:21:09 -0500 Subject: [Python-Dev] [Python-checkins] r86453 - in python/branches/release31-maint: Include/patchlevel.h Lib/distutils/__init__.py Lib/idlelib/idlever.py Misc/NEWS Misc/RPM/python-3.1.spec README In-Reply-To: <20101113172857.00DBFEEAC5@mail.python.org> References: <20101113172857.00DBFEEAC5@mail.python.org> Message-ID: <4CDED715.1070100@udel.edu> On 11/13/2010 12:28 PM, benjamin.peterson wrote: > Author: benjamin.peterson > Date: Sat Nov 13 18:28:56 2010 > New Revision: 86453 > Modified: python/branches/release31-maint/README > ============================================================================== > --- python/branches/release31-maint/README (original) > +++ python/branches/release31-maint/README Sat Nov 13 18:28:56 2010 > @@ -1,5 +1,5 @@ > -This is Python version 3.1.2 > -============================ > +This is Python version 3.1.2 release candidate 1 > +================================================ That should be 3.1.3 ;-) From janssen at parc.com Sat Nov 13 21:56:11 2010 From: janssen at parc.com (Bill Janssen) Date: Sat, 13 Nov 2010 12:56:11 PST Subject: [Python-Dev] [Python-checkins] r86441 - python/branches/py3k/Lib/test/test_nntplib.py In-Reply-To: <20101113134025.5604fc9c@pitrou.net> References: <20101113002853.526F8EEA40@mail.python.org> <4CDDEA85.6050907@udel.edu> <20101113130839.1c315e45@pitrou.net> <92814936-A0FC-403A-B3BA-46AE3085594B@fuhm.net> <20101113134025.5604fc9c@pitrou.net> Message-ID: <47826.1289681771@parc.com> Antoine Pitrou wrote: > On Sat, 13 Nov 2010 07:30:05 -0500 > James Y Knight wrote: > > On Nov 13, 2010, at 7:08 AM, Antoine Pitrou wrote: > > > Funny, it shows that the NNTP SSL tests don't check the certificate, > > > then. > > > > Unsurprising, given that you need 140 lines of pretty non-obvious python code to do so... > > You must have missed the new match_hostname() function: > http://docs.python.org/dev/library/ssl.html#ssl.match_hostname > > (its implementation is 50 lines rather than 140 lines, though) On the client side, it's pretty easy to see an invalid (say, expired) certificate. Just call get_server_certificate(), which will fail if the server certificate is invalid. That's a separate issue from matching the request hostname to the various host identifiers in the certificate, which various application protocols may or may not require. Bill From benjamin at python.org Sun Nov 14 00:08:10 2010 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 13 Nov 2010 17:08:10 -0600 Subject: [Python-Dev] [RELEASED] Python 3.1.3 release candidate 1 Message-ID: On behalf of the Python development team, I'm gladsome to announce a release candidate of the third bugfix release for the Python 3.1 series, Python 3.1.3. This bug fix release fixes numerous issues found in 3.1.2. Please try it with your packages and report any bugs you find. The final of 3.1.3 is scheduled to be released in two weeks. The Python 3.1 version series focuses on the stabilization and optimization of the features and changes that Python 3.0 introduced. For example, the new I/O system has been rewritten in C for speed. File system APIs that use unicode strings now handle paths with undecodable bytes in them. Other features include an ordered dictionary implementation, a condensed syntax for nested with statements, and support for ttk Tile in Tkinter. For a more extensive list of changes in 3.1, see http://doc.python.org/3.1/whatsnew/3.1.html or Misc/NEWS in the Python distribution. To download Python 3.1.3 visit: http://www.python.org/download/releases/3.1.3/ A list of changes in 3.1.3 can be found here: http://svn.python.org/projects/python/tags/r313rc1/Misc/NEWS The 3.1 documentation can be found at: http://docs.python.org/3.1 Bugs can always be reported to: http://bugs.python.org Enjoy! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 3.1.3's contributors) From benjamin at python.org Sun Nov 14 00:12:22 2010 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 13 Nov 2010 17:12:22 -0600 Subject: [Python-Dev] [RELEASED] Python 2.7.1 release candidate 1 Message-ID: On behalf of the Python development team, I'm chuffed to announce the a release candidate of Python 2.7.1. Please test the release candidate with your packages and report any bugs you find. 2.7.1 final is scheduled in two weeks. 2.7 includes many features that were first released in Python 3.1. The faster io module, the new nested with statement syntax, improved float repr, set literals, dictionary views, and the memoryview object have been backported from 3.1. Other features include an ordered dictionary implementation, unittests improvements, a new sysconfig module, auto-numbering of fields in the str/unicode format method, and support for ttk Tile in Tkinter. For a more extensive list of changes in 2.7, see http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python distribution. To download Python 2.7.1 visit: http://www.python.org/download/releases/2.7.1/ The 2.7.1 changelog is at: http://svn.python.org/projects/python/tags/r271rc1/Misc/NEWS 2.7 documentation can be found at: http://docs.python.org/2.7/ This is a testing release, so we encourage developers to test it with their applications and libraries. Please report any bugs you find, so they can be fixed in the final release. The bug tracker is at: http://bugs.python.org/ Enjoy! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 2.7.1's contributors) From victor.stinner at haypocalc.com Sun Nov 14 01:06:55 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sun, 14 Nov 2010 01:06:55 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDEBB11.5050209@m2.ccsnet.ne.jp> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> Message-ID: <201011140106.55153.victor.stinner@haypocalc.com> On Saturday 13 November 2010 17:21:37 you wrote: > On 2010/11/12 4:26, Victor Stinner wrote: > > On Thursday 11 November 2010 17:07:28 Hirokazu Yamamoto wrote: > >> Hello. Is it possible to remove Win32 ANSI API (ie: GetFileAttributesA) > >> and only use Win32 WIDE API (ie: GetFileAttributesW)? > >> Mainly in posixmodule.c. > > > > Even if I hate the MBCS encoding, because it replaces undecodable > > characters > > > by similar glyphs by default, I'm not certain that it is a good idea > > to drop > > > the bytes API. > > On 2010/11/12 21:08, Victor Stinner wrote: > > On Thursday 11 November 2010 23:01:32 you wrote: > >>> Sure, it will divide the number of lines, of the code specific to > >>> Windows, by two. > >> > >> Can we get most of the code cleanup benefit without the backwards > >> compatibility risk by doing the decode from 'mbcs' on our side of the > >> fence? > > > > I created PyUnicode_FSDecoder, a ParseTuple converter used to work on > > unicode paths, instead of bytes paths. On Windows, this converter uses > > mbcs encoding in strict mode, whereas Windows converter uses replace > > error handler to decode, and ignore to encode. So I don't think that we > > should this converter on Windows. > > > >> That is, have code that was the C equivalent of: > >> > >> arg_is_bytes = not isinstance(arg, str) > >> > >> if arg_is_bytes: > >> val = _decode_mbcs(arg) > >> # Decoding error checking here > >> > >> else: > >> val = arg > >> > >> # Common processing using WIDE API > >> > >> if arg_is_bytes: > >> result = _encode_mbcs(wide_result) > >> # Encoding error checking here > >> > >> else: > >> result = wide_result > > > > This doesn't make the code shorter, it may be longer than the actual > > code, and it is less compliant with the Windows native API... > > Is it possible to implement new PyArg_ParseTuple converter to use > PyUnicode_Decode(const char *s, > Py_ssize_t size, > const char *encoding, /* mbcs */ > const char *errors) /* replace */ > and use it? Yes, but how do you check if the input argument is a bytes or a str object with your PyArg_Parse converter? You should use "O" format and manually convert it to unicode, and then convert the result back to bytes (if the input was bytes). It don't think that it makes the code shorter. The code is currently working. The question is if we have to drop the ANSI API now, later or never. It looks like the decision moves to "later" (deprecate in 3.2, remove in 3.3). I still think that drop now doesn't really hurt. Victor From solipsis at pitrou.net Sun Nov 14 01:19:28 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 14 Nov 2010 01:19:28 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> Message-ID: <20101114011928.0f1e3d60@pitrou.net> On Sun, 14 Nov 2010 01:06:55 +0100 Victor Stinner wrote: > > The code is currently working. The question is if we have to drop the ANSI API > now, later or never. If the code is currently working and isn't a security hole, then we obviously don't "have to". Apparently several developers "want to", which is different. > It looks like the decision moves to "later" (deprecate in > 3.2, remove in 3.3). I still think that drop now doesn't really hurt. If you drop code without first deprecating it, chances are it will hurt someone. That's why having a deprecation period is the rule we usually follow (most of the time :-)). Regards Antoine. From ncoghlan at gmail.com Sun Nov 14 02:06:57 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 11:06:57 +1000 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <20101114011928.0f1e3d60@pitrou.net> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> <20101114011928.0f1e3d60@pitrou.net> Message-ID: On Sun, Nov 14, 2010 at 10:19 AM, Antoine Pitrou wrote: > On Sun, 14 Nov 2010 01:06:55 +0100 > Victor Stinner wrote: >> >> The code is currently working. The question is if we have to drop the ANSI API >> now, later or never. > > If the code is currently working and isn't a security hole, then we > obviously don't "have to". > Apparently several developers "want to", which is different. We should also keep in mind that *Microsoft* have chosen to keep the bytes Win32 APIs around, despite their flaws, all in the name of backwards compatibility. While the goal of nudging third party developers towards the superior Unicode APIs is an admirable one, it is still the case that there is a *lot* of ASCII-only code out there. E.g. applications could easily be storing filenames in an ASCII only datastore that provides them back to the application as bytes in 3.x. >> It looks like the decision moves to "later" (deprecate in >> 3.2, remove in 3.3). I still think that drop now doesn't really hurt. > > If you drop code without first deprecating it, chances are it will > hurt someone. ?That's why having a deprecation period is the rule we > usually follow (most of the time :-)). Indeed. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 14 02:28:31 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 11:28:31 +1000 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications Message-ID: Following the python-checkins list, I get to see both the current SVN notifications and the Hg notifications from Tarek's pushes into the distutils repository. I realised today that there is one key reason as to why the latter strikes me as a big wall of unintelligible text, while I find the SVN notification quite easy to read: vertical whitespace. The SVN notification uses vertical whitespace to separate out the log message and the list of files affected clearly from the rest of the header fields. It makes it *really* easy to see at a glance what the checkin was about and which files were affected. For the Hg notification, both of these fields are embedded in a big header block along with all the other fields, so it is quite difficult to make out the same information. It would be really nice if the formatting could be improved for the email notifications on the Hg side when we adopt it for the main CPython repository. The changes would be to: - add a blank line before and after the summary field - add a carriage return between the header and content for the summary field and the files field - indent the list of files by two spaces and use a carriage return rather than a comma to separate named files I've included an example below based on one of Tarek's recent pushes: Current Hg notification header and start of first diff: ================================================ tarek.ziade pushed 7ebf14ab2840 to distutils2: http://hg.python.org/distutils2/rev/7ebf14ab2840 changeset: 816:7ebf14ab2840 tag: tip user: Tarek Ziade date: Sat Nov 13 12:40:33 2010 +0100 summary: compiler_type -> name files: distutils2/compiler/__init__.py, distutils2/compiler/bcppcompiler.py, distutils2/compiler/ccompiler.py, distutils2/compiler/cygwinccompiler.py, distutils2/compiler/msvc9compiler.py, distutils2/compiler/msvccompiler.py, distutils2/compiler/unixccompiler.py, distutils2/tests/test_config.py diff --git a/distutils2/compiler/__init__.py b/distutils2/compiler/__init__.py --- a/distutils2/compiler/__init__.py +++ b/distutils2/compiler/__init__.py @@ -13,7 +13,7 @@ ==================================================== Proposed change to separate out summary and files fields: ================================================ tarek.ziade pushed 7ebf14ab2840 to distutils2: http://hg.python.org/distutils2/rev/7ebf14ab2840 changeset: 816:7ebf14ab2840 tag: tip user: Tarek Ziade date: Sat Nov 13 12:40:33 2010 +0100 summary: compiler_type -> name files: distutils2/compiler/__init__.py distutils2/compiler/bcppcompiler.py distutils2/compiler/ccompiler.py distutils2/compiler/cygwinccompiler.py distutils2/compiler/msvc9compiler.py distutils2/compiler/msvccompiler.py distutils2/compiler/unixccompiler.py distutils2/tests/test_config.py diff --git a/distutils2/compiler/__init__.py b/distutils2/compiler/__init__.py --- a/distutils2/compiler/__init__.py +++ b/distutils2/compiler/__init__.py @@ -13,7 +13,7 @@ ==================================================== Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From db3l.net at gmail.com Sun Nov 14 03:40:22 2010 From: db3l.net at gmail.com (David Bolen) Date: Sat, 13 Nov 2010 21:40:22 -0500 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> Message-ID: Antoine Pitrou writes: > (even though the Windows buildbots give > a rather unconventional meaning to the word "stability"). Nag, nag, nag .... :-) There's been a bit of an uptick in the past few weeks with hung python_d processes (not a new issue, but it ebbs and flows), so I'm going to try to pull together a monitor script this weekend to start killing them off automatically. Should at least get rid of some of the low hanging fruit that interferes with subsequent builds. -- David From tjreedy at udel.edu Sun Nov 14 04:10:11 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 13 Nov 2010 22:10:11 -0500 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: References: Message-ID: On 11/13/2010 8:28 PM, Nick Coghlan wrote: > Following the python-checkins list, I get to see both the current SVN > notifications and the Hg notifications from Tarek's pushes into the > distutils repository. I realised today that there is one key reason as > to why the latter strikes me as a big wall of unintelligible text, > while I find the SVN notification quite easy to read: vertical > whitespace. > > The SVN notification uses vertical whitespace to separate out the log > message and the list of files affected clearly from the rest of the > header fields. It makes it *really* easy to see at a glance what the > checkin was about and which files were affected. For the Hg > notification, both of these fields are embedded in a big header block > along with all the other fields, so it is quite difficult to make out > the same information. > > It would be really nice if the formatting could be improved for the > email notifications on the Hg side when we adopt it for the main > CPython repository. The changes would be to: > - add a blank line before and after the summary field > - add a carriage return between the header and content for the summary > field and the files field > - indent the list of files by two spaces and use a carriage return > rather than a comma to separate named files > > I've included an example below based on one of Tarek's recent pushes: > > Current Hg notification header and start of first diff: > ================================================ > tarek.ziade pushed 7ebf14ab2840 to distutils2: > > http://hg.python.org/distutils2/rev/7ebf14ab2840 > changeset: 816:7ebf14ab2840 > tag: tip > user: Tarek Ziade > date: Sat Nov 13 12:40:33 2010 +0100 > summary: compiler_type -> name > files: distutils2/compiler/__init__.py, > distutils2/compiler/bcppcompiler.py, distutils2/compiler/ccompiler.py, > distutils2/compiler/cygwinccompiler.py, > distutils2/compiler/msvc9compiler.py, > distutils2/compiler/msvccompiler.py, > distutils2/compiler/unixccompiler.py, distutils2/tests/test_config.py > > diff --git a/distutils2/compiler/__init__.py b/distutils2/compiler/__init__.py > --- a/distutils2/compiler/__init__.py > +++ b/distutils2/compiler/__init__.py > @@ -13,7 +13,7 @@ > ==================================================== > > Proposed change to separate out summary and files fields: > ================================================ > tarek.ziade pushed 7ebf14ab2840 to distutils2: > > http://hg.python.org/distutils2/rev/7ebf14ab2840 > changeset: 816:7ebf14ab2840 > tag: tip > user: Tarek Ziade > date: Sat Nov 13 12:40:33 2010 +0100 > > summary: > compiler_type -> name > > files: > distutils2/compiler/__init__.py > distutils2/compiler/bcppcompiler.py > distutils2/compiler/ccompiler.py > distutils2/compiler/cygwinccompiler.py > distutils2/compiler/msvc9compiler.py > distutils2/compiler/msvccompiler.py > distutils2/compiler/unixccompiler.py > distutils2/tests/test_config.py > > diff --git a/distutils2/compiler/__init__.py b/distutils2/compiler/__init__.py > --- a/distutils2/compiler/__init__.py > +++ b/distutils2/compiler/__init__.py > @@ -13,7 +13,7 @@ > ==================================================== Much better except possible for \n after 'summary:' -- Terry Jan Reedy From rdmurray at bitdance.com Sun Nov 14 04:40:52 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sat, 13 Nov 2010 22:40:52 -0500 Subject: [Python-Dev] unexpected traceback/stack behavior with chained exceptions (issue 1553375) Message-ID: <20101114034052.39AE81FC192@kimball.webabinitio.net> Issue 1553375 [1] proposes a patch to add an 'allframes' option to the traceback printing and formatting routines so that the full traceback from the top of the execution stack down to the exception is printed, instead of just from the point where the exception is caught down to the exception. This is useful when the reason you are capturing the traceback is to log it, and you have several different points in your application where you do such traceback logging. You often really want to know the entire stack in that case; logging only from the capture point down can lose important debugging information depending on how the application is structured. The patch seems to work well, except for one problem that I don't have enough CPython internals knowledge to understand. If the traceback we are printing has a chained traceback, the resulting full traceback shows the line that is printing the traceback instead of the line from the 'try' block. (It prints the expected line if there is no chained traceback). So, is this a failure in my understanding of how tracebacks are supposed to work, or a bug in how chained tracebacks are constructed? [1] http://bugs.python.org/issue1553375 -- R. David Murray www.bitdance.com From ncoghlan at gmail.com Sun Nov 14 09:22:31 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 18:22:31 +1000 Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> Message-ID: On Sun, Nov 14, 2010 at 12:40 PM, David Bolen wrote: > Antoine Pitrou writes: > >> (even though the Windows buildbots give >> a rather unconventional meaning to the word "stability"). > > Nag, nag, nag .... :-) > > There's been a bit of an uptick in the past few weeks with hung > python_d processes (not a new issue, but it ebbs and flows), so I'm > going to try to pull together a monitor script this weekend to start > killing them off automatically. ?Should at least get rid of some of > the low hanging fruit that interferes with subsequent builds. Do we have any idea why the workaround to avoid the popup windows stopped working? (assuming it ever worked reliably - I thought it did, but that impression may have been incorrect) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 14 09:25:27 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 18:25:27 +1000 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: References: Message-ID: On Sun, Nov 14, 2010 at 1:10 PM, Terry Reedy wrote: > Much better except possible for \n after 'summary:' That extra line break helps more for multi-line checkin messages (which happen reasonably often). Doesn't really bother me either way - I'm mainly looking for info on who has the ability to change the format in the first place :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From db3l.net at gmail.com Sun Nov 14 09:48:53 2010 From: db3l.net at gmail.com (David Bolen) Date: Sun, 14 Nov 2010 03:48:53 -0500 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> Message-ID: Nick Coghlan writes: > Do we have any idea why the workaround to avoid the popup windows > stopped working? (assuming it ever worked reliably - I thought it did, > but that impression may have been incorrect) Oh, the pop-up handling for the RTL dialogs still seems to be working fine (at least I haven't seen any since I put it in place). That, plus the original buildbot tweaks to block any OS popups still looks solid for avoiding any dialogs that block a test process. This is a completely separate issue, though probably around just as long, and like the popup problem its frequency changes over time. By "hung" here I'm referring to cases where something must go wrong with a test and/or its cleanup such that a python_d process remains running, usually several of them at the same time. So I end up with a bunch of python_d processes in the background (but not with any dialogs pending), which eventually cause errors during attempts the next time the same builder is used since the file remains in use. I expect some of this may be the lack of a good process group cleanup under Windows, though the root cause may not be unique to Windows. I see something very similar reasonable frequency on my OSX Tiger buildbot as well. But since the filesystem there can let the build tree get cleaned and rebuilt even with a stranded executable, the impact is minimal on subsequent tests than on Windows, though the OSX processes do burn a ton of CPU. I run a script on OSX to kill them off, but that was quick to whip up since in those cases the stranded processes all end up getting owned by init so it's a simple ps grep and kill. In the Windows case I'll probably just set a time limit so if the processes have been around more than a few hours I figure they're safe to kill. -- David From martin at v.loewis.de Sun Nov 14 11:09:08 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Nov 2010 11:09:08 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <20101114011928.0f1e3d60@pitrou.net> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> <20101114011928.0f1e3d60@pitrou.net> Message-ID: <4CDFB544.7000809@v.loewis.de> > If the code is currently working and isn't a security hole, then we > obviously don't "have to". > Apparently several developers "want to", which is different. In case the motivation for that isn't clear: it would produce a significant code reduction, and therefore ease maintenance. Regards, Martin From martin at v.loewis.de Sun Nov 14 11:14:27 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Nov 2010 11:14:27 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> <20101114011928.0f1e3d60@pitrou.net> Message-ID: <4CDFB683.5000709@v.loewis.de> > We should also keep in mind that *Microsoft* have chosen to keep the > bytes Win32 APIs around, despite their flaws, all in the name of > backwards compatibility. Of course, Microsoft is in a different position. If they remove a functionality in some release, their users typically can't go back and continue to use the old version - at least not on the same computer. For Python, it's different: our users can go back to use an old version if the new one breaks their applications. And we do break applications from time to time, most notably with the introduction of Python 3. > While the goal of nudging third party > developers towards the superior Unicode APIs is an admirable one, it > is still the case that there is a *lot* of ASCII-only code out there. The question is: is there also a lot of ASCII-only Python 3 software out there? And would developers of such software have difficulties to port it to a Unicode file name API. > E.g. applications could easily be storing filenames in an ASCII only > datastore that provides them back to the application as bytes in 3.x. That's speculation. My speculation would be that authors of such a datastore find that they can't even print the data anymore in a reasonable way, so they changed their API to return strings (i.e. decoding from ASCII) when they ported it to Python 3. They wouldn't even consider it a change, because it returned strings all the time, and now Python 3 has a different string type. >> If you drop code without first deprecating it, chances are it will >> hurt someone. That's why having a deprecation period is the rule we >> usually follow (most of the time :-)). I'm in favor of deprecating it first. Regards, Martin From martin at v.loewis.de Sun Nov 14 11:18:07 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Nov 2010 11:18:07 +0100 Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> Message-ID: <4CDFB75F.7020802@v.loewis.de> > This is a completely separate issue, though probably around just as > long, and like the popup problem its frequency changes over time. By > "hung" here I'm referring to cases where something must go wrong with > a test and/or its cleanup such that a python_d process remains > running, usually several of them at the same time. So I end up with a > bunch of python_d processes in the background (but not with any > dialogs pending), which eventually cause errors during attempts the > next time the same builder is used since the file remains in use. This is what kill_python.exe is supposed to solve. So I recommend to investigate why it fails to kill the hanging Pythons. Regards, Martin From martin at v.loewis.de Sun Nov 14 11:20:47 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Nov 2010 11:20:47 +0100 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: References: Message-ID: <4CDFB7FF.1000300@v.loewis.de> Am 14.11.2010 09:25, schrieb Nick Coghlan: > On Sun, Nov 14, 2010 at 1:10 PM, Terry Reedy wrote: >> Much better except possible for \n after 'summary:' > > That extra line break helps more for multi-line checkin messages > (which happen reasonably often). Doesn't really bother me either way - > I'm mainly looking for info on who has the ability to change the > format in the first place :) See http://hg.python.org/hooks/ You should have push permissions to that repository. Regards, Martin From db3l.net at gmail.com Sun Nov 14 11:32:25 2010 From: db3l.net at gmail.com (David Bolen) Date: Sun, 14 Nov 2010 05:32:25 -0500 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> <4CDFB75F.7020802@v.loewis.de> Message-ID: "Martin v. L?wis" writes: > This is what kill_python.exe is supposed to solve. So I recommend to > investigate why it fails to kill the hanging Pythons. Yeah, I know, and I can't say I disagree in principle - not sure why Windows doesn't let the kill in that module work (or if there's an issue actually running it under all conditions). At the moment though, I do know that using the sysinternals pskill utility externally (which is what I currently do interactively) definitely works so to be honest, automating that is a guaranteed bang for buck at this point with no analysis involved. Looking into kill_python or its use can be a follow-on. -- David From ncoghlan at gmail.com Sun Nov 14 12:41:59 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 21:41:59 +1000 Subject: [Python-Dev] unexpected traceback/stack behavior with chained exceptions (issue 1553375) In-Reply-To: <20101114034052.39AE81FC192@kimball.webabinitio.net> References: <20101114034052.39AE81FC192@kimball.webabinitio.net> Message-ID: On Sun, Nov 14, 2010 at 1:40 PM, R. David Murray wrote: > Issue 1553375 [1] proposes a patch to add an 'allframes' option to the > traceback printing and formatting routines so that the full traceback > from the top of the execution stack down to the exception is printed, > instead of just from the point where the exception is caught down to > the exception. ?This is useful when the reason you are capturing the > traceback is to log it, and you have several different points in your > application where you do such traceback logging. ?You often really want > to know the entire stack in that case; logging only from the capture > point down can lose important debugging information depending on how > the application is structured. > > The patch seems to work well, except for one problem that I don't have > enough CPython internals knowledge to understand. ?If the traceback we > are printing has a chained traceback, the resulting full traceback shows > the line that is printing the traceback instead of the line from the 'try' > block. ?(It prints the expected line if there is no chained traceback). > > So, is this a failure in my understanding of how tracebacks are supposed > to work, or a bug in how chained tracebacks are constructed? It looks to me like you're grabbing a reference to a frame that is currently executing and that frame has moved on since the exception was thrown (to your exception handler). The print_stack() call in the patch then accurately reflects this. The other thing to keep in mind is that the exception currently being handled is the *last* one produced by _iter_chain - all of the rest have already been caught and handled, it was the handlers for those that raised the subsequent exceptions in the chain. Basically, the whole patch strikes me as fundamentally misguided. If someone wants this information in their exception handler, they should put a print_stack() with the appropriate header information after the print_exception() call rather than trying to embed it in the display of the exception information. logging could also gain an independent "stack_trace=True" option to request inclusion of a stack trace independently of whether or not exception information is included. (Side note: there's a typo in the format_tb docstring claiming it is a wrapper around extract_stack - that's incorrect, it is a wrapper around extract_tb) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 14 12:44:19 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 21:44:19 +1000 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CDFB683.5000709@v.loewis.de> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> <20101114011928.0f1e3d60@pitrou.net> <4CDFB683.5000709@v.loewis.de> Message-ID: On Sun, Nov 14, 2010 at 8:14 PM, "Martin v. L?wis" wrote: > I'm in favor of deprecating it first. Aye. I've made the best case I could for keeping it, and even I don't find it terribly convincing. So deprecation for 3.2 sound like a reasonable option. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 14 12:46:41 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 21:46:41 +1000 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: <4CDFB7FF.1000300@v.loewis.de> References: <4CDFB7FF.1000300@v.loewis.de> Message-ID: On Sun, Nov 14, 2010 at 8:20 PM, "Martin v. L?wis" wrote: > Am 14.11.2010 09:25, schrieb Nick Coghlan: >> On Sun, Nov 14, 2010 at 1:10 PM, Terry Reedy wrote: >>> Much better except possible for \n after 'summary:' >> >> That extra line break helps more for multi-line checkin messages >> (which happen reasonably often). Doesn't really bother me either way - >> I'm mainly looking for info on who has the ability to change the >> format in the first place :) > > See > > http://hg.python.org/hooks/ > > You should have push permissions to that repository. Thanks - it will give me a chance to use Hg for something meaningful as well. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 14 13:39:40 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Nov 2010 22:39:40 +1000 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: <4CDFB7FF.1000300@v.loewis.de> References: <4CDFB7FF.1000300@v.loewis.de> Message-ID: On Sun, Nov 14, 2010 at 8:20 PM, "Martin v. L?wis" wrote: > See > > http://hg.python.org/hooks/ > > You should have push permissions to that repository. I suspect my hg-fu is inadequate to at this point - I get an 'access to repository "hg.python.org/hooks" not permitted' error when I try to push the modified file to "ssh://hg at hg.python.org/hooks". (I actually got the same error when cloning, but if I understand hg correctly, it shouldn't matter that my clone came from the http URL rather than the ssh one). My username and email address in my hgrc file match those in Dirkjan's author map, so I'm not sure what's going on there. The change I tried to make was to replace the last couple of lines of the header creation mail.py's incoming() function with the following 3 lines: body += log.splitlines()[:-2] body += ['summary:\n ' + ctx.description(), ''] body += ['files:\n ' + '\n '.join(ctx.files()), ''] Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Sun Nov 14 14:05:12 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 14 Nov 2010 14:05:12 +0100 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: References: <4CDFB7FF.1000300@v.loewis.de> Message-ID: Am 14.11.2010 13:39, schrieb Nick Coghlan: > On Sun, Nov 14, 2010 at 8:20 PM, "Martin v. L?wis" wrote: >> See >> >> http://hg.python.org/hooks/ >> >> You should have push permissions to that repository. > > I suspect my hg-fu is inadequate to at this point - I get an 'access > to repository "hg.python.org/hooks" not permitted' error when I try to > push the modified file to "ssh://hg at hg.python.org/hooks". Martin told you only half the truth: the SSH URL is (currently) . I think we will change that to remove the /repos/ part before going live with the cpython repo, but the hg username remains, corresponding to the pythondev username for SVN. > (I actually > got the same error when cloning, but if I understand hg correctly, it > shouldn't matter that my clone came from the http URL rather than the > ssh one). That's correct. > My username and email address in my hgrc file match those in Dirkjan's > author map, so I'm not sure what's going on there. The usernames and email addresses you use for commits don't matter; as long as you can connect via SSH you can push commits with any author. cheers, Georg From p.f.moore at gmail.com Sun Nov 14 18:31:19 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 14 Nov 2010 17:31:19 +0000 Subject: [Python-Dev] Issues 9931 and 9055 - test_ttk_guionly and buildbot run as a service In-Reply-To: References: Message-ID: On 12 November 2010 17:07, Terry Reedy wrote: > On 11/12/2010 3:44 AM, Paul Moore wrote: >> >> Hi, >> My buildbot has been failing for some time because of these 2 issues, >> both related to the fact that tests are hanging when run as a service >> (and hence have no display to open GUI elements on). Both issues have >> patches, and as far as I am aware, the patches fix the issues >> reasonably well. What can I do to move these 2 issues forwards? As >> things stand, my buildbot is not providing a lot of value on the 3.x >> branch :-( > > http://bugs.python.org/issue9055 > is marked as a 2.7 issue only, perhaps fixed by Tim Golden's committed > patches. Should it be re-versioned for 3.1/2? There is no patch file > attached, though perhaps the code in Yamamoto's message is meant as such > (but for which version?). So the first thing you could do is clarify the > current status and remaining issue on the tracker. Ah, sorry. I misremembered the history - you are right, I suspect this is fixed (at least to the extent that my buildbot isn't permanently red :-)) On rereading, I get the impression that a cleaner fix may be possible by using the ideas in the patch for 9931, but that's probably for another time. > http://bugs.python.org/issue9931 > by Yamamoto is marked for all 3 versions. It seems to be a similar issue, > though marked 'test' rather than 'ctypes'. It does have a patch by him > apparently based on his previous comments. The issue has no responses and > needs a patch review. So the first thing you could do is to provide one;-). > If it looks great (no changes that you can think of) and works great, say > so. Then it can move on to commit review stage. OK, thanks. I'll see if I can provide a review, and see how it goes from there. Really, it's not that urgent that this gets fixed in the wider scheme of things - but as my buildbot is a bit useless while the problem remains, I'm motivated to do what I can to work on it. I'm just a little limited in what I can do, hence the request for suggestions. > PS. Providing links like the above makes it easier for multiple people to > take a look and respond. You're right, and I apologise for that. I sent the email in a hurry and didn't consider others before sending. Paul From p.f.moore at gmail.com Sun Nov 14 18:49:36 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 14 Nov 2010 17:49:36 +0000 Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> Message-ID: On 14 November 2010 02:40, David Bolen wrote: > There's been a bit of an uptick in the past few weeks with hung > python_d processes (not a new issue, but it ebbs and flows), so I'm > going to try to pull together a monitor script this weekend to start > killing them off automatically. ?Should at least get rid of some of > the low hanging fruit that interferes with subsequent builds. My buildslave (x86 XP-5, see http://www.python.org/dev/buildbot/buildslaves/moore-windows) runs buildbot as a service. I set it up that way as I assumed that would be the most sensible approach to avoid manual intervention on reboots, keeping a user session permanently running, etc. But it seems that there are a few areas where things don't work quite right when run from a service (see, for example, http://bugs.python.org/issue9931) and I assumed that some of my hung python_d processes were related to that. Do you run your slave as a service? (And for that matter, what do other Windows slave owners do?) Are there any "best practices" for ongoing admin of a Windows buildslave that might be worth collecting together? (I'll try to put some notes on what I've found together - maybe a page on the Python wiki would be the best place to collect them). Paul. From martin at v.loewis.de Sun Nov 14 19:27:22 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Nov 2010 19:27:22 +0100 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: References: <4CDFB7FF.1000300@v.loewis.de> Message-ID: <4CE02A0A.1070207@v.loewis.de> > I suspect my hg-fu is inadequate to at this point - I get an 'access > to repository "hg.python.org/hooks" not permitted' error when I try to > push the modified file to "ssh://hg at hg.python.org/hooks". Try ssh://hg at hg.python.org/repos/hooks I think this is something that needs to be fixed: I fail to see the point of having this extra repos/ directory in the path (even though it's certainly useful to have them all in a separate directory on disk). It's also unfortunate that hg complains it can't give access to /hooks, when the problem really is that the repository doesn't exist. I guess this is because it tries to create it, and then finds that it can't. Regards, Martin From solipsis at pitrou.net Sun Nov 14 19:35:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 14 Nov 2010 19:35:07 +0100 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications References: <4CDFB7FF.1000300@v.loewis.de> <4CE02A0A.1070207@v.loewis.de> Message-ID: <20101114193507.7959c860@pitrou.net> On Sun, 14 Nov 2010 19:27:22 +0100 "Martin v. L?wis" wrote: > > I suspect my hg-fu is inadequate to at this point - I get an 'access > > to repository "hg.python.org/hooks" not permitted' error when I try to > > push the modified file to "ssh://hg at hg.python.org/hooks". > > Try > > ssh://hg at hg.python.org/repos/hooks > > I think this is something that needs to be fixed: I fail to see the > point of having this extra repos/ directory in the path (even though > it's certainly useful to have them all in a separate directory on disk). IIUC, "repos/hooks" is interpreted as a relative path to the "hg" user's HOME. The "ssh://" scheme executes remote hg over an ssh session, I don't think there's any additional magic. Regards Antoine. From martin at v.loewis.de Sun Nov 14 19:49:44 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 14 Nov 2010 19:49:44 +0100 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: <20101114193507.7959c860@pitrou.net> References: <4CDFB7FF.1000300@v.loewis.de> <4CE02A0A.1070207@v.loewis.de> <20101114193507.7959c860@pitrou.net> Message-ID: <4CE02F48.3040207@v.loewis.de> >> I think this is something that needs to be fixed: I fail to see the >> point of having this extra repos/ directory in the path (even though >> it's certainly useful to have them all in a separate directory on disk). > > IIUC, "repos/hooks" is interpreted as a relative path to the "hg" > user's HOME. The "ssh://" scheme executes remote hg over an ssh > session, I don't think there's any additional magic. Correct. However, this just means that additional magic is required. Regards, Martin From vinay_sajip at yahoo.co.uk Sun Nov 14 21:05:16 2010 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sun, 14 Nov 2010 20:05:16 +0000 (UTC) Subject: [Python-Dev] unexpected traceback/stack behavior with chained exceptions (issue 1553375) References: <20101114034052.39AE81FC192@kimball.webabinitio.net> Message-ID: Nick Coghlan gmail.com> writes: > of the exception information. logging could also gain an independent > "stack_trace=True" option to request inclusion of a stack trace > independently of whether or not exception information is included. Good point, Nick. There are times when you'd want to know how you got to a certain point in code, irrespective of whether any exception occurred. So your suggestion makes sense, and I'll try and see if I can get it into 3.2. Another benefit of this is that a user only gets this if they want it; if I were to use the allframes flag in logging, then everyone would get the print_stack() even if they didn't want it. Regards, Vinay Sajip From g.brandl at gmx.net Sun Nov 14 21:36:37 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 14 Nov 2010 21:36:37 +0100 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: <20101114193507.7959c860@pitrou.net> References: <4CDFB7FF.1000300@v.loewis.de> <4CE02A0A.1070207@v.loewis.de> <20101114193507.7959c860@pitrou.net> Message-ID: Am 14.11.2010 19:35, schrieb Antoine Pitrou: > On Sun, 14 Nov 2010 19:27:22 +0100 > "Martin v. L?wis" wrote: >> > I suspect my hg-fu is inadequate to at this point - I get an 'access >> > to repository "hg.python.org/hooks" not permitted' error when I try to >> > push the modified file to "ssh://hg at hg.python.org/hooks". >> >> Try >> >> ssh://hg at hg.python.org/repos/hooks >> >> I think this is something that needs to be fixed: I fail to see the >> point of having this extra repos/ directory in the path (even though >> it's certainly useful to have them all in a separate directory on disk). > > IIUC, "repos/hooks" is interpreted as a relative path to the "hg" > user's HOME. The "ssh://" scheme executes remote hg over an ssh > session, I don't think there's any additional magic. There is; we already have a custom authorized_keys command in place to call the hg-ssh wrapper, and all that's needed is to customize that command a bit more. Georg From db3l.net at gmail.com Sun Nov 14 22:24:55 2010 From: db3l.net at gmail.com (David Bolen) Date: Sun, 14 Nov 2010 16:24:55 -0500 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> Message-ID: Paul Moore writes: > Do you run your slave as a service? (And for that matter, what do > other Windows slave owners do?) Are there any "best practices" for > ongoing admin of a Windows buildslave that might be worth collecting > together? (I'll try to put some notes on what I've found together - > maybe a page on the Python wiki would be the best place to collect > them). I've always run my slave interactively under Windows (well, started it interactively). Not sure if I tried a service in the beginning or not, it was a while ago. So your slave is probably the guinea pig for service operation. There is http://wiki.python.org/moin/BuildbotOnWindows (for which I can't take any credit). It could probably use a little love and updating, and it's largely aimed at setting things up, but not as much operating it. I think the only stuff I'm doing on my slave above and beyond the basic setup is a small patch to buildbot (circa 2007, couldn't get it back upstream at the time) to use SetErrorMode to disable OS pop-ups, and the AutoIt script (from earlier this year) to auto-acknowledge C RTL pop-ups. The kill script in this thread as a safety net above kill_python would be a third tweak. There was a buildbot fix for uploading that was only needed for the short-lived MSI generation, and which I think later buildbot versions have their own changes for. I'd be happy to work with you if you're willing to combine/edit our bits of information. Probably something we can take off-list, so just let me know. -- David From ncoghlan at gmail.com Mon Nov 15 12:45:46 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 15 Nov 2010 21:45:46 +1000 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: <4CE02A0A.1070207@v.loewis.de> References: <4CDFB7FF.1000300@v.loewis.de> <4CE02A0A.1070207@v.loewis.de> Message-ID: On Mon, Nov 15, 2010 at 4:27 AM, "Martin v. L?wis" wrote: >> I suspect my hg-fu is inadequate to at this point - I get an 'access >> to repository "hg.python.org/hooks" not permitted' error when I try to >> push the modified file to "ssh://hg at hg.python.org/hooks". > > Try > > ssh://hg at hg.python.org/repos/hooks And done :) Hopefully I didn't break anything in the process... Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Nov 15 14:24:01 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 15 Nov 2010 23:24:01 +1000 Subject: [Python-Dev] [Python-checkins] r86467 - in python/branches/py3k: Doc/library/logging.rst Lib/logging/__init__.py Misc/NEWS In-Reply-To: <20101114213304.ED32AEE997@mail.python.org> References: <20101114213304.ED32AEE997@mail.python.org> Message-ID: On Mon, Nov 15, 2010 at 7:33 AM, vinay.sajip wrote: > > + ? .. attribute:: stack_info > + > + ? ? ?Stack frame information (where available) from the bottom of the stack > + ? ? ?in the current thread, up to and including the stack frame of the > + ? ? ?logging call which resulted in the creation of this record. > + Interesting - my mental model of the call stack is that the outermost frame is the top of the stack and the stack grows downwards as calls are executed (there are a few idioms like "recursive descent", the intuitive parallel with "inner functions" being lower in the stack than "outer functions" as well as the order in which Python prints stack traces that reinforce this view). According to the sys.getframe documentation, my mental model is wrong though :) (I'll note that the documentation of frame objects in the language reference itself appears a little confused on the matter - either that or I'm completely misunderstanding when writing to f_lineno will work) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From reid.kleckner at gmail.com Mon Nov 15 18:01:36 2010 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Mon, 15 Nov 2010 12:01:36 -0500 Subject: [Python-Dev] [Python-checkins] r86467 - in python/branches/py3k: Doc/library/logging.rst Lib/logging/__init__.py Misc/NEWS In-Reply-To: References: <20101114213304.ED32AEE997@mail.python.org> Message-ID: On Mon, Nov 15, 2010 at 8:24 AM, Nick Coghlan wrote: > On Mon, Nov 15, 2010 at 7:33 AM, vinay.sajip wrote: >> >> + ? .. attribute:: stack_info >> + >> + ? ? ?Stack frame information (where available) from the bottom of the stack >> + ? ? ?in the current thread, up to and including the stack frame of the >> + ? ? ?logging call which resulted in the creation of this record. >> + > > Interesting - my mental model of the call stack is that the outermost > frame is the top of the stack and the stack grows downwards as calls > are executed (there are a few idioms like "recursive descent", the > intuitive parallel with "inner functions" being lower in the stack > than "outer functions" as well as the order in which Python prints > stack traces that reinforce this view). Probably because the C stack tends to grow down for most architectures, but most stack data structures are implemented over arrays and hence, grow upwards from 0. Depending on the author's background, they probably use one mental model or the other. Reid From techtonik at gmail.com Mon Nov 15 21:43:08 2010 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 15 Nov 2010 22:43:08 +0200 Subject: [Python-Dev] PEP 385: Formatting of Hg checkin notifications In-Reply-To: References: Message-ID: On Sun, Nov 14, 2010 at 5:10 AM, Terry Reedy wrote: > On 11/13/2010 8:28 PM, Nick Coghlan wrote: >> >> Following the python-checkins list, I get to see both the current SVN >> notifications and the Hg notifications from Tarek's pushes into the >> distutils repository. I realised today that there is one key reason as >> to why the latter strikes me as a big wall of unintelligible text, >> while I find the SVN notification quite easy to read: vertical >> whitespace. >> >> The SVN notification uses vertical whitespace to separate out the log >> message and the list of files affected clearly from the rest of the >> header fields. It makes it *really* easy to see at a glance what the >> checkin was about and which files were affected. For the Hg >> notification, both of these fields are embedded in a big header block >> along with all the other fields, so it is quite difficult to make out >> the same information. >> >> It would be really nice if the formatting could be improved for the >> email notifications on the Hg side when we adopt it for the main >> CPython repository. The changes would be to: >> - add a blank line before and after the summary field >> - add a carriage return between the header and content for the summary >> field and the files field >> - indent the list of files by two spaces and use a carriage return >> rather than a comma to separate named files >> >> I've included an example below based on one of Tarek's recent pushes: >> >> Current Hg notification header and start of first diff: >> ================================================ >> tarek.ziade pushed 7ebf14ab2840 to distutils2: >> >> http://hg.python.org/distutils2/rev/7ebf14ab2840 >> changeset: ? 816:7ebf14ab2840 >> tag: ? ? ? ? tip >> user: ? ? ? ?Tarek Ziade >> date: ? ? ? ?Sat Nov 13 12:40:33 2010 +0100 >> summary: ? ? compiler_type -> ?name >> files: ? ? ? distutils2/compiler/__init__.py, >> distutils2/compiler/bcppcompiler.py, distutils2/compiler/ccompiler.py, >> distutils2/compiler/cygwinccompiler.py, >> distutils2/compiler/msvc9compiler.py, >> distutils2/compiler/msvccompiler.py, >> distutils2/compiler/unixccompiler.py, distutils2/tests/test_config.py >> >> diff --git a/distutils2/compiler/__init__.py >> b/distutils2/compiler/__init__.py >> --- a/distutils2/compiler/__init__.py >> +++ b/distutils2/compiler/__init__.py >> @@ -13,7 +13,7 @@ >> ==================================================== >> >> Proposed change to separate out summary and files fields: >> ================================================ >> tarek.ziade pushed 7ebf14ab2840 to distutils2: >> >> http://hg.python.org/distutils2/rev/7ebf14ab2840 >> changeset: ? 816:7ebf14ab2840 >> tag: ? ? ? ? tip >> user: ? ? ? ?Tarek Ziade >> date: ? ? ? ?Sat Nov 13 12:40:33 2010 +0100 >> >> summary: >> compiler_type -> ?name >> >> files: >> ? distutils2/compiler/__init__.py >> ? distutils2/compiler/bcppcompiler.py >> ? distutils2/compiler/ccompiler.py >> ? distutils2/compiler/cygwinccompiler.py >> ? distutils2/compiler/msvc9compiler.py >> ? distutils2/compiler/msvccompiler.py >> ? distutils2/compiler/unixccompiler.py >> ? distutils2/tests/test_config.py >> >> diff --git a/distutils2/compiler/__init__.py >> b/distutils2/compiler/__init__.py >> --- a/distutils2/compiler/__init__.py >> +++ b/distutils2/compiler/__init__.py >> @@ -13,7 +13,7 @@ >> ==================================================== > > Much better except possible for \n after 'summary:' Why not to drop "summary" label at all? The purpose of the text delimited with newlines is quite obvious. -- anatoly t. From brian.curtin at gmail.com Tue Nov 16 01:23:51 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Mon, 15 Nov 2010 18:23:51 -0600 Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> Message-ID: On Sun, Nov 14, 2010 at 02:48, David Bolen wrote: > Nick Coghlan writes: > > > Do we have any idea why the workaround to avoid the popup windows > > stopped working? (assuming it ever worked reliably - I thought it did, > > but that impression may have been incorrect) > > Oh, the pop-up handling for the RTL dialogs still seems to be working > fine (at least I haven't seen any since I put it in place). That, plus > the original buildbot tweaks to block any OS popups still looks solid > for avoiding any dialogs that block a test process. > > This is a completely separate issue, though probably around just as > long, and like the popup problem its frequency changes over time. By > "hung" here I'm referring to cases where something must go wrong with > a test and/or its cleanup such that a python_d process remains > running, usually several of them at the same time. So I end up with a > bunch of python_d processes in the background (but not with any > dialogs pending), which eventually cause errors during attempts the > next time the same builder is used since the file remains in use. > > I expect some of this may be the lack of a good process group cleanup > under Windows, though the root cause may not be unique to Windows. I > see something very similar reasonable frequency on my OSX Tiger > buildbot as well. But since the filesystem there can let the build > tree get cleaned and rebuilt even with a stranded executable, the > impact is minimal on subsequent tests than on Windows, though the OSX > processes do burn a ton of CPU. I run a script on OSX to kill them > off, but that was quick to whip up since in those cases the stranded > processes all end up getting owned by init so it's a simple ps grep > and kill. In the Windows case I'll probably just set a time limit so > if the processes have been around more than a few hours I figure > they're safe to kill. > > -- David Is the dialog closer script available somewhere? I'm guessing this is the same script that closes the window which pops up during test_capi's crash? I just setup a Windows Server 2008 R2 x64 build slave and noticed it hanging due to the popup. -------------- next part -------------- An HTML attachment was scrubbed... URL: From db3l.net at gmail.com Tue Nov 16 03:35:05 2010 From: db3l.net at gmail.com (David Bolen) Date: Mon, 15 Nov 2010 21:35:05 -0500 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> Message-ID: Brian Curtin writes: > Is the dialog closer script available somewhere? I'm guessing this is the > same script that closes the window which pops up during test_capi's crash? Not sure about that specific test, as I won't normally see the windows. If the failure is causing a C RTL pop-up, then yes, the script will be closing it. If the test is generating an OS level pop-up (process error dialog from the OS, not RTL) then that is instead suppressed for any of the child processes run on my slave, so it never shows up at all. The RTL script is trivial enough that I'll just include it inline: - - - - - - - - - - - - - - - - - - - - - - - - - ; buildbot.au3 ; Forceably acknowledge any RTL pop-ups that may occur during testing $MSVCRT = "Microsoft Visual C++ Runtime Library" while 1 ; Wait for any RTL pop-up and then acknowledge WinWait($MSVCRT) ControlClick($MSVCRT, "", "[CLASS:Button; TEXT:OK]") ; Safety check to avoid spinning if it doesn't go away Sleep(1000) WEnd - - - - - - - - - - - - - - - - - - - - - - - - - Execute with AutoIt3 (http://www.autoitscript.com/autoit3/). I just use the plain autoit3.exe against this script from the Startup folder. The error mode buildbot patch was discussed in the past on this list (or it might have been the python-3000-devel list at the time). Originally it just used pywin32, but I added a fallback to ctypes if available. When first done, we were still building pre-2.5 builds - I suppose at this point it could just assume the presence of ctypes. The patch below is from 0.7.11p3: - - - - - - - - - - - - - - - - - - - - - - - - - --- commands.py 2009-08-13 11:53:17.000000000 -0400 +++ /cygdrive/d/python/2.6/lib/site-packages/buildbot/slave/commands.py 2009-11-08 02:09:38.000000000 -0500 @@ -489,6 +489,23 @@ if not self.keepStdinOpen: self.pp.closeStdin() + # [db3l] Under Win32, try to control error mode + win32_SetErrorMode = None + if runtime.platformType == 'win32': + try: + import win32api + win32_SetErrorMode = win32api.SetErrorMode + except: + try: + import ctypes + win32_SetErrorMode = ctypes.windll.kernel32.SetErrorMode + except: + pass + + if win32_SetErrorMode: + log.msg(" Setting Windows error mode") + old_err_mode = win32_SetErrorMode(7) + # win32eventreactor's spawnProcess (under twisted <= 2.0.1) returns # None, as opposed to all the posixbase-derived reactors (which # return the new Process object). This is a nuisance. We can make up @@ -509,6 +526,10 @@ if not self.process: self.process = p + # [db3l] + if win32_SetErrorMode: + win32_SetErrorMode(old_err_mode) + # connectionMade also closes stdin as long as we're not using a PTY. # This is intended to kill off inappropriately interactive commands # better than the (long) hung-command timeout. ProcessPTY should be - - - - - - - - - - - - - - - - - - - - - - - - - -- David From janssen at parc.com Tue Nov 16 04:57:10 2010 From: janssen at parc.com (Bill Janssen) Date: Mon, 15 Nov 2010 19:57:10 PST Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> Message-ID: <30929.1289879830@parc.com> Both the Tiger buildbots are suddenly failing 3.x on test_cmd_line. Looking at the changes since the last success, I can't see anything which would obviously affect that... Any suspects? Here's what's failing: ====================================================================== ERROR: test_run_code (test.test_cmd_line.CmdLineTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py", line 95, in test_run_code assert_python_failure('-c') File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py", line 55, in assert_python_failure return _assert_python(False, *args, **env_vars) File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py", line 29, in _assert_python env=env) File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", line 683, in __init__ self.stdin = io.open(p2cwrite, 'wb', bufsize) OSError: [Errno 9] Bad file descriptor ====================================================================== ERROR: test_run_module (test.test_cmd_line.CmdLineTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py", line 72, in test_run_module assert_python_failure('-m') File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py", line 55, in assert_python_failure return _assert_python(False, *args, **env_vars) File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py", line 29, in _assert_python env=env) File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", line 683, in __init__ self.stdin = io.open(p2cwrite, 'wb', bufsize) OSError: [Errno 9] Bad file descriptor ====================================================================== ERROR: test_version (test.test_cmd_line.CmdLineTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py", line 48, in test_version rc, out, err = assert_python_ok('-V') File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py", line 48, in assert_python_ok return _assert_python(True, *args, **env_vars) File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py", line 29, in _assert_python env=env) File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", line 683, in __init__ self.stdin = io.open(p2cwrite, 'wb', bufsize) OSError: [Errno 9] Bad file descriptor Bill From nad at acm.org Tue Nov 16 10:21:29 2010 From: nad at acm.org (Ned Deily) Date: Tue, 16 Nov 2010 01:21:29 -0800 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> <30929.1289879830@parc.com> Message-ID: In article <30929.1289879830 at parc.com>, Bill Janssen wrote: > Both the Tiger buildbots are suddenly failing 3.x on test_cmd_line. > Looking at the changes since the last success, I can't see anything > which would obviously affect that... Any suspects? It appears to be a duplicate of Issue8458. Playing with it again, it seems to be a race condition: sometimes I see all three failures you reported, sometimes just one, sometimes none. Again, only on 10.4 (Tiger), not 10.5 or 10.6. But the 10.4 machine I'm using is by far the slowest of the three so it is possible that could be a factor. Perhaps a race condition with cleaning up the p2c pipe from a previous run? > Here's what's failing: > > ====================================================================== > ERROR: test_run_code (test.test_cmd_line.CmdLineTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py" > , line 95, in test_run_code > assert_python_failure('-c') > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > , line 55, in assert_python_failure > return _assert_python(False, *args, **env_vars) > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > , line 29, in _assert_python > env=env) > File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", > line 683, in __init__ > self.stdin = io.open(p2cwrite, 'wb', bufsize) > OSError: [Errno 9] Bad file descriptor > > ====================================================================== > ERROR: test_run_module (test.test_cmd_line.CmdLineTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py" > , line 72, in test_run_module > assert_python_failure('-m') > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > , line 55, in assert_python_failure > return _assert_python(False, *args, **env_vars) > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > , line 29, in _assert_python > env=env) > File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", > line 683, in __init__ > self.stdin = io.open(p2cwrite, 'wb', bufsize) > OSError: [Errno 9] Bad file descriptor > > ====================================================================== > ERROR: test_version (test.test_cmd_line.CmdLineTest) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py" > , line 48, in test_version > rc, out, err = assert_python_ok('-V') > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > , line 48, in assert_python_ok > return _assert_python(True, *args, **env_vars) > File > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > , line 29, in _assert_python > env=env) > File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", > line 683, in __init__ > self.stdin = io.open(p2cwrite, 'wb', bufsize) > OSError: [Errno 9] Bad file descriptor -- Ned Deily, nad at acm.org From georg at python.org Tue Nov 16 15:05:51 2010 From: georg at python.org (Georg Brandl) Date: Tue, 16 Nov 2010 15:05:51 +0100 Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4 Message-ID: <4CE28FBF.9020200@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team, I'm happy to announce the fourth and (this time really) final alpha preview release of Python 3.2. Python 3.2 is a continuation of the efforts to improve and stabilize the Python 3.x line. Since the final release of Python 2.7, the 2.x line will only receive bugfixes, and new features are developed for 3.x only. Since PEP 3003, the Moratorium on Language Changes, is in effect, there are no changes in Python's syntax and built-in types in Python 3.2. Development efforts concentrated on the standard library and support for porting code to Python 3. Highlights are: * numerous improvements to the unittest module * PEP 3147, support for .pyc repository directories * PEP 3149, support for version tagged dynamic libraries * an overhauled GIL implementation that reduces contention * many consistency and behavior fixes for numeric operations * countless fixes regarding string/unicode issues; among them full support for a bytes environment (filenames, environment variables) * a sysconfig module to access configuration information * a pure-Python implementation of the datetime module * additions to the shutil module, among them archive file support * improvements to pdb, the Python debugger For an extensive list of changes in 3.2, see Misc/NEWS in the Python distribution. To download Python 3.2 visit: http://www.python.org/download/releases/3.2/ 3.2 documentation can be found at: http://docs.python.org/3.2/ Please consider trying Python 3.2 with your code and reporting any bugs you may notice to: http://bugs.python.org/ Enjoy! - -- Georg Brandl, Release Manager georg at python.org (on behalf of the entire python-dev team and 3.2's contributors) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) iEYEARECAAYFAkzij74ACgkQN9GcIYhpnLCbtwCgi4whRruM0Oi6yfgjVclYErFa OJcAn0U8UBBsQBFyGcnKJRbls6B+guQ2 =Vuqf -----END PGP SIGNATURE----- From p.f.moore at gmail.com Tue Nov 16 16:05:49 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 16 Nov 2010 15:05:49 +0000 Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4 In-Reply-To: References: <4CE28FBF.9020200@python.org> Message-ID: (Copying to the list, sorry Georg for the duplicate) On 16 November 2010 14:05, Georg Brandl wrote: > On behalf of the Python development team, I'm happy to announce the > fourth and (this time really) final alpha preview release of Python 3.2. PEP 3148 (Futures) is noted in the PEP as going into 3.2, It also seems to be in the release. Should it not be added to the "What's new in 3.2" document and the release announcements? It's a fairly significant feature. Paul. From alexander.belopolsky at gmail.com Tue Nov 16 16:16:15 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 10:16:15 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: What this thread has shown is that there is no consensus on what public names are and what rules should be followed when changing names that can be imported from a module. I have opened an issue at http://bugs.python.org/issue10434 to address this. My vote is to adopt the definition spelled out in the language reference, copy it to the library manual and add some discussion of the deprecation policies. I also have a similar question about C API. Here, in absence of __all__, the answer should be clear: all symbols in public header files should start with either _Py_ or Py_ and those that start with Py_ are public. The question is what should be done with names that start with Py_, but are not documented? Can we add an underscore to those names? If so, should a (deprecated) alias be made available? Should they be documented as deprecated? I think these questions can only be answered on a case by case bases which choices being: 1. Document. 2. Document as deprecated. 3. Document as deprecated, add underscore prefix and retain a deprecated alias. 4. Add an underscore prefix. The specific set of names that I would like to consider is the following from unicode.h. I am marking with (*) the names that I think should be documented and with (D) those that should be deprecated: PyUnicode_GetMax PyUnicode_Resize (*) PyUnicode_InternImmortal PyUnicode_FromOrdinal (*) PyUnicode_GetDefaultEncoding (D) PyUnicode_AsDecodedObject PyUnicode_AsDecodedUnicode PyUnicode_AsEncodedObject PyUnicode_AsEncodedUnicode PyUnicode_BuildEncodingMap PyUnicode_EncodeDecimal (*) PyUnicode_Append (*) PyUnicode_AppendAndDel (*) PyUnicode_Partition (*) PyUnicode_RPartition (*) PyUnicode_RSplit (*) PyUnicode_IsIdentifier (*) Py_UNICODE_strlen Py_UNICODE_strcpy Py_UNICODE_strcat Py_UNICODE_strncpy Py_UNICODE_strcmp Py_UNICODE_strncmp Py_UNICODE_strchr Py_UNICODE_strrchr On Sat, Nov 13, 2010 at 7:12 AM, Giampaolo Rodol? wrote: > +1 on everything. > > 2010/11/11 Alexander Belopolsky : >> 2010/11/11 Michael Foord : >> .. >>>> You mean runtime automation, e.g. creating __all__ on the fly omitting >>>> underscored names? >>>> >>> Writing code to generate a __all__ that duplicates the default behaviour >>> seems redundant to me. >>> >> >> FWIW, I like having __all__ at the top of the module. ?It feels like a >> table of contents at the start of a chapter. ?In some cases it may >> also serve as an optimization when len(__all__) is much smaller than >> len(__dict__). ?I also don't like _ prefix to become an exclusive >> means to express privateness. >> >> I think the current definition of "public names" is a good one and >> just needs to be made more visible in the docs. ?If the module defines >> __all__, that should be the ultimate answer to what is public in that >> module. ? (Users should learn to use help(module) instead of >> dir(module) for API discovery.) ? If __all__ is not defined in the >> module, I think it is good to introduce it after a careful review of >> what it should contain. ?And __all__ should never contain names that >> start with _. >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com >> > From fuzzyman at voidspace.org.uk Tue Nov 16 16:31:10 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 16 Nov 2010 15:31:10 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: <4CE2A3BE.6060308@voidspace.org.uk> On 16/11/2010 15:16, Alexander Belopolsky wrote: > What this thread has shown is that there is no consensus on what > public names are and what rules should be followed when changing names > that can be imported from a module. I have opened an issue at > http://bugs.python.org/issue10434 to address this. My vote is to > adopt the definition spelled out in the language reference, copy it to > the library manual and add some discussion of the deprecation > policies. > Whilst the definition in the reference manual is fine it only covers module level public APIs (which I realise is your particular concern) it doesn't cover whether a module in a package is public and doesn't cover class members. The rules for these follow as a natural extension, but if we are going to bother codifying the rules (which I think is good given the confusion) then it is worth covering these cases. I posted a suggested wording in an earlier message: http://mail.python.org/pipermail/python-dev/2010-November/105476.html We could also note that existing modules that don't follow these rules will generally follow the deprecation rules for "accidentally public" names, but that this will be decided on a case-by-case basis and that names *obviously* never intended to be public may be changed if it is believed that they aren't (or really shouldn't be) in use. All the best, Michael Foord > I also have a similar question about C API. Here, in absence of > __all__, the answer should be clear: all symbols in public header > files should start with either _Py_ or Py_ and those that start with > Py_ are public. The question is what should be done with names that > start with Py_, but are not documented? Can we add an underscore to > those names? If so, should a (deprecated) alias be made available? > Should they be documented as deprecated? > > I think these questions can only be answered on a case by case bases > which choices being: > > 1. Document. > 2. Document as deprecated. > 3. Document as deprecated, add underscore prefix and retain a deprecated alias. > 4. Add an underscore prefix. > > The specific set of names that I would like to consider is the > following from unicode.h. I am marking with (*) the names that I > think should be documented and with (D) those that should be > deprecated: > > PyUnicode_GetMax > PyUnicode_Resize (*) > PyUnicode_InternImmortal > PyUnicode_FromOrdinal (*) > PyUnicode_GetDefaultEncoding (D) > PyUnicode_AsDecodedObject > PyUnicode_AsDecodedUnicode > PyUnicode_AsEncodedObject > PyUnicode_AsEncodedUnicode > PyUnicode_BuildEncodingMap > PyUnicode_EncodeDecimal (*) > PyUnicode_Append (*) > PyUnicode_AppendAndDel (*) > PyUnicode_Partition (*) > PyUnicode_RPartition (*) > PyUnicode_RSplit (*) > PyUnicode_IsIdentifier (*) > Py_UNICODE_strlen > Py_UNICODE_strcpy > Py_UNICODE_strcat > Py_UNICODE_strncpy > Py_UNICODE_strcmp > Py_UNICODE_strncmp > Py_UNICODE_strchr > Py_UNICODE_strrchr > > > On Sat, Nov 13, 2010 at 7:12 AM, Giampaolo Rodol? wrote: >> +1 on everything. >> >> 2010/11/11 Alexander Belopolsky: >>> 2010/11/11 Michael Foord: >>> .. >>>>> You mean runtime automation, e.g. creating __all__ on the fly omitting >>>>> underscored names? >>>>> >>>> Writing code to generate a __all__ that duplicates the default behaviour >>>> seems redundant to me. >>>> >>> FWIW, I like having __all__ at the top of the module. It feels like a >>> table of contents at the start of a chapter. In some cases it may >>> also serve as an optimization when len(__all__) is much smaller than >>> len(__dict__). I also don't like _ prefix to become an exclusive >>> means to express privateness. >>> >>> I think the current definition of "public names" is a good one and >>> just needs to be made more visible in the docs. If the module defines >>> __all__, that should be the ultimate answer to what is public in that >>> module. (Users should learn to use help(module) instead of >>> dir(module) for API discovery.) If __all__ is not defined in the >>> module, I think it is good to introduce it after a careful review of >>> what it should contain. And __all__ should never contain names that >>> start with _. >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/g.rodola%40gmail.com >>> -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From mal at egenix.com Tue Nov 16 16:38:04 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 16 Nov 2010 16:38:04 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: <4CE2A55C.8030807@egenix.com> Alexander Belopolsky wrote: > What this thread has shown is that there is no consensus on what > public names are and what rules should be followed when changing names > that can be imported from a module. I have opened an issue at > http://bugs.python.org/issue10434 to address this. My vote is to > adopt the definition spelled out in the language reference, copy it to > the library manual and add some discussion of the deprecation > policies. > > I also have a similar question about C API. Here, in absence of > __all__, the answer should be clear: all symbols in public header > files should start with either _Py_ or Py_ and those that start with > Py_ are public. The question is what should be done with names that > start with Py_, but are not documented? Can we add an underscore to > those names? If so, should a (deprecated) alias be made available? > Should they be documented as deprecated? > > I think these questions can only be answered on a case by case bases > which choices being: > > 1. Document. > 2. Document as deprecated. > 3. Document as deprecated, add underscore prefix and retain a deprecated alias. > 4. Add an underscore prefix. > > The specific set of names that I would like to consider is the > following from unicode.h. I am marking with (*) the names that I > think should be documented and with (D) those that should be > deprecated: > > PyUnicode_GetMax > PyUnicode_Resize (*) > PyUnicode_InternImmortal > PyUnicode_FromOrdinal (*) > PyUnicode_GetDefaultEncoding (D) > PyUnicode_AsDecodedObject > PyUnicode_AsDecodedUnicode > PyUnicode_AsEncodedObject > PyUnicode_AsEncodedUnicode > PyUnicode_BuildEncodingMap > PyUnicode_EncodeDecimal (*) > PyUnicode_Append (*) > PyUnicode_AppendAndDel (*) > PyUnicode_Partition (*) > PyUnicode_RPartition (*) > PyUnicode_RSplit (*) > PyUnicode_IsIdentifier (*) > Py_UNICODE_strlen > Py_UNICODE_strcpy > Py_UNICODE_strcat > Py_UNICODE_strncpy > Py_UNICODE_strcmp > Py_UNICODE_strncmp > Py_UNICODE_strchr > Py_UNICODE_strrchr For Unicode, unicodeobject.h defines which APIs are private or not. APIs which don't appear in the header file are either private or need to be added to the header file (but I don't think there are any in this category). All APIs in the header that do not appear in the documentation, should be added there as well. unicodeobject.h already provides documentation for most of the APIs you've listed above (except some new ones that were added later on). One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat obscure and given that we already have PyUnicode_Concat(), I think it should be made private and eventually dropped. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Tue Nov 16 16:48:20 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Nov 2010 07:48:20 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky wrote: > What this thread has shown is that there is no consensus on what > public names are and what rules should be followed when changing names > that can be imported from a module. ?I have opened an issue at > http://bugs.python.org/issue10434 to address this. ?My vote is to > adopt the definition spelled out in the language reference, copy it to > the library manual and add some discussion of the deprecation > policies. Hm. Apart from the specific semantics assigned by the language to single and double leading (and trailing) underscores, I still think this belongs in a style guide, not in the library manual. When reading the library manual, one should always assume that undocumented features are subject to change at any time. When writing library code, one should of course be much more conservative, and guidelines for contributors are needed to ensure that in the future we won't repeat the mistakes of the past (mostly my own mistakes :-). > I also have a similar question about C API. ?Here, in absence of > __all__, the answer should be clear: all symbols in public header > files should start with either _Py_ or Py_ and those that start with > Py_ are public. ? The question is what should be done with names that > start with Py_, but are not documented? ?Can we add an underscore to > those names? ?If so, should a (deprecated) alias be made available? > Should they be documented as deprecated? Even more care should be taken here, since breakage is harder to fix, especially in 3rd party code that needs to be compatible with a wide range of Python versions. The good news here is that the intended rule is very clear: - *no* symbols that don't start with Py_ or _Py_ (unless there's a technical reason why it can't be named that way) - public == Py_ - private == _Py_ > I think these questions can only be answered on a case by case bases Right! > which choices being: > > 1. Document. > 2. Document as deprecated. > 3. Document as deprecated, add underscore prefix and retain a deprecated alias. > 4. Add an underscore prefix. > > The specific set of names that I would like to consider is the > following from unicode.h. ?I am marking with (*) the names that I > think should be documented and with (D) those that should be > deprecated: > > PyUnicode_GetMax > PyUnicode_Resize (*) > PyUnicode_InternImmortal > PyUnicode_FromOrdinal (*) > PyUnicode_GetDefaultEncoding (D) > PyUnicode_AsDecodedObject > PyUnicode_AsDecodedUnicode > PyUnicode_AsEncodedObject > PyUnicode_AsEncodedUnicode > PyUnicode_BuildEncodingMap > PyUnicode_EncodeDecimal (*) > PyUnicode_Append (*) > PyUnicode_AppendAndDel (*) > PyUnicode_Partition (*) > PyUnicode_RPartition (*) > PyUnicode_RSplit (*) > PyUnicode_IsIdentifier (*) > Py_UNICODE_strlen > Py_UNICODE_strcpy > Py_UNICODE_strcat > Py_UNICODE_strncpy > Py_UNICODE_strcmp > Py_UNICODE_strncmp > Py_UNICODE_strchr > Py_UNICODE_strrchr I'll leave this to others more familiar with the Unicode code; I would recommend being fairly conservative though since these have been around for a long time. -- --Guido van Rossum (python.org/~guido) From janssen at parc.com Tue Nov 16 17:30:44 2010 From: janssen at parc.com (Bill Janssen) Date: Tue, 16 Nov 2010 08:30:44 PST Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> <30929.1289879830@parc.com> Message-ID: <45342.1289925044@parc.com> Ned Deily wrote: > In article <30929.1289879830 at parc.com>, Bill Janssen > wrote: > > > Both the Tiger buildbots are suddenly failing 3.x on test_cmd_line. > > Looking at the changes since the last success, I can't see anything > > which would obviously affect that... Any suspects? > > It appears to be a duplicate of Issue8458. Playing with it again, it > seems to be a race condition: sometimes I see all three failures you > reported, sometimes just one, sometimes none. Again, only on 10.4 > (Tiger), not 10.5 or 10.6. But the 10.4 machine I'm using is by far the > slowest of the three so it is possible that could be a factor. Good thought. It's also the slowest of my buildbots -- dual 1GHz PPC. > Perhaps a race condition with cleaning up the p2c pipe from a previous run? > > > Here's what's failing: > > > > ====================================================================== > > ERROR: test_run_code (test.test_cmd_line.CmdLineTest) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py" > > , line 95, in test_run_code > > assert_python_failure('-c') > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > > , line 55, in assert_python_failure > > return _assert_python(False, *args, **env_vars) > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > > , line 29, in _assert_python > > env=env) > > File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", > > line 683, in __init__ > > self.stdin = io.open(p2cwrite, 'wb', bufsize) > > OSError: [Errno 9] Bad file descriptor > > > > ====================================================================== > > ERROR: test_run_module (test.test_cmd_line.CmdLineTest) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py" > > , line 72, in test_run_module > > assert_python_failure('-m') > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > > , line 55, in assert_python_failure > > return _assert_python(False, *args, **env_vars) > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > > , line 29, in _assert_python > > env=env) > > File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", > > line 683, in __init__ > > self.stdin = io.open(p2cwrite, 'wb', bufsize) > > OSError: [Errno 9] Bad file descriptor > > > > ====================================================================== > > ERROR: test_version (test.test_cmd_line.CmdLineTest) > > ---------------------------------------------------------------------- > > Traceback (most recent call last): > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/test_cmd_line.py" > > , line 48, in test_version > > rc, out, err = assert_python_ok('-V') > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > > , line 48, in assert_python_ok > > return _assert_python(True, *args, **env_vars) > > File > > "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/test/script_helper.py" > > , line 29, in _assert_python > > env=env) > > File "/Users/buildbot/buildarea/3.x.parc-tiger-1/build/Lib/subprocess.py", > > line 683, in __init__ > > self.stdin = io.open(p2cwrite, 'wb', bufsize) > > OSError: [Errno 9] Bad file descriptor > > -- > Ned Deily, > nad at acm.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/bill%40janssen.org From exarkun at twistedmatrix.com Tue Nov 16 17:34:54 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Tue, 16 Nov 2010 16:34:54 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> On 03:48 pm, guido at python.org wrote: >On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky > wrote: >>What this thread has shown is that there is no consensus on what >>public names are and what rules should be followed when changing names >>that can be imported from a module. ?I have opened an issue at >>http://bugs.python.org/issue10434 to address this. ?My vote is to >>adopt the definition spelled out in the language reference, copy it to >>the library manual and add some discussion of the deprecation >>policies. > >Hm. Apart from the specific semantics assigned by the language to >single and double leading (and trailing) underscores, I still think >this belongs in a style guide, not in the library manual. When reading >the library manual, one should always assume that undocumented >features are subject to change at any time. I don't think it belongs only in PEP 8 (that's "a style guide" you're referring to, correct?). It needs to be front and center. This is information that every single user of the stdlib needs in order to use the stdlib correctly. Imagine trying to use a dictionary without knowing about alphabetical ordering. Or driving a car without knowing what lane markers indicate. No matter how many times we discuss this policy on this list (I know it's come up here before), the majority of python users still won't learn about it. PEP 8 isn't nearly visible enough, either. Whatever the rule is, it needs to be presented with the information itself. If the rule is that things not documented in the library manual have no compatibility guarantees, then all of the means of getting documentation *other* than looking at the library manual need to indicate this somehow (alternatively, the information shouldn't be duplicated, but I doubt I'll convince anyone of that). Here's a stupid proposal. What if the top of pydoc output said (for stdlib modules only) "The library manual is the canonical reference. Refer to it before using APIs you find in this documentation." Still inconvenient, but inconvenient is better than secret/impossible. Jean-Paul From raymond.hettinger at gmail.com Tue Nov 16 18:03:03 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 16 Nov 2010 09:03:03 -0800 Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4 In-Reply-To: References: <4CE28FBF.9020200@python.org> Message-ID: <662EDCAC-B0D2-4FF4-B666-CDB3363123C7@gmail.com> On Nov 16, 2010, at 7:05 AM, Paul Moore wrote: > > PEP 3148 (Futures) is noted in the PEP as going into 3.2, It also > seems to be in the release. > > Should it not be added to the "What's new in 3.2" document and the > release announcements? It's a fairly significant feature. I'll update the whatsnew document before the beta goes out. Raymond From raymond.hettinger at gmail.com Tue Nov 16 18:01:39 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 16 Nov 2010 09:01:39 -0800 Subject: [Python-Dev] [RELEASED] Python 3.2 alpha 4 In-Reply-To: References: <4CE28FBF.9020200@python.org> Message-ID: <61761AC8-B99E-4D0B-9C1A-70A419957FB7@gmail.com> On Nov 16, 2010, at 7:05 AM, Paul Moore wrote: > > PEP 3148 (Futures) is noted in the PEP as going into 3.2, It also > seems to be in the release. > > Should it not be added to the "What's new in 3.2" document and the > release announcements? It's a fairly significant feature. I'll update the whatsnew document before the beta goes out. Raymond From solipsis at pitrou.net Tue Nov 16 18:06:40 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 16 Nov 2010 18:06:40 +0100 Subject: [Python-Dev] Breaking undocumented API References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> Message-ID: <20101116180640.26a112f2@pitrou.net> On Tue, 16 Nov 2010 16:34:54 -0000 exarkun at twistedmatrix.com wrote: > > Imagine trying to use a dictionary without knowing about alphabetical > ordering. You mean an ordered dictionary, right? From alexander.belopolsky at gmail.com Tue Nov 16 18:13:57 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 12:13:57 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2A55C.8030807@egenix.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> Message-ID: On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg wrote: .. > One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat > obscure and given that we already have PyUnicode_Concat(), I think > it should be made private and eventually dropped. > What about PyUnicode_GetMax()? Isn't that supposed to be Py_UNICODE_GETMAX()? Or better still Py_UNICODE_MAXORDINAL? From lukasz at langa.pl Tue Nov 16 18:16:21 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Tue, 16 Nov 2010 18:16:21 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101116180640.26a112f2@pitrou.net> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <20101116180640.26a112f2@pitrou.net> Message-ID: <4CE2BC65.1080001@langa.pl> Am 16.11.2010 18:06, schrieb Antoine Pitrou: > On Tue, 16 Nov 2010 16:34:54 -0000 > exarkun at twistedmatrix.com wrote: >> Imagine trying to use a dictionary without knowing about alphabetical >> ordering. > You mean an ordered dictionary, right? He meant the ones with actual paper pages. From fuzzyman at voidspace.org.uk Tue Nov 16 18:21:38 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 16 Nov 2010 17:21:38 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2BC65.1080001@langa.pl> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <20101116180640.26a112f2@pitrou.net> <4CE2BC65.1080001@langa.pl> Message-ID: <4CE2BDA2.1000302@voidspace.org.uk> On 16/11/2010 17:16, ?ukasz Langa wrote: > Am 16.11.2010 18:06, schrieb Antoine Pitrou: >> On Tue, 16 Nov 2010 16:34:54 -0000 >> exarkun at twistedmatrix.com wrote: >>> Imagine trying to use a dictionary without knowing about alphabetical >>> ordering. >> You mean an ordered dictionary, right? > > He meant the ones with actual paper pages. But given that we are particularly talking about how to handle undocumented APIs, a more apropos comparison would be to ask how dictionary readers are supposed to look up words that aren't in the dictionary... This is why I think it *is* a style issue for developers - the more important decision is codifying how we decide what words need to go in the dictionary (to continue to torture the analogy). Michael > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From exarkun at twistedmatrix.com Tue Nov 16 18:30:49 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Tue, 16 Nov 2010 17:30:49 -0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2BDA2.1000302@voidspace.org.uk> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <20101116180640.26a112f2@pitrou.net> <4CE2BC65.1080001@langa.pl> <4CE2BDA2.1000302@voidspace.org.uk> Message-ID: <20101116173049.2040.989476246.divmod.xquotient.936@localhost.localdomain> On 05:21 pm, fuzzyman at voidspace.org.uk wrote: >On 16/11/2010 17:16, 1ukasz Langa wrote: >>Am 16.11.2010 18:06, schrieb Antoine Pitrou: >>>On Tue, 16 Nov 2010 16:34:54 -0000 >>>exarkun at twistedmatrix.com wrote: >>>>Imagine trying to use a dictionary without knowing about >>>>alphabetical >>>>ordering. >>>You mean an ordered dictionary, right? >> >>He meant the ones with actual paper pages. > >But given that we are particularly talking about how to handle >undocumented APIs, a more apropos comparison would be to ask how >dictionary readers are supposed to look up words that aren't in the >dictionary... No, this isn't an appropriate comparison. The dictionary was an example of something that presents information but is very hard to use without knowing the rules. We're not talking about undocumented APIs. We're talking about APIs that are documented somewhere other than in the library manual. Jean-Paul From mal at egenix.com Tue Nov 16 19:06:22 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 16 Nov 2010 19:06:22 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> Message-ID: <4CE2C81E.20103@egenix.com> Alexander Belopolsky wrote: > On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg wrote: > .. >> One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat >> obscure and given that we already have PyUnicode_Concat(), I think >> it should be made private and eventually dropped. >> > > What about PyUnicode_GetMax()? Isn't that supposed to be > Py_UNICODE_GETMAX()? Or better still Py_UNICODE_MAXORDINAL? Traditionally, all uppercase symbols refer to macros, whereas the mixed case ones refer to functions. Now, we can't use a macro for this, since the information has to be available as callable in order to applications or extensions to use it (without recompile). Regarding the name: PyUnicode_MaxOrdinal() would certainly have been better. BTW: I'm not really happy about the Py_UNICODE_ prefix for functions in unicodeobject.h, but I guess it's too late to change those. It would be better to stick to one prefix for Unicode related APIs, i.e. "PyUnicode_". -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From g.brandl at gmx.net Tue Nov 16 19:05:44 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 16 Nov 2010 19:05:44 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101116180640.26a112f2@pitrou.net> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <20101116180640.26a112f2@pitrou.net> Message-ID: Am 16.11.2010 18:06, schrieb Antoine Pitrou: > On Tue, 16 Nov 2010 16:34:54 -0000 > exarkun at twistedmatrix.com wrote: >> >> Imagine trying to use a dictionary without knowing about alphabetical >> ordering. > > You mean an ordered dictionary, right? That one's a sorted dictionary, though. Georg From alexander.belopolsky at gmail.com Tue Nov 16 19:31:32 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 13:31:32 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2C81E.20103@egenix.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> <4CE2C81E.20103@egenix.com> Message-ID: On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg wrote: .. > Now, we can't use a macro for [PyUnicode_GetMax()], since the information has > to be available as callable in order to applications or extensions > to use it (without recompile). > .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or PyUnicodeUCS4_GetMax. What is the scenario when may want to change what PyUnicodeUCS?_GetMax return and have extensions pick up the change without a recompile? UCS2 case will certainly never change since it is already 0xFFFF. Is it possible that USC4 will be expanded beyond 0x10FFFF? Note that we can have both a macro and a function version. This is fairly standard practice in Python C-API. From jcea at jcea.es Tue Nov 16 19:38:07 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 16 Nov 2010 19:38:07 +0100 Subject: [Python-Dev] Mercurial Schedule Message-ID: <4CE2CF8F.4040500@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Is there any updated mercurial schedule?. Any impact related with the new 3.2 schedule (three weeks offset)? - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOLPj5lgi5GaxT1NAQKM4gQAnL+pDmsc8PjPYCdCMf50pe6NwUs60D54 O3t8IgtbQJi9HqL5KJIJ99ZYlBOzze0lCy25NWNmnSrt6ISoU3IuTe7SUJ24iWKH T4x9MzRog5eIfa7z37aCJiIfvRJV4Q2drL4C6U1VFSji13EpknkGXefvyNToc+OX IDSM9ESZmGc= =vSL9 -----END PGP SIGNATURE----- From alexander.belopolsky at gmail.com Tue Nov 16 19:40:36 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 13:40:36 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2C81E.20103@egenix.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> <4CE2C81E.20103@egenix.com> Message-ID: On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg wrote: .. > BTW: I'm not really happy about the Py_UNICODE_ prefix for functions > in unicodeobject.h, but I guess it's too late to change those. > It would be better to stick to one prefix for Unicode related > APIs, i.e. "PyUnicode_". I don't have a problem with this. It makes sense that functions that operate on PyUnicode objects start with PyUnicode_ and those that operate on Py_UNICODE ordinals start with Py_UNICODE_. Of course, PyUnicode should have been named PyUnicodeObject and Py_UNICODE should have been named Py_wchar_t, but that's a different story. From mal at egenix.com Tue Nov 16 19:57:04 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 16 Nov 2010 19:57:04 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> <4CE2C81E.20103@egenix.com> Message-ID: <4CE2D400.5060803@egenix.com> Alexander Belopolsky wrote: > On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg wrote: > .. >> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has >> to be available as callable in order to applications or extensions >> to use it (without recompile). >> > > .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or > PyUnicodeUCS4_GetMax. That doesn't count :-) It's only a trick to prevent external code from using the wrong Unicode APIs. There still is a real function behind the renaming. > What is the scenario when may want to change > what PyUnicodeUCS?_GetMax return and have extensions pick up the > change without a recompile? If an extensions uses the stable ABI, it will want to know whether the interpreter was built for UCS2 or UCS4 (even if it doesn't use the Unicode APIs directly). > UCS2 case will certainly never change > since it is already 0xFFFF. Is it possible that USC4 will be expanded > beyond 0x10FFFF? Well, the Unicode Consortium decided to not go beyond 0x10FFFF, but then you never know... when they started out on the quest, 16 bits appeared more than enough, but they found out relatively quickly that the Asian scripts had enough code points to easily fill that space. Once space is available, it tends to get used sooner or later :-) > Note that we can have both a macro and a function > version. This is fairly standard practice in Python C-API. Sure, but what for ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Tue Nov 16 20:06:37 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 14:06:37 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2D400.5060803@egenix.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> <4CE2C81E.20103@egenix.com> <4CE2D400.5060803@egenix.com> Message-ID: On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg wrote: .. >> Note that we can have both a macro and a function >> version. ?This is fairly standard practice in Python C-API. > > Sure, but what for ? > Mostly just for consistency with the other macros: http://docs.python.org/dev/py3k/c-api/unicode.html#unicode-character-properties Wait, these actually map to C functions as well. So this is just a naming issue. From tjreedy at udel.edu Tue Nov 16 20:08:18 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 16 Nov 2010 14:08:18 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> Message-ID: On 11/16/2010 10:16 AM, Alexander Belopolsky wrote: > What this thread has shown is that there is no consensus on what > public names are and what rules should be followed when changing names > that can be imported from a module. Nor is their any consensus on the use of __all__ in the stdlib, with opinion ranging from never to sometimes to always. I do not have any opinions on the particular solution adopted, but appreciate your persistence in pushing to *some* solution. It would be nice to add 'Cleanly separated public and private APIs' to the list of 3.x features. -- Terry Jan Reedy From mal at egenix.com Tue Nov 16 20:16:50 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 16 Nov 2010 20:16:50 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> <4CE2C81E.20103@egenix.com> <4CE2D400.5060803@egenix.com> Message-ID: <4CE2D8A2.9040705@egenix.com> Alexander Belopolsky wrote: > On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg wrote: > .. >>> Note that we can have both a macro and a function >>> version. This is fairly standard practice in Python C-API. >> >> Sure, but what for ? >> > > Mostly just for consistency with the other macros: > > http://docs.python.org/dev/py3k/c-api/unicode.html#unicode-character-properties > > Wait, these actually map to C functions as well. So this is just a > naming issue. As said: the UCS2/4 name mangling doesn't count fall under the macro naming scheme, since it's done transparently and with a different reasoning in mind, than when you decide to use a macro to access some object detail, or want to avoid repetition. This trick was also added after the original APIs had already been documented for a while, so there was no way to change their names anymore. The various ctype functions use macro names for historic reasons: they were directed to different functions and/or inline code depending on a configuration switch. This is now gone, since the lib C ctype functions were locale aware and often implemented things a little differently than the Python ctype tables. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Tue Nov 16 20:52:07 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 14:52:07 -0500 Subject: [Python-Dev] PyUnicode_GetMax() and PyUnicode_FromOrdinal() Was: Breaking undocumented API Message-ID: On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg wrote: > Alexander Belopolsky wrote: >> On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg wrote: >> .. >>> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has >>> to be available as callable in order to applications or extensions >>> to use it (without recompile). >>> >> >> .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or >> PyUnicodeUCS4_GetMax. > > That doesn't count :-) It's only a trick to prevent external code > from using the wrong Unicode APIs. > > There still is a real function behind the renaming. > >> What is the scenario when may want to change >> what PyUnicodeUCS?_GetMax return and have extensions pick up the >> change without a recompile? > > If an extensions uses the stable ABI, it will want to know > whether the interpreter was built for UCS2 or UCS4 (even if > it doesn't use the Unicode APIs directly). > >> UCS2 case will certainly never change >> since it is already 0xFFFF. ?Is it possible that USC4 will be expanded >> beyond 0x10FFFF? > > Well, the Unicode Consortium decided to not go beyond 0x10FFFF, > but then you never know... when they started out on the quest, > 16 bits appeared more than enough, but they found out relatively > quickly that the Asian scripts had enough code points to easily > fill that space. > > Once space is available, it tends to get used sooner or later :-) > >> Note that we can have both a macro and a function >> version. ?This is fairly standard practice in Python C-API. > > Sure, but what for ? Note that PyUnicode_FromOrdinal() is documented (in unicodeobject.h) as follows without a reference to PyUnicode_GetMax(): """ Create a Unicode Object from the given Unicode code point ordinal. The ordinal must be in range(0x10000) on narrow Python builds (UCS2), and range(0x110000) on wide builds (UCS4). A ValueError is raised in case it is not. """ The actual implementation actually checks UCS4 range only. if (ordinal < 0 || ordinal > 0x10ffff) { PyErr_SetString(PyExc_ValueError, "chr() arg not in range(0x110000)"); return NULL; } This actually looks like a bug: >>> len(chr(0x10FFFF)) 2 (on a USC2 build.) Also, I think PyUnicode_FromOrdinal() should take Py_UNICODE argument rather than int. From mal at egenix.com Tue Nov 16 21:06:15 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 16 Nov 2010 21:06:15 +0100 Subject: [Python-Dev] PyUnicode_GetMax() and PyUnicode_FromOrdinal() Was: Breaking undocumented API In-Reply-To: References: Message-ID: <4CE2E437.5010103@egenix.com> Alexander Belopolsky wrote: > On Tue, Nov 16, 2010 at 1:57 PM, M.-A. Lemburg wrote: >> Alexander Belopolsky wrote: >>> On Tue, Nov 16, 2010 at 1:06 PM, M.-A. Lemburg wrote: >>> .. >>>> Now, we can't use a macro for [PyUnicode_GetMax()], since the information has >>>> to be available as callable in order to applications or extensions >>>> to use it (without recompile). >>>> >>> >>> .. but it *is* a macro resolving to either PyUnicodeUCS2_GetMax or >>> PyUnicodeUCS4_GetMax. >> >> That doesn't count :-) It's only a trick to prevent external code >> from using the wrong Unicode APIs. >> >> There still is a real function behind the renaming. >> >>> What is the scenario when may want to change >>> what PyUnicodeUCS?_GetMax return and have extensions pick up the >>> change without a recompile? >> >> If an extensions uses the stable ABI, it will want to know >> whether the interpreter was built for UCS2 or UCS4 (even if >> it doesn't use the Unicode APIs directly). >> >>> UCS2 case will certainly never change >>> since it is already 0xFFFF. Is it possible that USC4 will be expanded >>> beyond 0x10FFFF? >> >> Well, the Unicode Consortium decided to not go beyond 0x10FFFF, >> but then you never know... when they started out on the quest, >> 16 bits appeared more than enough, but they found out relatively >> quickly that the Asian scripts had enough code points to easily >> fill that space. >> >> Once space is available, it tends to get used sooner or later :-) >> >>> Note that we can have both a macro and a function >>> version. This is fairly standard practice in Python C-API. >> >> Sure, but what for ? > > Note that PyUnicode_FromOrdinal() is documented (in unicodeobject.h) > as follows without a reference to PyUnicode_GetMax(): > > """ > Create a Unicode Object from the given Unicode code point ordinal. > > The ordinal must be in range(0x10000) on narrow Python builds > (UCS2), and range(0x110000) on wide builds (UCS4). A ValueError is > raised in case it is not. > """ > > The actual implementation actually checks UCS4 range only. > > if (ordinal < 0 || ordinal > 0x10ffff) { > PyErr_SetString(PyExc_ValueError, > "chr() arg not in range(0x110000)"); > return NULL; > } > > This actually looks like a bug: > >>>> len(chr(0x10FFFF)) > 2 > > (on a USC2 build.) Yes, it's a documentation bug. I guess someone forgot to update the comment in unicodeobject.h after the change to have chr()/unichr() return a 2-char string instead of a 1-char string for non-BMP code points. > Also, I think PyUnicode_FromOrdinal() should take Py_UNICODE argument > rather than int. No, an ordinal is a number, not a typed value. We have PyUnicode_FromUnicode() to create strings from Py_UNICODE* arrays. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 16 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From rbp at isnomore.net Tue Nov 16 21:15:56 2010 From: rbp at isnomore.net (Rodrigo Bernardo Pimentel) Date: Tue, 16 Nov 2010 18:15:56 -0200 Subject: [Python-Dev] Python bug week-end : 20-21 November In-Reply-To: References: <20101025230337.41aeef12@pitrou.net> Message-ID: On 26 October 2010 18:04, Georg Brandl wrote: > Am 26.10.2010 19:53, schrieb Brett Cannon: >> Can whomever has edit access to the Python Google Calendar add this? > > Done. The Bug Weekend is still up, right? I don't see mention of it at http://wiki.python.org/moin/PythonBugDay (and when I tried to log in to edit, I got "A problem occurred in a Python script." - now, I thought no problems ever occurred on Python scripts! ;)). ? ? rbp -- ?http://isnomore.net From alexander.belopolsky at gmail.com Tue Nov 16 21:31:13 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 15:31:13 -0500 Subject: [Python-Dev] PyUnicode_GetMax() and PyUnicode_FromOrdinal() Was: Breaking undocumented API In-Reply-To: <4CE2E437.5010103@egenix.com> References: <4CE2E437.5010103@egenix.com> Message-ID: On Tue, Nov 16, 2010 at 3:06 PM, M.-A. Lemburg wrote: .. >>>>> len(chr(0x10FFFF)) >> 2 >> >> (on a USC2 build.) > > Yes, it's a documentation bug. I guess someone forgot to update > the comment in unicodeobject.h after the change to have chr()/unichr() > return a 2-char string instead of a 1-char string for non-BMP > code points. Same problem in reST doc for chr(i): """ chr(i) Return the string of one character whose Unicode codepoint is the integer i. For example, chr(97) returns the string 'a'. This is the inverse of ord(). The valid range for the argument depends how Python was configured ? it may be either UCS2 [0..0xFFFF] or UCS4 [0..0x10FFFF]. ValueError will be raised if i is outside that range. """ http://docs.python.org/dev/py3k/library/functions.html?chr And in ord(c): """ ord(c) Given a string of length one, return an integer representing the Unicode code point of the character. For example, ord('a') returns the integer 97 and ord('\u2020') returns 8224. This is the inverse of chr(). If the argument length is not one, a TypeError will be raised. (If Python was built with UCS2 Unicode, then the character?s code point must be in the range [0..65535] inclusive; otherwise the string length is two!) """ http://docs.python.org/dev/py3k/library/functions.html#ord From g.brandl at gmx.net Tue Nov 16 21:49:01 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 16 Nov 2010 21:49:01 +0100 Subject: [Python-Dev] Python bug week-end : 20-21 November In-Reply-To: References: <20101025230337.41aeef12@pitrou.net> Message-ID: Am 16.11.2010 21:15, schrieb Rodrigo Bernardo Pimentel: > On 26 October 2010 18:04, Georg Brandl wrote: >> Am 26.10.2010 19:53, schrieb Brett Cannon: >>> Can whomever has edit access to the Python Google Calendar add this? >> >> Done. > > The Bug Weekend is still up, right? I don't see mention of it at > http://wiki.python.org/moin/PythonBugDay (and when I tried to log in > to edit, I got "A problem occurred in a Python script." - now, I > thought no problems ever occurred on Python scripts! ;)). Yeah, somebody (Antoine?) should update that wiki page... Georg From ben+python at benfinney.id.au Tue Nov 16 22:31:41 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 17 Nov 2010 08:31:41 +1100 Subject: [Python-Dev] Breaking undocumented API References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> Message-ID: <87lj4t9cqq.fsf@benfinney.id.au> exarkun at twistedmatrix.com writes: > On 03:48 pm, guido at python.org wrote: > >Hm. Apart from the specific semantics assigned by the language to > >single and double leading (and trailing) underscores, I still think > >this belongs in a style guide, not in the library manual. > > I don't think it belongs only in PEP 8 (that's "a style guide" you're > referring to, correct?). I don't know about Guido, but I'd be ?1 on suggestions to add more normative information to PEP 7, PEP 8, PEP 257, or any other established style guide PEP. I certainly don't want to have to keep going back to the same documents frequently just to see if the set of recommendations I already know has changed recently. Rather, I took Guido's mention of ?this belongs in a style guide? as suggesting a *new* style guide. Perhaps one that explicitly obsoletes an existing one or perhaps not; either way, the updated normative recommendations are in a new document with a new name, so that one knows whether one has already read it. > It needs to be front and center. This is information that every single > user of the stdlib needs in order to use the stdlib correctly. True enough. This is information that goes beyond a style guide for writers, and into conventions that API users need to know also. -- \ ?I went to the museum where they had all the heads and arms | `\ from the statues that are in all the other museums.? ?Steven | _o__) Wright | Ben Finney From fdrake at acm.org Tue Nov 16 22:41:39 2010 From: fdrake at acm.org (Fred Drake) Date: Tue, 16 Nov 2010 16:41:39 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87lj4t9cqq.fsf@benfinney.id.au> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <87lj4t9cqq.fsf@benfinney.id.au> Message-ID: On Tue, Nov 16, 2010 at 4:31 PM, Ben Finney wrote: > I don't know about Guido, but I'd be -1 on suggestions to add more > normative information to PEP 7, PEP 8, PEP 257, or any other established > style guide PEP. I certainly don't want to have to keep going back to > the same documents frequently just to see if the set of recommendations > I already know has changed recently. Agreed. Many style guides are written as extensions of PEP 8 in particular. This has already bitten the Zope community, which was developing style beyond what was even written in it's own extension, only to have PEP 8 change out from under it in a contrary manner. Lessons we learned: - If you refer to someone else's documents, refer to specific versions. References can be updated explicitly if desired. - If you have even an advisory point of style, write it down in the style guide, so people who read the foundational documents you referred to without version information will be aware of the expectations. Otherwise, you may as well not have one. -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From guido at python.org Tue Nov 16 22:49:16 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Nov 2010 13:49:16 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> Message-ID: On Tue, Nov 16, 2010 at 8:34 AM, wrote: > On 03:48 pm, guido at python.org wrote: >> >> On Tue, Nov 16, 2010 at 7:16 AM, Alexander Belopolsky >> wrote: >>> >>> What this thread has shown is that there is no consensus on what >>> public names are and what rules should be followed when changing names >>> that can be imported from a module. ?I have opened an issue at >>> http://bugs.python.org/issue10434 to address this. ?My vote is to >>> adopt the definition spelled out in the language reference, copy it to >>> the library manual and add some discussion of the deprecation >>> policies. >> >> Hm. Apart from the specific semantics assigned by the language to >> single and double leading (and trailing) underscores, I still think >> this belongs in a style guide, not in the library manual. When reading >> the library manual, one should always assume that undocumented >> features are subject to change at any time. > > I don't think it belongs only in PEP 8 (that's "a style guide" you're > referring to, correct?). ?It needs to be front and center. ?This is > information that every single user of the stdlib needs in order to use the > stdlib correctly. That depends on what methods you're imagining "every single user" is using to find out what the API *is*. In my experience there are many ways people do this: - by reading the source - by reading the official docs - by trial and error - inspection of objects (e.g. dir()) - using help() - by reading pydoc output collected on some website (or local disk) - by following tutorials - by reading books containing reference documentation generated by 3rd party authors Most people do several of those things. (Personally, I learned about many APIs by creating them. But I'm probably an exception. :-) > No matter how many times we discuss this policy on this list (I know it's > come up here before), the majority of python users still won't learn about > it. Agreed. And adding a disclaimer to help() or pydoc output won't make much of a difference, I expect. > PEP 8 isn't nearly visible enough, either. ?Whatever the rule is, it needs > to be presented with the information itself. ?If the rule is that things not > documented in the library manual have no compatibility guarantees, then all > of the means of getting documentation *other* than looking at the library > manual need to indicate this somehow (alternatively, the information > shouldn't be duplicated, but I doubt I'll convince anyone of that). Assuming people actually read the disclaimers. > Here's a stupid proposal. ?What if the top of pydoc output said (for stdlib > modules only) "The library manual is the canonical reference. Refer to it > before using APIs you find in this documentation." ?Still inconvenient, but > inconvenient is better than secret/impossible. Personally I think it would be sufficient if the disclaimer was at the top of the library reference itself. That's certainly enough from a legalistic "I told you so" POV and I doubt that we'll be able to move the POV of what people actually use... -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Tue Nov 16 22:54:24 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 16:54:24 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87lj4t9cqq.fsf@benfinney.id.au> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <87lj4t9cqq.fsf@benfinney.id.au> Message-ID: On Tue, Nov 16, 2010 at 4:31 PM, Ben Finney wrote: .. > I don't know about Guido, but I'd be -1 on suggestions to add more > normative information to PEP 7, PEP 8, PEP 257, or any other established > style guide PEP. I certainly don't want to have to keep going back to > the same documents frequently just to see if the set of recommendations > I already know has changed recently. > > Rather, I took Guido's mention of "this belongs in a style guide" as > suggesting a *new* style guide. Perhaps one that explicitly obsoletes an > existing one or perhaps not; either way, the updated normative > recommendations are in a new document with a new name, so that one knows > whether one has already read it. > +1 Numbered PEPs, while well-known to old-timers, are really odd place for newcomers to find a style guide. This really should be a separate part at the top level of docs.python.org. Note that we already have a documentation style guide under "Documenting Python." Maybe we should reuse this slot and have say "Python Development" part which will put together PEP 7, PEP 8 and documentation "Style Guide" in one convenient package. This, however, is a much bigger project than what I had in mind when I started this thread. From alexander.belopolsky at gmail.com Tue Nov 16 23:19:36 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 16 Nov 2010 17:19:36 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE2A55C.8030807@egenix.com> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <4CE2A55C.8030807@egenix.com> Message-ID: I created http://bugs.python.org/issue10435 to follow up on unicode C API issues. On Tue, Nov 16, 2010 at 10:38 AM, M.-A. Lemburg wrote: > Alexander Belopolsky wrote: >> What this thread has shown is that there is no consensus on what >> public names are and what rules should be followed when changing names >> that can be imported from a module. ?I have opened an issue at >> http://bugs.python.org/issue10434 to address this. ?My vote is to >> adopt the definition spelled out in the language reference, copy it to >> the library manual and add some discussion of the deprecation >> policies. >> >> I also have a similar question about C API. ?Here, in absence of >> __all__, the answer should be clear: all symbols in public header >> files should start with either _Py_ or Py_ and those that start with >> Py_ are public. ? The question is what should be done with names that >> start with Py_, but are not documented? ?Can we add an underscore to >> those names? ?If so, should a (deprecated) alias be made available? >> Should they be documented as deprecated? >> >> I think these questions can only be answered on a case by case bases >> which choices being: >> >> 1. Document. >> 2. Document as deprecated. >> 3. Document as deprecated, add underscore prefix and retain a deprecated alias. >> 4. Add an underscore prefix. >> >> The specific set of names that I would like to consider is the >> following from unicode.h. ?I am marking with (*) the names that I >> think should be documented and with (D) those that should be >> deprecated: >> >> PyUnicode_GetMax >> PyUnicode_Resize (*) >> PyUnicode_InternImmortal >> PyUnicode_FromOrdinal (*) >> PyUnicode_GetDefaultEncoding (D) >> PyUnicode_AsDecodedObject >> PyUnicode_AsDecodedUnicode >> PyUnicode_AsEncodedObject >> PyUnicode_AsEncodedUnicode >> PyUnicode_BuildEncodingMap >> PyUnicode_EncodeDecimal (*) >> PyUnicode_Append (*) >> PyUnicode_AppendAndDel (*) >> PyUnicode_Partition (*) >> PyUnicode_RPartition (*) >> PyUnicode_RSplit (*) >> PyUnicode_IsIdentifier (*) >> Py_UNICODE_strlen >> Py_UNICODE_strcpy >> Py_UNICODE_strcat >> Py_UNICODE_strncpy >> Py_UNICODE_strcmp >> Py_UNICODE_strncmp >> Py_UNICODE_strchr >> Py_UNICODE_strrchr > > For Unicode, unicodeobject.h defines which APIs are private or not. > APIs which don't appear in the header file are either private or > need to be added to the header file (but I don't think there are > any in this category). > > All APIs in the header that do not appear in the documentation, > should be added there as well. unicodeobject.h already provides > documentation for most of the APIs you've listed above (except some > new ones that were added later on). > > One API I'm not sure about is PyUnicode_AppendAndDel(). It's somewhat > obscure and given that we already have PyUnicode_Concat(), I think > it should be made private and eventually dropped. > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source ?(#1, Nov 16 2010) >>>> Python/Zope Consulting and Support ... ? ? ? ?http://www.egenix.com/ >>>> mxODBC.Zope.Database.Adapter ... ? ? ? ? ? ? http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... ? ? ? ?http://python.egenix.com/ > ________________________________________________________________________ > > ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: > > > ? eGenix.com Software, Skills and Services GmbH ?Pastor-Loeh-Str.48 > ? ?D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > ? ? ? ? ? Registered at Amtsgericht Duesseldorf: HRB 46611 > ? ? ? ? ? ? ? http://www.egenix.com/company/contact/ > From glyph at twistedmatrix.com Wed Nov 17 00:41:42 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 16 Nov 2010 18:41:42 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> Message-ID: <9F20A8B5-7628-448B-AE21-416D6FE76E80@twistedmatrix.com> On Nov 16, 2010, at 4:49 PM, Guido van Rossum wrote: >> PEP 8 isn't nearly visible enough, either. Whatever the rule is, it needs >> to be presented with the information itself. If the rule is that things not >> documented in the library manual have no compatibility guarantees, then all >> of the means of getting documentation *other* than looking at the library >> manual need to indicate this somehow (alternatively, the information >> shouldn't be duplicated, but I doubt I'll convince anyone of that). > > Assuming people actually read the disclaimers. I don't think it necessarily needs to be presented as a disclaimer. There will always be people who just ignore part of the information presented, but the message could be something along the lines of "Here's some basic documentation, but it might be out-of-date or incomplete. You can find a better reference at ." If it's easy to click on the link, I think a lot of people will click on it. Especially since the library reference really _is_ more helpful than the docstrings, for the standard library. (IMHO, dir()'s semantics are so weird that it should emit a warning too, like "looking for docs? please use help()".) -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed Nov 17 08:18:59 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 17 Nov 2010 08:18:59 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE2CF8F.4040500@jcea.es> References: <4CE2CF8F.4040500@jcea.es> Message-ID: Am 16.11.2010 19:38, schrieb Jesus Cea: > Is there any updated mercurial schedule?. > > Any impact related with the new 3.2 schedule (three weeks offset)? I've been trying to contact Dirkjan and ask; generally, I don't see much connection to the 3.2 schedule (with the exception that the final migration day should not be a release day.) Georg From ncoghlan at gmail.com Wed Nov 17 12:45:39 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Nov 2010 21:45:39 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> Message-ID: On Wed, Nov 17, 2010 at 2:34 AM, wrote: > I don't think it belongs only in PEP 8 (that's "a style guide" you're > referring to, correct?). ?It needs to be front and center. ?This is > information that every single user of the stdlib needs in order to use the > stdlib correctly. > > Imagine trying to use a dictionary without knowing about alphabetical > ordering. ?Or driving a car without knowing what lane markers indicate. The definition of the public/private policy in all its gory detail should be in PEP 8 as Guido suggests. The library documentation may then contain a note about the difference in compatibility guarantees for public and private APIs, say that any interface and behaviour documented in the manual qualifies as public, then point readers to PEP 8 for the precise details. A similar note could be placed in the C API documentation (with a reference to the detailed policy in PEP 7, perhaps REsTify'ing that PEP in the process in order to link directly to the naming convention section). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Wed Nov 17 12:57:17 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 17 Nov 2010 11:57:17 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> Message-ID: <4CE3C31D.50701@voidspace.org.uk> On 17/11/2010 11:45, Nick Coghlan wrote: > On Wed, Nov 17, 2010 at 2:34 AM, wrote: >> I don't think it belongs only in PEP 8 (that's "a style guide" you're >> referring to, correct?). It needs to be front and center. This is >> information that every single user of the stdlib needs in order to use the >> stdlib correctly. >> >> Imagine trying to use a dictionary without knowing about alphabetical >> ordering. Or driving a car without knowing what lane markers indicate. > The definition of the public/private policy in all its gory detail > should be in PEP 8 as Guido suggests. +1 Have we agreed the policy though? > The library documentation may then contain a note about the difference > in compatibility guarantees for public and private APIs, say that any > interface and behaviour documented in the manual qualifies as public, > then point readers to PEP 8 for the precise details. > +1 This sounds like the right approach to me. All the best, Michael > A similar note could be placed in the C API documentation (with a > reference to the detailed policy in PEP 7, perhaps REsTify'ing that > PEP in the process in order to link directly to the naming convention > section). > > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From lukasz at langa.pl Wed Nov 17 13:37:27 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Wed, 17 Nov 2010 13:37:27 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3C31D.50701@voidspace.org.uk> References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> Message-ID: <4CE3CC87.1000105@langa.pl> Am 17.11.2010 12:57, schrieb Michael Foord: > On 17/11/2010 11:45, Nick Coghlan wrote: >> The definition of the public/private policy in all its gory detail >> should be in PEP 8 as Guido suggests. > > +1 > Guido did not said that, though. I'm with Fred and other people that agree that PEPs should be more-less immutable. Let's make a new document (PEP 88?). The reasoning was well laid out here: http://mail.python.org/pipermail/python-dev/2010-November/105641.html http://mail.python.org/pipermail/python-dev/2010-November/105642.html > Have we agreed the policy though? > Everybody has their own opinion on the matter. This discussion thread is getting too fractured to actually get us far enough with the conclusions. Let's make a PEP and discuss concrete wording on a concrete proposal. >> The library documentation may then contain a note about the difference >> in compatibility guarantees for public and private APIs, say that any >> interface and behaviour documented in the manual qualifies as public, >> then point readers to PEP 8 for the precise details. >> > > +1 Yes, point to PEP 88. Best regards, ?ukasz Langa From jcea at jcea.es Wed Nov 17 13:51:49 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 17 Nov 2010 13:51:49 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> Message-ID: <4CE3CFE5.7070803@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 17/11/10 08:18, Georg Brandl wrote: > Am 16.11.2010 19:38, schrieb Jesus Cea: >> Is there any updated mercurial schedule?. >> >> Any impact related with the new 3.2 schedule (three weeks offset)? > > I've been trying to contact Dirkjan and ask; generally, I don't > see much connection to the 3.2 schedule (with the exception that > the final migration day should not be a release day.) I can't find the mail now, but I remember that months ago the Mercurial migration schedule was mid-december. I wonder if there is any update. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOPP5Zlgi5GaxT1NAQLpSgP/e31LxthlSKgrVYbVhmHKfpdRvQKS2KGb kd0wpIYHhYs/TF0Jwm+Z1r4ylNTaOq0bSL8mJAFqZDnf2IA/jSn9Db/JUk338z7B FIcP0jYLSG0wS+pITRL+f6ifCK5s9SgdbSlPVTdyA6R5G9BDw0T72ZI4WDbnbTEy zqPfvWULiqY= =kPIk -----END PGP SIGNATURE----- From emile.anclin at logilab.fr Wed Nov 17 13:48:06 2010 From: emile.anclin at logilab.fr (Emile Anclin) Date: Wed, 17 Nov 2010 13:48:06 +0100 Subject: [Python-Dev] python3k vs _ast Message-ID: <201011171348.07169.emile.anclin@logilab> hello everybody, migrating Pylint to python3.x, we encounter a little problem : in the tree generated by _ast, if we consider a "args" node (representing an argument of a function), the "lineno" (and the "col_offset") information disappeared from those nodes. Is there a particular reason for that ? In python2.x, the "args" nodes were just "Name" nodes, and as for now we keep them as "AssName" nodes in astng/pylint and would like to know where it was defined. thx for any information -- Emile Anclin http://www.logilab.fr/ http://www.logilab.org/ Informatique scientifique & et gestion de connaissances From fuzzyman at voidspace.org.uk Wed Nov 17 14:11:51 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 17 Nov 2010 13:11:51 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3CC87.1000105@langa.pl> References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@lan ga.pl> Message-ID: <4CE3D497.50102@voidspace.org.uk> On 17/11/2010 12:37, ?ukasz Langa wrote: > Am 17.11.2010 12:57, schrieb Michael Foord: >> On 17/11/2010 11:45, Nick Coghlan wrote: >>> The definition of the public/private policy in all its gory detail >>> should be in PEP 8 as Guido suggests. >> >> +1 >> > > Guido did not said that, though. I think that is a reasonable interpretation, and the suggestion that by "in a style guide" means "create a new style guide" is more of a stretch. > I'm with Fred and other people that agree that PEPs should be > more-less immutable. Let's make a new document (PEP 88?). The > reasoning was well laid out here: > > http://mail.python.org/pipermail/python-dev/2010-November/105641.html > http://mail.python.org/pipermail/python-dev/2010-November/105642.html In those emails Fred provides a solution to his most substantial difficulty, that other people base their own documents off pep8, by recommending that extension documents should refer to a specific revision. I don't think those reasons are compelling and the cost of splitting the Python development style guide into multiple documents are higher. (They run the risk of contradicting each other, if you want to find a particular rule you have multiple places to check, there is no single authoritative place to send people, people *wanting* to base documents off the Python style rules now have to refer to multiple places, etc.) So -1 on splitting Python development style guide into multiple documents. Michael >> Have we agreed the policy though? >> > > Everybody has their own opinion on the matter. This discussion thread > is getting too fractured to actually get us far enough with the > conclusions. Let's make a PEP and discuss concrete wording on a > concrete proposal. > >>> The library documentation may then contain a note about the difference >>> in compatibility guarantees for public and private APIs, say that any >>> interface and behaviour documented in the manual qualifies as public, >>> then point readers to PEP 8 for the precise details. >>> >> >> +1 > > Yes, point to PEP 88. > > > Best regards, > ?ukasz Langa > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fdrake at acm.org Wed Nov 17 14:21:57 2010 From: fdrake at acm.org (Fred Drake) Date: Wed, 17 Nov 2010 08:21:57 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3D497.50102@voidspace.org.uk> References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: 2010/11/17 Michael Foord : > So -1 on splitting Python development style guide into multiple documents. I don't think that the publicness or API stability promises of the standard library are part of a style guide. They're an essential part of the library documentation. They aren't a guide for 3rd-party code, and are specific to the standard library. If we can't come up with something reasonable for the standard library, we *certainly* shouldn't be making recommendations on the matter for 3rd party code. If we do come up with something reasonable, we can recommend it to others later (once field-proven), and without duplication. (Possibly by referring to the standard library documentation, and possibly by refactoring. That's not important until we have something, though.) ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From ncoghlan at gmail.com Wed Nov 17 14:24:39 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Nov 2010 23:24:39 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3D497.50102@voidspace.org.uk> References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: 2010/11/17 Michael Foord : > I don't think those reasons are compelling and the cost of splitting the > Python development style guide into multiple documents are higher. (They run > the risk of contradicting each other, if you want to find a particular rule > you have multiple places to check, there is no single authoritative place to > send people, people *wanting* to base documents off the Python style rules > now have to refer to multiple places, etc.) > > So -1 on splitting Python development style guide into multiple documents. Indeed. We don't need to clarify things very often, but the idea of creating a new PEP every time we want to make something explicit that was historically implicit (or otherwise underspecified) is a silly idea. Allowing traceable revisions is what version control is for, and hence why the PEP archive is part of the SVN repository. As far as notifiying current developers of any changes, they will generally be following python-dev anyway, or else will get pulled up on python-checkins if the policy change is significant (and this one really *isn't* all that significant - the only people it will affect are those deciding whether to document or deprecate implicitly public APIs and that almost never happens, since the vast majority of our APIs are explicitly public or private). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Wed Nov 17 14:25:34 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 17 Nov 2010 13:25:34 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: <4CE3D7CE.8030108@voidspace.org.uk> On 17/11/2010 13:21, Fred Drake wrote: > 2010/11/17 Michael Foord: >> So -1 on splitting Python development style guide into multiple documents. > I don't think that the publicness or API stability promises of the > standard library are part of a style guide. They're an essential part > of the library documentation. They aren't a guide for 3rd-party code, > and are specific to the standard library. PEP 8 *isn't* targeted at third party code - is the development style guide for the Python standard library. This document gives coding conventions for the Python code comprising the standard library in the main Python distribution. The ideal place for informing the Python core developers the naming conventions we should use for our public APIs... (Which is why Guido said that a style guide *is* the right place for this information.) It doesn't mean it shouldn't be information provided to library users as well. (As discussed.) All the best, Michael Foord > If we can't come up with something reasonable for the standard > library, we *certainly* shouldn't be making recommendations on the > matter for 3rd party code. If we do come up with something > reasonable, we can recommend it to others later (once field-proven), > and without duplication. (Possibly by referring to the standard > library documentation, and possibly by refactoring. That's not > important until we have something, though.) > > > -Fred > > -- > Fred L. Drake, Jr. > "A storm broke loose in my mind." --Albert Einstein -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From dirkjan at ochtman.nl Wed Nov 17 14:23:59 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Wed, 17 Nov 2010 14:23:59 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE3CFE5.7070803@jcea.es> References: <4CE2CF8F.4040500@jcea.es> <4CE3CFE5.7070803@jcea.es> Message-ID: On Wed, Nov 17, 2010 at 13:51, Jesus Cea wrote: > I can't find the mail now, but I remember that months ago the Mercurial > migration schedule was mid-december. I wonder if there is any update. I'm still aiming for that date. I've had some problems getting the test repository together. It's almost done, but I'm on holiday in Boston and NYC this week, so I don't have much time to spend on it. The delay shouldn't be much more than a week, and we'll just compress the testing period such that the migration date should still be about the same, release schedules willing. Georg, if you have any further questions, mail is better than IRC while I'm here. Cheers, Dirkjan From phd at phd.pp.ru Wed Nov 17 14:29:59 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Wed, 17 Nov 2010 16:29:59 +0300 Subject: [Python-Dev] python3k vs _ast In-Reply-To: <201011171348.07169.emile.anclin@logilab> References: <201011171348.07169.emile.anclin@logilab> Message-ID: <20101117132959.GA29283@phd.pp.ru> Seems to be rather a usage question, not a development question (python-dev is about *developing* python, not *using* it). On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote: > hello everybody, > > migrating Pylint to python3.x, we encounter a little problem : > in the tree generated by _ast, if we consider a "args" node (representing > an argument of a function), the "lineno" (and the "col_offset") > information disappeared from those nodes. Is there a particular > reason for that ? In python2.x, the "args" nodes were just "Name" nodes, > and as for now we keep them as "AssName" nodes in astng/pylint and would > like to know where it was defined. > > thx for any information > > -- > > Emile Anclin > http://www.logilab.fr/ http://www.logilab.org/ > Informatique scientifique & et gestion de connaissances Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Nov 17 14:30:25 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Nov 2010 23:30:25 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: On Wed, Nov 17, 2010 at 11:21 PM, Fred Drake wrote: > 2010/11/17 Michael Foord : >> So -1 on splitting Python development style guide into multiple documents. > > I don't think that the publicness or API stability promises of the > standard library are part of a style guide. ?They're an essential part > of the library documentation. ?They aren't a guide for 3rd-party code, > and are specific to the standard library. > > If we can't come up with something reasonable for the standard > library, we *certainly* shouldn't be making recommendations on the > matter for 3rd party code. ?If we do come up with something > reasonable, we can recommend it to others later (once field-proven), > and without duplication. ?(Possibly by referring to the standard > library documentation, and possibly by refactoring. ?That's not > important until we have something, though.) Would it make people happier if we left PEP 7 and PEP 8 alone, and put the clarification of what constitutes a "public API" into PEP 5 instead? PEP 5 currently the deprecation policy for language constructs, it would be easy enough to extend it to all public APIs. The library documentation is *not* the right place for quibbling about what constitutes a public API when using other means than the library documentation to find APIs to call. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From lukasz at langa.pl Wed Nov 17 14:31:41 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Wed, 17 Nov 2010 14:31:41 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3D497.50102@voidspace.org.uk> References: <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@lan ga.pl> <4CE3D497.50102@voidspace.or g.uk> Message-ID: <4CE3D93D.3010601@langa.pl> Am 17.11.2010 14:11, schrieb Michael Foord: > I don't think those reasons are compelling and the cost of splitting > the Python development style guide into multiple documents are higher. > (They run the risk of contradicting each other, if you want to find a > particular rule you have multiple places to check, there is no single > authoritative place to send people, people *wanting* to base documents > off the Python style rules now have to refer to multiple places, etc.) > > So -1 on splitting Python development style guide into multiple > documents. > Bah, again my English skills failed me in a critical moment ;) I was proposing creation of PEP 88 to supersede PEP 8. This would be better IMO for the following reasons: 1. Existing projects wouldn't have to explain afterwards why they differ from PEP 8, e.g. in terms of public/private API declaration. "Your project claims PEP8 conformance! Why don't you use __all__?" "Ah, that was before they've added this part to PEP8." 2. All other projects (new and old) would have a much more explicit (better than implicit) sign that *something significant has changed* in the recommended style. 3. As someone already said, PEP8 is not visible enough. Transition from PEP 8 to PEP 88 could help to make some hype that would help raise the awareness within the community. Mutating PEP8 is bad form. We fight mercilessly over source code backwards compatibility so I think PEPs should be taken just as seriously in that regard. ?ukasz From benjamin at python.org Wed Nov 17 14:36:37 2010 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 17 Nov 2010 07:36:37 -0600 Subject: [Python-Dev] python3k vs _ast In-Reply-To: <20101117132959.GA29283@phd.pp.ru> References: <201011171348.07169.emile.anclin@logilab> <20101117132959.GA29283@phd.pp.ru> Message-ID: 2010/11/17 Oleg Broytman : > Seems to be rather a usage question, not a development question (python-dev > is about *developing* python, not *using* it). Well, technically I think it's a feature request. > > On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote: >> hello everybody, >> >> migrating Pylint to python3.x, we encounter a little problem : >> in the tree generated by _ast, if we consider a "args" node (representing >> an argument of a function), the "lineno" (and the "col_offset") >> information disappeared from those nodes. Is there a particular >> reason for that ? In python2.x, the "args" nodes were just "Name" nodes, >> and as for now we keep them as "AssName" nodes in astng/pylint and would >> like to know where it was defined. I wouldn't object to adding them back if you want to file a bug report. -- Regards, Benjamin From fdrake at acm.org Wed Nov 17 14:45:03 2010 From: fdrake at acm.org (Fred Drake) Date: Wed, 17 Nov 2010 08:45:03 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan wrote: > The library documentation is *not* the right place for quibbling about > what constitutes a public API when using other means than the library > documentation to find APIs to call. Quibbling can happen on the mailing list, where it can be ignored by those who aren't interested. But the documentation is the right place to document what we come up with for the standard library. I expect what the tools do will inform any decisions, and the tools (those in the stdlib) will henceforth be maintained with that in mind. I *am* suggesting that the scope of this be restricted to what's appropriate for the standard library, rather than a general recommendation for others. Third-party projects are free to use what we come up with, or provide their own policies. That's theirs to decide, and I see no value in interfering with that. ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From fuzzyman at voidspace.org.uk Wed Nov 17 14:53:24 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 17 Nov 2010 13:53:24 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3D93D.3010601@langa.pl> References: <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@lan ga.pl> <4CE3D497.50102@voidspace.or g.uk> <4CE3D93D.3010601@langa.pl> Message-ID: <4CE3DE54.2070008@voidspace.org.uk> On 17/11/2010 13:31, ?ukasz Langa wrote: > Am 17.11.2010 14:11, schrieb Michael Foord: >> I don't think those reasons are compelling and the cost of splitting >> the Python development style guide into multiple documents are >> higher. (They run the risk of contradicting each other, if you want >> to find a particular rule you have multiple places to check, there is >> no single authoritative place to send people, people *wanting* to >> base documents off the Python style rules now have to refer to >> multiple places, etc.) >> >> So -1 on splitting Python development style guide into multiple >> documents. >> > > Bah, again my English skills failed me in a critical moment ;) I was > proposing creation of PEP 88 to supersede PEP 8. This would be better > IMO for the following reasons: > > 1. Existing projects wouldn't have to explain afterwards why they > differ from PEP 8, e.g. in terms of public/private API declaration. > "Your project claims PEP8 conformance! Why don't you use __all__?" > "Ah, that was before they've added this part to PEP8." > > 2. All other projects (new and old) would have a much more explicit > (better than implicit) sign that *something significant has changed* > in the recommended style. > > 3. As someone already said, PEP8 is not visible enough. Transition > from PEP 8 to PEP 88 could help to make some hype that would help > raise the awareness within the community. > > Mutating PEP8 is bad form. We fight mercilessly over source code > backwards compatibility so I think PEPs should be taken just as > seriously in that regard. Given the following: http://code.python.org/hg/peps/log/6b223d6b8b24/pep-0008.txt Anyone who thinks that PEP 8 is immutable (and should remain so) is already wrong... As discussed, the goal is to codify what is already considered "best practise" within the wider community and the standard library *anyway*. So in practise this won't be a great surprise or change. As to the publicity, PEP 8 is both the most widely known PEP and the most widely known Python style guide. This isn't an argument for letting it rot, nor for deprecating it and invalidating all those tutorials / developers / links / books that consider it authoritative. Better to carefully and slowly evolve it as practise and the language change. For those wanting immutable versions we provide that in the form of specific revisions. All the best, Michael > > > ?ukasz -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From steve at pearwood.info Wed Nov 17 15:16:53 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 18 Nov 2010 01:16:53 +1100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87lj4t9cqq.fsf@benfinney.id.au> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <87lj4t9cqq.fsf@benfinney.id.au> Message-ID: <4CE3E3D5.3040607@pearwood.info> Ben Finney wrote: > I don't know about Guido, but I'd be ?1 on suggestions to add more > normative information to PEP 7, PEP 8, PEP 257, or any other established > style guide PEP. I certainly don't want to have to keep going back to > the same documents frequently just to see if the set of recommendations > I already know has changed recently. This is not a problem unique to any specific PEP. How do we learn about any changes that might interest us? What are the alternatives? - our knowledge is fixed to what we knew at some particular date, and gets further and further obsolete as time goes by; - we actively search out new knowledge; - we wait for somebody to tell us that something we knew has changed. (E.g. I was rather surprised to learn that, sometime over the last few years, the number of extra-solar planets known to astronomers have increased from the one or two I was aware of to multiple dozens.) All three strategies have advantages and disadvantages. Regardless of whether future versions of the style-guide are called "PEP 8" or whether they are given new names ("PEP 8" -> "PEP 88" -> ...), we have the identical problem -- how do we know whether or not there is a new version of the style guide to look for? In twelve months time, how sure will we be that PEP 88 is the most recent version to look for? Perhaps we missed the release of PEP 95. The one advantage of giving each revision of the document an updated name is that, under some circumstances, we *might* be able to detect a new revision easily. If I think that PEP 88 is the most recent version, and somebody says that the recommended style guide is PEP 89, I might: - think that he merely made a mistake, and meant to say 88; or - think that there is a new document for me to look at. > Rather, I took Guido's mention of ?this belongs in a style guide? as > suggesting a *new* style guide. Perhaps one that explicitly obsoletes an > existing one or perhaps not; either way, the updated normative > recommendations are in a new document with a new name, so that one knows > whether one has already read it. How do you know which is the most recent version of the style guide to look at? Instead of doing a O(1) lookup of PEP 8, you have to follow a potentially O(N) search: PEP 8 is obsoleted by PEP 88... go and look at PEP 88. PEP 88 is obsoleted by PEP 93... go at look at PEP 93. PEP 93 is obsoleted by PEP 123... go and look at PEP 123. PEP 123 doesn't contain an "obsoleted by" notice, so: (1) either it is the current document, or (2) it has been obsoleted, but the link to the new version was missed, and it is now very hard to discover what the current document is called. Personally, I don't think the current PEP arrangement is broken enough to change it. Each PEP is already tracked in VCS and history is available for it. There's insufficient advantage, and some disadvantage, to splitting each revision of the PEPs into new documents with new names. -1 on the idea. -- Steven From emile.anclin at logilab.fr Wed Nov 17 15:18:14 2010 From: emile.anclin at logilab.fr (Emile Anclin) Date: Wed, 17 Nov 2010 15:18:14 +0100 Subject: [Python-Dev] python3k vs _ast In-Reply-To: References: <201011171348.07169.emile.anclin@logilab> <20101117132959.GA29283@phd.pp.ru> Message-ID: <201011171518.14387.emile.anclin@logilab> On Wednesday 17 November 2010 14:36:37 Benjamin Peterson wrote: > I wouldn't object to adding them back if you want to file a bug report. Ok, thank you for quick reply. here is the issue : http://bugs.python.org/issue10445 -- Emile Anclin http://www.logilab.fr/ http://www.logilab.org/ Informatique scientifique & et gestion de connaissances From steve at pearwood.info Wed Nov 17 15:19:22 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 18 Nov 2010 01:19:22 +1100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3D93D.3010601@langa.pl> References: <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@lan ga.pl> <4CE3D497.50102@voidspace.or g.uk> <4CE3D93D.3010601@langa.pl> Message-ID: <4CE3E46A.9030905@pearwood.info> ?ukasz Langa wrote: > Mutating PEP8 is bad form. We fight mercilessly over source code > backwards compatibility so I think PEPs should be taken just as > seriously in that regard. There's no comparison between the two. If you change your library's API -- not "source code", it doesn't matter if the source code changes so long as the interface remains backwards compatible -- then you will break other people's code. If we change PEP 8, then all that will happen is that some people's coding style will no longer be exactly compatible with PEP 8. Their code will continue to work. -- Steven From ncoghlan at gmail.com Wed Nov 17 15:19:39 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 00:19:39 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: On Wed, Nov 17, 2010 at 11:45 PM, Fred Drake wrote: > On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan wrote: >> The library documentation is *not* the right place for quibbling about >> what constitutes a public API when using other means than the library >> documentation to find APIs to call. > > Quibbling can happen on the mailing list, where it can be ignored by > those who aren't interested. > > But the documentation is the right place to document what we come up > with for the standard library. ?I expect what the tools do will inform > any decisions, and the tools (those in the stdlib) will henceforth be > maintained with that in mind. > > I *am* suggesting that the scope of this be restricted to what's > appropriate for the standard library, rather than a general > recommendation for others. ?Third-party projects are free to use what > we come up with, or provide their own policies. ?That's theirs to > decide, and I see no value in interfering with that. The standard library documentation should say that the public API is what the documentation says it is. Officially, anyone going outside those documented APIs should not be surprised if things get removed or changed arbitrarily without warning. That has long been the python-dev policy and I, for one, don't think it should change. What we're talking about in this thread is what to do in the grey area of APIs which are not included in the official documentation, but also don't have names starting with an underscore so they "look public" when reading the source code or exploring the API in the interactive interpreter. It *may* be appropriate for the standard library documentation to acknowledge that this grey area exists (I'm not yet convinced on that point), but it definitely should *not* be encouraging anyone to rely on it or on our policies for dealing with it. The policy we're aiming to clarify here is what we should do when we come across standard library APIs that land in the grey area, with there being two appropriate ways to deal with them: 1. Document them and make them officially public 2. Deprecate the public names and make them officially private (with the public names later removed in accordance with normal deprecation procedures) The actual approach taken will vary on a case-by-case basis (and is a little trickier in the case of module level globals, since those can't be deprecated properly), but is always aimed at bringing the standard library more into line with the official position (i.e. APIs are either public-and-documented or private). So the official policy from a language *user* point of view would remain unchanged (i.e. if it isn't documented, you're on your own). As a *pragmatic* policy, however, we would explicitly acknowledge that developers may inadvertently use an undocumented API without realising that it isn't technically public, and hence apply the normal deprecation process even though the official policy says we don't have to. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rdmurray at bitdance.com Wed Nov 17 15:19:35 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 17 Nov 2010 09:19:35 -0500 Subject: [Python-Dev] python3k vs _ast In-Reply-To: References: <201011171348.07169.emile.anclin@logilab> <20101117132959.GA29283@phd.pp.ru> Message-ID: <20101117141935.599592188CD@kimball.webabinitio.net> On Wed, 17 Nov 2010 07:36:37 -0600, Benjamin Peterson wrote: > 2010/11/17 Oleg Broytman : > > Seems to be rather a usage question, not a development question (python-dev > > is about *developing* python, not *using* it). > > Well, technically I think it's a feature request. > > > > > On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote: > >> hello everybody, > >> > >> migrating Pylint to python3.x, we encounter a little problem : > >> in the tree generated by _ast, if we consider a "args" node (representing > >> an argument of a function), the "lineno" (and the "col_offset") > >> information disappeared from those nodes. Is there a particular > >> reason for that ? In python2.x, the "args" nodes were just "Name" nodes, > >> and as for now we keep them as "AssName" nodes in astng/pylint and would > >> like to know where it was defined. > > I wouldn't object to adding them back if you want to file a bug report. It also seems to me that it was a perfectly appropriate question for this list. The question was "why did you developers drop this (obscure) feature that we depend on in Python3?" I don't think that question would make sense on python-list. Granted, there's a fuzzy line there, but pylint is really development infrastructure :) The python-porting list would have been a good alternate choice. -- R. David Murray www.bitdance.com From fuzzyman at voidspace.org.uk Wed Nov 17 15:25:01 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 17 Nov 2010 14:25:01 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: <4CE3E5BD.6050700@voidspace.org.uk> On 17/11/2010 14:19, Nick Coghlan wrote: > On Wed, Nov 17, 2010 at 11:45 PM, Fred Drake wrote: >> On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan wrote: >>> The library documentation is *not* the right place for quibbling about >>> what constitutes a public API when using other means than the library >>> documentation to find APIs to call. >> Quibbling can happen on the mailing list, where it can be ignored by >> those who aren't interested. >> >> But the documentation is the right place to document what we come up >> with for the standard library. I expect what the tools do will inform >> any decisions, and the tools (those in the stdlib) will henceforth be >> maintained with that in mind. >> >> I *am* suggesting that the scope of this be restricted to what's >> appropriate for the standard library, rather than a general >> recommendation for others. Third-party projects are free to use what >> we come up with, or provide their own policies. That's theirs to >> decide, and I see no value in interfering with that. > The standard library documentation should say that the public API is > what the documentation says it is. Officially, anyone going outside > those documented APIs should not be surprised if things get removed or > changed arbitrarily without warning. That has long been the python-dev > policy and I, for one, don't think it should change. > > What we're talking about in this thread is what to do in the grey area > of APIs which are not included in the official documentation, but also > don't have names starting with an underscore so they "look public" We're *also* discussing codifying the naming conventions (or using __all__) within the standard library, so it isn't just about deprecations (which is why I think PEP 8 rather than PEP 5). This is so that in the future if a name looks public users can have more confidence that it actually is... Obviously what to do about modules that don't follow these rules currently is a big part of it (and how the discussion started). All the best, Michael > when reading the source code or exploring the API in the interactive > interpreter. It *may* be appropriate for the standard library > documentation to acknowledge that this grey area exists (I'm not yet > convinced on that point), but it definitely should *not* be > encouraging anyone to rely on it or on our policies for dealing with > it. > > The policy we're aiming to clarify here is what we should do when we > come across standard library APIs that land in the grey area, with > there being two appropriate ways to deal with them: > 1. Document them and make them officially public > 2. Deprecate the public names and make them officially private (with > the public names later removed in accordance with normal deprecation > procedures) > > The actual approach taken will vary on a case-by-case basis (and is a > little trickier in the case of module level globals, since those can't > be deprecated properly), but is always aimed at bringing the standard > library more into line with the official position (i.e. APIs are > either public-and-documented or private). > > So the official policy from a language *user* point of view would > remain unchanged (i.e. if it isn't documented, you're on your own). As > a *pragmatic* policy, however, we would explicitly acknowledge that > developers may inadvertently use an undocumented API without realising > that it isn't technically public, and hence apply the normal > deprecation process even though the official policy says we don't have > to. > > Regards, > Nick. > -- http://www.voidspace.org.uk/ From jcea at jcea.es Wed Nov 17 15:31:02 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 17 Nov 2010 15:31:02 +0100 Subject: [Python-Dev] I need help with IO testuite Message-ID: <4CE3E726.2030008@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi all. I am modifying IO module for Python 3.2, and I am unable to understand the mechanism used in IO testsuite to test both the C and the Python implementation. In particular I need to test that the implementation passes some parameters to the OS. The module uses "Mock" classes, but I think "Mock" is something else, and I don't see how it interpose between the C/Python code and the OS. If somebody could explain the mechanism a bit... Thanks for your time and attention. Some background: http://bugs.python.org/issue10142 - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOPnJplgi5GaxT1NAQLVqQP/cf9+hdLdoSMzY+cSquq7YZMiQOQ0aMEH ZRn+su4F3qg5e8MgEQOXFj9uGEjVDLwonE4nBZ+T3ovBcPCyGaLB/K/YttZGVM5/ O3gpzZss9bkMvuWQCblyEJp8uzJC831AwPDMg1Q0nbMiTnJlW5dY1CX9BD0gYPBW oIVBt2oBfCI= =hq7M -----END PGP SIGNATURE----- From ncoghlan at gmail.com Wed Nov 17 15:34:22 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 00:34:22 +1000 Subject: [Python-Dev] Proposed adjustments to PEP 0 generation Message-ID: The lists of Meta-PEPs and Other Informational PEPs at the beginning of PEP 0 are starting to get a little long, and contain some outdated information that doesn't really deserve pride of place at the top of the PEP index. If I don't hear any objections in this thread, I plan to make the following tweaks to the PEP 0 generator "soonish": - make these two lists respect the "Withdrawn" and "Rejected" flags (i.e. taking the relevant PEPs out of this list and dropping them into later categories) - adding a new "Historical" category for PEPs that have served their purpose and are no longer of immediate interest (primarily old release PEPs, but also the old SVN migration PEP, the DVCS study and PEP 42) Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Wed Nov 17 15:44:15 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Nov 2010 15:44:15 +0100 Subject: [Python-Dev] I need help with IO testuite References: <4CE3E726.2030008@jcea.es> Message-ID: <20101117154415.41100ec5@pitrou.net> On Wed, 17 Nov 2010 15:31:02 +0100 Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi all. I am modifying IO module for Python 3.2, and I am unable to > understand the mechanism used in IO testsuite to test both the C and the > Python implementation. > > In particular I need to test that the implementation passes some > parameters to the OS. > > The module uses "Mock" classes, but I think "Mock" is something else, > and I don't see how it interpose between the C/Python code and the OS. It doesn't interpose between Python and the OS: it mocks the OS. It is, therefore, a mock (!). Consequently, if you want to test that parameters are passed to the OS, you shouldn't use a mock, but an actual file. There are several tests which already do that, it shouldn't be too hard to write your own. Regards Antoine. From ncoghlan at gmail.com Wed Nov 17 15:46:01 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 00:46:01 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3E5BD.6050700@voidspace.org.uk> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE3E5BD.6050700@voidspace.org.uk> Message-ID: On Thu, Nov 18, 2010 at 12:25 AM, Michael Foord wrote: > We're *also* discussing codifying the naming conventions (or using __all__) > within the standard library, so it isn't just about deprecations (which is > why I think PEP 8 rather than PEP 5). This is so that in the future if a > name looks public users can have more confidence that it actually is... I deliberately glossed over that, since my stance on the naming conventions is "don't change them" (i.e. PEP 8 already says that a leading underscore is an internal use indicator, and I think that's how we should guide the clarification of our deprecation policy - just carving out an exception for imported modules). My original question related to dealing with the grey area in the deprecation policy (i.e. wanting to remove an API that was undocumented, but had a public name) and I'm happy that the existing style guide does answer my question (even though the implications aren't necessarily obvious). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From phd at phd.pp.ru Wed Nov 17 15:50:56 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Wed, 17 Nov 2010 17:50:56 +0300 Subject: [Python-Dev] python3k vs _ast In-Reply-To: <20101117141935.599592188CD@kimball.webabinitio.net> References: <201011171348.07169.emile.anclin@logilab> <20101117132959.GA29283@phd.pp.ru> <20101117141935.599592188CD@kimball.webabinitio.net> Message-ID: <20101117145056.GA1034@phd.pp.ru> On Wed, Nov 17, 2010 at 09:19:35AM -0500, R. David Murray wrote: > On Wed, 17 Nov 2010 07:36:37 -0600, Benjamin Peterson wrote: > > 2010/11/17 Oleg Broytman : > > > Seems to be rather a usage question, not a development question (python-dev > > > is about *developing* python, not *using* it). > > > > Well, technically I think it's a feature request. > > > > > > > > On Wed, Nov 17, 2010 at 01:48:06PM +0100, Emile Anclin wrote: > > >> hello everybody, > > >> > > >> migrating Pylint to python3.x, we encounter a little problem : > > >> in the tree generated by _ast, if we consider a "args" node (representing > > >> an argument of a function), the "lineno" (and the "col_offset") > > >> information disappeared from those nodes. Is there a particular > > >> reason for that ? In python2.x, the "args" nodes were just "Name" nodes, > > >> and as for now we keep them as "AssName" nodes in astng/pylint and would > > >> like to know where it was defined. > > > > I wouldn't object to adding them back if you want to file a bug report. > > It also seems to me that it was a perfectly appropriate question > for this list. The question was "why did you developers drop this > (obscure) feature that we depend on in Python3?" The problem for me is the wording. A question like "why did you developers drop a feature?" is certainly a development question, while "like to know where it was defined" seems more like a usage question. I apologize for misunderstanding. > I don't think that > question would make sense on python-list. Granted, there's a fuzzy > line there, but pylint is really development infrastructure :) > > The python-porting list would have been a good alternate choice. > > -- > R. David Murray www.bitdance.com Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From ncoghlan at gmail.com Wed Nov 17 15:58:30 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 00:58:30 +1000 Subject: [Python-Dev] I need help with IO testuite In-Reply-To: <4CE3E726.2030008@jcea.es> References: <4CE3E726.2030008@jcea.es> Message-ID: On Thu, Nov 18, 2010 at 12:31 AM, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi all. I am modifying IO module for Python 3.2, and I am unable to > understand the mechanism used in IO testsuite to test both the C and the > Python implementation. > > In particular I need to test that the implementation passes some > parameters to the OS. > > The module uses "Mock" classes, but I think "Mock" is something else, > and I don't see how it interpose between the C/Python code and the OS. The "Mock" refers to stubbing out or substituting various layers of the IO stack with the Python implementations in the test file. It isn't related specifically to the C/Python switching. > If somebody could explain the mechanism a bit... The actual C/Python switching happens later in the file. It is best to start from the bottom of the file (with the list of test cases that are actually executed) and work your way up from there. For what Amaury is talking about, what you can test is that the higher layers of the IO stack (e.g. BufferedReader) correctly pass the new flags down to the RawIO layer. You're correct that you can't really test that RawIO is actually passing the flags down to the OS. However, if you have a way to check whether the filesystem in use is ZFS, you may be able to create a conditionally executed test, such that correct behaviour can be verified just by running on a machine that uses ZFS for its temp directory. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tseaver at palladion.com Wed Nov 17 15:58:37 2010 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 17 Nov 2010 09:58:37 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE3E3D5.3040607@pearwood.info> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <87lj4t9cqq.fsf@benfinney.id.au> <4CE3E3D5.3040607@pearwood.info> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/17/2010 09:16 AM, Steven D'Aprano wrote: > Ben Finney wrote: > >> I don't know about Guido, but I'd be ?1 on suggestions to add more >> normative information to PEP 7, PEP 8, PEP 257, or any other established >> style guide PEP. I certainly don't want to have to keep going back to >> the same documents frequently just to see if the set of recommendations >> I already know has changed recently. > > This is not a problem unique to any specific PEP. How do we learn about > any changes that might interest us? What are the alternatives? > > - our knowledge is fixed to what we knew at some particular date, and > gets further and further obsolete as time goes by; > > - we actively search out new knowledge; > > - we wait for somebody to tell us that something we knew has changed. > > (E.g. I was rather surprised to learn that, sometime over the last few > years, the number of extra-solar planets known to astronomers have > increased from the one or two I was aware of to multiple dozens.) > > All three strategies have advantages and disadvantages. > > Regardless of whether future versions of the style-guide are called "PEP > 8" or whether they are given new names ("PEP 8" -> "PEP 88" -> ...), we > have the identical problem -- how do we know whether or not there is a > new version of the style guide to look for? In twelve months time, how > sure will we be that PEP 88 is the most recent version to look for? > Perhaps we missed the release of PEP 95. > > The one advantage of giving each revision of the document an updated > name is that, under some circumstances, we *might* be able to detect a > new revision easily. If I think that PEP 88 is the most recent version, > and somebody says that the recommended style guide is PEP 89, I might: > > - think that he merely made a mistake, and meant to say 88; or > - think that there is a new document for me to look at. > > >> Rather, I took Guido's mention of ?this belongs in a style guide? as >> suggesting a *new* style guide. Perhaps one that explicitly obsoletes an >> existing one or perhaps not; either way, the updated normative >> recommendations are in a new document with a new name, so that one knows >> whether one has already read it. > > How do you know which is the most recent version of the style guide to > look at? Instead of doing a O(1) lookup of PEP 8, you have to follow a > potentially O(N) search: > > PEP 8 is obsoleted by PEP 88... go and look at PEP 88. > PEP 88 is obsoleted by PEP 93... go at look at PEP 93. > PEP 93 is obsoleted by PEP 123... go and look at PEP 123. > PEP 123 doesn't contain an "obsoleted by" notice, so: > (1) either it is the current document, or > (2) it has been obsoleted, but the link to the new version was missed, > and it is now very hard to discover what the current document is called. > > Personally, I don't think the current PEP arrangement is broken enough > to change it. Each PEP is already tracked in VCS and history is > available for it. There's insufficient advantage, and some disadvantage, > to splitting each revision of the PEPs into new documents with new > names. -1 on the idea. FWIW, Guido recently ruled that updating PEP 333 to indicate how WSGI would work in Python3 was not appropriate, and suggested instead a new PEP (3333), stating[1]: Of those, IMO only textual clarifications ought to be made to an existing, accepted, widely implemented standards-track PEP. Note that the BDFL ruled this way even though the changes to PEP 333 were essentially clarifications which applied only to Python 3: the existing Python 2 semantics would have rmeained the same.[2] [1] http://permalink.gmane.org/gmane.comp.python.devel/117269 [2] http://permalink.gmane.org/gmane.comp.python.devel/117249 Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzj7ZwACgkQ+gerLs4ltQ7mPgCg1TpA+rF0WigLGB1xeuUTyRF7 MLQAnjGUgWZUqQBLfbwl6RanA+ME4Hth =zuiQ -----END PGP SIGNATURE----- From ncoghlan at gmail.com Wed Nov 17 16:00:20 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 01:00:20 +1000 Subject: [Python-Dev] I need help with IO testuite In-Reply-To: References: <4CE3E726.2030008@jcea.es> Message-ID: On Thu, Nov 18, 2010 at 12:58 AM, Nick Coghlan wrote: > For what Amaury is talking about, what you can test is that the higher > layers of the IO stack (e.g. BufferedReader) correctly pass the new > flags down to the RawIO layer. You're correct that you can't really > test that RawIO is actually passing the flags down to the OS. However, > if you have a way to check whether the filesystem in use is ZFS, you > may be able to create a conditionally executed test, such that correct > behaviour can be verified just by running on a machine that uses ZFS > for its temp directory. On further thought, the test should probably be unconditional - just allow a ValueError as an acceptable result that indicates the underlying filesystem isn't ZFS. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Wed Nov 17 16:17:45 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 17 Nov 2010 10:17:45 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: On Wed, Nov 17, 2010 at 9:19 AM, Nick Coghlan wrote: .. > The standard library documentation should say that the public API is > what the documentation says it is. Officially, anyone going outside > those documented APIs should not be surprised if things get removed or > changed arbitrarily without warning. That has long been the python-dev > policy and I, for one, don't think it should change. > +1 That's another reason why it is appropriate to document this in both Library Reference and the Developers Guide (whatever it is). In the Library Reference we can say point-blank: "This is the authoritative documentation of what Python Library provides. Anything not mentioned here is subject to change between releases without notice." In the Developers Guide, guide, however we can take a more nuanced approach that would start with a general policy that changing existing APIs public or not is costly and should not be done without significant offsetting benefit. More on this below. > What we're talking about in this thread is what to do in the grey area > of APIs which are not included in the official documentation, but also > don't have names starting with an underscore so they "look public" > when reading the source code or exploring the API in the interactive > interpreter. It *may* be appropriate for the standard library > documentation to acknowledge that this grey area exists (I'm not yet > convinced on that point), but it definitely should *not* be > encouraging anyone to rely on it or on our policies for dealing with > it. > Users will venture into grey area regardless of whether its existence is acknowledged or not. Developers Guide should take this into consideration, but there is no need to encourage this practice in the Library Reference. In the Developers Guide, we can list a set of factors that need to be considered when changing or removing an undocumented API. For example: 1. Does it start with an underscore? 2. Is __all__ defined for the module? Id so, is the name in __all__? 3. Is API name well chosen for what it does? 4. How old is the module? Was is written before modern policies have been adopted? 5. Is API used in the standard library outside of the module? 6. Is API broken? Can it be fixed? (If it was broken in several releases and nobody complained - it is ok to remove.) 7. Is API used? General google search or google code search can give an insight. The decision to remove an API should be always done on a case by case basis. Purely style compliance changes such as let's add __all__ and rename all names not in all by prepending an underscore should always add old names back as deprecated aliases. (Breaking from xyz import * by adding __all__ to xyz is probably ok because code using from xyz import * may be broken by any addition to xyz and users have been warned.) .. > So the official policy from a language *user* point of view would > remain unchanged (i.e. if it isn't documented, you're on your own). As > a *pragmatic* policy, however, we would explicitly acknowledge that > developers may inadvertently use an undocumented API without realising > that it isn't technically public, and hence apply the normal > deprecation process even though the official policy says we don't have > to. +1 From foom at fuhm.net Wed Nov 17 16:24:12 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 17 Nov 2010 10:24:12 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: <3B40F127-F82B-4A91-9485-3D089DAF4A4F@fuhm.net> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote: > (and is a little trickier in the case of module level globals, since those can't be deprecated properly) People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way... James From guido at python.org Wed Nov 17 16:30:03 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 17 Nov 2010 07:30:03 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <3B40F127-F82B-4A91-9485-3D089DAF4A4F@fuhm.net> References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <3B40F127-F82B-4A91-9485-3D089DAF4A4F@fuhm.net> Message-ID: On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight wrote: > On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote: >> (and is a little trickier in the case of module level globals, since those can't be deprecated properly) > > People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way... That's quite the distraction from the current thread though. Start discussing it on python-ideas, or submit a code fix, or something in between. But the hackish way that some 3rd party frameworks use (replacing the module object with a class instance in sys.modules) is clearly not right for the standard library (I'll explain on python-ideas if you insist). -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Nov 17 16:52:37 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 17 Nov 2010 07:52:37 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <87lj4t9cqq.fsf@benfinney.id.au> References: <64DF4272-FF17-4E82-96F5-1DA6CA3A06EC@gmail.com> <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <87lj4t9cqq.fsf@benfinney.id.au> Message-ID: On Tue, Nov 16, 2010 at 1:31 PM, Ben Finney wrote: > I don't know about Guido, but I'd be -1 on suggestions to add more > normative information to PEP 7, PEP 8, PEP 257, or any other established > style guide PEP. I certainly don't want to have to keep going back to > the same documents frequently just to see if the set of recommendations > I already know has changed recently. > > Rather, I took Guido's mention of "this belongs in a style guide" as > suggesting a *new* style guide. Perhaps one that explicitly obsoletes an > existing one or perhaps not; either way, the updated normative > recommendations are in a new document with a new name, so that one knows > whether one has already read it. That's not what I meant. In the case of style guides I think it is totally appropriate to update the PEP as new rules are developed or existing ones are clarified (or even changed). I certainly don't want to get into the situation where the style guide is spread over multiple documents that need to be taken together to make sense. It's not like PEP 8 specifies an API that is going to break code in the future -- it is a set of conventions. You could create a new PEP or move the style guide out of the PEP system (a not unreasonable option) but the effect of changes to the style guide is the same: some fraction of old code will become non-compliant. So what? A style guide is just that -- a guide for coding style. Every good style guide contains an escape clause: in PEP 8 it is the section named "A Foolish Consistency is the Hobgoblin of Little Minds". I've seen many unreasonable uses of style guides. This is a recurring theme with Google's internal style guides too. For example, some people get in an argument with a code reviewer about what's the best way to do something, and they can't agree -- so now they want a resolution in the style guide, no matter how specific their argument is to one particular context. Other people claim you cannot change a style guide because it would make existing code unnecessarily non-compliant. There are the people who insist that the style guide be followed mindlessly, even in situations where using a different style would be clearly better. Then there are the people who want to update the entire code base to become compliant after each style change. Etc., etc. All I want to say is, people lighten up. The style guide can't solve all your problems. You are never going to have all code compliant. Use the style guide when it helps, ignore it when it's in the way. Finally, there's the issue of the scope of PEP 8. Its heading says that it applies to the stdlib. The reason I put this in was so that 3rd party developers who disagreed with (part of) PEP 8 would not feel obligated to follow it. At the same time I would hope that most people see its value and follow (most of) it for their own code, accepting that a more universal set of conventions helps readability of all code. I would not be against changes to the style guide that emphasize that some rules apply specifically to the stdlib (the rules about mostly not using non-ASCII characters come to mind) and even to include some normative rules for stdlib developers (e.g. exactly how to use __all__ and private names). But we cannot hope that all stdlib modules will all look exactly alike. It is the work of many contributors, over many years, with different backgrounds and intentions. That's fine. Let's try to make new stdlib modules use the best style we can think of, but limit the time spent fretting over code that's already there. -- --Guido van Rossum (python.org/~guido) From jcea at jcea.es Wed Nov 17 17:07:02 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 17 Nov 2010 17:07:02 +0100 Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86 Message-ID: <4CE3FDA6.5040703@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, everybody. I am glad to say I am installing an OpenIndiana zone (Openindiana is a fork of Indiana, a distribution of OpenSolaris) with the aim to be a buildbot for python development. This machine has plenty of disk (even SSD!), CPU and memory for the task. I am reading http://wiki.python.org/moin/BuildBot . I have installed buildbotslave already, but I need passwords, etc., to link to python buildbot infraestructure. The machine is behind a NAT system, so any incoming connection will need to be documented and a port mapping request to be done. So, after installing buildbotslave, what is the next step?. Thanks to OpenIndiana staff, specially Alasdair Lumsden, for providing the physical resources for this attempt. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOP9pplgi5GaxT1NAQLmWQP6AqEGqEX3b50qKTP2MrkJwYQ8pXCOJm+6 fGB4jpH+i47mzgSOtANvrp1N5qOmHXzjbdWlVrL2/7ZOeLiGWSnq/ZvpTrYaysU3 o2zG4rhk48jsSYE7u0EoSKk272LmAiTU6WBSt6ZMzOGWIQxdjMhs/OVanpFybBc0 rCbATfdJ3hQ= =rIqM -----END PGP SIGNATURE----- From solipsis at pitrou.net Wed Nov 17 17:23:01 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Nov 2010 17:23:01 +0100 Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86 References: <4CE3FDA6.5040703@jcea.es> Message-ID: <20101117172301.36ac88f9@pitrou.net> On Wed, 17 Nov 2010 17:07:02 +0100 Jesus Cea wrote: > > I am reading http://wiki.python.org/moin/BuildBot . I have installed > buildbotslave already, but I need passwords, etc., to link to python > buildbot infraestructure. > > The machine is behind a NAT system, so any incoming connection will need > to be documented and a port mapping request to be done. There is no incoming connection; however, a bunch of outgoing connections are made to various hosts by various tests, so it's better if there's no overzealous firewall in-between. Regards Antoine. From foom at fuhm.net Wed Nov 17 17:23:43 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 17 Nov 2010 11:23:43 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <3B40F127-F82B-4A91-9485-3D089DAF4A4F@fuhm.net> Message-ID: On Nov 17, 2010, at 10:30 AM, Guido van Rossum wrote: > On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight wrote: >> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote: >>> (and is a little trickier in the case of module level globals, since those can't be deprecated properly) >> >> People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way... > > That's quite the distraction from the current thread though. Start > discussing it on python-ideas, or submit a code fix, or something in > between. But the hackish way that some 3rd party frameworks use > (replacing the module object with a class instance in sys.modules) is > clearly not right for the standard library (I'll explain on > python-ideas if you insist). I just don't want people to use the current lack as an excuse to simply remove module attributes without prior deprecation (or make a compatibility policy which recommends doing such a thing). I'll leave it up to the experts on this list (or python-ideas...) to determine how to implement a module-level deprecation in a way that isn't considered "hackish". (Or, if there is no such way, there's also the alternative of simply never removing module-level names.) James From guido at python.org Wed Nov 17 17:38:09 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 17 Nov 2010 08:38:09 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3D497.50102@voidspace.org.uk> <3B40F127-F82B-4A91-9485-3D089DAF4A4F@fuhm.net> Message-ID: On Wed, Nov 17, 2010 at 8:23 AM, James Y Knight wrote: > On Nov 17, 2010, at 10:30 AM, Guido van Rossum wrote: >> On Wed, Nov 17, 2010 at 7:24 AM, James Y Knight wrote: >>> On Nov 17, 2010, at 9:19 AM, Nick Coghlan wrote: >>>> (and is a little trickier in the case of module level globals, since those can't be deprecated properly) >>> >>> People keep saying this, but there have already been examples shown of how to do it. I actually think that python should include a way to do so standard -- it's a reasonable enough desire, as shown by how many times in this thread the inability to do so has been mentioned. If the existing working 3rd-party mechanisms aren't good enough for python-dev standards, come up with a new way... >> >> That's quite the distraction from the current thread though. Start >> discussing it on python-ideas, or submit a code fix, or something in >> between. But the hackish way that some 3rd party frameworks use >> (replacing the module object with a class instance in sys.modules) is >> clearly not right for the standard library (I'll explain on >> python-ideas if you insist). > > I just don't want people to use the current lack as an excuse to simply remove module attributes without prior deprecation (or make a compatibility policy which recommends doing such a thing). I'll leave it up to the experts on this list (or python-ideas...) to determine how to implement a module-level deprecation in a way that isn't considered "hackish". (Or, if there is no such way, there's also the alternative of simply never removing module-level names.) Deprecation doesn't *require* logging a warning or raising an exception. You can also add a note to the docs, or if it is undocumented, just add a comment to the code. (Though if it is in widespread use despite being undocumented, a better way would be to document it first -- as immediately deprecated if necessary.) Deprecation is in the end a way to give people advance warning about future changes. The mechanism of the warning doesn't always have to be implemented by the interpreter/compiler/parser or whatever other tool. -- --Guido van Rossum (python.org/~guido) From jcea at jcea.es Wed Nov 17 17:52:14 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 17 Nov 2010 17:52:14 +0100 Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86 In-Reply-To: <20101117172301.36ac88f9@pitrou.net> References: <4CE3FDA6.5040703@jcea.es> <20101117172301.36ac88f9@pitrou.net> Message-ID: <4CE4083E.1080603@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 17/11/10 17:23, Antoine Pitrou wrote: > There is no incoming connection; however, a bunch of outgoing > connections are made to various hosts by various tests, so it's better > if there's no overzealous firewall in-between. I know that, just confirming. """ You'll need to get someone to create the slavename/slavepasswd on dinsdale.python.org before doing this. Talk to someone like Antoine Pitrou, Martin von L?wis, Anthony or Neal Norwitz to do this. #python-dev on freenode is a good place to ask. """ ?Could you provide the connection credential?. I rather prefer to skip the IRC (I am a XMPP guy), but I can connect to freenode if you need it. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOQIPplgi5GaxT1NAQJJggP7B+kMnhEpZQlxCy8E95Qs3Q70zJmQJXjj aodjURYlIW9PJLXUMH0dhiK3Oggsl0k/iq44pL1fu+LRpgD7bo9Snxi4IBgYlArj IMGThrpdEHKVh0r2TkVsmkCA6pAwV3crM3170ItzSDqXZPmGQgqdqFuD5fk8xQl2 caqC+sTcJjw= =zbQs -----END PGP SIGNATURE----- From foom at fuhm.net Wed Nov 17 18:05:02 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 17 Nov 2010 12:05:02 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3D497.50102@voidspace.org.uk> <3B40F127-F82B-4A91-9485-3D089DAF4A4F@fuhm.net> Message-ID: On Nov 17, 2010, at 11:38 AM, Guido van Rossum wrote: > Deprecation doesn't *require* logging a warning or raising an > exception. You can also add a note to the docs, or if it is > undocumented, just add a comment to the code. (Though if it is in > widespread use despite being undocumented, a better way would be to > document it first -- as immediately deprecated if necessary.) > > Deprecation is in the end a way to give people advance warning about > future changes. The mechanism of the warning doesn't always have to be > implemented by the interpreter/compiler/parser or whatever other tool. Well, that's certainly a possible policy. I'd suggest that adding notes to the docs after-the-fact is a singularly ineffective way of giving people advance warning of feature removal compared to having the interpreter/compiler/parser or whatever other tool warn you. And if that's to be python's policy, when it's possible to do better, I'm disappointed. (But won't respond further, my point is made.) James From solipsis at pitrou.net Wed Nov 17 18:10:02 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Nov 2010 18:10:02 +0100 Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86 References: <4CE3FDA6.5040703@jcea.es> <20101117172301.36ac88f9@pitrou.net> <4CE4083E.1080603@jcea.es> Message-ID: <20101117181002.73e61cd1@pitrou.net> > > ?Could you provide the connection credential?. I rather prefer to skip > the IRC (I am a XMPP guy), but I can connect to freenode if you need it. I've already sent you a private e-mail. From jcea at jcea.es Wed Nov 17 18:13:24 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 17 Nov 2010 18:13:24 +0100 Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86 In-Reply-To: <20101117181002.73e61cd1@pitrou.net> References: <4CE3FDA6.5040703@jcea.es> <20101117172301.36ac88f9@pitrou.net> <4CE4083E.1080603@jcea.es> <20101117181002.73e61cd1@pitrou.net> Message-ID: <4CE40D34.4060804@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 17/11/10 18:10, Antoine Pitrou wrote: >> >> ?Could you provide the connection credential?. I rather prefer to skip >> the IRC (I am a XMPP guy), but I can connect to freenode if you need it. > > I've already sent you a private e-mail. OK. Sorry. My mail greylist is probably involved. Lets wait for another hour... Thanks for your time, Antoine. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOQNNJlgi5GaxT1NAQJVLAP9ElT0GGLWBZsGBMAHbzZCn1b0SC18Ki8o jp5eQgxDGRFo8ZPWVz3Q+/TGoIIs8UHLKjpYskfEae9Vm789lMlY/OZFerTn1Eus D9ldaVMKwpsLgSIgQr3AdAm3d5fXKvT6SXhGVwCOnuVi/iDiIGJl54UXoSqtLqo8 7PVP3LDaK8c= =8poZ -----END PGP SIGNATURE----- From janssen at parc.com Wed Nov 17 18:12:53 2010 From: janssen at parc.com (Bill Janssen) Date: Wed, 17 Nov 2010 09:12:53 PST Subject: [Python-Dev] Help deploying a new buildbot running OpenIndiana/x86 In-Reply-To: <4CE4083E.1080603@jcea.es> References: <4CE3FDA6.5040703@jcea.es> <20101117172301.36ac88f9@pitrou.net> <4CE4083E.1080603@jcea.es> Message-ID: <72914.1290013973@parc.com> Jesus Cea wrote: > On 17/11/10 17:23, Antoine Pitrou wrote: > > There is no incoming connection; however, a bunch of outgoing > > connections are made to various hosts by various tests, so it's better > > if there's no overzealous firewall in-between. For those of us who can't do that, there's a list of what machines the testing framework needs to be able to reach at . If you modify the tests, please keep that list up-to-date. Bill From alexander.belopolsky at gmail.com Wed Nov 17 19:35:19 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 17 Nov 2010 13:35:19 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: On Wed, Nov 17, 2010 at 8:30 AM, Nick Coghlan wrote: .. > The library documentation is *not* the right place for quibbling about > what constitutes a public API when using other means than the library > documentation to find APIs to call. > +1 People who bother to read the Library Reference most likely already know that it is the authoritative source. People who read the sources or use deep introspection most likely know that they are walking on thin ice. The only grey area is help() and dir(). Unfortunately may novice guides recommend using these tools for learning as follows: >>> L = [] >>> dir(L) ['append', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort'] >>> help(L.append) Help on built-in function append: ... See http://docs.python.org/faq/general.html#is-python-a-good-language-for-beginning-programmers Given the quirkiness of dir(), this is probably not the best practice. For the standard library however, >>> help('module') or $ pydoc module already refer users to the official manual. Unfortunately this feature is slightly broken in 3.x (the link takes you to 2.x documentation instead of 3.x). I have opened a bug report about this, http://bugs.python.org/issue10446, and would like to add a sentence or two to the "MODULE DOC" section explaining the differences between the auto-generated docs and the official manual. We may also revisit the rules used by help() to decide what to include on the auto-generated module implementation. Note that currently help() output excludes names not in __all__ is the module has __all__ defined. While I advocated this rule earlier in this thread, I now realize that it may not be quite practical. Consider the recent addition of open() to the tokenize module. It was documented in the manual, but (wisely) excluded from tokenize.__all__. It appears that this discussion is converging to the conclusion that public API = documented in the reST manual. An unfortunate consequence is that it is not easy to discover public API programmatically. However, "not easy" does not mean "impossible." ReST documentation is highly structured and Sphinx already generates various indices that can be easily queried. Maybe some of these indices should be distilled into something compact and made available to pydoc by the build process. This would allow help(anyobject) display a deep link to the official documentation or a warning that anyobject is undocumented. From tjreedy at udel.edu Wed Nov 17 20:52:58 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 17 Nov 2010 14:52:58 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDAA27B.8040703@voidspace.org.uk> <4CDBDB0C.6080703@voidspace.org.uk> <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <87lj4t9cqq.fsf@benfinney.id.au> Message-ID: On 11/17/2010 10:52 AM, Guido van Rossum wrote: > That's not what I meant. In the case of style guides I think it is > totally appropriate to update the PEP as new rules are developed or > existing ones are clarified (or even changed). Revising style guides is standard practice. The Chicago Manual of Style, which is practically the 'Bible' of American publishing, is now in its 16th edition after 104 years. http://www.amazon.com/Chicago-Manual-Style-16th/dp/0226104206/ref=sr_1_2?s=books&ie=UTF8&qid=1290022712&sr=1-2 Idea: include the 'current' version of PEP8 in the doc set for each Python version as the frozen Python Stdlib Style Guide for that version. Then people could specifically refer to the 3.2 version of the style guide. PEP8 would then be the trunk version subject to further revision. Include with the frozen version the repository id info needed to do a diff between it and future revisions so people can discover what has changed since whenever. -- Terry Jan Reedy From merwok at netwok.org Wed Nov 17 22:08:32 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 17 Nov 2010 22:08:32 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: <4CE44450.3020303@netwok.org> > We may also revisit the rules used by help() to decide what to include > on the auto-generated module implementation. Note that currently > help() output excludes names not in __all__ is the module has __all__ > defined. While I advocated this rule earlier in this thread, I now > Consider the recent addition of open() to the tokenize module. It > was documented in the manual, but (wisely) excluded from tokenize.__all__. I?m not sure this was on purpose. Victor? From ncoghlan at gmail.com Wed Nov 17 22:10:01 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 07:10:01 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE44450.3020303@netwok.org> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> Message-ID: On Thu, Nov 18, 2010 at 7:08 AM, ?ric Araujo wrote: >> We may also revisit the rules used by help() to decide what to include >> on the auto-generated module implementation. ?Note that currently >> help() output excludes names not in __all__ is the module has __all__ >> defined. ?While I advocated this rule earlier in this thread, I now >> Consider the recent addition of open() to the tokenize module. ?It >> was documented in the manual, but (wisely) excluded from tokenize.__all__. > > I?m not sure this was on purpose. ?Victor? Excluding a builtin name from __all__ sounds like a perfectly sensible idea, so even if it wasn't deliberate, I'd say it qualifies as fortuitous :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jaraco at jaraco.com Wed Nov 17 21:58:10 2010 From: jaraco at jaraco.com (Jason R. Coombs) Date: Wed, 17 Nov 2010 12:58:10 -0800 Subject: [Python-Dev] new LRU cache API in Py3.2 In-Reply-To: References: Message-ID: <12C7AB425F0DD546B6049311F827C74E0986D4151B@VA3DIAXVS141.RED001.local> I see now that my previous reply went only to Stefan, so I'm re-submitting, this time to the list. > -----Original Message----- > From: Stefan Behnel > Sent: Saturday, 04 September, 2010 04:29 > > What about adding an intermediate namespace called "cache", so that > the new operations are available like this: > > print get_phone_number.cache.hits > get_phone_number.cache.clear() I agree. While the function-based implementation is highly efficient, the pure use of functions has the counter-Pythonic effect of obfuscating the internal state (the same way the 'private' keyword does in Java). A class-based implementation would be capable of having its state introspected and could easily be extended. While the functional implementation is a powerful construct, it fails to generalize well. IMHO, a stdlib implementation should err on the side of transparency and extensibility over performance. That said, I've adapted Hettinger's Python 2.5 implementation to a class-based implementation. I've tried to keep the performance optimizations in place, but instead of instrumenting the wrapped method with lots of cache_* functions, I simply attach the cache object itself, which then provides the interface suggested by Stefan. This technique allows access to the cache object and all of its internal state, so it's also possible to do things like: get_phone_number.cache.maxsize += 100 or if get_phone_number.cache.store: do_something_interesting() These techniques are nearly impossible in the functional implementation, as the state is buried in the locals() of the nested functions. I'm most grateful to Raymond for contributing this to Python; On many occasions, I've used the ActiveState recipes for simple caches, but in almost every case, I've had to adapt the implementation to provide more transparency. I'd prefer to not have to do the same with the stdlib. Regards, Jason R. Coombs -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cache.py URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6448 bytes Desc: not available URL: From merwok at netwok.org Wed Nov 17 22:16:10 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Wed, 17 Nov 2010 22:16:10 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> Message-ID: <4CE4461A.1020007@netwok.org> > Excluding a builtin name from __all__ sounds like a perfectly sensible > idea, so even if it wasn't deliberate, I'd say it qualifies as > fortuitous :) But then, a tool that looks into __all__ to find for example what objects to document will miss open. I?d put open in __all__. Regards From g.brandl at gmx.net Wed Nov 17 22:22:50 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 17 Nov 2010 22:22:50 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE4461A.1020007@netwok.org> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: Am 17.11.2010 22:16, schrieb ?ric Araujo: >> Excluding a builtin name from __all__ sounds like a perfectly sensible >> idea, so even if it wasn't deliberate, I'd say it qualifies as >> fortuitous :) > > But then, a tool that looks into __all__ to find for example what > objects to document will miss open. I?d put open in __all__. So it comes down again to what we'd like __all__ to mean foremost: public API, or just a list for "import *"? Georg From fdrake at acm.org Wed Nov 17 22:39:25 2010 From: fdrake at acm.org (Fred Drake) Date: Wed, 17 Nov 2010 16:39:25 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: On Wed, Nov 17, 2010 at 4:22 PM, Georg Brandl wrote: > So it comes down again to what we'd like __all__ to mean foremost: > public API, or just a list for "import *"? It is and has been since its inception *the* list for "import *". Any additional meaning will have to accommodate that usage as well. ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From solipsis at pitrou.net Wed Nov 17 22:48:01 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Nov 2010 22:48:01 +0100 Subject: [Python-Dev] PEP 3151 dictator Message-ID: <20101117224801.44d97bad@pitrou.net> Hello, I would like to announce that, following Guido's (private) suggestion that I find a temporary dictator for PEP 3151, Barry has accepted to fill in this role. Regards Antoine. From g.brandl at gmx.net Wed Nov 17 22:50:10 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 17 Nov 2010 22:50:10 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: Am 17.11.2010 22:39, schrieb Fred Drake: > On Wed, Nov 17, 2010 at 4:22 PM, Georg Brandl wrote: >> So it comes down again to what we'd like __all__ to mean foremost: >> public API, or just a list for "import *"? > > It is and has been since its inception *the* list for "import *". > > Any additional meaning will have to accommodate that usage as well. Seeing that "import *" is discouraged anywhere I look, it might just not be as important anymore. BTW, "open" is listed in __all__ for lots of modules: io, gzip, dbm... and even "ancient" ones like aifc. cheers, Georg From steve at pearwood.info Wed Nov 17 22:57:00 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 18 Nov 2010 08:57:00 +1100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> Message-ID: <4CE44FAC.5010408@pearwood.info> Nick Coghlan wrote: > The policy we're aiming to clarify here is what we should do when we > come across standard library APIs that land in the grey area, with > there being two appropriate ways to deal with them: > 1. Document them and make them officially public > 2. Deprecate the public names and make them officially private (with > the public names later removed in accordance with normal deprecation > procedures) You missed at least two other options: 3. Treat "documented" and "public" as orthogonal, not synonymous: undocumented public API is not an oxymoron, and neither is documented private API. 4. Do nothing. Inertia wins. Is this problem we're trying to solve so serious that we need to solve it now except on a case-by-case basis? The approach that gives us the most flexibility is #3. Clearly one would not need to document private APIs for the use of the general public, but adding docstrings to private functions and classes for in-house use is a sensible thing to do. This applies equally to the standard library as to any other major project. Likewise, one might introduce a public function into some module, but for whatever reason, choose not to document it. (Perhaps it's a lack of hours in the day, perhaps it is a deliberate decision.) In this case, the mere lack of documentation shouldn't relieve us of the responsibility of treating the function as public. For emphasis: I strongly believe that public/private and documented/undocumented are orthogonal qualities, and should not be treated as, or forced to be, identical. The use of imported modules is possibly an exception. If a user is writing something like (say) getopt.os.getcwd() instead of importing os directly, then they're on shaky ground. We shouldn't expect module authors to write "import os as _os" just to avoid making os a part of their public API. I'd be prepared to make an exception to the rule "no leading underscore means public": imported modules are implementation details unless explicitly documented otherwise. E.g. the os module explicitly makes path part of its public API, but os.sys is an implementation detail. -- Steven From ben+python at benfinney.id.au Thu Nov 18 02:08:08 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 18 Nov 2010 12:08:08 +1100 Subject: [Python-Dev] Breaking undocumented API References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44FAC.5010408@pearwood.info> Message-ID: <8762vva16v.fsf@benfinney.id.au> Steven D'Aprano writes: > 3. Treat "documented" and "public" as orthogonal, not synonymous: > undocumented public API is not an oxymoron, and neither is documented > private API. +1 > The use of imported modules is possibly an exception. If a user is > writing something like (say) getopt.os.getcwd() instead of importing > os directly, then they're on shaky ground. We shouldn't expect module > authors to write "import os as _os" just to avoid making os a part of > their public API. > > I'd be prepared to make an exception to the rule "no leading > underscore means public": imported modules are implementation details > unless explicitly documented otherwise. E.g. the os module explicitly > makes path part of its public API, but os.sys is an implementation > detail. After reading the discussion for many days, I'm leaning to this position also. -- \ ?I may disagree with what you say, but I will defend to the | `\ death your right to mis-attribute this quote to Voltaire.? | _o__) ?Avram Grumer, rec.arts.sf.written, 2000-05-30 | Ben Finney From guido at python.org Thu Nov 18 03:44:35 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 17 Nov 2010 18:44:35 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <8762vva16v.fsf@benfinney.id.au> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44FAC.5010408@pearwood.info> <8762vva16v.fsf@benfinney.id.au> Message-ID: On Wed, Nov 17, 2010 at 5:08 PM, Ben Finney wrote: > Steven D'Aprano writes: > >> 3. Treat "documented" and "public" as orthogonal, not synonymous: >> undocumented public API is not an oxymoron, and neither is documented >> private API. > > +1 > >> The use of imported modules is possibly an exception. If a user is >> writing something like (say) getopt.os.getcwd() instead of importing >> os directly, then they're on shaky ground. We shouldn't expect module >> authors to write "import os as _os" just to avoid making os a part of >> their public API. >> >> I'd be prepared to make an exception to the rule "no leading >> underscore means public": imported modules are implementation details >> unless explicitly documented otherwise. E.g. the os module explicitly >> makes path part of its public API, but os.sys is an implementation >> detail. > > After reading the discussion for many days, I'm leaning to this position > also. Agreed on both counts. -- --Guido van Rossum (python.org/~guido) From fuzzyman at voidspace.org.uk Thu Nov 18 11:47:18 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 18 Nov 2010 10:47:18 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE4461A.1020007@netwok.org> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: <4CE50436.4010706@voidspace.org.uk> On 17/11/2010 21:16, ?ric Araujo wrote: >> Excluding a builtin name from __all__ sounds like a perfectly sensible >> idea, so even if it wasn't deliberate, I'd say it qualifies as >> fortuitous :) > But then, a tool that looks into __all__ to find for example what > objects to document will miss open. I?d put open in __all__. > "import *" would then override the builtin open. A good reason not to use "import *" I guess, but also a good reason not to create names that shadow builtins. All the best, Michael > Regards > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Thu Nov 18 11:54:23 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 18 Nov 2010 10:54:23 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: <4CE505DF.8030401@voidspace.org.uk> On 17/11/2010 21:22, Georg Brandl wrote: > Am 17.11.2010 22:16, schrieb ?ric Araujo: >>> Excluding a builtin name from __all__ sounds like a perfectly sensible >>> idea, so even if it wasn't deliberate, I'd say it qualifies as >>> fortuitous :) >> But then, a tool that looks into __all__ to find for example what >> objects to document will miss open. I?d put open in __all__. > So it comes down again to what we'd like __all__ to mean foremost: > public API, or just a list for "import *"? Well, as noted earlier in this discussion - the language reference *states* that __all__ defines the module level public API. From: http://docs.python.org/reference/simple_stmts.html#grammar-token-import_stmt "If the list of identifiers is replaced by a star ('*'), all public names defined in the module are bound in the local namespace of the import statement." ... "The public names defined by a module are determined by checking the module?s namespace for a variable named __all__" If we decide that __all__ is purely for "import *" we should refine the use of the word public on this page. All the best, Michael Foord > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Thu Nov 18 12:41:07 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 18 Nov 2010 11:41:07 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE44FAC.5010408@pearwood.info> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44FAC.5010408@pearwood.info> Message-ID: <4CE510D3.5090501@voidspace.org.uk> On 17/11/2010 21:57, Steven D'Aprano wrote: > Nick Coghlan wrote: > >> The policy we're aiming to clarify here is what we should do when we >> come across standard library APIs that land in the grey area, with >> there being two appropriate ways to deal with them: >> 1. Document them and make them officially public >> 2. Deprecate the public names and make them officially private (with >> the public names later removed in accordance with normal deprecation >> procedures) > > You missed at least two other options: > > 3. Treat "documented" and "public" as orthogonal, not synonymous: > undocumented public API is not an oxymoron, and neither is documented > private API. > Along with the others +1 I think how we handle the deprecations (legacy modules with unclear or clearly wrong naming policies) is the least interesting part of this discussion. For deprecating existing names we have *no choice* but to proceed on a case-by-case basis evaluating how likely the deprecation is to break other code, whether or not the name was originally intended to be public or not. (At least that is how we *should* proceed and part of our standard deprecation policy - it is why we aren't removing unittest.TestCase.assertEquals and assert_ even though they are deprecated. They are just too widely used.) What is more important is that we have a clearly stated policy for new modules and adding names to existing modules so that we don't have to repeat this debate in five years time. My suggestion, which fits in with the use of __all__ by the language and also the convention widely in use by the community already boils down to: * If __all__ exists it is definitive * Imported names are never part of the public API of a module unless in __all__ or documented to be part of the API * Names with leading underscores are private unless in __all__ (and if you want to export leading underscore names as part of a public API you should define __all__ or "import *" won't export them) * Leading underscore convention extends to packages and class members; no members of a package or class whose name begins with a leading underscore are public It is still good practise that public APIs *should* be documented (and *should* have docstrings). There is however no corollary that private APIs should not be documented (and they may have docstrings). All the best, Michael Foord > 4. Do nothing. Inertia wins. Is this problem we're trying to solve so > serious that we need to solve it now except on a case-by-case basis? > > The approach that gives us the most flexibility is #3. Clearly one > would not need to document private APIs for the use of the general > public, but adding docstrings to private functions and classes for > in-house use is a sensible thing to do. This applies equally to the > standard library as to any other major project. > > Likewise, one might introduce a public function into some module, but > for whatever reason, choose not to document it. (Perhaps it's a lack > of hours in the day, perhaps it is a deliberate decision.) In this > case, the mere lack of documentation shouldn't relieve us of the > responsibility of treating the function as public. > > For emphasis: I strongly believe that public/private and > documented/undocumented are orthogonal qualities, and should not be > treated as, or forced to be, identical. > > The use of imported modules is possibly an exception. If a user is > writing something like (say) getopt.os.getcwd() instead of importing > os directly, then they're on shaky ground. We shouldn't expect module > authors to write "import os as _os" just to avoid making os a part of > their public API. > > I'd be prepared to make an exception to the rule "no leading > underscore means public": imported modules are implementation details > unless explicitly documented otherwise. E.g. the os module explicitly > makes path part of its public API, but os.sys is an implementation > detail. > > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Thu Nov 18 13:16:35 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 18 Nov 2010 22:16:35 +1000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: On Thu, Nov 18, 2010 at 7:22 AM, Georg Brandl wrote: > Am 17.11.2010 22:16, schrieb ?ric Araujo: >>> Excluding a builtin name from __all__ sounds like a perfectly sensible >>> idea, so even if it wasn't deliberate, I'd say it qualifies as >>> fortuitous :) >> >> But then, a tool that looks into __all__ to find for example what >> objects to document will miss open. ?I?d put open in __all__. > > So it comes down again to what we'd like __all__ to mean foremost: > public API, or just a list for "import *"? It's the list for star imports. This intended use case is borne out by the description of the feature when it was first added to the language back in 2.1: http://docs.python.org/dev/whatsnew/2.1.html?highlight=__all__#other-changes-and-fixes The public API (for documentation and introspection purposes) is any name that doesn't start with an underscore and isn't an imported module. If a tool is attempting to use __all__ as more than just the list of names for star imports, I would call the tool buggy. The use of the term "public names" in the language reference when describing the semantics of __all__ is an unfortunate choice, but it is used specifically in the context of talking about star imports and clarifying which names they bring in without making any reference to standards for documentation or deprecation policies. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Thu Nov 18 13:37:38 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 18 Nov 2010 13:37:38 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE50436.4010706@voidspace.org.uk> References: <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> <4CE50436.4010706@voidspace.org.uk> Message-ID: Am 18.11.2010 11:47, schrieb Michael Foord: > On 17/11/2010 21:16, ?ric Araujo wrote: >>> Excluding a builtin name from __all__ sounds like a perfectly sensible >>> idea, so even if it wasn't deliberate, I'd say it qualifies as >>> fortuitous :) >> But then, a tool that looks into __all__ to find for example what >> objects to document will miss open. I?d put open in __all__. >> > > "import *" would then override the builtin open. A good reason not to > use "import *" I guess, but also a good reason not to create names that > shadow builtins. Heh. Instead have fun with io.ioopen(), gzip.gzipopen(), webbrowser.webbrowseropen(), etc.? We do have namespace support for a reason. Georg From fuzzyman at voidspace.org.uk Thu Nov 18 13:48:57 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 18 Nov 2010 12:48:57 +0000 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> <4CE50436.4010706@voidspace.org.uk> Message-ID: <4CE520B9.3020500@voidspace.org.uk> On 18/11/2010 12:37, Georg Brandl wrote: > Am 18.11.2010 11:47, schrieb Michael Foord: >> On 17/11/2010 21:16, ?ric Araujo wrote: >>>> Excluding a builtin name from __all__ sounds like a perfectly sensible >>>> idea, so even if it wasn't deliberate, I'd say it qualifies as >>>> fortuitous :) >>> But then, a tool that looks into __all__ to find for example what >>> objects to document will miss open. I?d put open in __all__. >>> >> "import *" would then override the builtin open. A good reason not to >> use "import *" I guess, but also a good reason not to create names that >> shadow builtins. > Heh. Instead have fun with io.ioopen(), gzip.gzipopen(), > webbrowser.webbrowseropen(), etc.? We do have namespace support for a reason. Or urllib2.urlopen, oh wait - that's real... If I was importing from those namespaces I probably *would* import and rename to have unambiguous names (and you would *have* to if there was any possibility of you using the builtin open). io.open is arguably an exception to this as it does the same as the builtin open... Using meaningful names is *good*. This is a reason I dislike modules that just call their base exception class "Error". You *have* to use it from the namespace (or import with import as and give it a good name) for it to have any meaning. Michael > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From lukasz at langa.pl Thu Nov 18 14:13:39 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Thu, 18 Nov 2010 14:13:39 +0100 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE520B9.3020500@voidspace.org.uk> References: <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> <4CE50436.4010706@voidspace.org.uk> <4CE520B9.3020500@voidspace.org.uk> Message-ID: <4CE52683.2040809@langa.pl> Am 18.11.2010 13:48, schrieb Michael Foord: > On 18/11/2010 12:37, Georg Brandl wrote: >> Am 18.11.2010 11:47, schrieb Michael Foord: >>> On 17/11/2010 21:16, ?ric Araujo wrote: >>>>> Excluding a builtin name from __all__ sounds like a perfectly >>>>> sensible >>>>> idea, so even if it wasn't deliberate, I'd say it qualifies as >>>>> fortuitous :) >>>> But then, a tool that looks into __all__ to find for example what >>>> objects to document will miss open. I?d put open in __all__. >>>> >>> "import *" would then override the builtin open. A good reason not to >>> use "import *" I guess, but also a good reason not to create names that >>> shadow builtins. >> Heh. Instead have fun with io.ioopen(), gzip.gzipopen(), >> webbrowser.webbrowseropen(), etc.? We do have namespace support for >> a reason. > > Or urllib2.urlopen, oh wait - that's real... > > If I was importing from those namespaces I probably *would* import and > rename to have unambiguous names (and you would *have* to if there was > any possibility of you using the builtin open). io.open is arguably an > exception to this as it does the same as the builtin open... > > Using meaningful names is *good*. This is a reason I dislike modules > that just call their base exception class "Error". You *have* to use > it from the namespace (or import with import as and give it a good > name) for it to have any meaning. > Guys, I may agree or disagree with these statements but we are drifting towards "opinion" versus "solid, well understood practice". Let's focus on the subject. For the matter, "import *" is a discouraged mechanism anyway, let alone the rare exceptions where its usage is valid. If you use star-imports and you don't know what you're doing, you might just as well hurt yourself in other ways than just by "open". Maybe we should just sum up the discussion somewhere already. Keeping up with a thread reaching a megabyte in size is starting to be painful. Best regards, ?ukasz From fdrake at acm.org Thu Nov 18 14:47:05 2010 From: fdrake at acm.org (Fred Drake) Date: Thu, 18 Nov 2010 08:47:05 -0500 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: <4CE510D3.5090501@voidspace.org.uk> References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44FAC.5010408@pearwood.info> <4CE510D3.5090501@voidspace.org.uk> Message-ID: On Thu, Nov 18, 2010 at 6:41 AM, Michael Foord wrote: > Along with the others +1 I agree with keeping these distinct and orthogonal as well. > What is more important is that we have a clearly stated policy for new > modules and adding names to existing modules so that we don't have to repeat > this debate in five years time. Agreed again. > My suggestion, which fits in with the use of __all__ by the language and > also the convention widely in use by the community already boils down to: > > * If __all__ exists it is definitive I think this is overly vague. :-) Specifically, if something is mentioned in __all__, it's public. Non-inclusion in __all__ doesn't imply privateness. > * Names with leading underscores are private unless in __all__ (and if you > want to export leading underscore names as part of a public API you should > define __all__ or "import *" won't export them) We shouldn't confuse non-export via "import *" with non-public, however. ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From solipsis at pitrou.net Thu Nov 18 16:18:57 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 18 Nov 2010 16:18:57 +0100 Subject: [Python-Dev] r86514 - in python/branches/py3k/Lib: test/test_xmlrpc.py xmlrpc/client.py References: <20101118150053.F2247EEA6E@mail.python.org> Message-ID: <20101118161857.42a6750e@pitrou.net> On Thu, 18 Nov 2010 16:00:53 +0100 (CET) senthil.kumaran wrote: > Author: senthil.kumaran > Date: Thu Nov 18 16:00:53 2010 > New Revision: 86514 > > Log: > Fix Issue 9991: xmlrpc client ssl check faulty > [...] > > + def test_ssl_presence(self): > + #Check for ssl support > + have_ssl = False > + if hasattr(socket, 'ssl'): > + have_ssl = True This is not the right way to check for ssl. socket.ssl is deprecated in 2.x and doesn't exist in 3.x. "import ssl" is enough. > + try: > + xmlrpc.client.ServerProxy('https://localhost:9999').bad_function() > + except: > + exc = sys.exc_info() > + if exc[0] == socket.error: This is a rather clumsy way to check for exception types. Why don't you just write "except socket.error"? > - if not hasattr(socket, "ssl"): > + if not hasattr(http.client, "ssl"): That isn't better. "http.client.ssl" is not a public API. You should check for http.client.HTTPSConnection instead. cheers Antoine. From orsenthil at gmail.com Thu Nov 18 17:23:25 2010 From: orsenthil at gmail.com (Senthil Kumaran) Date: Fri, 19 Nov 2010 00:23:25 +0800 Subject: [Python-Dev] r86514 - in python/branches/py3k/Lib: test/test_xmlrpc.py xmlrpc/client.py In-Reply-To: <20101118161857.42a6750e@pitrou.net> References: <20101118150053.F2247EEA6E@mail.python.org> <20101118161857.42a6750e@pitrou.net> Message-ID: On Thu, Nov 18, 2010 at 11:18 PM, Antoine Pitrou wrote: >> Log: >> Fix Issue 9991: xmlrpc client ssl check faulty >> > [...] >> >> + ? ?def test_ssl_presence(self): >> + ? ? ? ?#Check for ssl support >> + ? ? ? ?have_ssl = False >> + ? ? ? ?if hasattr(socket, 'ssl'): >> + ? ? ? ? ? ?have_ssl = True > > This is not the right way to check for ssl. ?socket.ssl is deprecated in > 2.x and doesn't exist in 3.x. ?"import ssl" is enough. The history of the bug report showed that it was closed earlier with comments such as "Python should be complied with SSL" which had resulted in some confusion, so after some thought, I let those earlier verifications remain (Just for readability/understanding the context of the tests). Thinking again, I see that it is not required. Agree to your comments on code changes. Shall change it. -- Senthil From martin at v.loewis.de Thu Nov 18 17:25:41 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 18 Nov 2010 17:25:41 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> Message-ID: <4CE55385.6080002@v.loewis.de> Am 17.11.2010 08:18, schrieb Georg Brandl: > Am 16.11.2010 19:38, schrieb Jesus Cea: >> Is there any updated mercurial schedule?. >> >> Any impact related with the new 3.2 schedule (three weeks offset)? > > I've been trying to contact Dirkjan and ask; generally, I don't > see much connection to the 3.2 schedule (with the exception that > the final migration day should not be a release day.) Please reconsider. When Python migrates to Mercurial, new features will be added to Python, most notably a new way of identifying versions, perhaps new variables in the sys module. So far, the policy has been that no new features can be added after beta 1. So consequentially, migrating 3.2 to Mercurial would violate that policy if done after b1. Consequentially, we would need to release 3.2 from Subversion, which in turn means that the Mercurial migration should be delayed until after 3.2 is released. Alternatively, b1 should be postponed until after the Mercurial migration is done. Regards, Martin From guido at python.org Thu Nov 18 17:50:22 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 18 Nov 2010 08:50:22 -0800 Subject: [Python-Dev] Breaking undocumented API In-Reply-To: References: <20101111100516.6e90aa41@mission> <4CDC08F3.6010501@langa.pl> <4CDC0950.5040309@voidspace.org.uk> <20101116163454.2040.394815387.divmod.xquotient.928@localhost.localdomain> <4CE3C31D.50701@voidspace.org.uk> <4CE3CC87.1000105@langa.pl> <4CE3D497.50102@voidspace.org.uk> <4CE44450.3020303@netwok.org> <4CE4461A.1020007@netwok.org> Message-ID: On Thu, Nov 18, 2010 at 4:16 AM, Nick Coghlan wrote: > On Thu, Nov 18, 2010 at 7:22 AM, Georg Brandl wrote: >> So it comes down again to what we'd like __all__ to mean foremost: >> public API, or just a list for "import *"? > > It's the list for star imports. This intended use case is borne out by > the description of the feature when it was first added to the language > back in 2.1: > http://docs.python.org/dev/whatsnew/2.1.html?highlight=__all__#other-changes-and-fixes > > The public API (for documentation and introspection purposes) is any > name that doesn't start with an underscore and isn't an imported > module. If a tool is attempting to use __all__ as more than just the > list of names for star imports, I would call the tool buggy. Not so fast. The feature's meaning has clearly evolved. > The use of the term "public names" in the language reference when > describing the semantics of __all__ is an unfortunate choice, but it > is used specifically in the context of talking about star imports and > clarifying which names they bring in without making any reference to > standards for documentation or deprecation policies. Let's live with a little ambiguity. There are more shades of gray here than you can imagine. I like gray. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Nov 18 17:57:58 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 18 Nov 2010 08:57:58 -0800 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE55385.6080002@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> Message-ID: On Thu, Nov 18, 2010 at 8:25 AM, "Martin v. L?wis" wrote: > Am 17.11.2010 08:18, schrieb Georg Brandl: >> Am 16.11.2010 19:38, schrieb Jesus Cea: >>> Is there any updated mercurial schedule?. >>> >>> Any impact related with the new 3.2 schedule (three weeks offset)? >> >> I've been trying to contact Dirkjan and ask; generally, I don't >> see much connection to the 3.2 schedule (with the exception that >> the final migration day should not be a release day.) > > Please reconsider. When Python migrates to Mercurial, new features > will be added to Python, most notably a new way of identifying versions, > perhaps new variables in the sys module. So far, the policy has been > that no new features can be added after beta 1. So consequentially, > migrating 3.2 to Mercurial would violate that policy if done after b1. > Consequentially, we would need to release 3.2 from Subversion, which > in turn means that the Mercurial migration should be delayed until > after 3.2 is released. > > Alternatively, b1 should be postponed until after the Mercurial > migration is done. I think this "new feature" is not so shocking that it can be used as an argument to hold up the migration. If you have another reason to stop the migration please say so; personally I can't wait for it to happen. -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Thu Nov 18 18:08:10 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 18 Nov 2010 18:08:10 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE55385.6080002@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> Message-ID: Am 18.11.2010 17:25, schrieb "Martin v. L?wis": > Am 17.11.2010 08:18, schrieb Georg Brandl: >> Am 16.11.2010 19:38, schrieb Jesus Cea: >>> Is there any updated mercurial schedule?. >>> >>> Any impact related with the new 3.2 schedule (three weeks offset)? >> >> I've been trying to contact Dirkjan and ask; generally, I don't >> see much connection to the 3.2 schedule (with the exception that >> the final migration day should not be a release day.) > > Please reconsider. When Python migrates to Mercurial, new features > will be added to Python, most notably a new way of identifying versions, > perhaps new variables in the sys module. So far, the policy has been > that no new features can be added after beta 1. So consequentially, > migrating 3.2 to Mercurial would violate that policy if done after b1. > Consequentially, we would need to release 3.2 from Subversion, which > in turn means that the Mercurial migration should be delayed until > after 3.2 is released. I'm with Guido here. Plus, if you like it can be seen as a bug fix: the SVN build identification stops working, and we neeed to fix it. Georg From martin at v.loewis.de Thu Nov 18 18:32:33 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 18 Nov 2010 18:32:33 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> Message-ID: <4CE56331.3050508@v.loewis.de> >> Alternatively, b1 should be postponed until after the Mercurial >> migration is done. > > I think this "new feature" is not so shocking that it can be used as > an argument to hold up the migration. If you have another reason to > stop the migration please say so; personally I can't wait for it to > happen. I can't point out any other specific concern, just a general feeling that *when* the migration happens, it will be rushed, and we will have to deal for a long time with the aftermath. For example, I expect that it will take me several days until I get the Windows build process to work correctly, and, if the migration gets as rushed as it appears to, that the migration will happen without everything being worked out beforehand. Therefore, I'm concerned that I will have to work out all the details on my own, just so that I can produce the b2 binaries (says); this is not something I look forward to. I'm not asking that the migration be stopped - I'm asking that it be accelerated, so that there is plenty of time to identify all the problems. But I'm also not willing to put time into it. Failing the acceleration, I ask that appropriate consequences for the 3.2 release are drawn: either it is postponed, or done using Subversion until the final release (I think something can be worked out then to get the 3.2.1 release from Mercurial - with only slight incompatibilities). In general, I'm *also* concerned about the lack of volunteers that are interested in working on the infrastructure. I wish some of the people who stated that they can't wait for the migration to happen would work on solving some of the remaining problems. Regards, Martin From g.brandl at gmx.net Thu Nov 18 19:56:51 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 18 Nov 2010 19:56:51 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE56331.3050508@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> Message-ID: Am 18.11.2010 18:32, schrieb "Martin v. L?wis": >>> Alternatively, b1 should be postponed until after the Mercurial >>> migration is done. >> >> I think this "new feature" is not so shocking that it can be used as >> an argument to hold up the migration. If you have another reason to >> stop the migration please say so; personally I can't wait for it to >> happen. > > I can't point out any other specific concern, just a general feeling > that *when* the migration happens, it will be rushed, and we will have > to deal for a long time with the aftermath. For example, I expect that > it will take me several days until I get the Windows build process to > work correctly, and, if the migration gets as rushed as it appears to, > that the migration will happen without everything being worked out > beforehand. > > Therefore, I'm concerned that I will have to work out all the details > on my own, just so that I can produce the b2 binaries (says); this is > not something I look forward to. How much does the binary build process really depend on version control? I.e., what would be stopping you from making a binary from an archive made with e.g. "svn export"? (I'm really asking because I don't know.) Concerning the SVN external/ subdir, that is quite orthogonal to the main development repo, and doesn't need to be migrated in lockstep (if it is migrated to Mercurial at all in its current shape. > I'm not asking that the migration be stopped - I'm asking that it be > accelerated, so that there is plenty of time to identify all the > problems. But I'm also not willing to put time into it. I think we have anticipated what we could. Of course there will still be problems, but I think not of the sort that causes big disruptions everywhere, preventing our developers from committing or breaking the issue tracker, etc. > Failing the acceleration, I ask that appropriate consequences for > the 3.2 release are drawn: either it is postponed, or done using > Subversion until the final release (I think something can be worked > out then to get the 3.2.1 release from Mercurial - with only slight > incompatibilities). > > In general, I'm *also* concerned about the lack of volunteers that > are interested in working on the infrastructure. I wish some of the > people who stated that they can't wait for the migration to happen > would work on solving some of the remaining problems. Well, put some butter to the fish: how many volunteers would you deem sufficient, and which specific tasks are uncared for in the infrastructure? I can only speak for myself, but I am prepared to put in my time. Georg From martin at v.loewis.de Thu Nov 18 20:33:40 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 18 Nov 2010 20:33:40 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> Message-ID: <4CE57F94.4080703@v.loewis.de> >> Therefore, I'm concerned that I will have to work out all the details >> on my own, just so that I can produce the b2 binaries (says); this is >> not something I look forward to. > > How much does the binary build process really depend on version control? > I.e., what would be stopping you from making a binary from an archive made > with e.g. "svn export"? (I'm really asking because I don't know.) The build process currently compiles a program (make_buildinfo), which in turn finds the subversion installation, and runs subwcrev if found. If no .svn folder is found, it falls back to the version information in the export. I would have to try out what exactly will happen when I try to build the current hg conversion result on Windows, but chances are that the resulting interpreter will crash because the string manipulation fails to find the right substrings. > Well, put some butter to the fish: how many volunteers would you deem > sufficient, and which specific tasks are uncared for in the infrastructure? > I can only speak for myself, but I am prepared to put in my time. As a starting point, I'd like to see a complete, current conversion result, using as many repositories as planned, and including as many branches into each repository as planned (rather than the giant cpython repository which we have now - unless the plan now is that there will be a single giant repository). Then the existing patches to the build identification should be applied, and the repositories should be opened for (test) commits. Then people could start identifying problems. As a parallel activity, I'd also ask that the PEP is finished, or atleast put into a form where the authors consider it complete (again so that people could start identifying issues, and determine where the PEP differs from reality - currently most obviously in the branching approach). Regards, Martin From jcea at jcea.es Fri Nov 19 03:13:38 2010 From: jcea at jcea.es (Jesus Cea) Date: Fri, 19 Nov 2010 03:13:38 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE56331.3050508@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> Message-ID: <4CE5DD52.7050907@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 18/11/10 18:32, "Martin v. L?wis" wrote: > In general, I'm *also* concerned about the lack of volunteers that > are interested in working on the infrastructure. I wish some of the > people who stated that they can't wait for the migration to happen > would work on solving some of the remaining problems. Do we have a exhaustive list of mercurial "to do" things?. I thought the plan was to keep a read only SVN mirror fedded from mercurial. The 3.2 build could come from the mirror, I guess. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOXdUplgi5GaxT1NAQIL3AP/WRq9IwRZXEuFkKRAqBm0cOi4CkTbcV5X Ix+JZvimKEiq1DkUsJJb6q5/ViQ3z15ai9idY+AOmv4EdMK9hbgYZIQXGig9TLvA LFvqTqnl9ZuZCVFEYh2QdnXU576edgn2AaBpBDpoC88IXcu6Y3kcmzFIHWRTh2MF SEkUAzETSrc= =cOVM -----END PGP SIGNATURE----- From benjamin at python.org Fri Nov 19 03:23:25 2010 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 18 Nov 2010 20:23:25 -0600 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE5DD52.7050907@jcea.es> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: 2010/11/18 Jesus Cea : > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 18/11/10 18:32, "Martin v. L?wis" wrote: >> In general, I'm *also* concerned about the lack of volunteers that >> are interested in working on the infrastructure. I wish some of the >> people who stated that they can't wait for the migration to happen >> would work on solving some of the remaining problems. > > Do we have a exhaustive list of mercurial "to do" things?. http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt -- Regards, Benjamin From g.brandl at gmx.net Fri Nov 19 08:43:15 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 19 Nov 2010 08:43:15 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: Am 19.11.2010 03:23, schrieb Benjamin Peterson: > 2010/11/18 Jesus Cea : >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 18/11/10 18:32, "Martin v. L?wis" wrote: >>> In general, I'm *also* concerned about the lack of volunteers that >>> are interested in working on the infrastructure. I wish some of the >>> people who stated that they can't wait for the migration to happen >>> would work on solving some of the remaining problems. >> >> Do we have a exhaustive list of mercurial "to do" things?. > > http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt Uh, that's the list of things to do *at* the migration. The todo list is http://hg.python.org/pymigr/file/1576eb34ec9f/todo.txt Georg From martin at v.loewis.de Fri Nov 19 08:58:27 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Fri, 19 Nov 2010 08:58:27 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: <4CE62E23.9010701@v.loewis.de> Am 19.11.2010 03:23, schrieb Benjamin Peterson: > 2010/11/18 Jesus Cea : >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 18/11/10 18:32, "Martin v. L?wis" wrote: >>> In general, I'm *also* concerned about the lack of volunteers that >>> are interested in working on the infrastructure. I wish some of the >>> people who stated that they can't wait for the migration to happen >>> would work on solving some of the remaining problems. >> >> Do we have a exhaustive list of mercurial "to do" things?. > > http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt This doesn't, but IMO should, list - resolve open issues in PEP - finalize and implement branch structure - set and implement policy for external code bases for Windows builds - set up account management infrastructure, determine account managers Regards, Martin From kristjan at ccpgames.com Fri Nov 19 08:31:59 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Fri, 19 Nov 2010 15:31:59 +0800 Subject: [Python-Dev] sha digest endianness Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE57872AF5@exchcn.ccp.ad.local> Please see this defect: http://bugs.python.org/issue10430 It would appear that the digest and hexdigest for sha, is wrong on little endian machines. There certainly is a discrepancy between little and big endian ones, irrespective of which one is "right" Any thoughts? K -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 19 14:50:36 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 19 Nov 2010 23:50:36 +1000 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: On Fri, Nov 19, 2010 at 5:43 PM, Georg Brandl wrote: > Am 19.11.2010 03:23, schrieb Benjamin Peterson: >> 2010/11/18 Jesus Cea : >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> On 18/11/10 18:32, "Martin v. L?wis" wrote: >>>> In general, I'm *also* concerned about the lack of volunteers that >>>> are interested in working on the infrastructure. I wish some of the >>>> people who stated that they can't wait for the migration to happen >>>> would work on solving some of the remaining problems. >>> >>> Do we have a exhaustive list of mercurial "to do" things?. >> >> http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt > > Uh, that's the list of things to do *at* the migration. ?The todo list is > > http://hg.python.org/pymigr/file/1576eb34ec9f/todo.txt That kind of link is the sort of thing that should really be in the PEP... (along with the info about where to find the hooks repository, specific URLs for at least 3.x, 3.1 and 2.7, pointers to a draft FAQ to replace the current SVN focused FAQ, etc) Target dates for the following specific activities would also be useful: - date a "final draft" of converted repository will be made available to Martin and Ronald to dry run creation of Windows and Mac OS X installers - date SVN will go read only - date Hg will be available for write access (it should be frozen for a while, to give the folks doing the conversion a chance to make sure buildbot is back up and run, commit emails are working properly, etc) So as long as we acknowledge that any migration problems may mean additional beta releases of 3.2 to iron things out, I don't see a problem with releasing beta 1 as planned to close the door on any *other* new features, and giving the Hg migration a clear run at the source repository before we start working seriously on dealing with bug reports (either existing ones, or those from the first beta). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Fri Nov 19 15:36:35 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Nov 2010 15:36:35 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: <4CE68B73.30005@v.loewis.de> > - date Hg will be available for write access (it should be frozen for > a while, to give the folks doing the conversion a chance to make sure > buildbot is back up and run, commit emails are working properly, etc) I would target the build slaves to the Mercurial repository already in the testing phase, e.g by creating builders for building from commits to the 3k branch. I hope Buildbot supports multiple change sources now. Likewise, I'd also see commit emails being delivered in the test phase already, and let committers make test commits to trigger this all (and also to get acquainted with the Mercurial tools they are going to use, without fear of breaking something). Regards, Martin From barry at python.org Fri Nov 19 15:46:57 2010 From: barry at python.org (Barry Warsaw) Date: Fri, 19 Nov 2010 09:46:57 -0500 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: <20101119094657.1a7cc24a@mission> On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote: >- date SVN will go read only Please note that svn cannot be made completely read-only. We've already decided that versions already in maintenance or security-only mode (2.5, 2.6, 2.7, 3.1) will get updates and releases only via svn. But only the release managers should have write access to the svn repositories. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Fri Nov 19 15:56:40 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 20 Nov 2010 00:56:40 +1000 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <20101119094657.1a7cc24a@mission> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> Message-ID: On Sat, Nov 20, 2010 at 12:46 AM, Barry Warsaw wrote: > On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote: > >>- date SVN will go read only > > Please note that svn cannot be made completely read-only. ?We've already > decided that versions already in maintenance or security-only mode (2.5, 2.6, > 2.7, 3.1) will get updates and releases only via svn. ?But only the release > managers should have write access to the svn repositories. Again, something that should be in PEP 385 (but isn't). It seems that the work *is* going on, and the people actually doing it have a reasonable idea as to what has been decided and where things are going, but those of us "out here" have a fair stake in this as well, and without an up to date PEP 385 there's no one place to go to to see the current state of the migration. That's enough to make folks like me somewhat nervous as to whether or not we're actually going to have a usable source control system come December 12. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dirkjan at ochtman.nl Fri Nov 19 16:00:40 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Fri, 19 Nov 2010 16:00:40 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> Message-ID: On Fri, Nov 19, 2010 at 15:56, Nick Coghlan wrote: > That's enough to make folks like me somewhat nervous as to whether or > not we're actually going to have a usable source control system come > December 12. Yes, I've been negligent about updating the PEP. I'll try do so next week. Georg, if you have time to update it a bit, that would be great as well. Cheers, Dirkjan From g.brandl at gmx.net Fri Nov 19 17:23:44 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 19 Nov 2010 17:23:44 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE62E23.9010701@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <4CE62E23.9010701@v.loewis.de> Message-ID: Am 19.11.2010 08:58, schrieb "Martin v. L?wis": > Am 19.11.2010 03:23, schrieb Benjamin Peterson: >> 2010/11/18 Jesus Cea : >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> On 18/11/10 18:32, "Martin v. L?wis" wrote: >>>> In general, I'm *also* concerned about the lack of volunteers that >>>> are interested in working on the infrastructure. I wish some of the >>>> people who stated that they can't wait for the migration to happen >>>> would work on solving some of the remaining problems. >>> >>> Do we have a exhaustive list of mercurial "to do" things?. >> >> http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt > > This doesn't, but IMO should, list > > - resolve open issues in PEP > - finalize and implement branch structure > - set and implement policy for external code bases for Windows builds > - set up account management infrastructure, determine account managers Good points, I've added the missing ones to the todo list. Georg From john at arbash-meinel.com Fri Nov 19 17:38:16 2010 From: john at arbash-meinel.com (John Arbash Meinel) Date: Fri, 19 Nov 2010 10:38:16 -0600 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: <4CE6A7F8.3030008@arbash-meinel.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 11/19/2010 7:50 AM, Nick Coghlan wrote: > On Fri, Nov 19, 2010 at 5:43 PM, Georg Brandl wrote: >> Am 19.11.2010 03:23, schrieb Benjamin Peterson: >>> 2010/11/18 Jesus Cea : >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> On 18/11/10 18:32, "Martin v. L?wis" wrote: >>>>> In general, I'm *also* concerned about the lack of volunteers that >>>>> are interested in working on the infrastructure. I wish some of the >>>>> people who stated that they can't wait for the migration to happen >>>>> would work on solving some of the remaining problems. >>>> >>>> Do we have a exhaustive list of mercurial "to do" things?. >>> >>> http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt >> >> Uh, that's the list of things to do *at* the migration. The todo list is >> >> http://hg.python.org/pymigr/file/1576eb34ec9f/todo.txt > > That kind of link is the sort of thing that should really be in the > PEP... (along with the info about where to find the hooks repository, > specific URLs for at least 3.x, 3.1 and 2.7, pointers to a draft FAQ > to replace the current SVN focused FAQ, etc) > Well, if it goes in the pep, you should at least use the 'always the most recent' version :) http://hg.python.org/pymigr/file/tip/todo.txt John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzmp/gACgkQJdeBCYSNAAOwjgCeOda2XeNvxOR0UnFuQOfN0zZt jGIAoIuarrvIz3oQ+o1jtnH5dFoFk35t =JJo8 -----END PGP SIGNATURE----- From g.brandl at gmx.net Fri Nov 19 17:51:23 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 19 Nov 2010 17:51:23 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> Message-ID: Am 19.11.2010 16:00, schrieb Dirkjan Ochtman: > On Fri, Nov 19, 2010 at 15:56, Nick Coghlan wrote: >> That's enough to make folks like me somewhat nervous as to whether or >> not we're actually going to have a usable source control system come >> December 12. > > Yes, I've been negligent about updating the PEP. I'll try do so next > week. Georg, if you have time to update it a bit, that would be great > as well. I'm at it. In fact, I think I will merge both todo.txt and tasks.txt into the PEP. It's not more of a burden to update it there, and it's more visible to the developer community. Georg From alexander.belopolsky at gmail.com Fri Nov 19 17:53:58 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 19 Nov 2010 11:53:58 -0500 Subject: [Python-Dev] len(chr(i)) = 2? Message-ID: I was recently surprised to learn that chr(i) can produce a string of length 2 in python 3.x. I suspect that I am not alone finding this behavior non-obvious given that a mistake in Python manual stating the contrary survived several releases. [1] Note that I am not arguing that the change was bad. In Python 2.x, \U escapes have been producing surrogate pair on narrow builds for a long time if not since introduction of unicode. I do believe, however that a change like this [2] and its consequences should be better publicized. I have not found any discussion of this change in PEPs or "What's new" documents. The closest find was a mentioning of a related issue #3280 in the 3.0 NEWS file. [3] Since this feature will be first documented in the Library Reference in 3.2, I wonder if it will be appropriate to mention it in "What's new in 3.2"? [1] http://bugs.python.org/issue7828 [2] http://svn.python.org/view?view=rev&revision=56395 [3] http://www.python.org/download/releases/3.0.1/NEWS.txt From g.brandl at gmx.net Fri Nov 19 17:53:28 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 19 Nov 2010 17:53:28 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE68B73.30005@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <4CE68B73.30005@v.loewis.de> Message-ID: Am 19.11.2010 15:36, schrieb "Martin v. L?wis": >> - date Hg will be available for write access (it should be frozen for >> a while, to give the folks doing the conversion a chance to make sure >> buildbot is back up and run, commit emails are working properly, etc) > > I would target the build slaves to the Mercurial repository already in > the testing phase, e.g by creating builders for building from commits > to the 3k branch. I hope Buildbot supports multiple change sources now. > Likewise, I'd also see commit emails being delivered in the test phase > already, and let committers make test commits to trigger this all (and > also to get acquainted with the Mercurial tools they are going to use, > without fear of breaking something). I've already let my Mercurial buildbot configuration run for a few checkins while testing it; a separate changesource was not needed. The commit email hook also has been tested extensively by its usage for the distutils2 repo, which are also sent to python-checkins. That said, it will of course be nice to activate both for the test repo as well, once it's available. Georg From status at bugs.python.org Fri Nov 19 18:07:02 2010 From: status at bugs.python.org (Python tracker) Date: Fri, 19 Nov 2010 18:07:02 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20101119170702.BB0FA1DBAD@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2010-11-12 - 2010-11-19) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2549 (+23) closed 19694 (+43) total 22243 (+66) Open issues with patches: 1058 Issues opened (43) ================== #2571: cmd.py always uses raw_input, even when another stdin is speci http://bugs.python.org/issue2571 reopened by eric.araujo #4153: Unicode HOWTO up to date? http://bugs.python.org/issue4153 reopened by belopolsky #6941: Socket error when launching IDLE http://bugs.python.org/issue6941 reopened by 08jpurcell #10356: decimal.py: hash of -1 http://bugs.python.org/issue10356 reopened by rhettinger #10399: AST Optimization: inlining of function calls http://bugs.python.org/issue10399 opened by dmalcolm #10401: Globals / builtins cache http://bugs.python.org/issue10401 opened by pitrou #10402: sporadic test_bsddb3 failures http://bugs.python.org/issue10402 opened by pitrou #10403: Use "member" consistently http://bugs.python.org/issue10403 opened by fdrake #10404: IDLE on OS X popup menus do not work: cannot set/clear breakpo http://bugs.python.org/issue10404 opened by ned.deily #10405: IDLE breakpoint facility undocumented http://bugs.python.org/issue10405 opened by ned.deily #10406: IDLE 2.7 on OS X does not enable Rstrip extension by default http://bugs.python.org/issue10406 opened by ned.deily #10407: missing errno import in distutils/dir_util.py http://bugs.python.org/issue10407 opened by zbysz #10408: Denser dicts and linear probing http://bugs.python.org/issue10408 opened by pitrou #10415: readline.insert_text documentation incomplete http://bugs.python.org/issue10415 opened by Justin.Lebar #10417: unittest triggers UnicodeEncodeError with non-ASCII character http://bugs.python.org/issue10417 opened by jammon #10419: distutils command build_scripts fails with UnicodeDecodeError http://bugs.python.org/issue10419 opened by hagen #10420: Document of Bdb.effective is wrong. http://bugs.python.org/issue10420 opened by naoki #10423: s/args/options in arpgarse "Upgrading optparse code" http://bugs.python.org/issue10423 opened by bethard #10424: better error message from argparse when positionals missing http://bugs.python.org/issue10424 opened by bethard #10427: 24:00 Hour in DateTime http://bugs.python.org/issue10427 opened by ingo.janssen #10430: _sha.sha().digest() method is endian-sensitive. and hexdigest( http://bugs.python.org/issue10430 opened by krisvale #10433: Document unique behavior of 'getgroups' on OSX http://bugs.python.org/issue10433 opened by r.david.murray #10434: Document the rules for "public names" http://bugs.python.org/issue10434 opened by belopolsky #10435: Document unicode C-API in reST http://bugs.python.org/issue10435 opened by belopolsky #10436: tarfile.extractfile in "r|" stream mode fails with filenames o http://bugs.python.org/issue10436 opened by David.Nesting #10437: ThreadPoolExecutor should accept max_workers=None http://bugs.python.org/issue10437 opened by stutzbach #10438: list an example for calling static methods from WITHIN classes http://bugs.python.org/issue10438 opened by ifreecarve #10439: PyCodec C API is not documented in reST http://bugs.python.org/issue10439 opened by belopolsky #10441: some stdlib modules need to be updated to handle SSL certifica http://bugs.python.org/issue10441 opened by db #10444: A mechanism is needed to override waiting for Python threads t http://bugs.python.org/issue10444 opened by michaelahughes #10446: pydoc3 links to 2.x library reference http://bugs.python.org/issue10446 opened by belopolsky #10448: Add Mako template benchmark to Python Benchmark Suite http://bugs.python.org/issue10448 opened by bobbyi #10449: ???os.environ was modified by test_httpservers??? http://bugs.python.org/issue10449 opened by eric.araujo #10450: Fix markup in Misc/NEWS http://bugs.python.org/issue10450 opened by eric.araujo #10451: memoryview can be used to write into readonly buffer http://bugs.python.org/issue10451 opened by abacabadabacaba #10453: Add -h/--help option to compileall http://bugs.python.org/issue10453 opened by eric.araujo #10454: Clarify compileall command-line options http://bugs.python.org/issue10454 opened by eric.araujo #10457: "Related help topics" shown outside pager http://bugs.python.org/issue10457 opened by cben #10458: 2.7 += re.ASCII http://bugs.python.org/issue10458 opened by hfuru #10459: missing character names in unicodedata (CJK...) http://bugs.python.org/issue10459 opened by vbr #10460: Misc/indent.pro does not reflect PEP 7 http://bugs.python.org/issue10460 opened by Mick.Beaver #10461: Use with statement throughout the docs http://bugs.python.org/issue10461 opened by eric.araujo #10445: _ast py3k : add lineno back to "args" node http://bugs.python.org/issue10445 opened by emile.anclin Most recent 15 issues with no replies (15) ========================================== #10461: Use with statement throughout the docs http://bugs.python.org/issue10461 #10460: Misc/indent.pro does not reflect PEP 7 http://bugs.python.org/issue10460 #10457: "Related help topics" shown outside pager http://bugs.python.org/issue10457 #10451: memoryview can be used to write into readonly buffer http://bugs.python.org/issue10451 #10449: ???os.environ was modified by test_httpservers??? http://bugs.python.org/issue10449 #10445: _ast py3k : add lineno back to "args" node http://bugs.python.org/issue10445 #10439: PyCodec C API is not documented in reST http://bugs.python.org/issue10439 #10437: ThreadPoolExecutor should accept max_workers=None http://bugs.python.org/issue10437 #10433: Document unique behavior of 'getgroups' on OSX http://bugs.python.org/issue10433 #10424: better error message from argparse when positionals missing http://bugs.python.org/issue10424 #10423: s/args/options in arpgarse "Upgrading optparse code" http://bugs.python.org/issue10423 #10420: Document of Bdb.effective is wrong. http://bugs.python.org/issue10420 #10419: distutils command build_scripts fails with UnicodeDecodeError http://bugs.python.org/issue10419 #10406: IDLE 2.7 on OS X does not enable Rstrip extension by default http://bugs.python.org/issue10406 #10405: IDLE breakpoint facility undocumented http://bugs.python.org/issue10405 Most recent 15 issues waiting for review (15) ============================================= #10448: Add Mako template benchmark to Python Benchmark Suite http://bugs.python.org/issue10448 #10446: pydoc3 links to 2.x library reference http://bugs.python.org/issue10446 #10444: A mechanism is needed to override waiting for Python threads t http://bugs.python.org/issue10444 #10435: Document unicode C-API in reST http://bugs.python.org/issue10435 #10419: distutils command build_scripts fails with UnicodeDecodeError http://bugs.python.org/issue10419 #10408: Denser dicts and linear probing http://bugs.python.org/issue10408 #10406: IDLE 2.7 on OS X does not enable Rstrip extension by default http://bugs.python.org/issue10406 #10404: IDLE on OS X popup menus do not work: cannot set/clear breakpo http://bugs.python.org/issue10404 #10401: Globals / builtins cache http://bugs.python.org/issue10401 #10399: AST Optimization: inlining of function calls http://bugs.python.org/issue10399 #10391: obj2ast's error handling can lead to python crashing with a C- http://bugs.python.org/issue10391 #10385: Mark up "subprocess" as module in its doc http://bugs.python.org/issue10385 #10383: test_os leaks under Windows http://bugs.python.org/issue10383 #10382: Command line error marker misplaced on unicode entry http://bugs.python.org/issue10382 #10371: Deprecate trace module undocumented API http://bugs.python.org/issue10371 Top 10 most discussed issues (10) ================================= #3871: cross and native build of python for mingw32 with distutils http://bugs.python.org/issue3871 17 msgs #10441: some stdlib modules need to be updated to handle SSL certifica http://bugs.python.org/issue10441 16 msgs #2001: Pydoc interactive browsing enhancement http://bugs.python.org/issue2001 14 msgs #10356: decimal.py: hash of -1 http://bugs.python.org/issue10356 14 msgs #10446: pydoc3 links to 2.x library reference http://bugs.python.org/issue10446 12 msgs #7900: posix.getgroups() failure on Mac OS X http://bugs.python.org/issue7900 11 msgs #10435: Document unicode C-API in reST http://bugs.python.org/issue10435 11 msgs #4153: Unicode HOWTO up to date? http://bugs.python.org/issue4153 10 msgs #10417: unittest triggers UnicodeEncodeError with non-ASCII character http://bugs.python.org/issue10417 8 msgs #1553375: Add traceback.print_full_exception() http://bugs.python.org/issue1553375 8 msgs Issues closed (44) ================== #4471: IMAP4 missing support for starttls http://bugs.python.org/issue4471 closed by pitrou #4476: compileall fails if current dir has a "types" package http://bugs.python.org/issue4476 closed by ncoghlan #5111: httplib: wrong Host header when connecting to IPv6 litteral UR http://bugs.python.org/issue5111 closed by orsenthil #7828: chr() and ord() documentation for wide characters http://bugs.python.org/issue7828 closed by belopolsky #8649: Py_UNICODE_* functions are undocumented http://bugs.python.org/issue8649 closed by belopolsky #9076: Add C-API documentation for PyUnicode_AsDecodedObject/Unicode http://bugs.python.org/issue9076 closed by georg.brandl #9520: Add Patricia Trie high performance container http://bugs.python.org/issue9520 closed by rhettinger #9991: xmlrpc client ssl check faulty http://bugs.python.org/issue9991 closed by orsenthil #10070: 2to3 wishes for already-2to3'ed files http://bugs.python.org/issue10070 closed by loewis #10205: Can't have two tags with the same QName http://bugs.python.org/issue10205 closed by orsenthil #10260: Add a threading.Condition.wait_for() method http://bugs.python.org/issue10260 closed by krisvale #10373: Setup Script example incorrect http://bugs.python.org/issue10373 closed by eric.araujo #10392: GZipFile crash when fileobj.mode is None http://bugs.python.org/issue10392 closed by r.david.murray #10396: stdin argument to pdb.Pdb doesn't work unless you also set Pdb http://bugs.python.org/issue10396 closed by georg.brandl #10397: Unified Benchmark Suite fails on py3k with --track-memory http://bugs.python.org/issue10397 closed by pitrou #10398: errors in docs re module initialization vs self arg to functio http://bugs.python.org/issue10398 closed by georg.brandl #10400: updating unicodedata to Unicode 6 http://bugs.python.org/issue10400 closed by loewis #10409: mkcfg crashes with ValueError http://bugs.python.org/issue10409 closed by tarek #10410: Is iterable a container type? http://bugs.python.org/issue10410 closed by rhettinger #10411: Pickle benchmark fails after converting Benchmark Suite to py3 http://bugs.python.org/issue10411 closed by pitrou #10412: Add py3k support for "slow" pickle benchmark in Benchmark Suit http://bugs.python.org/issue10412 closed by pitrou #10413: Comments in unicode.h are out of date http://bugs.python.org/issue10413 closed by belopolsky #10414: socket.gethostbyname doesn't return an ipv6 address http://bugs.python.org/issue10414 closed by loewis #10416: UnicodeDecodeError when 2to3 is run on a dir with numpy .npy f http://bugs.python.org/issue10416 closed by benjamin.peterson #10418: test_io hangs on 3.1.3rc1 http://bugs.python.org/issue10418 closed by vdupras #10421: Failed issue tracker submission http://bugs.python.org/issue10421 closed by eric.araujo #10422: pstats.py : error when loading multiple stats files http://bugs.python.org/issue10422 closed by ezio.melotti #10425: xmlrpclib support for None isn't compliant with XMLRPC http://bugs.python.org/issue10425 closed by orsenthil #10426: The whole thing is NOT good http://bugs.python.org/issue10426 closed by georg.brandl #10428: IDLE Trouble shooting http://bugs.python.org/issue10428 closed by r.david.murray #10429: bug in test_imaplib http://bugs.python.org/issue10429 closed by pitrou #10431: Failed issue tracker submission http://bugs.python.org/issue10431 closed by ezio.melotti #10432: concurrent.futures.as_completed() spins waiting for futures to http://bugs.python.org/issue10432 closed by bquinlan #10440: support RUSAGE_THREAD as a constant in the resource module http://bugs.python.org/issue10440 closed by pitrou #10442: Please by default enforce ssl certificate checking in modules http://bugs.python.org/issue10442 closed by ned.deily #10443: add wrapper for SSL_CTX_set_default_verify_paths http://bugs.python.org/issue10443 closed by pitrou #10447: zipfile: IOError for long directory paths on Windows http://bugs.python.org/issue10447 closed by amaury.forgeotdarc #10452: Unhelpful diagnostic 'cannot find the path specified' http://bugs.python.org/issue10452 closed by eric.smith #10455: typo in urllib.request documentation http://bugs.python.org/issue10455 closed by ezio.melotti #10456: unittest.main(verbosity=2) broke in python31, worked when I ha http://bugs.python.org/issue10456 closed by r.david.murray #1599329: urllib(2) should allow automatic decoding by charset http://bugs.python.org/issue1599329 closed by eric.araujo #1376292: Write user's version of the reference guide http://bugs.python.org/issue1376292 closed by akuchling #1509798: replace dist/src/Tools/scripts/which.py with tmick's which http://bugs.python.org/issue1509798 closed by eric.araujo #1520831: urrlib2 max_redirections=0 disables redirects http://bugs.python.org/issue1520831 closed by orsenthil From g.brandl at gmx.net Fri Nov 19 18:12:22 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 19 Nov 2010 18:12:22 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <20101119094657.1a7cc24a@mission> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> Message-ID: Am 19.11.2010 15:46, schrieb Barry Warsaw: > On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote: > >>- date SVN will go read only > > Please note that svn cannot be made completely read-only. We've already > decided that versions already in maintenance or security-only mode (2.5, 2.6, > 2.7, 3.1) will get updates and releases only via svn. But only the release > managers should have write access to the svn repositories. Really? I can understand this for security-only branches (commits there will be rare, and equivalent commits to the Mercurial branches can be made by others than the release managers, in order to keep history consistent). But having the maintenance branches (by then, that will mostly be 2.7 because 3.1 will go to security-only mode soon) in SVN will be a burden for every developer, since they have to backport bugfixes from Hg to SVN... Georg From solipsis at pitrou.net Fri Nov 19 18:17:20 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 19 Nov 2010 18:17:20 +0100 Subject: [Python-Dev] len(chr(i)) = 2? References: Message-ID: <20101119181720.10ec11d3@pitrou.net> On Fri, 19 Nov 2010 11:53:58 -0500 Alexander Belopolsky wrote: > Since this feature will be first documented in the > Library Reference in 3.2, I wonder if it will be appropriate to > mention it in "What's new in 3.2"? No, since it's not new in 3.2. No need to further confuse users. If there's a porting guide to 3.x it should be mentioned there. Regards Antoine. From barry at python.org Fri Nov 19 18:41:58 2010 From: barry at python.org (Barry Warsaw) Date: Fri, 19 Nov 2010 12:41:58 -0500 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> Message-ID: <20101119124158.3d8debc9@mission> On Nov 19, 2010, at 06:12 PM, Georg Brandl wrote: >Am 19.11.2010 15:46, schrieb Barry Warsaw: >> On Nov 19, 2010, at 11:50 PM, Nick Coghlan wrote: >> >>>- date SVN will go read only >> >> Please note that svn cannot be made completely read-only. We've already >> decided that versions already in maintenance or security-only mode (2.5, 2.6, >> 2.7, 3.1) will get updates and releases only via svn. But only the release >> managers should have write access to the svn repositories. > >Really? I can understand this for security-only branches (commits there will >be rare, and equivalent commits to the Mercurial branches can be made by >others than the release managers, in order to keep history consistent). > >But having the maintenance branches (by then, that will mostly be 2.7 because >3.1 will go to security-only mode soon) in SVN will be a burden for every >developer, since they have to backport bugfixes from Hg to SVN... Maybe I misremembered Martin's suggestion, and he was only talking about security releases. I think the key thing is whether you're going to backport the vcs related bits to stable releases. I plan to only do releases for 2.6 from svn, because it's not worth breaking things like sys.subversion, and as you say the number of commits will be small. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Fri Nov 19 19:06:09 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 19 Nov 2010 19:06:09 +0100 Subject: [Python-Dev] Mercurial Schedule References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> Message-ID: <20101119190609.637c7a72@pitrou.net> On Fri, 19 Nov 2010 12:41:58 -0500 Barry Warsaw wrote: > >Really? I can understand this for security-only branches (commits there will > >be rare, and equivalent commits to the Mercurial branches can be made by > >others than the release managers, in order to keep history consistent). > > > >But having the maintenance branches (by then, that will mostly be 2.7 because > >3.1 will go to security-only mode soon) in SVN will be a burden for every > >developer, since they have to backport bugfixes from Hg to SVN... > > Maybe I misremembered Martin's suggestion, and he was only talking about > security releases. I think the key thing is whether you're going to backport > the vcs related bits to stable releases. It would be horribly burdensome to use two different VCSes depending on whether you're working on a bugfix branch or a feature branch. > I plan to only do releases for 2.6 from svn, because it's not worth breaking > things like sys.subversion, and as you say the number of commits will be > small. But 2.6 is security-fixes only, right? It would really be annoying if the same rules applied for 2.7 and 3.1. I don't understand all the worry about sys.subversion. It's not like it's useful to anybody else than us, and I think it should have been named sys._subversion instead. There's no point in making API-like promises about which DVCS, bug tracker or documentation toolset we use for our workflow. Regards Antoine. From merwok at netwok.org Fri Nov 19 19:41:54 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 19 Nov 2010 19:41:54 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <20101119190609.637c7a72@pitrou.net> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> <20101119190609.637c7a72@pitrou.net> Message-ID: <4CE6C4F2.2040806@netwok.org> > I don't understand all the worry about sys.subversion. It's not like > it's useful to anybody else than us, and I think it should have been > named sys._subversion instead. There's no point in making API-like > promises about which DVCS, bug tracker or documentation toolset we use > for our workflow. I read ?subversion? as ?sub-piece of information about version?, not the name of a VCS, so I have no problem with its continuing existence under Mercurial (it?s in PEP 385). Regards From brett at python.org Fri Nov 19 19:52:03 2010 From: brett at python.org (Brett Cannon) Date: Fri, 19 Nov 2010 10:52:03 -0800 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> Message-ID: On Fri, Nov 19, 2010 at 05:50, Nick Coghlan wrote: > On Fri, Nov 19, 2010 at 5:43 PM, Georg Brandl wrote: >> Am 19.11.2010 03:23, schrieb Benjamin Peterson: >>> 2010/11/18 Jesus Cea : >>>> -----BEGIN PGP SIGNED MESSAGE----- >>>> Hash: SHA1 >>>> >>>> On 18/11/10 18:32, "Martin v. L?wis" wrote: >>>>> In general, I'm *also* concerned about the lack of volunteers that >>>>> are interested in working on the infrastructure. I wish some of the >>>>> people who stated that they can't wait for the migration to happen >>>>> would work on solving some of the remaining problems. >>>> >>>> Do we have a exhaustive list of mercurial "to do" things?. >>> >>> http://hg.python.org/pymigr/file/1576eb34ec9f/tasks.txt >> >> Uh, that's the list of things to do *at* the migration. ?The todo list is >> >> http://hg.python.org/pymigr/file/1576eb34ec9f/todo.txt > > That kind of link is the sort of thing that should really be in the > PEP... (along with the info about where to find the hooks repository, > specific URLs for at least 3.x, 3.1 and 2.7, pointers to a draft FAQ > to replace the current SVN focused FAQ, etc) I am spending my PSF grant time in January rewriting python.org/dev practically from scratch. Any needed updates to take Mercurial in account will happen no later than then. -Brett > > Target dates for the following specific activities would also be useful: > - date a "final draft" of converted repository will be made available > to Martin and Ronald to dry run creation of Windows and Mac OS X > installers > - date SVN will go read only > - date Hg will be available for write access (it should be frozen for > a while, to give the folks doing the conversion a chance to make sure > buildbot is back up and run, commit emails are working properly, etc) > > So as long as we acknowledge that any migration problems may mean > additional beta releases of 3.2 to iron things out, I don't see a > problem with releasing beta 1 as planned to close the door on any > *other* new features, and giving the Hg migration a clear run at the > source repository before we start working seriously on dealing with > bug reports (either existing ones, or those from the first beta). > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From victor.stinner at haypocalc.com Fri Nov 19 21:23:14 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Fri, 19 Nov 2010 21:23:14 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: Message-ID: <201011192123.14169.victor.stinner@haypocalc.com> Hi, On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote: > I was recently surprised to learn that chr(i) can produce a string of > length 2 in python 3.x. Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in wide mode (sys.maxunicode == 1114111). > I suspect that I am not alone finding this behavior non-obvious > given that a mistake in Python manual stating the contrary survived > several releases. [1] It was a documentation bug and you fixed it. Non-BMP characters are rare, so few (maybe only you?) noticed the documentation bug. I consider the behaviour as an improvment of non-BMP support of Python3. Python is unclear about non-BMP characters: narrow build was called "ucs2" for long time, even if it is UTF-16 (each character is encoded to one or two UTF-16 words). Python2 accepts non-BMP characters with \U syntax, but not with chr(). This is inconsistent and I see this as a bug. But I don't want to touch Python2 about non-BMP characters, and the "bug" is already fixed in Python3! > I do believe, however that a change like > this [2] and its consequences should be better publicized. Change made before the release of Python 3.0. Do you want to patch the "What's new in Python 3.0?" document? > I have not > found any discussion of this change in PEPs or "What's new" documents. > The closest find was a mentioning of a related issue #3280 in the 3.0 > NEWS file. [3] Since this feature will be first documented in the > Library Reference in 3.2, I wonder if it will be appropriate to > mention it in "What's new in 3.2"? In my opinion, the question is more what was it not fixed in Python2. I suppose that the answer is something ugly like "backward compatibility" or "historical reasons" :-) Victor From martin at v.loewis.de Fri Nov 19 22:25:08 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Nov 2010 22:25:08 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <20101119124158.3d8debc9@mission> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> Message-ID: <4CE6EB34.5010805@v.loewis.de> > Maybe I misremembered Martin's suggestion, and he was only talking about > security releases. Technically, I was only talking about 2.5. For each branch, the respective release manager should make a decision. For 2.5 and 2.6, it's been decided; Benjamin has not yet announced plans how 2.7 and 3.1 will be maintained after the switchover. Regards, Martin From martin at v.loewis.de Fri Nov 19 22:35:54 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Nov 2010 22:35:54 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <20101119190609.637c7a72@pitrou.net> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> <20101119190609.637c7a72@pitrou.net> Message-ID: <4CE6EDBA.9040706@v.loewis.de> > I don't understand all the worry about sys.subversion. Really? For a security release, there should be *zero* chance that it breaks existing applications, unless the application relies on the security bug that has been fixed. By "zero chance", I mean absolutely no chance, never. I'm pretty sure that applications *will* break because of the change to sys.subversion, or sys.version. People made bug reports complaining that sys.version has a newline on some systems and not on others. > It's not like > it's useful to anybody else than us I think you underestimate what API people actually use in applications http://tinyurl.com/292vhxx http://tinyurl.com/23ah8ps http://tinyurl.com/27fhyvk http://tinyurl.com/28cuyv9 etc. Regards, Martin From g.brandl at gmx.net Fri Nov 19 22:39:04 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 19 Nov 2010 22:39:04 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE6EDBA.9040706@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> <20101119190609.637c7a72@pitrou.net> <4CE6EDBA.9040706@v.loewis.de> Message-ID: Am 19.11.2010 22:35, schrieb "Martin v. L?wis": >> I don't understand all the worry about sys.subversion. > > Really? For a security release, there should be *zero* chance that it > breaks existing applications, unless the application relies on the > security bug that has been fixed. By "zero chance", I mean absolutely > no chance, never. I'm pretty sure that applications *will* break because > of the change to sys.subversion, or sys.version. People made bug reports > complaining that sys.version has a newline on some systems and not on > others. > >> It's not like >> it's useful to anybody else than us > > I think you underestimate what API people actually use in applications > > http://tinyurl.com/292vhxx > http://tinyurl.com/23ah8ps > http://tinyurl.com/27fhyvk > http://tinyurl.com/28cuyv9 > etc. Well, it should not be a problem to continue to provide a sys.subversion that at least will not break applications reading it. And yes, I am in favor of giving the new attribute a leading underscore. Georg From solipsis at pitrou.net Fri Nov 19 22:43:12 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 19 Nov 2010 22:43:12 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE6EDBA.9040706@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> <20101119190609.637c7a72@pitrou.net> <4CE6EDBA.9040706@v.loewis.de> Message-ID: <1290202992.3621.4.camel@localhost.localdomain> Le vendredi 19 novembre 2010 ? 22:35 +0100, "Martin v. L?wis" a ?crit : > > I don't understand all the worry about sys.subversion. > > Really? For a security release, there should be *zero* chance that it > breaks existing applications, It should have been clear that my message explicitly excluded security releases. Regards Antoine. From martin at v.loewis.de Fri Nov 19 22:43:45 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Nov 2010 22:43:45 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <201011192123.14169.victor.stinner@haypocalc.com> References: <201011192123.14169.victor.stinner@haypocalc.com> Message-ID: <4CE6EF91.1040803@v.loewis.de> > In my opinion, the question is more what was it not fixed in Python2. I suppose > that the answer is something ugly like "backward compatibility" or "historical > reasons" :-) No, there was a deliberate decision to not support that, see http://www.python.org/dev/peps/pep-0261/ There had been a long discussion on this specific detail when PEP 261 was written, and in the end, an explicit, deliberate, considered decision was made to raise a ValueError. Regards, Martin From ezio.melotti at gmail.com Fri Nov 19 23:05:51 2010 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sat, 20 Nov 2010 00:05:51 +0200 Subject: [Python-Dev] [Python-checkins] r86530 - python/branches/py3k/Doc/howto/unicode.rst In-Reply-To: <20101119161003.AAEEAEE9A6@mail.python.org> References: <20101119161003.AAEEAEE9A6@mail.python.org> Message-ID: <4CE6F4BF.9050409@gmail.com> Hi, On 19/11/2010 18.10, alexander.belopolsky wrote: > Author: alexander.belopolsky > Date: Fri Nov 19 17:09:58 2010 > New Revision: 86530 > > Log: > Issue #4153: Updated Unicode HOWTO. > > Modified: > python/branches/py3k/Doc/howto/unicode.rst > > Modified: python/branches/py3k/Doc/howto/unicode.rst > ============================================================================== > --- python/branches/py3k/Doc/howto/unicode.rst (original) > +++ python/branches/py3k/Doc/howto/unicode.rst Fri Nov 19 17:09:58 2010 > > > [...] > > > -Python 2.x's Unicode Support > -============================ > +Python's Unicode Support > +======================== > > Now that you've learned the rudiments of Unicode, we can look at Python's > Unicode features. > @@ -265,7 +263,7 @@ > UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: > unexpected code byte > >>> b'\x80abc'.decode("utf-8", "replace") > - '\ufffdabc' > + '?abc' Apparently 'make latex' and 'make all-pdf' don't like this char. > >>> b'\x80abc'.decode("utf-8", "ignore") > 'abc' > > [...] Best Regards, Ezio Melotti From benjamin at python.org Fri Nov 19 23:20:25 2010 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 19 Nov 2010 16:20:25 -0600 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE6EB34.5010805@v.loewis.de> References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> <20101119124158.3d8debc9@mission> <4CE6EB34.5010805@v.loewis.de> Message-ID: 2010/11/19 "Martin v. L?wis" : >> Maybe I misremembered Martin's suggestion, and he was only talking about >> security releases. > > Technically, I was only talking about 2.5. For each branch, the > respective release manager should make a decision. For 2.5 and 2.6, > it's been decided; Benjamin has not yet announced plans how 2.7 and 3.1 > will be maintained after the switchover. I propose that they follow the development branches over to hg. Having to backport bug fixes with any frequency from hg to svn would probably be more unpleasant than the current svnmerge situation. -- Regards, Benjamin From mal at egenix.com Fri Nov 19 23:25:03 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 19 Nov 2010 23:25:03 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <201011192123.14169.victor.stinner@haypocalc.com> References: <201011192123.14169.victor.stinner@haypocalc.com> Message-ID: <4CE6F93F.9010109@egenix.com> Victor Stinner wrote: > Hi, > > On Friday 19 November 2010 17:53:58 Alexander Belopolsky wrote: >> I was recently surprised to learn that chr(i) can produce a string of >> length 2 in python 3.x. > > Yes, but only on narrow build. Eg. Debian and Ubuntu compile Python 3.1 in > wide mode (sys.maxunicode == 1114111). > >> I suspect that I am not alone finding this behavior non-obvious >> given that a mistake in Python manual stating the contrary survived >> several releases. [1] > > It was a documentation bug and you fixed it. Non-BMP characters are rare, so > few (maybe only you?) noticed the documentation bug. I consider the behaviour > as an improvment of non-BMP support of Python3. > > Python is unclear about non-BMP characters: narrow build was called "ucs2" for > long time, even if it is UTF-16 (each character is encoded to one or two > UTF-16 words). No, no, no :-) UCS2 and UCS4 are more appropriate than "narrow" and "wide" or even "UTF-16" and "UTF-32". It'S rather common to confuse a transfer encoding with a storage format. UCS2 and UCS4 refer to code units (the storage format). You can use UCS2 and UCS4 code units to represent UTF-16 and UTF-32 resp., but those are not the same things. In UTF-16 0xD800 has a special meaning, in UCS2 it doesn't. Python uses UCS2 internally. It does not assign a special meaning to those surrogate code point ranges. However, when it comes to codecs, we do try to make use of the fact that UCS2 can easily be used to represent an UTF-16 encoding and that's why you often see surrogates being created for code points that wouldn't otherwise fit into UCS2 and you see those surrogates being converted back to single code units in UCS4 builds. I don't know who invented the terms "narrow" and "wide" builds for Python3. Not me that's for sure :-) They don't have any meaning in Unicode terminology and thus cause even more confusion than UCS2 and UCS4. E.g. the import errors you get when importing extensions built for a different Unicode version, (correctly) refer to UCS2 vs. UCS4 and now give even less of a clue that they relate to difference in Unicode builds (since these are now labeled "narrow" and "wide"). IMO, we should go back to the Python2 terms UCS2 and UCS4 which are correct and provide a clear description of what Python uses internally for code units. > Python2 accepts non-BMP characters with \U syntax, but not with > chr(). This is inconsistent and I see this as a bug. But I don't want to touch > Python2 about non-BMP characters, and the "bug" is already fixed in Python3! > >> I do believe, however that a change like >> this [2] and its consequences should be better publicized. > > Change made before the release of Python 3.0. Do you want to patch the "What's > new in Python 3.0?" document? Perhaps add a section "What we forgot to mention in 3.0" or "What's not so new in 3.2" to "What's new in 3.2" :-) >> I have not >> found any discussion of this change in PEPs or "What's new" documents. >> The closest find was a mentioning of a related issue #3280 in the 3.0 >> NEWS file. [3] Since this feature will be first documented in the >> Library Reference in 3.2, I wonder if it will be appropriate to >> mention it in "What's new in 3.2"? > > In my opinion, the question is more what was it not fixed in Python2. I suppose > that the answer is something ugly like "backward compatibility" or "historical > reasons" :-) Backwards compatibility. Python2 applications don't expect unichr(i) to return anything other than a single character. If you need this in Python2, it's easy enough to get around, though, with a little helper function. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 19 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From martin at v.loewis.de Fri Nov 19 23:46:08 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 19 Nov 2010 23:46:08 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE6F93F.9010109@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> Message-ID: <4CE6FE30.1050903@v.loewis.de> > It'S rather common to confuse a transfer encoding with a storage format. > UCS2 and UCS4 refer to code units (the storage format). Actually, they don't. Instead, they refer to "coded character sets", in W3C terminology: mapping of characters to natural numbers. See http://unicode.org/faq/basic_q.html#14 The term "UCS-2" is a character set that can encode only encode 65536 characters; it thus refers to Unicode 1.1. According to the Unicode Consortium's FAQ, the term UCS-2 should be avoided these days. > IMO, we should go back to the Python2 terms UCS2 and UCS4 which > are correct and provide a clear description of what Python uses > internally for code units. No, we shouldn't. The term UCS-2 is deprecated, see above. Regards, Martin From v+python at g.nevcal.com Sat Nov 20 04:48:58 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Fri, 19 Nov 2010 19:48:58 -0800 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 Message-ID: <4CE7452A.7050109@g.nevcal.com> So maybe this is the wrong forum, if so please tell me what the right forum is for each of the various pieces. I'm assuming that I should file some bugs in the tracker, but I'm not exactly sure whether to file them on cgitb, http.server, or subprocess, or all of the above. Pretty sure there are at least some in http.server, but maybe some of those will be considered "enhancement requests" since they are long outstanding in the predecessor code. So I've been writing CGI scripts in Python behind Apache. No framework, just raw CGI. Got everything working on Python 2.6 (it's the newest that the hosting company has). Whacked at 2.6's CGIHTTPServer.py until I got an environment that would actually run CGI programs in the same sort of way that Apache does, so I can test faster, locally. Got the site working. Am happy. Now I decided to tackle porting the code to Python 3, in hopes that someday the hosting company might have it, and to see what I could learn about the "Subject:" matters, and to altruistically see if 3.2a4 has a consistent story. Um. Well. Some of me, Python 3.2a4, or its documentation is missing something. Maybe several somethings. Here's some code to ponder. import sys import traceback sys.stdout = open("sob", "wb") # WSGI sez data should be binary, so stdout should be binary??? import cgitb sys.stdout.write(b"out") fhb = open("fhb", "wb") cgitb.enable(0,"d:\temp") fhb.write("abcdef") # try writing non-binary to binary file. Expect an error, of course. Feed it to python32... d:\temp>c:\python32\python.exe test11.py Error in sys.excepthook: TypeError: 'str' does not support the buffer interface Original exception was: Traceback (most recent call last): File "d:\my\py\test11.py", line 8, in fhb.write("abcdef") # try writing non-binary to binary file. Expect an err or, of course. TypeError: 'str' does not support the buffer interface So it seems that cgitb can't write to binary files, to report the error? Or how else should I interpret the Error in sys.excepthook ? So then I tweaked the code for cgitb's enjoyment: import sys import traceback sys.stdout = open("sob", "w", encoding="UTF-8") # WSGI sez data should be binary, so stdout should be binary??? import cgitb sys.stdout.write("out") fhb = open("fhb", "wb") cgitb.enable(0,"d:\temp") fhb.write("abcdef") # try writing non-binary to binary file. Expect an error, of course. Now I get the following report in the stdout file: out --> -->

A problem occurred in a Python script. and the following error on the console: d:\temp>c:\python32\python.exe test12.py Error in sys.excepthook: Traceback (most recent call last): File "c:\python32\lib\tempfile.py", line 209, in _mkstemp_inner fd = _os.open(file, flags, 0o600) OSError: [Errno 22] Invalid argument Original exception was: Traceback (most recent call last): File "d:\my\py\test12.py", line 8, in fhb.write("abcdef") # try writing non-binary to binary file. Expect an error, of course. TypeError: 'str' does not support the buffer interface I was expecting see a whole cgitb in sob, but no such luck. Not sure why it is trying to create a temporary file, but it seems to fail to do that. Of course, the next test, would have been to write binary data into fhb, and try to copy it to stdout, which would fail, because stdout has to not be binary to make cgitb work??? That brings me to http.server, the 3.2a4 replacement for CGIHTTPServer. There are definitely some improvements here, and some reported-but-yet-unfixed bugs. And some pitiful missing features, especially on Windows. I applied some of the whacks I had applied to CGIHTTPServer, and got some things working, but, per what I was trying to demonstrate above, there seems to be an incompatibility with the idea of using cgitb (which wants stdout open with some encoding provided) and serving binary files (which wants stdout open in binary) [this latter is supported by the WSGI spec too]. So it seems to be that there are some problems. Yet, it seems that http.server can some accept the data sent by cgitb, which comes from subprocess running my CGI script, but my CGI script fails to be able to copy a binary file to its stdout (a subprocess created PIPE). The subprocess documentation doesn't say what encoding is supplied to the PIPE-created handles, if any, but since cgitb data is accepted but binary file data is not, I infer it must be a non-binary handle, encoding unknown. The subprocess documentation doesn't document any way to specify what encoding should be used on the PIPE-created handles, either. So this isn't very enlightening. In the absence of a specification or parameter, I would have expected the PIPEs to be binary, but this seems to be experimentally false. Yet http.server, when serving plain files, seems to open them in binary mode, and transfer them successfully to the browser. And it can also accept the non-binary?? data from cgitb from my CGI script, and display it in the browser. The former comes from a file it opens in binary mode, and the latter from the subprocess PIPE in unknown mode. It seems that the socketfile.server opens the socket in "wb" mode, and encodes most data. That in turn, seems to imply that the binary data from SimpleHTTPServer files are reasonably returned, and I note the headers and such are expliticly encoded before being written to wfile... again, consistent with the socket, wfile, being in binary mode. But the data coming back from the subprocess PIPE from my CGI script seems to be acceptable to be written to wfile also, implying that the PIPEs are binary, like the absence of specifications and parameters and knowledge of pipes as being bytestreams would be expected. But then, it would seem that the cgitb output should be in binary to get into the PIPE, but it seems that using a binary stdout makes cgitb fail, in the above experiment... and I can't find any code in cgitb that does explicit encoding. So I'm confused, and it seems a little extra documentation might help decide which are the modules that have bugs or missing features, and which do not. One of the cgitb outputs from my attempt to serve the binary file claims that my CGI script's output file (which comes from a subprocess PIPE) is a TextIOWrapper with encoding cp1252. Maybe that is the default that comes when a new Python is launched, even though it gets a subprocess PIPE as stdout? -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sat Nov 20 05:11:48 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 20 Nov 2010 13:11:48 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE6FE30.1050903@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> Message-ID: <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > The term "UCS-2" is a character set that can encode only encode 65536 > characters; it thus refers to Unicode 1.1. According to the Unicode > Consortium's FAQ, the term UCS-2 should be avoided these days. So what do you propose we call the Python implementation? You can call it "code-unit-oriented" if you like, but in fact it is identical to UCS-2 for all non-hairsplitting purposes. AFAICS the Unicode Consortium deprecates the *term* UCS-2 because they would like us to avoid *implementations* that don't encode the full Unicode character set, not because the term is technically incorrect. Strictly speaking, internally Python only encodes 65536 characters in 2-octet builds. Its (Unicode) string-handling code does not know about surrogates at all, AFAIK, and therefore is not UTF-16 conforming. (The anomolies discussed here are type transformations, not string-handling, for my purpose.) I really don't see why we shouldn't call a UCS-2 implementation by its name. AFAIK this was not supposed to change in Python 3; indexing and slicing go by code unit (isomorphic to UCS-n), not character, and due to PEP 383 4-octet builds do not conform (internally) to UTF-32, and can produce output that conforms to Unicode not at all (as a user option, of course, but it's still non-conformant). > > IMO, we should go back to the Python2 terms UCS2 and UCS4 which > > are correct and provide a clear description of what Python uses > > internally for code units. > > No, we shouldn't. The term UCS-2 is deprecated, see above. Too bad for the Unicode Consortium, I say. UCS-2 is the closest term that folks who are not Unicode geeks will have a chance of understanding. I agree with Marc-Andre that "narrow" and "wide" are too ambiguous to be useful. Many people will interpret that as "UTF-16" (or even "UTF-8") and "UTF-32", respectively, which is dead wrong. Others won't have a clue. Using "UCS-2" and "UCS-4" has the correct connotations to Unicode geeks, and they are easy to look up for non-geeks who care about precise definitions. Cf. the second half of the FAQ you quote: Instead, "UCS-2" has sometimes been used in the past to indicate that an implementation does not support supplementary characters and doesn't interpret pairs of surrogate code points as characters. Such an implementation would not handle processing like character properties, codepoint boundaries, collation, etc. for supplementary characters. "Hey, Python, I'm looking at you!" (Strictly speaking, Python libraries do some of that for us, but the Python *language* does not.) From brian.curtin at gmail.com Sat Nov 20 05:24:38 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 19 Nov 2010 22:24:38 -0600 Subject: [Python-Dev] [Python-checkins] r86540 - in python/branches/py3k: Parser/asdl_c.py Python/Python-ast.c In-Reply-To: <20101120020146.25797EE989@mail.python.org> References: <20101120020146.25797EE989@mail.python.org> Message-ID: On Fri, Nov 19, 2010 at 20:01, benjamin.peterson wrote: > Author: benjamin.peterson > Date: Sat Nov 20 03:01:45 2010 > New Revision: 86540 > > Log: > c89 declarations > > Modified: > python/branches/py3k/Parser/asdl_c.py > python/branches/py3k/Python/Python-ast.c > > Modified: python/branches/py3k/Parser/asdl_c.py > > ============================================================================== > --- python/branches/py3k/Parser/asdl_c.py (original) > +++ python/branches/py3k/Parser/asdl_c.py Sat Nov 20 03:01:45 2010 > @@ -366,9 +366,9 @@ > self.emit("obj2ast_%s(PyObject* obj, %s* out, PyArena* arena)" % > (name, ctype), 0) > self.emit("{", 0) > self.emit("PyObject* tmp = NULL;", 1) > + self.emit("int isinstance;", 1) > # Prevent compiler warnings about unused variable. > self.emit("tmp = tmp;", 1) > - self.emit("int isinstance;", 1) > self.emit("", 0) > > def sumTrailer(self, name, add_label=False): > > Modified: python/branches/py3k/Python/Python-ast.c > > ============================================================================== > --- python/branches/py3k/Python/Python-ast.c (original) > +++ python/branches/py3k/Python/Python-ast.c Sat Nov 20 03:01:45 2010 > @@ -3375,8 +3375,8 @@ > obj2ast_mod(PyObject* obj, mod_ty* out, PyArena* arena) > { > PyObject* tmp = NULL; > - tmp = tmp; > int isinstance; > + tmp = tmp; Windows builds fail due to this change. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Sat Nov 20 07:56:18 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Fri, 19 Nov 2010 22:56:18 -0800 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <4CE7452A.7050109@g.nevcal.com> References: <4CE7452A.7050109@g.nevcal.com> Message-ID: <4CE77112.3080604@g.nevcal.com> On 11/19/2010 7:48 PM, Glenn Linderman wrote: > One of the cgitb outputs from my attempt to serve the binary file > claims that my CGI script's output file (which comes from a subprocess > PIPE) is a TextIOWrapper with encoding cp1252. Maybe that is the > default that comes when a new Python is launched, even though it gets > a subprocess PIPE as stdout? So the rather gross code below solves the cp1252 stdout problem, and also permits both strings and bytes to be written to the same file, although those two features are separable. But now that I've worked around it, it seems that subprocesss should somehow ensure that launched Python programs know they are working on a binary stream? Of course, not all programs launched are Python programs... so maybe it should be a documentation issue, but it seems to be missing from the documentation. ##################################### if sys.version_info[ 0 ] == 2: class IOMix(): def __init__( self, fh, encoding="UTF-8"): self.fh = fh def write( self, param ): if isinstance( param, unicode ): self.fh.write( param.encode( encoding )) else: self.fh.write( param ) ##################################### if sys.version_info[ 0 ] == 3: class IOMix(): def __init__( self, fh, encoding="UTF-8"): if hasattr( fh, 'buffer'): self.bio = fh.buffer fh.flush() self.last = 'b' import io self.txt = io.TextIOWrapper( self.bio, encoding, None, '\r\n') else: raise ValueError("not a buffered stream") def write( self, param ): if isinstance( param, str ): self.last = 't' self.txt.write( param ) else: if self.last == 't': self.txt.flush() self.last = 'b' self.bio.write( param ) ##################################### -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Sat Nov 20 10:05:38 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sat, 20 Nov 2010 10:05:38 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CE78F62.7060707@v.loewis.de> Am 20.11.2010 05:11, schrieb Stephen J. Turnbull: > "Martin v. L?wis" writes: > > > The term "UCS-2" is a character set that can encode only encode 65536 > > characters; it thus refers to Unicode 1.1. According to the Unicode > > Consortium's FAQ, the term UCS-2 should be avoided these days. > > So what do you propose we call the Python implementation? A technical correct description would be to say that Python uses either 16-bit code units or 32-bit code units; for brevity, these can be called narrow and wide code units. > Strictly speaking, internally Python only encodes 65536 characters in > 2-octet builds. Its (Unicode) string-handling code does not know > about surrogates at all, AFAIK Here you are mistaken: it does indeed know about UTF-16 and surrogates in several places, e.g. in the UTF-8 codec, or in the repr() implementation; likewise in the parser. > and therefore is not UTF-16 conforming. I disagree. Python does "conform" to "UTF-16" (certainly in the sense that no UTF-16 specification ever mandates a certain Python API, and that Python follows all general requirements of the UTF-16 specification). > AFAIK this was not supposed to change in Python 3; indexing and > slicing go by code unit (isomorphic to UCS-n), not character, and due > to PEP 383 4-octet builds do not conform (internally) to UTF-32, and > can produce output that conforms to Unicode not at all (as a user > option, of course, but it's still non-conformant). What behavior specifically do you consider non-conforming, and what specific specification do you think it is not conforming to? For example, it *is* fully conforming with UTF-8. Regards, Martin From merwok at netwok.org Sat Nov 20 12:38:53 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sat, 20 Nov 2010 12:38:53 +0100 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <4CE7452A.7050109@g.nevcal.com> References: <4CE7452A.7050109@g.nevcal.com> Message-ID: <4CE7B34D.4020309@netwok.org> Hello > cgitb.enable(0,"d:\temp") Isn?t that expanded to ?d:emp?? From ncoghlan at gmail.com Sat Nov 20 14:16:27 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 20 Nov 2010 23:16:27 +1000 Subject: [Python-Dev] [Python-checkins] pymigr: Build identification patch is updated, but only for Unix. In-Reply-To: References: Message-ID: On Sat, Nov 20, 2010 at 6:02 PM, georg.brandl wrote: > georg.brandl pushed abd0dc1328ce to pymigr: > > http://hg.python.org/pymigr/rev/abd0dc1328ce > changeset: ? 70:abd0dc1328ce > tag: ? ? ? ? tip > user: ? ? ? ?Georg Brandl > date: ? ? ? ?Sat Nov 20 09:01:03 2010 +0100 > summary: ? ? Build identification patch is updated, but only for Unix. > files: ? ? ? todo.txt Does this repository use the same set of hooks as distutils2? (I'm hoping not, since if it does, my change to the email hook didn't work...) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 20 14:55:57 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 20 Nov 2010 23:55:57 +1000 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: References: <4CE2CF8F.4040500@jcea.es> <4CE55385.6080002@v.loewis.de> <4CE56331.3050508@v.loewis.de> <4CE5DD52.7050907@jcea.es> <20101119094657.1a7cc24a@mission> Message-ID: On Sat, Nov 20, 2010 at 2:51 AM, Georg Brandl wrote: > I'm at it. ?In fact, I think I will merge both todo.txt and tasks.txt > into the PEP. ?It's not more of a burden to update it there, and it's > more visible to the developer community. The latest checkin was definitely an improvement (especially the updated timeline). According to the PEP, the .hgeol rules aren't currently enforced server side - having such a hook in place before Hg went live was definitely one of the things we agreed on before the hgeol extension even existed in a usable form. For fixing whitespace issues (another open question mentioned in the PEP), "make patchcheck" can continue to handle that - no need to create a Hg specific extension for it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 20 16:21:32 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 21 Nov 2010 01:21:32 +1000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: <20101120150731.2D346E78E@mail.python.org> References: <20101120150731.2D346E78E@mail.python.org> Message-ID: On Sun, Nov 21, 2010 at 1:07 AM, michael.foord wrote: > +Fetching attributes statically > +------------------------------ > + > +Both :func:`getattr` and :func:`hasattr` can trigger code execution when > +fetching or checking for the existence of attributes. Descriptors, like > +properties, will be invoked and :meth:`__getattr__` and :meth:`__getattribute__` > +may be called. > + > +For cases where you want passive introspection, like documentation tools, this > +can be inconvenient. `getattr_static` has the same signature as :func:`getattr` > +but avoids executing code when it fetches attributes. This description feels a little strong to me - getattr_static still executes all those things on the metaclass as it retrieves the information it needs to do the "static" lookup. Leaving this original description (which assumes metaclass=type) alone and adding a note near the end of the section to say that metaclass code is still executed might be an improvement. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Sat Nov 20 16:29:13 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 20 Nov 2010 15:29:13 +0000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: References: <20101120150731.2D346E78E@mail.python.org> Message-ID: <4CE7E949.5030300@voidspace.org.uk> On 20/11/2010 15:21, Nick Coghlan wrote: > On Sun, Nov 21, 2010 at 1:07 AM, michael.foord > wrote: >> +Fetching attributes statically >> +------------------------------ >> + >> +Both :func:`getattr` and :func:`hasattr` can trigger code execution when >> +fetching or checking for the existence of attributes. Descriptors, like >> +properties, will be invoked and :meth:`__getattr__` and :meth:`__getattribute__` >> +may be called. >> + >> +For cases where you want passive introspection, like documentation tools, this >> +can be inconvenient. `getattr_static` has the same signature as :func:`getattr` >> +but avoids executing code when it fetches attributes. > This description feels a little strong to me - getattr_static still > executes all those things on the metaclass as it retrieves the > information it needs to do the "static" lookup. Leaving this original > description (which assumes metaclass=type) alone and adding a note > near the end of the section to say that metaclass code is still > executed might be an improvement. Can you give an example of code in a metaclass that may be executed by getattr_static? It's not that I don't believe you I just can't think of an example. Looking up the class and the mro are the only two examples I can think of (klass.__mro__ and instance.__class__ - and they are noted in the docs?) but aren't metaclass specific. Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Sat Nov 20 16:42:30 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 20 Nov 2010 16:42:30 +0100 Subject: [Python-Dev] r86570 - in python/branches/py3k: Lib/unittest/case.py Lib/unittest/test/test_case.py Misc/NEWS References: <20101120153426.47AC0ED9A@mail.python.org> Message-ID: <20101120164230.5dc326bc@pitrou.net> On Sat, 20 Nov 2010 16:34:26 +0100 (CET) michael.foord wrote: > + > + def testPickle(self): > + # Issue 10326 > + > + # Can't use TestCase classes defined in Test class as > + # pickle does not work with inner classes > + test = unittest.TestCase('run') > + for protocol in range(pickle.HIGHEST_PROTOCOL + 1): > + > + # blew up prior to fix > + pickled_test = pickle.dumps(test, protocol=protocol) You must also check that the object can be unpickled, otherwise making TestCase picklable is not only pointless, but misleading the user. Other classes which claim to be picklable (such as e.g. io.BytesIO) are careful to check that unpickling works fine and produces an usable object. Regards Antoine. From fuzzyman at voidspace.org.uk Sat Nov 20 16:48:59 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 20 Nov 2010 15:48:59 +0000 Subject: [Python-Dev] r86570 - in python/branches/py3k: Lib/unittest/case.py Lib/unittest/test/test_case.py Misc/NEWS In-Reply-To: <20101120164230.5dc326bc@pitrou.net> References: <20101120153426.47AC0ED9A@mail.python.org> <20101120164230.5dc326bc@pitrou.net> Message-ID: <4CE7EDEB.9080706@voidspace.org.uk> On 20/11/2010 15:42, Antoine Pitrou wrote: > On Sat, 20 Nov 2010 16:34:26 +0100 (CET) > michael.foord wrote: >> + >> + def testPickle(self): >> + # Issue 10326 >> + >> + # Can't use TestCase classes defined in Test class as >> + # pickle does not work with inner classes >> + test = unittest.TestCase('run') >> + for protocol in range(pickle.HIGHEST_PROTOCOL + 1): >> + >> + # blew up prior to fix >> + pickled_test = pickle.dumps(test, protocol=protocol) > You must also check that the object can be unpickled, otherwise > making TestCase picklable is not only pointless, but misleading the > user. Other classes which claim to be picklable (such as e.g. > io.BytesIO) are careful to check that unpickling works fine and > produces an usable object. Well, given the *particular* bug it is fixing, ensuring that the TestCase instances can be pickled is enough. If they fail to unpickle that is a bug in pickle and not in unittest. *However*, the test is very easy to extend to what you suggest so I have done it. All the best, Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Sat Nov 20 16:59:49 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 20 Nov 2010 16:59:49 +0100 Subject: [Python-Dev] r86570 - in python/branches/py3k: Lib/unittest/case.py Lib/unittest/test/test_case.py Misc/NEWS In-Reply-To: <4CE7EDEB.9080706@voidspace.org.uk> References: <20101120153426.47AC0ED9A@mail.python.org> <20101120164230.5dc326bc@pitrou.net> <4CE7EDEB.9080706@voidspace.org.uk> Message-ID: <1290268789.3560.12.camel@localhost.localdomain> Le samedi 20 novembre 2010 ? 15:48 +0000, Michael Foord a ?crit : > On 20/11/2010 15:42, Antoine Pitrou wrote: > > On Sat, 20 Nov 2010 16:34:26 +0100 (CET) > > michael.foord wrote: > >> + > >> + def testPickle(self): > >> + # Issue 10326 > >> + > >> + # Can't use TestCase classes defined in Test class as > >> + # pickle does not work with inner classes > >> + test = unittest.TestCase('run') > >> + for protocol in range(pickle.HIGHEST_PROTOCOL + 1): > >> + > >> + # blew up prior to fix > >> + pickled_test = pickle.dumps(test, protocol=protocol) > > You must also check that the object can be unpickled, otherwise > > making TestCase picklable is not only pointless, but misleading the > > user. Other classes which claim to be picklable (such as e.g. > > io.BytesIO) are careful to check that unpickling works fine and > > produces an usable object. > > Well, given the *particular* bug it is fixing, ensuring that the > TestCase instances can be pickled is enough. If they fail to unpickle > that is a bug in pickle and not in unittest. It wouldn't be, no. pickle provides several different APIs to ensure that state gets correctly stored *and* restored, but it's up to application classes such as TestCase to ensure that they implement those APIs correctly for the intended behaviour. Therefore, checking that pickling "works" fine (or, rather, seems to work) is only half ot the job. (for example, if you define a __getstate__, chances are you must define a __setstate__ too, and it is your job to make it work properly) Antoine. From ncoghlan at gmail.com Sat Nov 20 17:01:06 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 21 Nov 2010 02:01:06 +1000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: <4CE7E949.5030300@voidspace.org.uk> References: <20101120150731.2D346E78E@mail.python.org> <4CE7E949.5030300@voidspace.org.uk> Message-ID: On Sun, Nov 21, 2010 at 1:29 AM, Michael Foord wrote: > Can you give an example of code in a metaclass that may be executed by > getattr_static? It's not that I don't believe you I just can't think of an > example. Looking up the class and the mro are the only two examples I can > think of (klass.__mro__ and instance.__class__ - and they are noted in the > docs?) but aren't metaclass specific. The description heavily implies that arbitrary Python code won't be executed by calling getattr_static, and that isn't necessarily true. It's almost certain to be true in the case when the metaclass is type, but can't be guaranteed otherwise. The retrieval of __class__ is a normal lookup on the object, so it can trigger all of the things getattr_static is trying to avoid (unavoidable if you want to support proxy classes at all), and the lookup of __mro__ invokes all of those things on the metaclass. I'll see if I'm still of the same opinion after I sleep on it, but my first impression of the docs was that they slightly oversold the strength of the "doesn't execute arbitrary code" aspect of the new function. The existing caveats were all relating to when getattr() and getattr_static() might give different answers, while the additional caveats I was suggesting related to cases where arbitrary code may still be executed. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Sat Nov 20 17:06:59 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 20 Nov 2010 16:06:59 +0000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: References: <20101120150731.2D346E78E@mail.python.org> <4CE7E949.5030300@voidspace.org.uk> Message-ID: <4CE7F223.5040009@voidspace.org.uk> On 20/11/2010 16:01, Nick Coghlan wrote: > On Sun, Nov 21, 2010 at 1:29 AM, Michael Foord > wrote: >> Can you give an example of code in a metaclass that may be executed by >> getattr_static? It's not that I don't believe you I just can't think of an >> example. Looking up the class and the mro are the only two examples I can >> think of (klass.__mro__ and instance.__class__ - and they are noted in the >> docs?) but aren't metaclass specific. > The description heavily implies that arbitrary Python code won't be > executed by calling getattr_static, and that isn't necessarily true. > It's almost certain to be true in the case when the metaclass is type, > but can't be guaranteed otherwise. Given the way that member lookups are done by getattr_static I don't think any assumptions about the metaclass are made. I'm happy to be proven wrong (but would rather fix it than document it as an exception). (Actually we assume the metaclass doesn't use __slots__, but only because it isn't *possible* for a metaclass to use __slots__.) > The retrieval of __class__ is a > normal lookup on the object, so it can trigger all of the things > getattr_static is trying to avoid (unavoidable if you want to support > proxy classes at all), and the lookup of __mro__ invokes all of those > things on the metaclass. __class__ and mro lookup are noted in the docs as being exceptions. We could actually remove the __class__ lookup from the list of exceptions by using type(...) instead of obj.__class__. > I'll see if I'm still of the same opinion after I sleep on it, but my > first impression of the docs was that they slightly oversold the > strength of the "doesn't execute arbitrary code" aspect of the new > function. The existing caveats were all relating to when getattr() and > getattr_static() might give different answers, while the additional > caveats I was suggesting related to cases where arbitrary code may > still be executed. I'm happy to change the wording to make the promise less strong. All the best, Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Sat Nov 20 17:10:42 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 20 Nov 2010 16:10:42 +0000 Subject: [Python-Dev] r86570 - in python/branches/py3k: Lib/unittest/case.py Lib/unittest/test/test_case.py Misc/NEWS In-Reply-To: <1290268789.3560.12.camel@localhost.localdomain> References: <20101120153426.47AC0ED9A@mail.python.org> <20101120164230.5dc326bc@pitrou.net> <4CE7EDEB.9080706@voidspace.org.uk> <1290268789.3560.12.camel@localhost.localdomain> Message-ID: <4CE7F302.8090909@voidspace.org.uk> On 20/11/2010 15:59, Antoine Pitrou wrote: > Le samedi 20 novembre 2010 ? 15:48 +0000, Michael Foord a ?crit : >> On 20/11/2010 15:42, Antoine Pitrou wrote: >>> On Sat, 20 Nov 2010 16:34:26 +0100 (CET) >>> michael.foord wrote: >>>> + >>>> + def testPickle(self): >>>> + # Issue 10326 >>>> + >>>> + # Can't use TestCase classes defined in Test class as >>>> + # pickle does not work with inner classes >>>> + test = unittest.TestCase('run') >>>> + for protocol in range(pickle.HIGHEST_PROTOCOL + 1): >>>> + >>>> + # blew up prior to fix >>>> + pickled_test = pickle.dumps(test, protocol=protocol) >>> You must also check that the object can be unpickled, otherwise >>> making TestCase picklable is not only pointless, but misleading the >>> user. Other classes which claim to be picklable (such as e.g. >>> io.BytesIO) are careful to check that unpickling works fine and >>> produces an usable object. >> Well, given the *particular* bug it is fixing, ensuring that the >> TestCase instances can be pickled is enough. If they fail to unpickle >> that is a bug in pickle and not in unittest. > It wouldn't be, no. pickle provides several different APIs to ensure > that state gets correctly stored *and* restored, but it's up to > application classes such as TestCase to ensure that they implement those > APIs correctly for the intended behaviour. Therefore, checking that > pickling "works" fine (or, rather, seems to work) is only half ot the > job. > > (for example, if you define a __getstate__, chances are you must define > a __setstate__ too, and it is your job to make it work properly) Yes, but unittest.TestCase doesn't implement any of those APIs (and if we did we would *definitely* need to test unpickling). That aside I have extended the test in the way you suggest. Actually it would be nice to implement custom pickling / unpickling methods to allow Python 2.7 / 3.2 pickled TestCases to be unpickled on earlier versions of Python. I couldn't see how to change the class name in the pickle using the pickle protocol methods. Suggestions welcomed. Michael > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Sat Nov 20 17:28:40 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 20 Nov 2010 16:28:40 +0000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: <4CE7F223.5040009@voidspace.org.uk> References: <20101120150731.2D346E78E@mail.python.org> <4CE7E949.5030300@voidspace.org.uk> <4CE7F223.5040009@voidspace.org.uk> Message-ID: <4CE7F738.90706@voidspace.org.uk> On 20/11/2010 16:06, Michael Foord wrote: > On 20/11/2010 16:01, Nick Coghlan wrote: > [snip...] >> The retrieval of __class__ is a >> normal lookup on the object, so it can trigger all of the things >> getattr_static is trying to avoid (unavoidable if you want to support >> proxy classes at all), and the lookup of __mro__ invokes all of those >> things on the metaclass. > > __class__ and mro lookup are noted in the docs as being exceptions. We > could actually remove the __class__ lookup from the list of exceptions > by using type(...) instead of obj.__class__. > Done. >> I'll see if I'm still of the same opinion after I sleep on it, but my >> first impression of the docs was that they slightly oversold the >> strength of the "doesn't execute arbitrary code" aspect of the new >> function. The existing caveats were all relating to when getattr() and >> getattr_static() might give different answers, while the additional >> caveats I was suggesting related to cases where arbitrary code may >> still be executed. > I'm happy to change the wording to make the promise less strong. I've also removed the __mro__ exception. This is done with: type.__dict__['__mro__'].__get__(klass) If you can think of any other exceptions then please let me know. Michael > All the best, > > Michael > >> Cheers, >> Nick. >> > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From v+python at g.nevcal.com Sat Nov 20 19:19:11 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 20 Nov 2010 10:19:11 -0800 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <4CE7B34D.4020309@netwok.org> References: <4CE7452A.7050109@g.nevcal.com> <4CE7B34D.4020309@netwok.org> Message-ID: <4CE8111F.9060502@g.nevcal.com> On 11/20/2010 3:38 AM, ?ric Araujo wrote: > Hello > >> cgitb.enable(0,"d:\temp") > Isn?t that expanded to ?d:emp?? > Oops. Yes, that fixes the problem with creation of the temp file, thanks for catching that. I now get a complete report of the original error in the temp file (below). I am a bit less confused now... but it seems that there are still a number of issues. Here is an enumeration of problems I was hard pressed to make before you removed my confusion on this issue. 1. cgitb should expect to report to a binary stdout, using whatever encoding (possibly ASCII) that seems appropriate for the output that in generates. 2. Some appropriate documentation or API or both should be provided to enable a script to set "binary" mode for stdout for CGI scripts. This link demonstrates the confusion (wish I had found it earlier) that is encountered by such lack. One must tell msvcrt the stream is binary (I had figured that out early on), one must also sidestep the use of the cp1252 default when printing binary, one must also choose a proper text encoding corresponding to the HTTP headers sent. My second email in this thread, sent a few hours after the first, shows a convenient set of cures for all but msvcrt (as long as only "write" is used for writing. "print" support could be added, similarly). Likely something along this line is needed for stdin as well, I haven't yet experimented with uploading binary content to a CGI. One could speculate about having the Python runtime auto-detect CGI mode, but I don't know of any foolproof technique for that, and the selection of the "proper" text encoding depends on the details of the CGI, so having instead an API or two that assists with doing this sort of thing would be better; the need for documentation, at least, seems imperative. 3. subprocess documentation could be improved to point out that when using subprocess.PIPE to talk to a Python subprocess, that the communications will be in binary. Again, I don't know of any way to autodetect the subprocess environment, but if it were possible to select an appropriate encoding and use it consistently on both sides of the PIPE, that would be a convenience to its use; if not possible, documenting the issue, and providing an API to use to easily select such encodings both in client and server, would be helpful. While the layers are all there, and ".buffer" is documented for TextIOWrapper, the use of sys.stdout.buffer and the fact that it has a full set of operations isn't immediately obvious from the reference material; perhaps it is in a tutorial I haven't found, but... I was looking, and didn't find it. Of course, subprocess may launch non-Python programs; they will have their own ideas of binary vs text encoding, so it is important that it is convenient to match them on the Python side. It would be nice if subprocess had a mechanism for providing no-deadlock stdout data to the parent prior to the child terminating. A CGI implementation via subprocess shouldn't accumulate all of stdout (or all of stderr, for that matter, although less important). I don't (yet) know enough about Python threading to know if this is possible, but it certainly would be useful. 4. http.server has a number of bugs and limitations. 4a. _url_collapse_path_split seems inefficient (although I have to benchmark it against what I think would be more efficient), and for its only use within http.server it produces the wrong information, so the information has to be recombined and resplit to make it function properly, adding to the perception of inefficiency. 4b. Detection of "executable" on Windows is simply wrong. Unix execution bits do not exist. 4c. is_cgi doesn't properly handle PATHINFO parts of the path, this is the other half of 4a. The Python2.x CGIHTTPServer.py had this right, but the introduction and use of _url_collapse_path_split broke it. 4d. Searching for a ? to find an explicit query string should use .find('?') rather than .rfind('?') as there is no prohibition on using '?' within a query string, AFAIK. 4e. doesn't set the REQUEST_URI, HTTP_HOST, or HTTP_PORT environment variables for the CGI. 4f. Should not send the 200 response until it sees if the CGI sends a Status: header. 4g. Should not buffer all of stdout: subprocess.communicate is inappropriate for a web server CGI interface. The data should stream through to avoid consuming inordinate amounts of memory. The only solution within the current limitations of subprocess is to abandon stderr, force the CGI to do its own error logging, and use shutil.copyfileobj to hook up p.stdout to self.wfile once the Status: message processing has happened. 4h. Doesn't seem to close p.stdin (I'm not sure if that is necessary, it may happen when p is garbage collected, but effort was made to close p.stdout and p.stderr, which seem similar.) *TypeError* Python 3.2a4: c:\python32\python.exe Sat Nov 20 09:28:41 2010 A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred. d:\my\py\test12.py in **() 4 import cgitb 5 sys.stdout.write("out") 6 fhb = open("fhb", "wb") 7 cgitb.enable(0,"d:\\temp") => 8 fhb.write("abcdef") # try writing non-binary to binary file. Expect an error, of course. *fhb* = <_io.BufferedWriter name='fhb'>, fhb.*write* = *TypeError*: 'str' does not support the buffer interface args = ("'str' does not support the buffer interface",) with_traceback = -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Nov 20 23:32:28 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 20 Nov 2010 17:32:28 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE78F62.7060707@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> Message-ID: On Sat, Nov 20, 2010 at 4:05 AM, "Martin v. L?wis" wrote: .. > A technical correct description would be to say that Python uses either > 16-bit code units or 32-bit code units; for brevity, these can be called > narrow and wide code units. +1 PEP 261 introduced terms "wide Py_UNICODE" and "narrow Py_UNICODE," but when discussion is at Python level, I don't think we should use names of C typedefs. I think "wide/narrow Unicode" builds describe the two options clearly and unambiguously. I prefer Python-specific terminology to Unicode terms because in Python reference documentation we often discuss details that are outside of the scope of Unicode Standard. For example, interpretation of lone surrogates on narrow builds is one such detail. From ziade.tarek at gmail.com Sun Nov 21 00:05:12 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 21 Nov 2010 00:05:12 +0100 Subject: [Python-Dev] Reminder: Distutils vs Distutils2 Message-ID: Hello, I have seen some efforts recently to improve Distutils in the standard library, Just a quick reminder of the status of Distutils: it's frozen and is just being bug fixed at this time. The work I done last year was reverted and pushed to Distutils2. A lot of work has been done since then, and we had 4 GSOC students working this summer on Distutils2. It's backward-incompatible, so we can remove the things we don't like and add new things w/o suffering from backward compatibility pains. So if you want to improve the tool, or if you have some pending changes to Distutils, I would encourage you to join the Distutils2 effort and not to waste time on Distutils anymore. The patches that did not make it to Distutils can still be added in Distutils2, for most of them. The workflow we currently use to change the code is as follow and make it easy for everyone to contribute: 1. clone http://bitbucket.org/tarek/distutils2 2. discuss / propose a patch on IRC (#distutils - Freenode) or on the dedicated mailing list (http://groups.google.com/group/the-fellowship-of-the-packaging) 3. I review and merge all changes at bitbucket, then push them on http://hg,python.org/distutils2 Crazy ideas are welcome. "setup.py" is gone in d2 for instance ;) Thanks ! Regards. Tarek -- Tarek Ziad? | http://ziade.org From ziade.tarek at gmail.com Sun Nov 21 00:15:41 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 21 Nov 2010 00:15:41 +0100 Subject: [Python-Dev] Reminder: Distutils vs Distutils2 In-Reply-To: References: Message-ID: On Sun, Nov 21, 2010 at 12:05 AM, Tarek Ziad? wrote: .. > Crazy ideas are welcome. "setup.py" is gone in d2 for instance ;) But you can still use a similar form if you want - just to mention From ncoghlan at gmail.com Sun Nov 21 04:52:19 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 21 Nov 2010 13:52:19 +1000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: <4CE7F223.5040009@voidspace.org.uk> References: <20101120150731.2D346E78E@mail.python.org> <4CE7E949.5030300@voidspace.org.uk> <4CE7F223.5040009@voidspace.org.uk> Message-ID: On Sun, Nov 21, 2010 at 2:06 AM, Michael Foord wrote: >> I'll see if I'm still of the same opinion after I sleep on it, but my >> first impression of the docs was that they slightly oversold the >> strength of the "doesn't execute arbitrary code" aspect of the new >> function. The existing caveats were all relating to when getattr() and >> getattr_static() might give different answers, while the additional >> caveats I was suggesting related to cases where arbitrary code may >> still be executed. > > I'm happy to change the wording to make the promise less strong. Your latest changes may have actually made the stronger wording accurate (I certainly can't think of any loopholes off the top of my head). If you did still want to soften the wording, I'd be inclined to replace the word "avoids" with "minimises" in the appropriate places. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Nov 21 04:54:11 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 21 Nov 2010 13:54:11 +1000 Subject: [Python-Dev] [Python-checkins] r86566 - in python/branches/py3k: Doc/glossary.rst Doc/library/inspect.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS Misc/python-wing4.wpr In-Reply-To: <20101120150731.2D346E78E@mail.python.org> References: <20101120150731.2D346E78E@mail.python.org> Message-ID: On Sun, Nov 21, 2010 at 1:07 AM, michael.foord wrote: > Author: michael.foord > Date: Sat Nov 20 16:07:30 2010 > New Revision: 86566 > > Log: > Issue 9732: addition of getattr_static to the inspect module > > Modified: > ? python/branches/py3k/Doc/glossary.rst > ? python/branches/py3k/Doc/library/inspect.rst > ? python/branches/py3k/Lib/inspect.py > ? python/branches/py3k/Lib/test/test_inspect.py > ? python/branches/py3k/Misc/NEWS > ? python/branches/py3k/Misc/python-wing4.wpr Unrelated to my previous comment - when adding inspect.getgeneratorstate, I noticed that inspect.getattr_static isn't mentioned in the 3.2 What's New yet (I put a XXX placeholder in for you/Raymond). -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From v+python at g.nevcal.com Sun Nov 21 08:52:45 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 20 Nov 2010 23:52:45 -0800 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <4CE8111F.9060502@g.nevcal.com> References: <4CE7452A.7050109@g.nevcal.com> <4CE7B34D.4020309@netwok.org> <4CE8111F.9060502@g.nevcal.com> Message-ID: <4CE8CFCD.4040906@g.nevcal.com> On 11/20/2010 10:19 AM, Glenn Linderman wrote: > Oops. Yes, that fixes the problem with creation of the temp file, > thanks for catching that. I now get a complete report of the > original error in the temp file (below). I am a bit less confused > now... but it seems that there are still a number of issues. Here is > an enumeration of problems I was hard pressed to make before you > removed my confusion on this issue. Related issues, regarding binary stream requirements for cgi interface. Perhaps the cgi module should have the API to set binary mode. http://bugs.python.org/issue1610654 http://bugs.python.org/issue8077 http://bugs.python.org/issue4953 Sadly, cgi.py input handling seems to depend on the email module, thought to be fixed for 3.2, but it is not clear if that has been achieved, or if the surrogate encode workaround is sufficient for this. More testing needed, but I don't have such a test case developed yet. > 1. cgitb should expect to report to a binary stdout, using whatever > encoding (possibly ASCII) that seems appropriate for the output that > in generates. Maybe cgi.py should have an API to set the stdin and stdout to binary streams. Although cgi.py deals more with stdin than stdout, cgitb deals more with stdout. Created http://bugs.python.org/issue10479 > > 2. Some appropriate documentation or API or both should be provided to > enable a script to set "binary" mode for stdout for CGI scripts. This > link > > demonstrates the confusion (wish I had found it earlier) that is > encountered by such lack. One must tell msvcrt the stream is binary > (I had figured that out early on), one must also sidestep the use of > the cp1252 default when printing binary, one must also choose a proper > text encoding corresponding to the HTTP headers sent. My second email > in this thread, sent a few hours after the first, shows a convenient > set of cures for all but msvcrt (as long as only "write" is used for > writing. "print" support could be added, similarly). Likely > something along this line is needed for stdin as well, I haven't yet > experimented with uploading binary content to a CGI. > > One could speculate about having the Python runtime auto-detect CGI > mode, but I don't know of any foolproof technique for that, and the > selection of the "proper" text encoding depends on the details of the > CGI, so having instead an API or two that assists with doing this sort > of thing would be better; the need for documentation, at least, seems > imperative. Created http://bugs.python.org/issue10480 > > 3. subprocess documentation could be improved to point out that when > using subprocess.PIPE to talk to a Python subprocess, that the > communications will be in binary. Again, I don't know of any way to > autodetect the subprocess environment, but if it were possible to > select an appropriate encoding and use it consistently on both sides > of the PIPE, that would be a convenience to its use; if not possible, > documenting the issue, and providing an API to use to easily select > such encodings both in client and server, would be helpful. > > While the layers are all there, and ".buffer" is documented for > TextIOWrapper, the use of sys.stdout.buffer and the fact that it has a > full set of operations isn't immediately obvious from the reference > material; perhaps it is in a tutorial I haven't found, but... I was > looking, and didn't find it. > > Of course, subprocess may launch non-Python programs; they will have > their own ideas of binary vs text encoding, so it is important that it > is convenient to match them on the Python side. > > It would be nice if subprocess had a mechanism for providing > no-deadlock stdout data to the parent prior to the child terminating. > A CGI implementation via subprocess shouldn't accumulate all of stdout > (or all of stderr, for that matter, although less important). I don't > (yet) know enough about Python threading to know if this is possible, > but it certainly would be useful. http://bugs.python.org/issue1048 for subprocess to document that communicate produces byte stream output. http://bugs.python.org/issue10482 for subprocess enhancements to handle more cases without deadlock. Found http://bugs.python.org/issue4571 which documents how to switch stdin/stdout/stderr to binary mode, and even back! I couldn't track the documented change to the actual documentation, though, but I did find it in section 26.1, under the documentation for the three stdio streams: def make_streams_binary(): sys.stdin = sys.stdin.detach() sys.stdout = sys.stdout.detach() > 4. http.server has a number of bugs and limitations. > 4a. _url_collapse_path_split seems inefficient (although I have to > benchmark it against what I think would be more efficient), and for > its only use within http.server it produces the wrong information, so > the information has to be recombined and resplit to make it function > properly, adding to the perception of inefficiency. > 4b. Detection of "executable" on Windows is simply wrong. Unix > execution bits do not exist. http://bugs.python.org/issue10483 for 4b. > 4c. is_cgi doesn't properly handle PATHINFO parts of the path, this is > the other half of 4a. The Python2.x CGIHTTPServer.py had this right, > but the introduction and use of _url_collapse_path_split broke it. http://bugs.python.org/issue10484 for 4a and 4c. > 4d. Searching for a ? to find an explicit query string should use > .find('?') rather than .rfind('?') as there is no prohibition on using > '?' within a query string, AFAIK. http://bugs.python.org/issue10485 for 4d. > 4e. doesn't set the REQUEST_URI, HTTP_HOST, or HTTP_PORT environment > variables for the CGI. http://bugs.python.org/issue10486 for 4e. > 4f. Should not send the 200 response until it sees if the CGI sends a > Status: header. http://bugs.python.org/issue10487 for 4f and 4g. > 4g. Should not buffer all of stdout: subprocess.communicate is > inappropriate for a web server CGI interface. The data should stream > through to avoid consuming inordinate amounts of memory. The only > solution within the current limitations of subprocess is to abandon > stderr, force the CGI to do its own error logging, and use > shutil.copyfileobj to hook up p.stdout to self.wfile once the Status: > message processing has happened. > 4h. Doesn't seem to close p.stdin (I'm not sure if that is necessary, > it may happen when p is garbage collected, but effort was made to > close p.stdout and p.stderr, which seem similar.) Discovered that subprocess.communicate closes p.stdin, so it wasn't needed until I quit using .communicate in my version of the code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sun Nov 21 13:55:12 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 21 Nov 2010 21:55:12 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE78F62.7060707@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> Message-ID: <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > Am 20.11.2010 05:11, schrieb Stephen J. Turnbull: > > "Martin v. L?wis" writes: > > > > > The term "UCS-2" is a character set that can encode only encode 65536 > > > characters; it thus refers to Unicode 1.1. According to the Unicode > > > Consortium's FAQ, the term UCS-2 should be avoided these days. > > > > So what do you propose we call the Python implementation? > > A technical correct description would be to say that Python uses either > 16-bit code units or 32-bit code units; for brevity, these can be called > narrow and wide code units. I agree that's technically correct. Unfortunately, it's also useless to anybody who doesn't already know more about Unicode than anybody should have to know. > > and therefore is not UTF-16 conforming. > > I disagree. Python does "conform" to "UTF-16" I'm sure the codecs do. But the Unicode standard doesn't care about the parts of the process, it cares about what it does as a whole. Python's internal coding does not conform to UTF-16, and that internal coding can, under certain conditions, escape to the outside world as invalid "Unicode" output. > > AFAIK this was not supposed to change in Python 3; indexing and > > slicing go by code unit (isomorphic to UCS-n), not character, and due > > to PEP 383 4-octet builds do not conform (internally) to UTF-32, and > > can produce output that conforms to Unicode not at all (as a user > > option, of course, but it's still non-conformant). > > What behavior specifically do you consider non-conforming, and what > specific specification do you think it is not conforming to? For > example, it *is* fully conforming with UTF-8. Oh, f = open('/tmp/broken','wt',encoding='utf8',errors='surrogateescape') f.write(chr(int('dc80',16))) f.close() for one. That produces a non-UTF-8 file in a 32-bit-code-unit build. You can say, "oh, but that's not really a UTF-8 codec", and I'd agree. Nevertheless, the program is able to produce output from internal "Unicode" strings that does not conform to Unicode at all. A Unicode- conforming Python implementation would error at the chr() call, or perhaps would not provide surrogateescape error handlers. It is, of course, possible to write Python programs that conform (and easier than in any other language I know), but Python itself does not conform to post-1.1 Unicode standards. Too bad for the standards: "Although practicality beats purity." The point is that internal code is *not* UTF-16 (or -32), but it *is* isomorphic to UCS-2 (or -4). *That is very useful information to users*, it's not a technical detail of interest only to Unicode geeks. It means that if you stick to defined characters in the BMP when giving Python input, then slicing and indexing unicode (Python 2) or str (Python 3) objects gives only valid output even in builds with 16-bit code units. OTOH, invalid processing (involving functions like 'chr' or input using surrogateescape codecs) can lead to invalid output even in builds with 32-bit code units. IMO, saying "UCS-2" or "UCS-4" tells ordinary developers most of what they need to know about the limitations of their Python vis-a-vis full conformance, at least with respect to the string manipulation functions. From rdmurray at bitdance.com Sun Nov 21 18:18:20 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 21 Nov 2010 12:18:20 -0500 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <4CE8CFCD.4040906@g.nevcal.com> References: <4CE7452A.7050109@g.nevcal.com> <4CE7B34D.4020309@netwok.org> <4CE8111F.9060502@g.nevcal.com> <4CE8CFCD.4040906@g.nevcal.com> Message-ID: <20101121171821.195552194AC@kimball.webabinitio.net> On Sat, 20 Nov 2010 23:52:45 -0800, Glenn Linderman wrote: > Sadly, cgi.py input handling seems to depend on the email module, > thought to be fixed for 3.2, but it is not clear if that has been > achieved, or if the surrogate encode workaround is sufficient for this. > More testing needed, but I don't have such a test case developed yet. Indeed, this should theoretically be fixable now. The email module is now perfectly capable of both consuming and producing binary data. The user of the module doesn't need to care how this was achieved unless they want to do processing of non-RFC conformant data. I want to look at the CGI issue, but I'm not sure when I'll get to it. -- R. David Murray www.bitdance.com From jcea at jcea.es Sun Nov 21 18:27:42 2010 From: jcea at jcea.es (Jesus Cea) Date: Sun, 21 Nov 2010 18:27:42 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE2CF8F.4040500@jcea.es> References: <4CE2CF8F.4040500@jcea.es> Message-ID: <4CE9568E.4010102@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 What is the impact in the buildbot architecture?. Slaves must do anything?. At least they need to have mercurial installed, I guess. What, as a buildslave manager, must I do to ready my server for the migration?. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOlWjplgi5GaxT1NAQKwJAP/W1w/mn3Jv9XECxGCLKFj1Xvjz4fKq8im e1oKpvrl5hzXfKfYtIC4K2fy5G4O3iP1gS/Iwy0iGSSqcpnxFIfpwcTpjigRGaBi rpZp956TosaSLTGZxS2Wb11KFxsGlhAcgVF2ooFF7Z+wL73wCyVjfUqMXCB/50Nr dztlJuv3Wvg= =ntFy -----END PGP SIGNATURE----- From rdmurray at bitdance.com Sun Nov 21 18:38:25 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 21 Nov 2010 12:38:25 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20101121173825.B1BFB235977@kimball.webabinitio.net> On Sun, 21 Nov 2010 21:55:12 +0900, "Stephen J. Turnbull" wrote: > "Martin v. L??wis" writes: > > Am 20.11.2010 05:11, schrieb Stephen J. Turnbull: > > > "Martin v. L??wis" writes: > > > > > > > The term "UCS-2" is a character set that can encode only encode 65536 > > > > characters; it thus refers to Unicode 1.1. According to the Unicode > > > > Consortium's FAQ, the term UCS-2 should be avoided these days. > > > > > > So what do you propose we call the Python implementation? > > > > A technical correct description would be to say that Python uses either > > 16-bit code units or 32-bit code units; for brevity, these can be called > > narrow and wide code units. > > I agree that's technically correct. Unfortunately, it's also useless > to anybody who doesn't already know more about Unicode than anybody > should have to know. [...] > The point is that internal code is *not* UTF-16 (or -32), but it *is* > isomorphic to UCS-2 (or -4). *That is very useful information to > users*, it's not a technical detail of interest only to Unicode geeks. > It means that if you stick to defined characters in the BMP when > giving Python input, then slicing and indexing unicode (Python 2) or > str (Python 3) objects gives only valid output even in builds with > 16-bit code units. OTOH, invalid processing (involving functions like > 'chr' or input using surrogateescape codecs) can lead to invalid > output even in builds with 32-bit code units. > > IMO, saying "UCS-2" or "UCS-4" tells ordinary developers most of what > they need to know about the limitations of their Python vis-a-vis full > conformance, at least with respect to the string manipulation functions. I'm sorry, but I have to disagree. As a relative unicode ignoramus, "UCS-2" and "UCS-4" convey almost no information to me, and the bits I have heard about them on this list have only confused me. On the other hand, I understand that 'narrow' means that fewer bytes are used for each internal character, meaning that some unicode characters need to be represented by more than one string element, and thus that slicing strings containing such characters on a narrow build causes problems. Now, you could tell me the same information using the terms 'UCS-2' and 'UCS-4' instead of 'narrow' and 'wide', but to my ear 'narrow' and 'wide' convey a better gut level feeling for what is going on than 'UCS-2' and 'UCS-4' do. And it avoids any question of whether or not Python's internal representation actually conforms to whatever standard it is that UCS refers to, a point on which there seems to be some dissension. Having written the above, I googled for UCS-2 and got the Wikipedia article on UTF16/UCS-2 [1]. Scanning that article, I do not see anything that would clue me in to the problems of slicing strings in a Python narrow build. Indeed, reading that article with my limited unicode knowledge, if I were told Python used UCS-2, I would assume that non-BMP characters could not be processed by a Python narrow build. -- R. David Murray www.bitdance.com [1] http://en.wikipedia.org/wiki/UTF-16/UCS-2 From g.brandl at gmx.net Sun Nov 21 18:58:53 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 21 Nov 2010 18:58:53 +0100 Subject: [Python-Dev] Mercurial Schedule In-Reply-To: <4CE9568E.4010102@jcea.es> References: <4CE2CF8F.4040500@jcea.es> <4CE9568E.4010102@jcea.es> Message-ID: Am 21.11.2010 18:27, schrieb Jesus Cea: > What is the impact in the buildbot architecture?. Slaves must do > anything?. At least they need to have mercurial installed, I guess. > > What, as a buildslave manager, must I do to ready my server for the > migration?. Apart from having Mercurial installed and "hg" in the PATH (that will be important for Windows I assume), I don't think anything else is required. Georg From raymond.hettinger at gmail.com Sun Nov 21 19:17:57 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 21 Nov 2010 10:17:57 -0800 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <20101121173825.B1BFB235977@kimball.webabinitio.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> Message-ID: <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> On Nov 21, 2010, at 9:38 AM, R. David Murray wrote: > > I'm sorry, but I have to disagree. As a relative unicode ignoramus, > "UCS-2" and "UCS-4" convey almost no information to me, and the bits I > have heard about them on this list have only confused me. From the users point of view, it doesn't much matter which encoding is used internally. Neither UTF-16 nor UCS-2 is exactly correct anyway. The former encodes the entire range of unicode characters in a variable length code (a character is usually 2 bytes but is sometimes 4 bytes long). The latter encodes only a subset of unicode (the basic mulitlingual plane) in a fixed-length code of bytes per character). What we use internally looks like utf-16 but a character encoded with 4 bytes is treated as two 2-byte characters (hence the subject of this thread). Our hybrid internal coding lets use handle the entire range of unicode while getting speed and simplicity by doing len() and slicing with a surrogate pair being treated as two separate characters). For the "wide" build, the entire range of unicode is encoded at 4 bytes per character and slicing/len operate correctly since every character is the same length. This used to be called UCS-4 and is now UTF-32. So, with "wide" builds there isn't much confusion (except perhaps unfamiliar terminology). The real issue seems to be that for "narrow" builds, none of the usual encoding names is exactly correct. From a users point-of-view, the actual encoding or encoding name doesn't matter much. They just need to be able to predict the relevant behaviors (memory consumption and len/slicing behavior). For the narrow build, that behavior is: - Characters in the BMP consume 2 bytes and count as one char for purposes of len and slicing. - Characters above the BMP consume 4 bytes and counts as two distinct chars for purpose of len and slicing. For wide builds, all characters are 4 bytes and count as a single char for len and slicing. Hope this helps, Raymond From martin at v.loewis.de Sun Nov 21 19:51:44 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 21 Nov 2010 19:51:44 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CE96A40.1050705@v.loewis.de> > > I disagree. Python does "conform" to "UTF-16" > > I'm sure the codecs do. But the Unicode standard doesn't care about > the parts of the process, it cares about what it does as a whole. Chapter and verse? > Python's internal coding does not conform to UTF-16, and that internal > coding can, under certain conditions, escape to the outside world as > invalid "Unicode" output. I'm fairly certain there are provisions in the Unicode standard for such behavior (taking into account "certain conditions"). > > What behavior specifically do you consider non-conforming, and what > > specific specification do you think it is not conforming to? For > > example, it *is* fully conforming with UTF-8. > > Oh, > > f = open('/tmp/broken','wt',encoding='utf8',errors='surrogateescape') > f.write(chr(int('dc80',16))) > f.close() > > for one. That produces a non-UTF-8 file Right. You are using an API that does not promise to create UTF-8, and hence isn't UTF-8. The Unicode standard certainly allows implementations to use character encoding schemes other than UTF-8; this one being "UTF-8 with surrogate escapes", which is different from "UTF-8" (IANA MIBEnum 106). > You can say, "oh, but that's not really a UTF-8 codec", and I'd agree. See above :-) > Nevertheless, the program is able to produce output from internal > "Unicode" strings that does not conform to Unicode at all. *Any* Unicode implementation will do that, since they all have to support legacy encodings in some form. This is certainly conforming to the Unicode standard, and in fact one of the primary Unicode design principles. > A Unicode- > conforming Python implementation would error at the chr() call, or > perhaps would not provide surrogateescape error handlers. Chapter and verse? > "Although practicality beats purity." The Unicode standard itself is based on practicality. It wouldn't have received the success it did if it was based on purity only (and indeed, was often rejected in cases where it put purity over practicality, e.g. with the Hangul syllables). Regards, Martin From rdmurray at bitdance.com Sun Nov 21 20:29:15 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 21 Nov 2010 14:29:15 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> Message-ID: <20101121192915.0FFE1209B7A@kimball.webabinitio.net> On Sun, 21 Nov 2010 10:17:57 -0800, Raymond Hettinger wrote: > On Nov 21, 2010, at 9:38 AM, R. David Murray wrote: > > I'm sorry, but I have to disagree. As a relative unicode ignoramus, > > "UCS-2" and "UCS-4" convey almost no information to me, and the bits I > > have heard about them on this list have only confused me. [...] > 6rom a users point-of-view, the actual encoding or encoding name > doesn't matter much. They just need to be able to predict the relevant > behaviors (memory consumption and len/slicing behavior). > > For the narrow build, that behavior is: > - Characters in the BMP consume 2 bytes and count as one char > for purposes of len and slicing. > - Characters above the BMP consume 4 bytes and counts as > two distinct chars for purpose of len and slicing. > > For wide builds, all characters are 4 bytes and count as a single > char for len and slicing. > > Hope this helps, Thank you, that nicely summarizes and confirms what I thought I knew about wide versus narrow build. And as I said, using the names UCS-2/UCS-4 would only *confuse* that understanding, not clarify it. -- R. David Murray www.bitdance.com From alexander.belopolsky at gmail.com Sun Nov 21 23:13:22 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 21 Nov 2010 17:13:22 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE6EF91.1040803@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6EF91.1040803@v.loewis.de> Message-ID: On Fri, Nov 19, 2010 at 4:43 PM, "Martin v. L?wis" wrote: >> In my opinion, the question is more what was it not fixed in Python2. I suppose >> that the answer is something ugly like "backward compatibility" or "historical >> reasons" :-) > > No, there was a deliberate decision to not support that, see > > http://www.python.org/dev/peps/pep-0261/ > > There had been a long discussion on this specific detail when PEP 261 > was written, and in the end, an explicit, deliberate, considered > decision was made to raise a ValueError. > Yes, the existence of PEP 261 was one of the reasons I was surprised that a change like this was made without a deliberation. Personally, I've never used chr() or ord() other than on the python command prompt. Processing text one character at a time is just too slow in Python. So for my own use cases, the change is quite welcome. I also find that with bytes() items being int in 3.x more or less removes the need for ord(). On the other hand any 2.x program that uses unichr() and ord() is very likely to exhibit subtly buggy behavior when ported to 3.x. I don't think len(chr(i)) = 2 is likely to cause problems, but map(ord, s) not being an iterator over code points is likely to break naive programs. This is especially true because as far as I can tell there is no easy way to iterate over code points in a Python string on a narrow build. From merwok at netwok.org Mon Nov 22 01:54:34 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 22 Nov 2010 01:54:34 +0100 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <20101121034404.52924F20A@mail.python.org> References: <20101121034404.52924F20A@mail.python.org> Message-ID: <4CE9BF4A.1020302@netwok.org> > Author: nick.coghlan > New Revision: 86633 > > Issue #10220: Add inspect.getgeneratorstate(). Initial patch by Rodolpho Eckhardt > > Modified: python/branches/py3k/Doc/library/inspect.rst > ============================================================================== > --- python/branches/py3k/Doc/library/inspect.rst (original) > +++ python/branches/py3k/Doc/library/inspect.rst Sun Nov 21 04:44:04 2010 > @@ -620,3 +620,25 @@ > # in which case the descriptor itself will > # have to do > pass > + > +Current State of a Generator > +---------------------------- > + > +When implementing coroutine schedulers and for other advanced uses of > +generators, it is useful to determine whether a generator is currently > +executing, is waiting to start or resume or execution, or has already > +terminated. func:`getgeneratorstate` allows the current state of a > +generator to be determined easily. > + > +.. function:: getgeneratorstate(generator) > + > + Get current state of a generator-iterator. > + > + Possible states are: > + GEN_CREATED: Waiting to start execution. > + GEN_RUNNING: Currently being executed by the interpreter. > + GEN_SUSPENDED: Currently suspended at a yield expression. > + GEN_CLOSED: Execution has completed. I wonder if those shouldn?t be marked up as :data: or something to make them indexed. From v+python at g.nevcal.com Mon Nov 22 04:59:54 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 21 Nov 2010 19:59:54 -0800 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <20101121171821.195552194AC@kimball.webabinitio.net> References: <4CE7452A.7050109@g.nevcal.com> <4CE7B34D.4020309@netwok.org> <4CE8111F.9060502@g.nevcal.com> <4CE8CFCD.4040906@g.nevcal.com> <20101121171821.195552194AC@kimball.webabinitio.net> Message-ID: <4CE9EABA.1090306@g.nevcal.com> On 11/21/2010 9:18 AM, R. David Murray wrote: > I want to look at the CGI issue, but I'm not sure when I'll get to it. Actually, since this code was working before 3.x, and if email.parser can now accept binary streams, it seems like maybe the only thing that might be wrong is that presently it is getting a text stream instead, so that is something cgi.py or the application program would have to switch, and then maybe some testing would discover correctness, or maybe a specification of UTF-8 as the encoding to use for the text parts would have to be done. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Mon Nov 22 05:39:57 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Sun, 21 Nov 2010 23:39:57 -0500 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <4CE9EABA.1090306@g.nevcal.com> References: <4CE7452A.7050109@g.nevcal.com> <4CE7B34D.4020309@netwok.org> <4CE8111F.9060502@g.nevcal.com> <4CE8CFCD.4040906@g.nevcal.com> <20101121171821.195552194AC@kimball.webabinitio.net> <4CE9EABA.1090306@g.nevcal.com> Message-ID: <20101122043957.2A5D6235C7A@kimball.webabinitio.net> On Sun, 21 Nov 2010 19:59:54 -0800, Glenn Linderman wrote: > On 11/21/2010 9:18 AM, R. David Murray wrote: > > I want to look at the CGI issue, but I'm not sure when I'll get to it. > > Actually, since this code was working before 3.x, and if email.parser > can now accept binary streams, it seems like maybe the only thing that > might be wrong is that presently it is getting a text stream instead, so > that is something cgi.py or the application program would have to > switch, and then maybe some testing would discover correctness, or maybe > a specification of UTF-8 as the encoding to use for the text parts would > have to be done. Well, given the bytes/string split in Python3, code definitely has to be changed to make this work, since you have to explicitly call bytes processing routines (message_from_bytes, message_from_binary_file, BytesFeedparser, etc) to parse binary data, and likewise use BytesGenerator to emit binary data. -- R. David Murray www.bitdance.com From brian.curtin at gmail.com Mon Nov 22 06:14:24 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Sun, 21 Nov 2010 23:14:24 -0600 Subject: [Python-Dev] Bug week-end on the 20th-21st? In-Reply-To: <20101025220401.0406722b@pitrou.net> References: <20101023190828.47b7f03e@pitrou.net> <20101025153242.2FBEC219F92@kimball.webabinitio.net> <20101025220401.0406722b@pitrou.net> Message-ID: On Mon, Oct 25, 2010 at 15:04, Antoine Pitrou wrote: > On Mon, 25 Oct 2010 11:32:42 -0400 > "R. David Murray" wrote: > > On Mon, 25 Oct 2010 12:22:24 -0200, Rodrigo Bernardo Pimentel < > rbp at isnomore.net> wrote: > > >> Am 23.10.2010 19:08, schrieb Antoine Pitrou: > > >>> The first 3.2 beta is scheduled by Georg for November 13th. > > >>> What would you think of scheduling a bug week-end one week later, > that > > >>> is on November 20th and 21st? We would need enough core developers to > > >>> be available on #python-dev. > > > > > >FWIW, I'm +1, and I'll try to get the Sao Paulo users group to > participate. > > > > I think this is a great idea (both Antoine's initial suggestion and the > > idea of getting users groups to participate). > > > > I'll be around and able to participate that weekend except for evening > > US Eastern time. > > Ok, so 20th-21st of November it shall be! > > Regards > > Antoine. Although a few time zones are still celebrating Bug Weekend, it looks like at least 76 bugs got closed out [0]. Some of those happened thanks to a number of first time contributors. Thanks to everyone for their efforts! [0] http://bugs.python.org/issue?%40columns=title&%40columns=id&activity=from+2010-11-20+to+2010-11-22&%40columns=activity&%40sort=activity&%40group=priority&status=2&%40columns=status&%40pagesize=50&%40startwith=0&%40action=search -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon Nov 22 06:28:13 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 22 Nov 2010 14:28:13 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE96A40.1050705@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE96A40.1050705@v.loewis.de> Message-ID: <87ipzqc4gi.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > Chapter and verse? Unicode 5.0, Chapter 3, verse C9: When a process generates a code unit sequence which purports to be in a Unicode character encoding form, it shall not emit ill-formed code sequences. I think anything called "UTF-8 something" is likely to be taken to "purport". Furthermore, users don't necessarily see which error handlers are being used. A user who specifies "utf8" as the output codec is likely to be rather surprised if non-UTF-8 is emitted because the app specified surrogateescape. Eg, consider a script which munges file descriptions into reasonable-length file names on Unix. Yes, technically the non-Unicode output is the app's fault, but I expect many users will put some blame on Python. I am in full agreement with you about the technicalities, but I am looking for ways to clue in users that (a) the technicalities matter, and (b) that Python does a *very* good job of making things as safe as possible without becoming unable to handle bytes. I think "wide" vs. "narrow" fails at both. It focuses on storage issues, which of course are important, but at the cost of ignoring the fact that for users of non-BMP characters 32-bit code units are much safer. Users who need non-BMP characters are relatively few, and at least at the present time most are painfully aware of the need to care for technicalities. I expect them to be pleasantly surprised by how easy it is to get reasonably safe behavior even from a 16-bit build. > > Python's internal coding does not conform to UTF-16, and that internal > > coding can, under certain conditions, escape to the outside world as > > invalid "Unicode" output. > > I'm fairly certain there are provisions in the Unicode standard for such > behavior (taking into account "certain conditions"). Sure. There's nothing in the Unicode standard that says you have to conform to it unless you claim to conform to it. So it is valid to say that Python's Unicode codecs without surrogateescape do conform. The point is that Python does not, even if all of the input is valid Unicode, because of the provision of surrogateescape and the lack of Unicode conformance-checking for certain internal functionality like chr() and slicing. You can say "we don't make any such claim", but IMO the distinction in question is too fine a point for most users, and requires a very large amount of Unicode knowledge (not to mention standards geekiness) to even understand the precise statement. "Unicode support" to users should mean that Python does the right thing, not that if you look hard enough in the documentation you will discover that Python doesn't claim to do the right thing even though in practice it mostly does. IMO, "UCS-2" is a pretty good description of what the user can leave up to Python in perfect safety. RDM's reply worries me a little, but I'll reply to his message separately. > *Any* Unicode implementation will do that, since they all have to > support legacy encodings in some form. This is certainly conforming to > the Unicode standard, and in fact one of the primary Unicode design > principles. No. Support for legacy encodings takes you outside of the realm of Unicode conformance by definition. Their names tell you that, however. "UTF-8 with surrogate escapes" on the other hand is an entirely different kettle of fish. It pretends to be UTF-8, but isn't. I think that users who give Python valid input should be able to expect valid output, but they can't. Chapter 3, verse C7: When a process purports not to modify the interpretation of a valid coded character sequence, it shall make no change to that coded character sequence other than the possible replacement of character sequences by their canonical-equivalent sequences, or the deletion of *noncharacter* code points. Sure, you can tell users the truth: "Python may modify your Unicode characters if you slice or index Unicode strings. It may even silently turn them into invalid codes which will eventually raise Errors." Then you are conformant, but why would anyone want to use such a program? If you tell them "UCS-2[sic] Python is safe to use with *no* extra care if you use only UCS-2 [or BMP] characters", suddenly Python looks very nice indeed again. "UCS-4" Python is even better; all you have to do is to avoid surrogateescape codecs. However, you're still vulnerable to hard-to-diagnose errors at the output stage in case of program bugs, because not enough checking of values is done by Python itself. > > A Unicode-conforming Python implementation would error at the > > chr() call, or perhaps would not provide surrogateescape error > > handlers. > > Chapter and verse? Chapter 3, verse C9 again. > > "Although practicality beats purity." > > The Unicode standard itself is based on practicality. It wouldn't > have received the success it did if it was based on purity only > (and indeed, was often rejected in cases where it put purity over > practicality, e.g. with the Hangul syllables). Python practicality is very different from Unicode practicality. From v+python at g.nevcal.com Mon Nov 22 06:40:22 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 21 Nov 2010 21:40:22 -0800 Subject: [Python-Dev] is this a bug? no environment variables Message-ID: <4CEA0246.9080607@g.nevcal.com> In reviewing my notes from my experimentations with CGIHTTPServer (Python2.6) and then http.server (Python 3.2a4), I note one behavior I haven't reported as a bug, nor do I know where to start to figure it out, other than experimentally. The experiment: launching CGIHTTPServer without environment variables, by the simple expedient of using a batch file to unset all the existing environment variables, and then launching Python2.6 with CGIHTTPServer. So it failed early: random.py fails at line 110 (Python 2.6). I suppose it is possible that some environment variables are used by Python directly (but I can't seem to find a documented list of them) although I would expect that usage to be optional, with fall-back defaults when they don't exist. I suppose it is even possible that some Windows APIs might depend on some environment variables, but I expected that the registry had replaced such usage completely, by now, with the environment variables mostly being a convenience tool for batch files, or for optional, temporary alteration of particular settings. If anyone knows of documentation listing what environment variables are required by Python on Windows, I would appreciate a pointer, searches and doc browsing having not turned it up. I'll attempt to recreate the test situation later this week with Python 3.2a4, if no one responds, but the only debug technique I can think of is to slowly remove environment variables until I find the minimum set required to run http.server successfully for my tests with CGI files. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon Nov 22 07:14:46 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 22 Nov 2010 15:14:46 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <20101121173825.B1BFB235977@kimball.webabinitio.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> Message-ID: <87hbf9dgvd.fsf@uwakimon.sk.tsukuba.ac.jp> R. David Murray writes: > I'm sorry, but I have to disagree. As a relative unicode ignoramus, > "UCS-2" and "UCS-4" convey almost no information to me, and the bits I > have heard about them on this list have only confused me. OK, point taken. > On the other hand, I understand that 'narrow' means that fewer > bytes are used for each internal character, meaning that some > unicode characters need to be represented by more than one string > element, and thus that slicing strings containing such characters > on a narrow build causes problems. Now, you could tell me the same > information using the terms 'UCS-2' and 'UCS-4' instead of 'narrow' > and 'wide', but to my ear 'narrow' and 'wide' convey a better gut > level feeling for what is going on than 'UCS-2' and 'UCS-4' do. I think that is probably conditioned by your long experience with Python's Unicode features, specifically the knowledge that Python's Unicode strings are not arrays of characters, which often is referred to on this list. My guess is that very few newbies would know that, and it is not implied by "narrow". For example, both Emacs (for sure) and Perl (IIUC) index strings of variable-width character by characters (at great expense of performance in Emacs, at least), not as code units. > And it avoids any question of whether or not Python's internal > representation actually conforms to whatever standard it is that > UCS refers to, a point on which there seems to be some dissension. UCS-2 refers to ISO 10646, Annex 1 IIRC.[1] Anyway, it's somewhere in ISO 10646. I don't think there's actually dissension on conformance to UCS-2, as that's very easy to achieve. Rather, Guido explicitly pronounced that Python processes arrays of code units, not characters. My point is that if you pretend that Python is processing *characters* according to UCS-2 rules for characters, you'll always come to the same conclusion about what Python will do as if you use the technically correct terminology of code units. (At least for the BMP and UTF-16 private areas. There will necessarily be some confusion about surrogates, since in UCS-2 they are characters while in UTF-16 they're merely "code points", and the Unicode characters they represent can't be represented at all in UCS-2.) > Indeed, reading that article with my limited unicode knowledge, if > I were told Python used UCS-2, I would assume that non-BMP > characters could not be processed by a Python narrow build. Actually, I'm almost happy with that. That is, the precise formulation is "could not be processed *safely without extra care* by a Python narrow build." Specifically, AFAIK if you range check characters that have been indexed out of a string, or are located at slice boundaries, or produced by chr() or a surrogateescape input codec, you're safe. But practically speaking few apps will actually do those checks and therefore they are unsafe: processing non-BMP characters can easily lead to show-stopping Exceptions. It's very analogous to the kind of show-stopping "bad character in a header" exception that plagued Mailman for so long, and had to be fixed on a case-by-case basis. But the restriction to BMP characters is much more reasonable (at least for now) than RFC 822's restriction to ASCII! But evidently you take it much more stringently. So the question is, "what fraction of developers who think as you do would therefore be put off from using Python to build their applications?" If most would say "OK, we'll stick with BMP for now and use UCS-4 or some hack to deal with extended characters later -- it can't really be true that it's absolutely impossible to use non-BMP characters," I don't mind that misunderstanding. OTOH, yes, it would be bad if the use of "UCS-2" were to imply to more than a couple of developers that 16-bit builds of Python can't handle UTF-16 *at all*. Footnotes: [1] It simply says "we have a subset of the Unicode character set all of whose code points can be represented in 16 bits, excluding 0xFFFF." It goes on to define a private area, reserved for use by applications that will never be standardized, and it says that if you don't know what a code point in the character area is, don't change it (you can delete it, however). ISTR that a later Amendment added 0xFFFE to the short-list of non-characters. The surrogate area was taken out of the private area, so a UCS-2 application will simply consider each surrogate to be an unknown character and pass it through unchanged -- unless it deletes it, or inserts other characters between the code points of a surrogate pair. And that's why UCS-2 isn't UTF-16 conforming -- which is basically why Python isn't either. From martin at v.loewis.de Mon Nov 22 09:20:59 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Nov 2010 09:20:59 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87ipzqc4gi.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE96A40.1050705@v.loewis.de> <87ipzqc4gi.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEA27EB.8000104@v.loewis.de> > Unicode 5.0, Chapter 3, verse C9: > > When a process generates a code unit sequence which purports to be > in a Unicode character encoding form, it shall not emit ill-formed > code sequences. > > > A Unicode-conforming Python implementation would error at the > > > chr() call, or perhaps would not provide surrogateescape error > > > handlers. > > > > Chapter and verse? > > Chapter 3, verse C9 again. I agree that the surrogateescape error handler is non-conforming, but, as you say, it doesn't claim to, either (would your concern about utf-8 being misleading here been resolved if the thing had been called "utf-8b"?) More interestingly (and to the subject) is chr: how did you arrive at C9 banning Python3's definition of chr? This chr function puts the code sequence into well-formed UTF-16; that's the whole point of UTF-16. Regards, Martin From stephen at xemacs.org Mon Nov 22 11:47:09 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 22 Nov 2010 19:47:09 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEA27EB.8000104@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE96A40.1050705@v.loewis.de> <87ipzqc4gi.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA27EB.8000104@v.loewis.de> Message-ID: <87fwutd49e.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > More interestingly (and to the subject) is chr: how did you arrive > at C9 banning Python3's definition of chr? This chr function puts > the code sequence into well-formed UTF-16; that's the whole point of > UTF-16. No, it doesn't, in the specific case of surrogate code points. In 3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64, chr(0xd800) returns "\ud800". I don't know if that's by design (eg, so that it can be used in the implementation of the surrogateescape error handler) or a correctable oversight, but it's not conformant. From stephen at xemacs.org Mon Nov 22 11:48:42 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 22 Nov 2010 19:48:42 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> Message-ID: <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Raymond Hettinger writes: > Neither UTF-16 nor UCS-2 is exactly correct anyway. >From a standards lawyer point of view, UCS-2 is exactly correct, as far as I can tell upon rereading ISO 10646-1, especially Annexes H ("retransmitting devices") and Q ("UTF-16"). Annex Q makes it clear that UTF-16 was intentionally designed so that Python-style processing could be done in a UCS-2 context. > For the "wide" build, the entire range of unicode is encoded at > 4 bytes per character and slicing/len operate correctly since > every character is the same length. This used to be called UCS-4 > and is now UTF-32. That's inaccurate, I believe. UCS-4 is not a UTF, and doesn't satisfy the range restrictions of a UTF. > So, with "wide" builds there isn't much confusion (except perhaps > unfamiliar terminology). The real issue seems to be that for > "narrow" builds, none of the usual encoding names is exactly > correct. I disagree. I do see a problem with "UCS-2", because it fails to tell us that Python implements a large number of features that make it easy to do a very good job of working with non-BMP data in 16-bit builds of Python, with no extra effort. Python is not perfect, and (rarely) some of the imperfections may be very distressing. But it's very good, and deserves to be advertised as such. However, I don't see how "narrow" tells us more than "UCS-2" does. If "UCS-2" is equally (or more) informative, I prefer it because it is the technically precise, already well-defined, term. > From a users point-of-view, the actual encoding or encoding name > doesn't matter much. They just need to be able to predict the relevant > behaviors (memory consumption and len/slicing behavior). "UCS-2" indicates those behaviors precisely and concisely. The problems are (a) the lack of familiarity of users with this term, if David is reasonably representative, and (b) the fact that it fails to advertise Python's UTF-16 capabilities. "Narrow" suffers from both of those problems, and further from the fact that it has no independent standard definition. Furthermore, "wide" has a very widespread, platform-dependent meaning derived from wchar_t. If we have to document what the terms we choose mean anyway, why not document the existing terms and reduce entropy, rather than invent new ones and increase entropy? From martin at v.loewis.de Mon Nov 22 12:22:35 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Nov 2010 12:22:35 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87fwutd49e.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE96A40.1050705@v.loewis.de> <87ipzqc4gi.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA27EB.8000104@v.loewis.de> <87fwutd49e.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEA527B.4030002@v.loewis.de> Am 22.11.2010 11:47, schrieb Stephen J. Turnbull: > "Martin v. L?wis" writes: > > > More interestingly (and to the subject) is chr: how did you arrive > > at C9 banning Python3's definition of chr? This chr function puts > > the code sequence into well-formed UTF-16; that's the whole point of > > UTF-16. > > No, it doesn't, in the specific case of surrogate code points. In > 3.1.2 from MacPorts on a iBook G4 and from Gentoo on AMD64, > chr(0xd800) returns "\ud800". Ah, I see - this is *not* the subject's issue, right? > > I don't know if that's by design (eg, so that it can be used in the > implementation of the surrogateescape error handler) or a correctable > oversight, but it's not conformant. I disagree: Quoting from Unicode 5.0, section 5.4: # The individual components of implementations may have different # levels of support for surrogates, as long as those components are # assembled and communicate correctly. Low-level string processing, # where a Unicode string is not interpreted but is handled simply as an # array of code units, may ignore surrogate pairs. With such strings, # for example, a truncation operation with an arbitrary offset might # break a surrogate pair. (For further discussion, see Section 2.7, # Unicode Strings.) For performance in string operations, such behavior # is reasonable at a low level, but it requires higher-level processes # to ensure that offsets are on character boundaries so as to guarantee # the integrity of surrogate pairs. So lower-level routines (which I claim chr() is one) are allowed to create lone surrogates. The formal requirement behind this is C1: # A process shall not interpret a high-surrogate code point or a # low-surrogate code point as an abstract character. I also claim that Python, in both narrow and wide mode, conforms to this requirement. Notice that the requirement is a ban on interpreting the code point as a character. In particular, unicodedata.category claims that the code point is of class Cs (surrogate), which I consider conforming. By the same line of reasoning, it is also OK that chr() allows the creation of unassigned code points, even though C2 says that they must not be interpreted as abstract characters. The rationale for supporting these characters in chr() goes back much further than the surrogateescape handler - as Python unicode strings are sequences of code points, it would be impractical if you couldn't create some of them, or even would have to consult the UCD before determining whether they can be created. Regards, Martin From martin at v.loewis.de Mon Nov 22 12:43:00 2010 From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Nov 2010 12:43:00 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEA5744.3080308@v.loewis.de> Am 22.11.2010 11:48, schrieb Stephen J. Turnbull: > Raymond Hettinger writes: > > > Neither UTF-16 nor UCS-2 is exactly correct anyway. > >>From a standards lawyer point of view, UCS-2 is exactly correct, as > far as I can tell upon rereading ISO 10646-1, especially Annexes H > ("retransmitting devices") and Q ("UTF-16"). Annex Q makes it clear > that UTF-16 was intentionally designed so that Python-style processing > could be done in a UCS-2 context. I could only find the FCD of 10646:2010, where annex H was integrated into section 10: http://www.itscj.ipsj.or.jp/sc2/open/02n4125/FCD10646-Main.pdf There they have stopped using the term UCS-2, and added a note # NOTE ? Former editions of this standard included references to a # two-octet BMP form called UCS-2 which would be a subset # of the UTF-16 encoding form restricted to the BMP UCS scalar values. # The UCS-2 form is deprecated. I think they are now acknowledging that UCS-2 was a misleading term, making it ambiguous whether this refers to a CCS, a CEF, or a CES; like "ASCII", people have been using it for all three of them. Apparently, the ISO WG interprets earlier revisions as saying that UCS-2 is a CEF that restricted UTF-16 to the BMP. THIS IS NOT WHAT PYTHON DOES. In a narrow Python build, the character set is *not* restricted to the BMP. Instead, Unicode strings are meant to be interpreted (by applications) as UTF-16. > > For the "wide" build, the entire range of unicode is encoded at > > 4 bytes per character and slicing/len operate correctly since > > every character is the same length. This used to be called UCS-4 > > and is now UTF-32. > > That's inaccurate, I believe. UCS-4 is not a UTF, and doesn't satisfy > the range restrictions of a UTF. Not sure what it says in your copy; in mine, section 9.3 says # 9.3 UTF-32 (UCS-4) # UTF-32 (or UCS-4) is the UCS encoding form that assigns each UCS # scalar value to a single unsigned 32-bit code unit. The terms UTF-32 # and UCS-4 can be used interchangeably to designate this encoding # form. so they (now) view the two as synonyms. I think that when ISO 10646 started, they were also fairly confused about these issues (as the group/plane/row/cell structure demonstrates, IMO). This is not surprising, since the notion of byte-based character sets had been ingrained for so long. It took 20 years to learn that a UCS scalar value really is *not* a sequence of bytes, but a natural number. > However, I don't see how "narrow" tells us more than "UCS-2" does. If > "UCS-2" is equally (or more) informative, I prefer it because it is > the technically precise, already well-defined, term. But it's not. It is a confusing term, one that the relevant standards bodies are abandoning. After reading FCD 10646:2010, I could agree to call the two implementations UTF-16 and UTF-32 (as these terms designate CEFs). Unfortunately, they also designate CESs. > If we have to document what the terms we choose mean anyway, why not > document the existing terms and reduce entropy, rather than invent new > ones and increase entropy? Because the proposed existing term is deprecated. Regards, Martin From mal at egenix.com Mon Nov 22 13:47:29 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 22 Nov 2010 13:47:29 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEA5744.3080308@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> Message-ID: <4CEA6661.4080402@egenix.com> Martin, it is really irrelevant whether the standards have decided to no longer use the terms UCS-2 and UCS-4 in their latest standard documents. The definitions still stand (just like Unicode 2.0 is still a valid standard, even if it's ten years old): * UCS-2 is defined as "Universal Character Set coded in 2 octets" by ISO 10464: (see http://www.unicode.org/versions/Unicode5.2.0/appC.pdf) * UCS-4 is defined as "Universal Character Set coded in 4 octets" by ISO 10464. Those two terms have been in use for many years. They refer to the Unicode character set as it can be represented in 2 or 4 bytes. As such they don't include any of the special meanings associated with the UTF transfer encodings. There are no invalid sequences, no invalid code points, etc. as you can find in the UTF encodings. And that's an important detail. If you interpret them as encodings, they are 1-1 mappings of Unicode code point ordinals to integers represented using 2 or 4 bytes. UCS-2 only supports BMP code points and can conveniently be interpreted as UTF-16, if you need to encode non-BMP code points (which we do in the UTF codecs). UCS-4 also supports non-BMP code points directly. Now, from a ISO or Unicode Consortium point of view, deprecating the term UCS-2 in *their* standard papers is only natural, since they are actively starting to assign non-BMP code points which cannot be represented in UCS-2. However, this deprecation is only relevant for the purpose of defining the standard. The above definitions are still useful when it comes to defining code units, i.e. the used storage format, (as opposed to the transfer format). For the purpose of describing the code units we are using in Python they are (still) the most correct terms and that's also the reason why we chose to use them when introducing the configure options in Python2. There are no other accurate definitions we could use. The terms "narrow" and "wide" are simply too inaccurate to be used as description of UCS-2 and UCS-4 code units. Please also note that we have used the terms UCS-2 and UCS-4 in Python2 for 9+ years now and users are just starting to learn the difference and get acquainted with the fact that Python uses these two forms. Confronting them with "narrow" and "wide" builds is only going to cause more confusion, not less, and adding those strings to Python package files isn't going to help much either, since the terms don't convey any relationship to Unicode: package-3.1.3.linux-x86_64-py2.6_ucs2.egg vs. package-3.1.3.linux-x86_64-py2.6_narrow.egg I opt for switching to the following config options: --with-unicode=ucs2 (default) --with-unicode=ucs4 and using "UCS-2" and "UCS-4" in the Python documentation when describing the two different build modes. We can add glossary entries for the two which clarify the differences. Python2 used --enable-unicode=ucs2/ucs4, but since Python3 doesn't build without Unicode support, the above two versions appear more appropriate. We can keep the alternative --with-wide-unicode as an alias for --with-unicode=ucs4 to maintain 3.x backwards compatibility. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 22 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ "Martin v. L?wis" wrote: > Am 22.11.2010 11:48, schrieb Stephen J. Turnbull: >> Raymond Hettinger writes: >> >> > Neither UTF-16 nor UCS-2 is exactly correct anyway. >> >> >From a standards lawyer point of view, UCS-2 is exactly correct, as >> far as I can tell upon rereading ISO 10646-1, especially Annexes H >> ("retransmitting devices") and Q ("UTF-16"). Annex Q makes it clear >> that UTF-16 was intentionally designed so that Python-style processing >> could be done in a UCS-2 context. > > I could only find the FCD of 10646:2010, where annex H was integrated > into section 10: > > http://www.itscj.ipsj.or.jp/sc2/open/02n4125/FCD10646-Main.pdf > > There they have stopped using the term UCS-2, and added a note > > # NOTE ? Former editions of this standard included references to a > # two-octet BMP form called UCS-2 which would be a subset > # of the UTF-16 encoding form restricted to the BMP UCS scalar values. # > The UCS-2 form is deprecated. > > I think they are now acknowledging that UCS-2 was a misleading term, > making it ambiguous whether this refers to a CCS, a CEF, or a CES; > like "ASCII", people have been using it for all three of them. > > Apparently, the ISO WG interprets earlier revisions as saying that > UCS-2 is a CEF that restricted UTF-16 to the BMP. THIS IS NOT WHAT > PYTHON DOES. In a narrow Python build, the character set is *not* > restricted to the BMP. Instead, Unicode strings are meant to be > interpreted (by applications) as UTF-16. > >> > For the "wide" build, the entire range of unicode is encoded at >> > 4 bytes per character and slicing/len operate correctly since >> > every character is the same length. This used to be called UCS-4 >> > and is now UTF-32. >> >> That's inaccurate, I believe. UCS-4 is not a UTF, and doesn't satisfy >> the range restrictions of a UTF. > > Not sure what it says in your copy; in mine, section 9.3 says > > # 9.3 UTF-32 (UCS-4) > # UTF-32 (or UCS-4) is the UCS encoding form that assigns each UCS > # scalar value to a single unsigned 32-bit code unit. The terms UTF-32 # > and UCS-4 can be used interchangeably to designate this encoding > # form. > > so they (now) view the two as synonyms. > > I think that when ISO 10646 started, they were also fairly confused > about these issues (as the group/plane/row/cell structure demonstrates, > IMO). This is not surprising, since the notion of byte-based character > sets had been ingrained for so long. It took 20 years to learn that > a UCS scalar value really is *not* a sequence of bytes, but a natural > number. > >> However, I don't see how "narrow" tells us more than "UCS-2" does. If >> "UCS-2" is equally (or more) informative, I prefer it because it is >> the technically precise, already well-defined, term. > > But it's not. It is a confusing term, one that the relevant standards > bodies are abandoning. After reading FCD 10646:2010, I could agree to > call the two implementations UTF-16 and UTF-32 (as these terms > designate CEFs). Unfortunately, they also designate CESs. > >> If we have to document what the terms we choose mean anyway, why not >> document the existing terms and reduce entropy, rather than invent new >> ones and increase entropy? > > Because the proposed existing term is deprecated. > > Regards, > Martin From foom at fuhm.net Mon Nov 22 15:18:02 2010 From: foom at fuhm.net (James Y Knight) Date: Mon, 22 Nov 2010 09:18:02 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEA6661.4080402@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: Why don't ya'll just call them "--unichar-width=16/32". That describes precisely what the options do, and doesn't invite any quibbling over definitions. James From ncoghlan at gmail.com Mon Nov 22 16:14:46 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 23 Nov 2010 01:14:46 +1000 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <4CE9BF4A.1020302@netwok.org> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> Message-ID: On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo wrote: >> +.. function:: getgeneratorstate(generator) >> + >> + ? ?Get current state of a generator-iterator. >> + >> + ? ?Possible states are: >> + ? ? ?GEN_CREATED: Waiting to start execution. >> + ? ? ?GEN_RUNNING: Currently being executed by the interpreter. >> + ? ? ?GEN_SUSPENDED: Currently suspended at a yield expression. >> + ? ? ?GEN_CLOSED: Execution has completed. > > I wonder if those shouldn?t be marked up as :data: or something to make > them indexed. The same definitions are in the docstrings, and they're just integer constants so I'm not sure why anyone would be looking them up directly. Still, if someone with greater Sphinx-fu thinks additional markup would be helpful, I have no problem with them adding it :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Mon Nov 22 16:19:04 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 22 Nov 2010 15:19:04 +0000 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> Message-ID: <4CEA89E8.5090107@voidspace.org.uk> On 22/11/2010 15:14, Nick Coghlan wrote: > On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo wrote: >>> +.. function:: getgeneratorstate(generator) >>> + >>> + Get current state of a generator-iterator. >>> + >>> + Possible states are: >>> + GEN_CREATED: Waiting to start execution. >>> + GEN_RUNNING: Currently being executed by the interpreter. >>> + GEN_SUSPENDED: Currently suspended at a yield expression. >>> + GEN_CLOSED: Execution has completed. >> I wonder if those shouldn?t be marked up as :data: or something to make >> them indexed. > The same definitions are in the docstrings, and they're just integer > constants so I'm not sure why anyone would be looking them up > directly. Still, if someone with greater Sphinx-fu thinks additional > markup would be helpful, I have no problem with them adding it :) > Why not use string constants instead? You lose comparability (less than / greater than) but gain readability. Comparability may be a requirement - of course if Python had an Enum type we could use that and have both. Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Mon Nov 22 16:37:21 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 23 Nov 2010 01:37:21 +1000 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEA6661.4080402@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: On Mon, Nov 22, 2010 at 10:47 PM, M.-A. Lemburg wrote: > Please also note that we have used the terms UCS-2 and UCS-4 in Python2 > for 9+ years now and users are just starting to learn the difference > and get acquainted with the fact that Python uses these two forms. > > Confronting them with "narrow" and "wide" builds is only > going to cause more confusion, not less, and adding those > strings to Python package files isn't going to help much either, > since the terms don't convey any relationship to Unicode: I was personally surprised to learn in this discussion that there had even been an *attempt* to change the names of the two build variants to anything other than UCS2/UCS4. The concrete API implementations certainly still use those two terms to prevent inadvertent linkage with the wrong version of the C API. For practical purposes, UCS2/UCS4 convey far more inherent information than narrow/wide: - many developers will recognise them as Unicode related, even if they don't know exactly what they mean - even those that don't recognise them, can soon learn that they're Unicode related just by plugging them into Google* - a bit more digging should reveal that they're Unicode storage formats closely related to the UTF-16 and UTF-32 transfer encodings respectively* *(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article) All that just armed with Google, without even looking at the Python docs specifically. So don't just think about "what will developers know?", also think about "what will developers know, and what will a quick trip to a search engine tell them?". And once you take that stance, the overly generic narrow/wide terms fail, badly. +1 for MAL's suggested tweaks to the Py3k configure options. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Nov 22 16:37:22 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 22 Nov 2010 16:37:22 +0100 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> Message-ID: <20101122163722.7e96d123@pitrou.net> On Mon, 22 Nov 2010 15:19:04 +0000 Michael Foord wrote: > On 22/11/2010 15:14, Nick Coghlan wrote: > > On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo wrote: > >>> +.. function:: getgeneratorstate(generator) > >>> + > >>> + Get current state of a generator-iterator. > >>> + > >>> + Possible states are: > >>> + GEN_CREATED: Waiting to start execution. > >>> + GEN_RUNNING: Currently being executed by the interpreter. > >>> + GEN_SUSPENDED: Currently suspended at a yield expression. > >>> + GEN_CLOSED: Execution has completed. > >> I wonder if those shouldn?t be marked up as :data: or something to make > >> them indexed. > > The same definitions are in the docstrings, and they're just integer > > constants so I'm not sure why anyone would be looking them up > > directly. Still, if someone with greater Sphinx-fu thinks additional > > markup would be helpful, I have no problem with them adding it :) > > > > Why not use string constants instead? You lose comparability (less than > / greater than) but gain readability. Comparability may be a requirement > - of course if Python had an Enum type we could use that and have both. +1. The problem with int constants is that the int gets printed, not the name, when you dump them for debugging purposes :) cheers Antoine. From ncoghlan at gmail.com Mon Nov 22 16:45:28 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 23 Nov 2010 01:45:28 +1000 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <4CEA89E8.5090107@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> Message-ID: On Tue, Nov 23, 2010 at 1:19 AM, Michael Foord wrote: > On 22/11/2010 15:14, Nick Coghlan wrote: >> On Mon, Nov 22, 2010 at 10:54 AM, ?ric Araujo ?wrote: >>>> + ? ?Possible states are: >>>> + ? ? ?GEN_CREATED: Waiting to start execution. >>>> + ? ? ?GEN_RUNNING: Currently being executed by the interpreter. >>>> + ? ? ?GEN_SUSPENDED: Currently suspended at a yield expression. >>>> + ? ? ?GEN_CLOSED: Execution has completed. >>> >>> I wonder if those shouldn?t be marked up as :data: or something to make >>> them indexed. >> >> The same definitions are in the docstrings, and they're just integer >> constants so I'm not sure why anyone would be looking them up >> directly. Still, if someone with greater Sphinx-fu thinks additional >> markup would be helpful, I have no problem with them adding it :) >> > > Why not use string constants instead? You lose comparability (less than / > greater than) but gain readability. Comparability may be a requirement - of > course if Python had an Enum type we could use that and have both. With only 4 states, comparability isn't really necessary. I'm just so used to using the range() trick as a replacement for the lack of proper Enum type that using strings instead didn't even occur to me. The lack of printability did bother me a bit, so yeah, +1 from me as well (I've reopened the relevant issue to remind me to change it before beta 1). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Mon Nov 22 17:03:47 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 22 Nov 2010 11:03:47 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: On Mon, Nov 22, 2010 at 10:37 AM, Nick Coghlan wrote: .. > *(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on > Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article) > Do you think these articles are helpful for someone learning how to use chr() and ord() in Python for the first time? From hrvoje.niksic at avl.com Mon Nov 22 17:08:36 2010 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Mon, 22 Nov 2010 17:08:36 +0100 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <20101122163722.7e96d123@pitrou.net> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> Message-ID: <4CEA9584.7040301@avl.com> On 11/22/2010 04:37 PM, Antoine Pitrou wrote: > +1. The problem with int constants is that the int gets printed, not > the name, when you dump them for debugging purposes :) Well, it's trivial to subclass int to something with a nicer __repr__. PyGTK uses that technique for wrapping C enums: >>> gtk.PREVIEW_GRAYSCALE >>> isinstance(gtk.PREVIEW_GRAYSCALE, int) True >>> gtk.PREVIEW_GRAYSCALE + 0 1 From ncoghlan at gmail.com Mon Nov 22 17:13:39 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 23 Nov 2010 02:13:39 +1000 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: On Tue, Nov 23, 2010 at 2:03 AM, Alexander Belopolsky wrote: > On Mon, Nov 22, 2010 at 10:37 AM, Nick Coghlan wrote: > .. >> *(The first Google hit for "ucs2" is the UTF-16/UCS-2 article on >> Wikipedia, the first hit for "ucs4" is the UTF-32/UCS-4 article) >> > > Do you think these articles are helpful for someone learning how to > use chr() and ord() in Python for the first time? No, that's what the documentation of chr() and ord() is for. For that use case, it doesn't matter *what* the terms are. They could say "in a FOO build this will do X, in a BAR build it will do Y, see for a detailed explanation of the differences between FOO and BAR builds of Python" and be perfectly adequate for the task. If there is no appropriate documentation link to point to (probably somewhere in the C API docs if it isn't anywhere else) then that is a key issue that needs to be fixed, rather than trying to change the terms that have been in use for the better part of a decade already. The raw meaning of UCS2/UCS4 mainly comes into the story when people are encountering this as a config option when building Python. The whole idea of changing the terms for the two build types *should* have been short circuited by the "status quo wins a stalemate" guideline, but apparently that didn't happen at the time. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Nov 22 17:24:40 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 22 Nov 2010 17:24:40 +0100 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> Message-ID: <20101122172440.77d27ed5@pitrou.net> On Mon, 22 Nov 2010 17:08:36 +0100 Hrvoje Niksic wrote: > On 11/22/2010 04:37 PM, Antoine Pitrou wrote: > > +1. The problem with int constants is that the int gets printed, not > > the name, when you dump them for debugging purposes :) > > Well, it's trivial to subclass int to something with a nicer __repr__. > PyGTK uses that technique for wrapping C enums: Nice. It might be useful to add a private _Constant class somewhere for stdlib purposes. Regards Antoine. From guido at python.org Mon Nov 22 17:33:57 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 22 Nov 2010 08:33:57 -0800 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: <4CEA0246.9080607@g.nevcal.com> References: <4CEA0246.9080607@g.nevcal.com> Message-ID: On Sun, Nov 21, 2010 at 9:40 PM, Glenn Linderman wrote: > In reviewing my notes from my experimentations with CGIHTTPServer > (Python2.6) and then http.server (Python 3.2a4), I note one behavior I > haven't reported as a bug, nor do I know where to start to figure it out, > other than experimentally. > > The experiment: launching CGIHTTPServer without environment variables, by > the simple expedient of using a batch file to unset all the existing > environment variables, and then launching Python2.6 with CGIHTTPServer. > > So it failed early: random.py fails at line 110 (Python 2.6). What specific traceback do you get? In my copy of the code that line says a = long(_hexlify(_urandom(16)), 16) and I could just imagine that _urandom() fails for some reason to do with the environment (it is a reference to os.urandom()), which, being part of the C library code, might depend on the environment. But you're not giving enough info to debug this. > I suppose it is possible that some environment variables are used by Python > directly (but I can't seem to find a documented list of them) although I > would expect that usage to be optional, with fall-back defaults when they > don't exist. That is certainly the idea, but the fallbacks may not always be nice. Environment variables used by Python or the stdlib itself are supposed to be named PYTHON if they are Python-specific, and there's a way to disable all of these (-E). But there are other environment variables (HOME and PATH come to mind) that have a broader definition and that are used in some part of the stdlib. Plus, as I mentioned, who knows what the non-Python C library uses (well, somebody probably knows, but I don't know of a central source that we can actually trust across the many platforms where Python runs). > I suppose it is even possible that some Windows APIs might > depend on some environment variables, but I expected that the registry had > replaced such usage completely, by now, with the environment variables > mostly being a convenience tool for batch files, or for optional, temporary > alteration of particular settings. That sounds like wishful thinking. :-) > If anyone knows of documentation listing what environment variables are > required by Python on Windows, I would appreciate a pointer, searches and > doc browsing having not turned it up. > > I'll attempt to recreate the test situation later this week with Python > 3.2a4, if no one responds, but the only debug technique I can think of is to > slowly remove environment variables until I find the minimum set required to > run http.server successfully for my tests with CGI files. -- --Guido van Rossum (python.org/~guido) From fuzzyman at voidspace.org.uk Mon Nov 22 17:58:56 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 22 Nov 2010 16:58:56 +0000 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <20101122172440.77d27ed5@pitrou.net> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> Message-ID: <4CEAA150.3020106@voidspace.org.uk> On 22/11/2010 16:24, Antoine Pitrou wrote: > On Mon, 22 Nov 2010 17:08:36 +0100 > Hrvoje Niksic wrote: >> On 11/22/2010 04:37 PM, Antoine Pitrou wrote: >>> +1. The problem with int constants is that the int gets printed, not >>> the name, when you dump them for debugging purposes :) >> Well, it's trivial to subclass int to something with a nicer __repr__. >> PyGTK uses that technique for wrapping C enums: > Nice. It might be useful to add a private _Constant class somewhere for > stdlib purposes. Why not just solve the problem properly and add it to the standard library... (Allowing for flag enums too that can be or'd together and still have a decent repr.) Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From alexander.belopolsky at gmail.com Mon Nov 22 18:00:14 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 22 Nov 2010 12:00:14 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: On Mon, Nov 22, 2010 at 11:13 AM, Nick Coghlan wrote: .. >> Do you think these articles are helpful for someone learning how to >> use chr() and ord() in Python for the first time? > > No, that's what the documentation of chr() and ord() is for. For that > use case, it doesn't matter *what* the terms are. I recently updated chr() and ord() documentation and used "narrow/wide" terms. I thought USC2/4 proponents objected to that on the basis that these terms are imprecise. http://docs.python.org/dev/library/functions.html#chr http://docs.python.org/dev/library/functions.html#ord > They could say "in a > FOO build this will do X, in a BAR build it will do Y, see for > a detailed explanation of the differences between FOO and BAR builds > of Python" and be perfectly adequate for the task. If there is no > appropriate documentation link to point to (probably somewhere in the > C API docs if it isn't anywhere else) then that is a key issue that > needs to be fixed, rather than trying to change the terms that have > been in use for the better part of a decade already. > That's the point that I was trying to make. Using somewhat vague narrow/wide terms gives us an opportunity to describe exactly what is going on without confusing the reader with the intricacies of the Unicode Standard or Python'd compliance with a particular version of it. > The raw meaning of UCS2/UCS4 mainly comes into the story when people > are encountering this as a config option when building Python. The > whole idea of changing the terms for the two build types *should* have > been short circuited by the "status quo wins a stalemate" guideline, > but apparently that didn't happen at the time. > It also comes in the "Data model" reference section on String which is currently out of date: """ Strings The items of a string object are Unicode code units. A Unicode code unit is represented by a string object of one item and can hold either a 16-bit or 32-bit value representing a Unicode ordinal (the maximum value for the ordinal is given in sys.maxunicode, and depends on how Python is configured at compile time). Surrogate pairs may be present in the Unicode object, and will be reported as two separate items. The built-in functions chr() and ord() convert between code units and nonnegative integers representing the Unicode ordinals as defined in the Unicode Standard 3.0. Conversion from and to other encodings are possible through the string method encode(). """ http://docs.python.org/dev/reference/datamodel.html The out of date part is the reference to the Unicode Standard 3.0. I don't think we should refer to a specific version of Unicode here. It has little consequence for the "Python data model" and AFAICT does not come into play anywhere except unicodedata which is currently at version 6.0. The description of chr() and ord() is also not accurate on narrow builds and nether is the statement "The items of a string object are Unicode code units." From exarkun at twistedmatrix.com Mon Nov 22 17:46:54 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Mon, 22 Nov 2010 16:46:54 -0000 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <20101122172440.77d27ed5@pitrou.net> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> Message-ID: <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> On 04:24 pm, solipsis at pitrou.net wrote: >On Mon, 22 Nov 2010 17:08:36 +0100 >Hrvoje Niksic wrote: >>On 11/22/2010 04:37 PM, Antoine Pitrou wrote: >> > +1. The problem with int constants is that the int gets printed, >>not >> > the name, when you dump them for debugging purposes :) >> >>Well, it's trivial to subclass int to something with a nicer __repr__. >>PyGTK uses that technique for wrapping C enums: > >Nice. It might be useful to add a private _Constant class somewhere for >stdlib purposes. http://www.python.org/dev/peps/pep-0354/ >Regards > >Antoine. > > >_______________________________________________ >Python-Dev mailing list >Python-Dev at python.org >http://mail.python.org/mailman/listinfo/python-dev >Unsubscribe: http://mail.python.org/mailman/options/python- >dev/exarkun%40twistedmatrix.com From ezio.melotti at gmail.com Mon Nov 22 18:14:03 2010 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Mon, 22 Nov 2010 19:14:03 +0200 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest Message-ID: <4CEAA4DB.6020904@gmail.com> I would like to re-enable by default warnings for regrtest and/or unittest. The reasons are: 1) these tools are used mainly by developers and they (should) care about warnings; 2) developers won't have to remember that warning are silenced and how to enable them manually; 3) developers won't have to enable them manually every time they run the tests; 4) some developers are not even aware that warnings have been silenced and might not notice things like DeprecationWarnings until the function/method/class/etc gets removed and breaks their code; 5) another developer tool -- the --with-pydebug flag -- already re-enables warnings when it's used; If this is fixed in unittest it won't be necessary to patch regrtest. If it's fixed in regrtest only the core developers will benefit from this. This could be fixed checking if any warning flags (-Wx) are passed to python. If no flags are passed the default will be -Wd, otherwise the behavior will be the one specified by the flag. This will allow developers to use `python -Wi` to ignore errors explicitly. Best Regards, Ezio Melotti From rdmurray at bitdance.com Mon Nov 22 18:30:29 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 22 Nov 2010 12:30:29 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: <20101122173029.CB5AA235E1E@kimball.webabinitio.net> On Mon, 22 Nov 2010 12:00:14 -0500, Alexander Belopolsky wrote: > I recently updated chr() and ord() documentation and used > "narrow/wide" terms. I thought USC2/4 proponents objected to that on > the basis that these terms are imprecise. For reference, a grep in py3k/Doc reveals that there are currently exactly 23 lines mentioning UCS2 or UCS4 in the docs. Most are in the unicode part of the c-api, and 6 are in what's new for 2.2: c-api/arg.rst: Convert a null-terminated buffer of Unicode (UCS-2 or UCS-4) data to a Python c-api/arg.rst: Convert a Unicode (UCS-2 or UCS-4) data buffer and its length to a Python c-api/unicode.rst: for :c:type:`Py_UNICODE` and store Unicode values internally as UCS2. It is also c-api/unicode.rst: possible to build a UCS4 version of Python (most recent Linux distributions come c-api/unicode.rst: with UCS4 builds of Python). These builds then use a 32-bit type for c-api/unicode.rst: :c:type:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms c-api/unicode.rst: short` (UCS2) or :c:type:`unsigned long` (UCS4). c-api/unicode.rst:Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep c-api/unicode.rst: values is interpreted as an UCS-2 character. whatsnew/2.2.rst:usually stored as UCS-2, as 16-bit unsigned integers. Python 2.2 can also be whatsnew/2.2.rst:compiled to use UCS-4, 32-bit unsigned integers, as its internal encoding by whatsnew/2.2.rst:supplying :option:`--enable-unicode=ucs4` to the configure script. (It's also whatsnew/2.2.rst:When built to use UCS-4 (a "wide Python"), the interpreter can natively handle whatsnew/2.2.rst:compiled to use UCS-2 (a "narrow Python"), values greater than 65535 will still whatsnew/2.2.rst:Marc-Andr?? Lemburg. The changes to support using UCS-4 internally were howto/unicode.rst:.. comment Additional topic: building Python w/ UCS2 or UCS4 support howto/unicode.rst: - [ ] Building Python (UCS2, UCS4) library/sys.rst: characters are stored as UCS-2 or UCS-4. library/json.rst: specified. Encodings that are not ASCII based (such as UCS-2) are not faq/extending.rst:When importing module X, why do I get "undefined symbol: PyUnicodeUCS2*"? faq/extending.rst:If instead the name of the undefined symbol starts with ``PyUnicodeUCS4``, the faq/extending.rst: ... print('UCS4 build') faq/extending.rst: ... print('UCS2 build') -- R. David Murray www.bitdance.com From lukasz at langa.pl Mon Nov 22 18:35:16 2010 From: lukasz at langa.pl (=?UTF-8?B?xYF1a2FzeiBMYW5nYQ==?=) Date: Mon, 22 Nov 2010 18:35:16 +0100 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: <4CEAA4DB.6020904@gmail.com> References: <4CEAA4DB.6020904@gmail.com> Message-ID: <4CEAA9D4.2020904@langa.pl> Am 22.11.2010 18:14, schrieb Ezio Melotti: > I would like to re-enable by default warnings for regrtest and/or > unittest. +1 Especially in regrtest it could help manage stdlib quality (currently we have a horde of ResourceWarnings, zipfile mostly). I would even be +1 on making warnings errors for regrtest but that seems to be unpopular on #python-dev. Best regards, ?ukasz Langa From alexander.belopolsky at gmail.com Mon Nov 22 18:37:59 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 22 Nov 2010 12:37:59 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <20101122173029.CB5AA235E1E@kimball.webabinitio.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> <20101122173029.CB5AA235E1E@kimball.webabinitio.net> Message-ID: On Mon, Nov 22, 2010 at 12:30 PM, R. David Murray wrote: .. > For reference, a grep in py3k/Doc reveals that there are currently exactly > 23 lines mentioning UCS2 or UCS4 in the docs. Did you grep for USC-2 and USC-4 as well? I have to admit that my aversion to these terms is mostly due to the fact that I don't know how to spell them correctly. :-) From tjreedy at udel.edu Mon Nov 22 18:41:46 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 22 Nov 2010 12:41:46 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 11/22/2010 5:48 AM, Stephen J. Turnbull wrote: > I disagree. I do see a problem with "UCS-2", because it fails to tell > us that Python implements a large number of features that make it easy > to do a very good job of working with non-BMP data in 16-bit builds of Yes. As I read the standard, UCS-2 is limited to BMP chars. So I was a bit confused when Python was described as UCS-2, until I realized that the term was inaccurate. Using that term punishes people like me who take the time to read the standard or otherwise learn what the term means. What Python does might be called USC-2+ or UCS-2e (xtended). -- Terry Jan Reedy From fuzzyman at voidspace.org.uk Mon Nov 22 18:45:58 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 22 Nov 2010 17:45:58 +0000 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: <4CEAA9D4.2020904@langa.pl> References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> Message-ID: <4CEAAC56.2090702@voidspace.org.uk> On 22/11/2010 17:35, ?ukasz Langa wrote: > Am 22.11.2010 18:14, schrieb Ezio Melotti: >> I would like to re-enable by default warnings for regrtest and/or >> unittest. > > +1 > > Especially in regrtest it could help manage stdlib quality (currently > we have a horde of ResourceWarnings, zipfile mostly). I would even be > +1 on making warnings errors for regrtest but that seems to be > unpopular on #python-dev. > Enabling it for regrtest makes sense. For unittest I still think it is a choice that should be left to developers. Michael > Best regards, > ?ukasz Langa > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From raymond.hettinger at gmail.com Mon Nov 22 19:13:30 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 22 Nov 2010 10:13:30 -0800 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 22, 2010, at 2:48 AM, Stephen J. Turnbull wrote: > Raymond Hettinger writes: > >> Neither UTF-16 nor UCS-2 is exactly correct anyway. > > From a standards lawyer point of view, UCS-2 is exactly correct, You're twisting yourself into definitional knots. Any explanation we give users needs to let them know two things: * that we cover the entire range of unicode not just BMP * that sometimes len(chr(i)) is one and sometimes two The term UCS-2 is a complete communications failure in that regard. If someone looks up the term, they will immediately see something like the wikipedia entry which says, "UCS-2 cannot represent code points outside the BMP". How is that helpful? Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Mon Nov 22 19:29:33 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 22 Nov 2010 10:29:33 -0800 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 22, 2010, at 9:41 AM, Terry Reedy wrote: > On 11/22/2010 5:48 AM, Stephen J. Turnbull wrote: > >> I disagree. I do see a problem with "UCS-2", because it fails to tell >> us that Python implements a large number of features that make it easy >> to do a very good job of working with non-BMP data in 16-bit builds of > > Yes. As I read the standard, UCS-2 is limited to BMP chars. So I was a bit confused when Python was described as UCS-2, until I realized that the term was inaccurate. Using that term punishes people like me who take the time to read the standard or otherwise learn what the term means. Bingo! Thanks for the excellent summary of the problem. > > What Python does might be called USC-2+ or UCS-2e (xtended). That would be a step in the right direction. Raymond From jcea at jcea.es Mon Nov 22 19:34:49 2010 From: jcea at jcea.es (Jesus Cea) Date: Mon, 22 Nov 2010 19:34:49 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling Message-ID: <4CEAB7C9.7020504@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 A Solaris installation contains ALWAYS 32 and 64 bits libraries. So in any Solaris you can run 32/64 bits programs, and compile in 32 and 64 bits. For this, libraries are stores in "/usr/lib", for instance, for 32 bits, while the same 64 bits libraries are stored in "/usr/lib/64". Currently, python do not considerate this. We have Solaris 10 buildslaves, but they compile in 32 bits, aparently. For instance . We now have 32 and 64 bits OpenIndiana buildslaves, so we can actually check this. They were deployed yesterday. Apparently the changes would be pretty simple, adding ".../64" to library paths, to try to find the extra libraries. What do you think?. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOq3yZlgi5GaxT1NAQLQhAP9G2liX+YveYmfYDOuVjWWS8PE7r2wM/XA 5rik9mJM4Z7/wDnY4wrWjG5l3B9sSyrhhNI1YmIcXm4klfYxV9xTkG9dMNL+2bVc +s98rlTdjNlMVTf8Xc7U3tMpdkG/JK0+XWmRfWsf52ATdtxPHazI9L6KvqdYjNuZ 2w3dXNXErZE= =oYXo -----END PGP SIGNATURE----- From mal at egenix.com Mon Nov 22 19:53:00 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 22 Nov 2010 19:53:00 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEABC0C.4080909@egenix.com> Raymond Hettinger wrote: > Any explanation we give users needs to let them know two things: > * that we cover the entire range of unicode not just BMP > * that sometimes len(chr(i)) is one and sometimes two > > The term UCS-2 is a complete communications failure > in that regard. If someone looks up the term, they will > immediately see something like the wikipedia entry which says, > "UCS-2 cannot represent code points outside the BMP". > How is that helpful? It's very helpful, since it explains why a UCS-2 build of Python requires a surrogates pair to represent a non-BMP code point and explains why chr(i) gives you a length 2 string rather than a length 1 string. A UCS-4 build does not need to use surrogates for this, hence you get a length 1 string from chr(i). There are two levels we have to explain to users: 1. the transfer level 2. the storage level The UTF encodings address the transfer level and is what you deal with in I/O. These provide variable length encodings of the complete Unicode code point range, regardless of whether you have a UCS-2 or a UCS-4 build. The storage level becomes important if you want to work on strings using indexing and slicing. Here you do have to know whether you're dealing with a UCS-2 or a UCS-4 build, since the indexes will vary if you're using non-BMP code points. Finally, to tie both together, we have to explain that UTF-16 (the transfer encoding) maps to UCS-2 in a straight-forward way, so it is possible to work with a UCS-2 build of Python and still use the complete Unicode code point range - you only have to take into consideration, that Python's string indexing will not necessarily point you to n-th code point in a string, but may well give you half or a surrogate. Note that while that last aspect may appear like a good argument for UCS-4 builds, in reality it is not. UCS-4 has the same issue on a different level: the letters that get printed on the screen or printer (graphemes) may well be made up of multiple combining code points, e.g. an "e" and an "?". Those again map to two indexes in the Python string, even though, the appear to be one character on output. Now try to explain all of the above using the terms "narrow" and "wide" (while remembering "explicit is better than implicit" and "avoid the temptation to guess") :-) It is not really helpful to replace a correct and accurate term with a fuzzy term: either way we're stuck with the semantics. However, the correct and accurate terms at least give you a chance to figure out and understand the reasoning behind the design. UCS-2 vs. UCS-4 is a trade-off, "narrow" and "wide" is marketing talk with an implicit emphasis on one side :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 22 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ezio.melotti at gmail.com Mon Nov 22 19:58:33 2010 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Mon, 22 Nov 2010 20:58:33 +0200 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: <4CEAAC56.2090702@voidspace.org.uk> References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> Message-ID: <4CEABD59.6080005@gmail.com> On 22/11/2010 19.45, Michael Foord wrote: > On 22/11/2010 17:35, ?ukasz Langa wrote: >> Am 22.11.2010 18:14, schrieb Ezio Melotti: >>> I would like to re-enable by default warnings for regrtest and/or >>> unittest. >> >> +1 >> >> Especially in regrtest it could help manage stdlib quality (currently >> we have a horde of ResourceWarnings, zipfile mostly). I would even be >> +1 on making warnings errors for regrtest but that seems to be >> unpopular on #python-dev. >> As I said on IRC I think it makes sense to turn them into errors once we fixed/silenced all the ones that we have now. That would help keeping the number of warning to 0. > > Enabling it for regrtest makes sense. For unittest I still think it is > a choice that should be left to developers. If we consider that most of the developers want to see them, I'd prefer to have the warnings by default rather than having to use -Wd explicitly every time I run the tests (keep in mind that many developers out there don't even know/remember that now they should use -Wd). > > Michael > >> Best regards, >> ?ukasz Langa >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk > > From alexander.belopolsky at gmail.com Mon Nov 22 20:09:14 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 22 Nov 2010 14:09:14 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Nov 22, 2010 at 12:41 PM, Terry Reedy wrote: .. > What Python does might be called USC-2+ or UCS-2e (xtended). > Wow! I am not the only one who can't get the order of letters right in these acronyms. (I am usually consistent within one sentence, though.) :-) I-can't-spell-three-letter-acronyms-right-ly yours ... From brett at python.org Mon Nov 22 20:12:26 2010 From: brett at python.org (Brett Cannon) Date: Mon, 22 Nov 2010 11:12:26 -0800 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAB7C9.7020504@jcea.es> References: <4CEAB7C9.7020504@jcea.es> Message-ID: On Mon, Nov 22, 2010 at 10:34, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > A Solaris installation contains ALWAYS 32 and 64 bits libraries. So in > any Solaris you can run 32/64 bits programs, and compile in 32 and 64 bits. > > For this, libraries are stores in "/usr/lib", for instance, for 32 bits, > while the same 64 bits libraries are stored in "/usr/lib/64". > > Currently, python do not considerate this. > > We have Solaris 10 buildslaves, but they compile in 32 bits, aparently. > For instance > . > > We now have 32 and 64 bits OpenIndiana buildslaves, so we can actually > check this. They were deployed yesterday. > > Apparently the changes would be pretty simple, adding ".../64" to > library paths, to try to find the extra libraries. > > What do you think?. Are you asking about buildbots only or as a general policy? If you are asking about the buildbots then I definitely think we should use 64 bits. If you are asking about policy I would say it should be an option in case people are using C extensions that are not designed to work with 64 bits. From brett at python.org Mon Nov 22 20:24:34 2010 From: brett at python.org (Brett Cannon) Date: Mon, 22 Nov 2010 11:24:34 -0800 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: <4CEABD59.6080005@gmail.com> References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> Message-ID: On Mon, Nov 22, 2010 at 10:58, Ezio Melotti wrote: > On 22/11/2010 19.45, Michael Foord wrote: >> >> On 22/11/2010 17:35, ?ukasz Langa wrote: >>> >>> Am 22.11.2010 18:14, schrieb Ezio Melotti: >>>> >>>> I would like to re-enable by default warnings for regrtest and/or >>>> unittest. >>> >>> +1 >>> >>> Especially in regrtest it could help manage stdlib quality (currently we >>> have a horde of ResourceWarnings, zipfile mostly). I would even be +1 on >>> making warnings errors for regrtest but that seems to be unpopular on >>> #python-dev. >>> > > As I said on IRC I think it makes sense to turn them into errors once we > fixed/silenced all the ones that we have now. That would help keeping the > number of warning to 0. I agree. > >> >> Enabling it for regrtest makes sense. For unittest I still think it is a >> choice that should be left to developers. > > If we consider that most of the developers want to see them, I'd prefer to > have the warnings by default rather than having to use -Wd explicitly every > time I run the tests (keep in mind that many developers out there don't even > know/remember that now they should use -Wd). The problem with that is it means developers who switch to Python 3.2 or whatever are suddenly going to have their tests fail until they update their code to turn the warnings off. Then again, if we make the switch for this dead simple to add and backwards-compatible so that turning them off doesn't trigger an error in older versions then I am all for turning warnings on by default. Another approach is to have unittest's runner, when run in verbose mode, print out what the warnings filter is set to so developers are aware that they are silencing warnings. -Brett > > >> >> Michael >> >>> Best regards, >>> ?ukasz Langa >>> _______________________________________________ >>> Python-Dev mailing list >>> Python-Dev at python.org >>> http://mail.python.org/mailman/listinfo/python-dev >>> Unsubscribe: >>> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk >> >> > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > From jcea at jcea.es Mon Nov 22 20:26:40 2010 From: jcea at jcea.es (Jesus Cea) Date: Mon, 22 Nov 2010 20:26:40 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: References: <4CEAB7C9.7020504@jcea.es> Message-ID: <4CEAC3F0.4040806@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 22/11/10 20:12, Brett Cannon wrote: > Are you asking about buildbots only or as a general policy? If you are > asking about the buildbots then I definitely think we should use 64 > bits. If you are asking about policy I would say it should be an > option in case people are using C extensions that are not designed to > work with 64 bits. The point is that building python in 64 bits under Solaris (family) is not easy, because the 64 bits libraries (zlib, openssl, berkeley db, curses, etc., etc., etc) are not is "/usr/lib", "/usr/local/lib", etc., but "/usr/lib/64", "/usr/local/lib/64", etc. Solaris overcomes most of the issue having separate library searchpath in 32 and 64 bits (via the "crle" command). But in some cases python try to find some library in "/usr/local/lib", and my point is that it should search TOO inside "/usr/local/lib/64". - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOrD8Jlgi5GaxT1NAQJhRQP/dd4q70eXsq5AUFrleqUx3A+AagChpCcp UDHAomaX26cMl0tLFwLOd4SaKizzRMvjdTJc3GhZDIqYrF3QuqZAyLPjr5tyogP8 /4KPM73l5L2cb7IdHdSHpruwMh8f2WJ4S6+ig8DzOj6qBcttXKMymrV/skum4ENJ yb4mbpH9q/0= =Oe2G -----END PGP SIGNATURE----- From barry at python.org Mon Nov 22 20:28:43 2010 From: barry at python.org (Barry Warsaw) Date: Mon, 22 Nov 2010 14:28:43 -0500 Subject: [Python-Dev] issue 9807 - abiflags in paths and symlinks (updated patch) In-Reply-To: <20101110162719.11ae7fe6@mission> References: <20101110162719.11ae7fe6@mission> Message-ID: <20101122142843.45ae45ae@mission> On Nov 10, 2010, at 04:27 PM, Barry Warsaw wrote: >I finally found a chance to address all the outstanding technical issues >mentioned in bug 9807: > > http://bugs.python.org/issue9807 > >I've uploaded a new patch which contains the rest of the changes I'm >proposing. I think we still need consensus about whether these changes are >good to commit. With 3.2b1 coming soon, now's the time to do that. > >If there are any remaining concerns about the details of the patch, please add >them to the tracker issue. If you have any remaining objections to the >change, please let me know or follow up here. The patch has now been updated to address the last few comments in the tracker issue. I am now ready to commit it to py3k. If there are any remaining objections or concerns, please reply here or update the tracker issue. Otherwise, I plan to commit this to py3k on Wednesday. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From martin at v.loewis.de Mon Nov 22 20:42:16 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Nov 2010 20:42:16 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAC3F0.4040806@jcea.es> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> Message-ID: <4CEAC798.5050707@v.loewis.de> > Solaris overcomes most of the issue having separate library searchpath > in 32 and 64 bits (via the "crle" command). But in some cases python try > to find some library in "/usr/local/lib", and my point is that it should > search TOO inside "/usr/local/lib/64". I don't think this will work. If the linker finds a library of the wrong ELF type, then it will choke. Before enabling anything on a build slave, a patch needs to be contributed to make it work in the first place. Regards, Martin From rdmurray at bitdance.com Mon Nov 22 20:50:14 2010 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 22 Nov 2010 14:50:14 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> <20101122173029.CB5AA235E1E@kimball.webabinitio.net> Message-ID: <20101122195014.B3D9A235C94@kimball.webabinitio.net> On Mon, 22 Nov 2010 12:37:59 -0500, Alexander Belopolsky wrote: > On Mon, Nov 22, 2010 at 12:30 PM, R. David Murray wrote: > .. > > For reference, a grep in py3k/Doc reveals that there are currently exactly > > 23 lines mentioning UCS2 or UCS4 in the docs. > > Did you grep for USC-2 and USC-4 as well? I have to admit that my > aversion to these terms is mostly due to the fact that I don't know > how to spell them correctly. :-) I grepped using "-ri ucs." and eliminated the false positives (of which there were only a few) by hand. -- R. David Murray www.bitdance.com From guido at python.org Mon Nov 22 22:08:57 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 22 Nov 2010 13:08:57 -0800 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> Message-ID: On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon wrote: > The problem with that is it means developers who switch to Python 3.2 > or whatever are suddenly going to have their tests fail until they > update their code to turn the warnings off. That sounds like a feature to me... :-) -- --Guido van Rossum (python.org/~guido) From jcea at jcea.es Mon Nov 22 22:31:21 2010 From: jcea at jcea.es (Jesus Cea) Date: Mon, 22 Nov 2010 22:31:21 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAC798.5050707@v.loewis.de> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> Message-ID: <4CEAE129.2060505@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 22/11/10 20:42, "Martin v. L?wis" wrote: > Before enabling anything on a build slave, a patch needs to be > contributed to make it work in the first place. I actually agree. I am not sure yet, but I am thinking that adding a "--build-64" parameter to "configure" could be an option under Solaris. Most OSs (let say, Linux) force you to choose 32/64 bits at install time, but Solaris can use both at the same time, and compilers allow to compile both (using -m32 or -m64). Since choosing 32 or 64 bits when compiling python under Solaris change the requirement, paths, etc., automating it should be a goal. PS: Martin, is there any reason to restrict the solaris 10 buildslaves to 32 bits, beside the said problems?. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOrhKZlgi5GaxT1NAQI0cAP+OUFGVDd7UV6MdHzMenBn8fO3h4M1n0dR UZrVyYJhUYvEX9p7MRBdDNFY/6LrUITb3WCVegD3PuOymQP16GgksRfIA/jGDXyl Fe+Ed5amlDgdVPeVVH/55OodrO4SuOrJZ846G6GB1wav2IjR7I9YGxZQ6PA0LR7l 4Iph6HfcMlw= =hTNy -----END PGP SIGNATURE----- From v+python at g.nevcal.com Mon Nov 22 22:54:47 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 22 Nov 2010 13:54:47 -0800 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: References: <4CEA0246.9080607@g.nevcal.com> Message-ID: <4CEAE6A7.3010902@g.nevcal.com> On 11/22/2010 8:33 AM, Guido van Rossum wrote: > On Sun, Nov 21, 2010 at 9:40 PM, Glenn Linderman wrote: >> In reviewing my notes from my experimentations with CGIHTTPServer >> (Python2.6) and then http.server (Python 3.2a4), I note one behavior I >> haven't reported as a bug, nor do I know where to start to figure it out, >> other than experimentally. >> >> The experiment: launching CGIHTTPServer without environment variables, by >> the simple expedient of using a batch file to unset all the existing >> environment variables, and then launching Python2.6 with CGIHTTPServer. >> >> So it failed early: random.py fails at line 110 (Python 2.6). > What specific traceback do you get? In my copy of the code that line says > > a = long(_hexlify(_urandom(16)), 16) > > and I could just imagine that _urandom() fails for some reason to do > with the environment (it is a reference to os.urandom()), which, being > part of the C library code, might depend on the environment. > > But you're not giving enough info to debug this. Yep, that's the line. I'll have to re-run the scenario, but will do it on 3.2a4, hopefully tonight or tomorrow, to get the traceback. >> I suppose it is possible that some environment variables are used by Python >> directly (but I can't seem to find a documented list of them) although I >> would expect that usage to be optional, with fall-back defaults when they >> don't exist. > That is certainly the idea, but the fallbacks may not always be nice. > > Environment variables used by Python or the stdlib itself are supposed > to be named PYTHON if they are Python-specific, and there's > a way to disable all of these (-E). But there are other environment > variables (HOME and PATH come to mind) that have a broader definition > and that are used in some part of the stdlib. Plus, as I mentioned, > who knows what the non-Python C library uses (well, somebody probably > knows, but I don't know of a central source that we can actually trust > across the many platforms where Python runs). OK, thanks for the philosophy statement. That's what I didn't know, being new. >> I suppose it is even possible that some Windows APIs might >> depend on some environment variables, but I expected that the registry had >> replaced such usage completely, by now, with the environment variables >> mostly being a convenience tool for batch files, or for optional, temporary >> alteration of particular settings. > That sounds like wishful thinking. :-) Well, wishful thinking from me regarding the Windows and the registry is that Windows would be better off without a registry. But it seemed like their direction was instead to do away with environment variables, but in any case, I have little idea if they've achieved it, but should have achieved something in 6.1 versions of Windows! -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Mon Nov 22 23:01:12 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 22 Nov 2010 22:01:12 +0000 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> Message-ID: <4CEAE828.5000801@voidspace.org.uk> On 22/11/2010 21:08, Guido van Rossum wrote: > On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon wrote: >> The problem with that is it means developers who switch to Python 3.2 >> or whatever are suddenly going to have their tests fail until they >> update their code to turn the warnings off. > That sounds like a feature to me... :-) > I think Ezio was suggesting just turning warnings on by default when unittest is run, not turning them into errors. Ezio is suggesting that developers could explicitly turn warnings off again, but when you use the default test runner warnings would be shown. His logic is that warnings are for developers, and so are tests... Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From martin at v.loewis.de Mon Nov 22 23:05:40 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 22 Nov 2010 23:05:40 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAE129.2060505@jcea.es> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> Message-ID: <4CEAE934.9000106@v.loewis.de> > I actually agree. I am not sure yet, but I am thinking that adding a > "--build-64" parameter to "configure" could be an option under Solaris. > Most OSs (let say, Linux) force you to choose 32/64 bits at install > time Actually, that's not at all the case. Most systems these days support 32-bit and 64-bit applications simultaneously, and also support compiler tool chains that allow building for either mode. Solaris, Linux, and Windows are about on-par in this respect; OS X is more advanced as it allows to have a single binary that supports both 32-bit and 64-bit execution (making the need for adjusted path names irrelevant). > Since choosing 32 or 64 bits when compiling python under Solaris change > the requirement, paths, etc., automating it should be a goal. > > PS: Martin, is there any reason to restrict the solaris 10 buildslaves > to 32 bits, beside the said problems?. I don't see that as a restriction. I have to make a choice, and there are sooo many choices to make: - gcc vs. SunPRO - 32-bit vs. 64-bit - GNU make vs. /usr/ccs/bin/make I picked the combination which was most easy to setup, and is therefore likely to be used by most users (except for those who think 64-bit is somehow "better" than 32-bit, when it is actually the other way 'round - IMO). As for configuration, I personally prefer that setting CC indicates what type of build you want. Set CC to "gcc -m64" to indicate a 64-build. Ideally, you will *not* have to adjust library paths, since the other compiler will know on its own where to search things. Regards, Martin From nad at acm.org Mon Nov 22 23:12:05 2010 From: nad at acm.org (Ned Deily) Date: Mon, 22 Nov 2010 14:12:05 -0800 Subject: [Python-Dev] Solaris family and 64 bits compiling References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> Message-ID: In article <4CEAE129.2060505 at jcea.es>, Jesus Cea wrote: > On 22/11/10 20:42, "Martin v. L?wis" wrote: > > Before enabling anything on a build slave, a patch needs to be > > contributed to make it work in the first place. > > I actually agree. I am not sure yet, but I am thinking that adding a > "--build-64" parameter to "configure" could be an option under Solaris. > Most OSs (let say, Linux) force you to choose 32/64 bits at install > time, but Solaris can use both at the same time, and compilers allow to > compile both (using -m32 or -m64). > > Since choosing 32 or 64 bits when compiling python under Solaris change > the requirement, paths, etc., automating it should be a goal. You might want to look at the existing --with-universal-archs=ARCH in configure for how this is done for OS X builds. It's probably both simpler and more complicated than would be needed elsewhere: on OS X, a single file can contain object codes for multiple architectures, e.g 32-bit and 64-bit, rather than having to have multiple files. -- Ned Deily, nad at acm.org From brett at python.org Mon Nov 22 23:20:21 2010 From: brett at python.org (Brett Cannon) Date: Mon, 22 Nov 2010 14:20:21 -0800 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> Message-ID: On Mon, Nov 22, 2010 at 13:08, Guido van Rossum wrote: > On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon wrote: >> The problem with that is it means developers who switch to Python 3.2 >> or whatever are suddenly going to have their tests fail until they >> update their code to turn the warnings off. > > That sounds like a feature to me... :-) =) I meant update their tests with the switch to turn off the warnings, not update to make the warnings properly disappear. I guess it's a question of whether it will be errors by default or simply output the warning. I can get behind printing the warnings by default and adding a switch to make them errors or off otherwise. -Brett > > -- > --Guido van Rossum (python.org/~guido) > From anurag.chourasia at gmail.com Mon Nov 22 23:46:16 2010 From: anurag.chourasia at gmail.com (Anurag Chourasia) Date: Tue, 23 Nov 2010 04:16:16 +0530 Subject: [Python-Dev] Missing Python Symbols when Starting Python App (Apache/Django/Mod_Wsgi) Message-ID: All, I have a problem in starting my Python(Django) App using Apache and Mod_Wsgi I am using Django 1.2.3 and Python 2.6.6 running on Apache 2.2.17 with Mod_Wsgi 3.3 When I try to access the app from Web Browser, I am getting these errors. [Mon Nov 22 09:45:25 2010] [notice] Apache/2.2.17 (Unix) mod_wsgi/3.3 Python/2.6.6 configured -- resuming normal operations [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] mod_wsgi (pid=1273874): Target WSGI script '/u01/home/apli/wm/app/gdd/pyserver/ apache/django.wsgi' cannot be loaded as Python module. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] mod_wsgi (pid=1273874): Exception occurred processing WSGI script '/u01/home/ apli/wm/app/gdd/pyserver/apache/django.wsgi'. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] Traceback (most recent call last): [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] File "/u01/ home/apli/wm/app/gdd/pyserver/apache/django.wsgi", line 19, in [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] import django.core.handlers.wsgi [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] File "/usr/ local/lib/python2.6/site-packages/django/core/handlers/wsgi.py", line 1, in [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from threading import Lock [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] File "/usr/ local/lib/python2.6/threading.py", line 13, in [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from functools import wraps [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] File "/usr/ local/lib/python2.6/functools.py", line 10, in [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from _functools import partial, reduce [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] ImportError: rtld: 0712-001 Symbol PyArg_UnpackTuple was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] rtld: 0712-001 Symbol PyCallable_Check was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] rtld: 0712-001 Symbol PyDict_Copy was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] rtld: 0712-001 Symbol PyDict_Merge was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] rtld: 0712-001 Symbol PyDict_New was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] rtld: 0712-001 Symbol PyErr_Occurred was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] rtld: 0712-001 Symbol PyErr_SetString was referenced [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] from module /usr/local/lib/python2.6/lib-dynload/_functools.so(), but a runtime definition [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] of the symbol was not found. [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] \t0509-021 Additional errors occurred but are not reported. I assume that those missing runtime definitions are supposed to be in the Python executable. Doing an nm on the first missing symbol reveals that it does exist. root [zibal]% nm /usr/local/bin/python | grep -i PyArg_UnpackTuple .PyArg_UnpackTuple T 268683204 524 PyArg_UnpackTuple D 537073500 PyArg_UnpackTuple d 537073500 12 PyArg_UnpackTuple:F-1 - 224 Please guide. Regards, Guddu -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Mon Nov 22 23:51:18 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 22 Nov 2010 23:51:18 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAB7C9.7020504@jcea.es> References: <4CEAB7C9.7020504@jcea.es> Message-ID: <4CEAF3E6.4080602@netwok.org> Hi, I think this bug is related: http://bugs.python.org/issue1294959 ?Problems with /usr/lib64 builds.? Regards From tlesher at gmail.com Mon Nov 22 23:56:25 2010 From: tlesher at gmail.com (Tim Lesher) Date: Mon, 22 Nov 2010 17:56:25 -0500 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: <4CEAE6A7.3010902@g.nevcal.com> References: <4CEA0246.9080607@g.nevcal.com> <4CEAE6A7.3010902@g.nevcal.com> Message-ID: On Mon, Nov 22, 2010 at 16:54, Glenn Linderman wrote: > I suppose it is possible that some environment variables are used by Python > directly (but I can't seem to find a documented list of them) although I > would expect that usage to be optional, with fall-back defaults when they > don't exist. I can verify that that's the case: Python (at least through 3.1.2) runs fine on Windows platforms when environment variables are completely unavailable. I know that from running our port for Windows CE (which has no environment variables at all), cross-compiled for Windows XP. -- Tim Lesher From martin at v.loewis.de Tue Nov 23 00:16:47 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 23 Nov 2010 00:16:47 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAF3E6.4080602@netwok.org> References: <4CEAB7C9.7020504@jcea.es> <4CEAF3E6.4080602@netwok.org> Message-ID: <4CEAF9DF.6070509@v.loewis.de> Am 22.11.2010 23:51, schrieb ?ric Araujo: > Hi, > > I think this bug is related: http://bugs.python.org/issue1294959 > ?Problems with /usr/lib64 builds.? Perhaps more closely related: http://bugs.python.org/issue847812 http://bugs.python.org/issue1733484 http://bugs.python.org/issue1676121 http://bugs.python.org/issue1628484 Regards, Martin From jcea at jcea.es Tue Nov 23 00:41:19 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 23 Nov 2010 00:41:19 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAE934.9000106@v.loewis.de> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> Message-ID: <4CEAFF9F.5070503@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 22/11/10 23:05, "Martin v. L?wis" wrote: >> PS: Martin, is there any reason to restrict the solaris 10 buildslaves >> to 32 bits, beside the said problems?. > > I don't see that as a restriction. I have to make a choice, and there > are sooo many choices to make: > - gcc vs. SunPRO > - 32-bit vs. 64-bit > - GNU make vs. /usr/ccs/bin/make > > I picked the combination which was most easy to setup, and is therefore > likely to be used by most users (except for those who think 64-bit > is somehow "better" than 32-bit, when it is actually the other way > 'round - IMO). Do not think this is a personal attack. Not at all. I am deploying 32 and 64 bits buildslaves (in the same machine) and feeling the pain. You are far more experiences than me with buildbots and python. I want to know if I am missing something. > As for configuration, I personally prefer that setting CC indicates > what type of build you want. Set CC to "gcc -m64" to indicate a > 64-build. Ideally, you will *not* have to adjust library paths, since > the other compiler will know on its own where to search things. The problem is not with system library paths. Compilers overcome that. The problem is with things like "/usr/local/lib" and hardcoded library paths in Python. For example, checking : """ gcc -shared -m64 build/temp.solaris-2.11-i86pc-3.2-pydebug/export/home/buildbot/64bits/3.x.cea-indiana-amd64/build/Modules/readline.o - -L/usr/lib/termcap -L/usr/local/lib -lreadline -lncursesw -o build/lib.solaris-2.11-i86pc-3.2-pydebug/readline.so ld: fatal: file /usr/local/lib/libncursesw.so: wrong ELF class: ELFCLASS32 ld: fatal: file processing errors. No output written to build/lib.solaris-2.11-i86pc-3.2-pydebug/readline.so collect2: ld returned 1 exit status """ The "-L/usr/local/lib" should be "-L/usr/local/lib/64". An example of many. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOr/n5lgi5GaxT1NAQLzogP/Sb2VMe7UwK/YeB8/cQSxhuoKeNRre0pZ XCJDePusysqI3uXBHmH8vitEIILmUKd5kQ6vsFwErPIry7ikl2fbDHe7eQgNr2HK o5Xcul36bqtuKWGkDV+gIyBH/m9k4pkvc7Lfp3mvR7yiYTBB75V/azt64XSTC9si 7QjjetX5wnA= =NCtE -----END PGP SIGNATURE----- From benjamin at python.org Tue Nov 23 00:47:16 2010 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 22 Nov 2010 17:47:16 -0600 Subject: [Python-Dev] [Python-checkins] r86699 - python/branches/py3k/Lib/zipfile.py In-Reply-To: <20101122233126.C8BDBEE981@mail.python.org> References: <20101122233126.C8BDBEE981@mail.python.org> Message-ID: No test? 2010/11/22 lukasz.langa : > Author: lukasz.langa > Date: Tue Nov 23 00:31:26 2010 > New Revision: 86699 > > Log: > Issue #9846: ZipExtFile provides no mechanism for closing the underlying file object > > > > Modified: > ? python/branches/py3k/Lib/zipfile.py > > Modified: python/branches/py3k/Lib/zipfile.py > ============================================================================== > --- python/branches/py3k/Lib/zipfile.py (original) > +++ python/branches/py3k/Lib/zipfile.py Tue Nov 23 00:31:26 2010 > @@ -473,9 +473,11 @@ > ? ? # Search for universal newlines or line chunks. > ? ? PATTERN = re.compile(br'^(?P[^\r\n]+)|(?P\n|\r\n?)') > > - ? ?def __init__(self, fileobj, mode, zipinfo, decrypter=None): > + ? ?def __init__(self, fileobj, mode, zipinfo, decrypter=None, > + ? ? ? ? ? ? ? ? close_fileobj=False): > ? ? ? ? self._fileobj = fileobj > ? ? ? ? self._decrypter = decrypter > + ? ? ? ?self._close_fileobj = close_fileobj > > ? ? ? ? self._compress_type = zipinfo.compress_type > ? ? ? ? self._compress_size = zipinfo.compress_size > @@ -647,6 +649,12 @@ > ? ? ? ? self._offset += len(data) > ? ? ? ? return data > > + ? ?def close(self): > + ? ? ? ?try: > + ? ? ? ? ? ?if self._close_fileobj: > + ? ? ? ? ? ? ? ?self._fileobj.close() > + ? ? ? ?finally: > + ? ? ? ? ? ?super().close() > > > ?class ZipFile: > @@ -889,8 +897,10 @@ > ? ? ? ? # given a file object in the constructor > ? ? ? ? if self._filePassed: > ? ? ? ? ? ? zef_file = self.fp > + ? ? ? ? ? ?should_close = False > ? ? ? ? else: > ? ? ? ? ? ? zef_file = io.open(self.filename, 'rb') > + ? ? ? ? ? ?should_close = True > > ? ? ? ? # Make sure we have an info object > ? ? ? ? if isinstance(name, ZipInfo): > @@ -944,7 +954,7 @@ > ? ? ? ? ? ? if h[11] != check_byte: > ? ? ? ? ? ? ? ? raise RuntimeError("Bad password for file", name) > > - ? ? ? ?return ?ZipExtFile(zef_file, mode, zinfo, zd) > + ? ? ? ?return ?ZipExtFile(zef_file, mode, zinfo, zd, close_fileobj=should_close) > > ? ? def extract(self, member, path=None, pwd=None): > ? ? ? ? """Extract a member from the archive to the current working directory, > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -- Regards, Benjamin From jcea at jcea.es Tue Nov 23 00:48:06 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 23 Nov 2010 00:48:06 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAE934.9000106@v.loewis.de> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> Message-ID: <4CEB0136.9050602@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I think this is probably trivial, but is there any foolproof way to detect 64 bit builds in python, beside "sys.maxint"?. And any macro useable for conditional compilation in C?. Checking Solaris 10 header files, I see macros like "_LP64". Portability would be nice, but in this personal case, probably unneeded... - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOsBNplgi5GaxT1NAQLkJwP+P1YyABBPGInHJXvwsU2ZLuj+u/OuZCRE m6hmbZgMajAyc5NtTie36qyHKAtVBcxFFvUdDeyfDZXV5gU+dF9Ha7/R16dclG3k b5W0CbccnGFcQJ/XypNPjH2dYPFDiqF8kCkDfeLJ7ZyL9ojA1YlRGFrgswN77/cF XM7Cwq1mh5k= =JXDq -----END PGP SIGNATURE----- From tjreedy at udel.edu Tue Nov 23 00:58:03 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 22 Nov 2010 18:58:03 -0500 Subject: [Python-Dev] Missing Python Symbols when Starting Python App (Apache/Django/Mod_Wsgi) In-Reply-To: References: Message-ID: On 11/22/2010 5:46 PM, Anurag Chourasia wrote: > > [Mon Nov 22 09:45:43 2010] [error] [client 108.10.0.191] mod_wsgi > (pid=1273874): Target WSGI script '/u01/home/apli/wm/app/gdd/pyserver/ > apache/django.wsgi' cannot be loaded as Python module. All other error stem probably from this. > Please guide. Ask usage questions like this on python-list or a django-specific list. python-list is for discussion of development of future versions of Python, not usage of current versions. -- Terry Jan Reedy From martin at v.loewis.de Tue Nov 23 01:05:59 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Nov 2010 01:05:59 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEAFF9F.5070503@jcea.es> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEAFF9F.5070503@jcea.es> Message-ID: <4CEB0567.8040500@v.loewis.de> Am 23.11.2010 00:41, schrieb Jesus Cea: > On 22/11/10 23:05, "Martin v. L?wis" wrote: >>> PS: Martin, is there any reason to restrict the solaris 10 buildslaves >>> to 32 bits, beside the said problems?. > >> I don't see that as a restriction. I have to make a choice, and there >> are sooo many choices to make: >> - gcc vs. SunPRO >> - 32-bit vs. 64-bit >> - GNU make vs. /usr/ccs/bin/make > >> I picked the combination which was most easy to setup, and is therefore >> likely to be used by most users (except for those who think 64-bit >> is somehow "better" than 32-bit, when it is actually the other way >> 'round - IMO). > > Do not think this is a personal attack. No offense taken. If you really want to know the historical background: this was the very first build slave (before I actually announced it to python-dev), and I haven't changed much from the initial setup. I just point out that none of the binaries in /usr/bin is a 64-bit binary; this includes the Sun-provided /usr/sfw/bin/python > The "-L/usr/local/lib" should be "-L/usr/local/lib/64". An example of many. Is that really the case? I.e. will ncurses automatically install into /usr/local/lib/64 if built with a 64-bit compiler? My installation doesn't even have a /usr/local/lib/64 folder. In any case: this shouldn't need a configure option. Instead, Python can find out itself whether it's a 64-bit build, and make modifications it considers necessary. Regards, Martin From solipsis at pitrou.net Tue Nov 23 01:06:12 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 01:06:12 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEB0136.9050602@jcea.es> Message-ID: <20101123010612.119d401c@pitrou.net> On Tue, 23 Nov 2010 00:48:06 +0100 Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I think this is probably trivial, but is there any foolproof way to > detect 64 bit builds in python, beside "sys.maxint"?. sys.maxsize > And any macro useable for conditional compilation in C?. SIZEOF_VOID_P > 4 From brian.curtin at gmail.com Tue Nov 23 01:06:33 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Mon, 22 Nov 2010 18:06:33 -0600 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEB0136.9050602@jcea.es> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEB0136.9050602@jcea.es> Message-ID: On Mon, Nov 22, 2010 at 17:48, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I think this is probably trivial, but is there any foolproof way to > detect 64 bit builds in python, beside "sys.maxint"?. > import platform platform.architecture() -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Tue Nov 23 01:12:16 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Nov 2010 01:12:16 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEB0136.9050602@jcea.es> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEB0136.9050602@jcea.es> Message-ID: <4CEB06E0.1080204@v.loewis.de> Am 23.11.2010 00:48, schrieb Jesus Cea: > I think this is probably trivial, but is there any foolproof way to > detect 64 bit builds in python, beside "sys.maxint"?. The canonical way is to use platform.architecture(). > And any macro useable for conditional compilation in C?. You need to be more specific than that. There are perhaps ten independent properties you may query, depending on what precise problem you try to solve. Most likely, you are looking for SIZEOF_VOID_P (but don't use that unless you literally want to know how many bytes a pointer uses, or whether it uses 4 or 8 bytes). Regards, Martin From lukasz at langa.pl Tue Nov 23 01:25:01 2010 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Tue, 23 Nov 2010 01:25:01 +0100 Subject: [Python-Dev] [Python-checkins] r86699 - python/branches/py3k/Lib/zipfile.py In-Reply-To: References: <20101122233126.C8BDBEE981@mail.python.org> Message-ID: <66720F75-169A-4702-AF53-69845701AA55@langa.pl> Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47: > No test? > The tests were there already, raising ResourceWarnings. After this change, they stopped doing that. You may say: now they pass for the first time :) Best regards, ?ukasz > 2010/11/22 lukasz.langa : >> Author: lukasz.langa >> Date: Tue Nov 23 00:31:26 2010 >> New Revision: 86699 >> >> Log: >> Issue #9846: ZipExtFile provides no mechanism for closing the underlying file object >> >> >> >> Modified: >> python/branches/py3k/Lib/zipfile.py >> >> Modified: python/branches/py3k/Lib/zipfile.py >> ============================================================================== >> --- python/branches/py3k/Lib/zipfile.py (original) >> +++ python/branches/py3k/Lib/zipfile.py Tue Nov 23 00:31:26 2010 >> @@ -473,9 +473,11 @@ >> # Search for universal newlines or line chunks. >> PATTERN = re.compile(br'^(?P[^\r\n]+)|(?P\n|\r\n?)') >> >> - def __init__(self, fileobj, mode, zipinfo, decrypter=None): >> + def __init__(self, fileobj, mode, zipinfo, decrypter=None, >> + close_fileobj=False): >> self._fileobj = fileobj >> self._decrypter = decrypter >> + self._close_fileobj = close_fileobj >> >> self._compress_type = zipinfo.compress_type >> self._compress_size = zipinfo.compress_size >> @@ -647,6 +649,12 @@ >> self._offset += len(data) >> return data >> >> + def close(self): >> + try: >> + if self._close_fileobj: >> + self._fileobj.close() >> + finally: >> + super().close() >> >> >> class ZipFile: >> @@ -889,8 +897,10 @@ >> # given a file object in the constructor >> if self._filePassed: >> zef_file = self.fp >> + should_close = False >> else: >> zef_file = io.open(self.filename, 'rb') >> + should_close = True >> >> # Make sure we have an info object >> if isinstance(name, ZipInfo): >> @@ -944,7 +954,7 @@ >> if h[11] != check_byte: >> raise RuntimeError("Bad password for file", name) >> >> - return ZipExtFile(zef_file, mode, zinfo, zd) >> + return ZipExtFile(zef_file, mode, zinfo, zd, close_fileobj=should_close) >> >> def extract(self, member, path=None, pwd=None): >> """Extract a member from the archive to the current working directory, >> _______________________________________________ >> Python-checkins mailing list >> Python-checkins at python.org >> http://mail.python.org/mailman/listinfo/python-checkins >> > > > > -- > Regards, > Benjamin > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins -- Pozdrawiam serdecznie, ?ukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From reinout at vanrees.org Mon Nov 22 23:52:10 2010 From: reinout at vanrees.org (Reinout van Rees) Date: Mon, 22 Nov 2010 23:52:10 +0100 Subject: [Python-Dev] Missing Python Symbols when Starting Python App (Apache/Django/Mod_Wsgi) In-Reply-To: References: Message-ID: On 11/22/2010 11:46 PM, Anurag Chourasia wrote: > > I have a problem in starting my Python(Django) App using Apache and Mod_Wsgi I'm pretty sure you're asking on the wrong list. This one is for discussing development of python-the-language :-) You'd better head over to the django-user mailinglist, for instance via http://groups.google.com/group/django-users Reinout -- Reinout van Rees - reinout at vanrees.org - http://reinout.vanrees.org Collega's gezocht! Django/python vacature in Utrecht: http://tinyurl.com/35v34f9 From lukasz at langa.pl Tue Nov 23 01:43:21 2010 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Tue, 23 Nov 2010 01:43:21 +0100 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: <4CEAE828.5000801@voidspace.org.uk> References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> <4CEAE828.5000801@voidspace.org.uk> Message-ID: Wiadomo?? napisana przez Michael Foord w dniu 2010-11-22, o godz. 23:01: > On 22/11/2010 21:08, Guido van Rossum wrote: >> On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon wrote: >>> The problem with that is it means developers who switch to Python 3.2 >>> or whatever are suddenly going to have their tests fail until they >>> update their code to turn the warnings off. >> That sounds like a feature to me... :-) >> > I think Ezio was suggesting just turning warnings on by default when unittest is run, not turning them into errors. Ezio is suggesting that developers could explicitly turn warnings off again, but when you use the default test runner warnings would be shown. His logic is that warnings are for developers, and so are tests... Then again, he is not against the idea to turn those warnings into errors, at least for regrtest. If you agree to do that for regrtest I will clean up the tests for warnings. Already did that for zipfile so it doesn't raise ResourceWarnings anymore. I just need to correct multiprocessing and xmlrpc ResourceWarnings, silence some DeprecationWarnings in the tests and we're all set. Ah, I see a couple more with -uall but nothing scary. Anyway, I find warnings as errors in regrtest a welcome feature. Let's make it happen :) -- Best regards, ?ukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcea at jcea.es Tue Nov 23 01:47:01 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 23 Nov 2010 01:47:01 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEB0567.8040500@v.loewis.de> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEAFF9F.5070503@jcea.es> <4CEB0567.8040500@v.loewis.de> Message-ID: <4CEB0F05.1040700@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 23/11/10 01:05, "Martin v. L?wis" wrote: > No offense taken. If you really want to know the historical background: > this was the very first build slave (before I actually announced it to > python-dev), and I haven't changed much from the initial setup. I do really want to know. I love trivia :-). Thanks. > I just point out that none of the binaries in /usr/bin is a 64-bit > binary; this includes the Sun-provided /usr/sfw/bin/python > >> The "-L/usr/local/lib" should be "-L/usr/local/lib/64". An example of many. > > Is that really the case? I.e. will ncurses automatically install into > /usr/local/lib/64 if built with a 64-bit compiler? My installation > doesn't even have a /usr/local/lib/64 folder. A fresh Solaris 10 install doesn't even have a "/usr/local" directory :). Sadly today most Open Source code is written like if Linux were the only Unix system out there. I was amazed that OpenSSL 1.0 installs automatically in "/usr/local/ssl/lib" when compiled in 32 bits, and in "/usr/local/ssl/lib/64" when compiled in 64 bits. I almost cry. > In any case: this shouldn't need a configure option. Instead, Python can > find out itself whether it's a 64-bit build, and make modifications > it considers necessary. I agree. Python should detect it automatically and update the paths when compiling. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOsPBZlgi5GaxT1NAQIw+QP/ZuxpWo2WZYUUcDfARRnOtp60n4PbIGMf fqQ4ZnC9JnelzKDU9kBo0yReL2zYAw0ZwezsGwZ98M9i3XyKkFCtcJcM1vXpIsDL eBwga8kPDpab5loP/vuac5kVC0wn0Z0z8x+BRMW6mwoOMHJzd463E8GTQywdx3x1 06FUHwJ0Hv4= =PV43 -----END PGP SIGNATURE----- From jcea at jcea.es Tue Nov 23 01:58:46 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 23 Nov 2010 01:58:46 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEB0567.8040500@v.loewis.de> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEAFF9F.5070503@jcea.es> <4CEB0567.8040500@v.loewis.de> Message-ID: <4CEB11C6.1010504@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 23/11/10 01:05, "Martin v. L?wis" wrote: > I just point out that none of the binaries in /usr/bin is a 64-bit > binary; this includes the Sun-provided /usr/sfw/bin/python True. This is for simplicity reasons (provide only one binary valid for 32 and 64 bits CPUs) and because 64 bits is overkill for a lot of stuff. In my own system my only 64 bits libraries are OpenSSL, GMP, and some multimedia stuff like mencoder, vorbis, etc, where the difference is big. And the GCC 4.5.x install, that installs libraries (fortran, stdc++, objective C, etc) automatically under "/usr/local/lib/64". GOOD. But if we say the Python can be compiled as 64 bits under Solaris, would be nice if that was actually true. Now that we have a buildbot (under OpenIndiana) to test, it is doable. If not, we could say that Solaris+64 bits is unsupported. I don't think we should go that way. Solaris+64 bits should be a full citizen. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOsRxplgi5GaxT1NAQKqqAP/fkiPpnPswMYOWc30Bflg3nDqRf6ih1bW ZZYHEMuJN9C8rm419LnRtoTyeAruHQYJ3o/dAoA2xDZu1xDYz8OOJKzG1L8hRVce OGm9TmziS4zuwWS4sYdmh21/ZCuD0MVq3gqD1h8zYPwrqbTTA6shYr6/He5hAo6j 5PsYWj4gIAE= =Rr80 -----END PGP SIGNATURE----- From benjamin at python.org Tue Nov 23 05:00:08 2010 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 22 Nov 2010 22:00:08 -0600 Subject: [Python-Dev] [Python-checkins] r86699 - python/branches/py3k/Lib/zipfile.py In-Reply-To: <66720F75-169A-4702-AF53-69845701AA55@langa.pl> References: <20101122233126.C8BDBEE981@mail.python.org> <66720F75-169A-4702-AF53-69845701AA55@langa.pl> Message-ID: 2010/11/22 ?ukasz Langa : > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47: > > No test? > > > The tests were there already, raising ResourceWarnings. After this change, > they stopped doing that. You may say: now they pass for the first time :) It looks like you added new API, though. For that, we would expect new tests. -- Regards, Benjamin From ocean-city at m2.ccsnet.ne.jp Tue Nov 23 05:13:38 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Tue, 23 Nov 2010 13:13:38 +0900 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) Message-ID: <4CEB3F72.7000006@m2.ccsnet.ne.jp> Hello. Does this affect python? Thank you. http://www.openssl.org/news/secadv_20101116.txt From glyph at twistedmatrix.com Tue Nov 23 06:07:09 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 23 Nov 2010 00:07:09 -0500 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) In-Reply-To: <4CEB3F72.7000006@m2.ccsnet.ne.jp> References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> Message-ID: On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto < ocean-city at m2.ccsnet.ne.jp> wrote: > Hello. Does this affect python? Thank you. > > http://www.openssl.org/news/secadv_20101116.txt > No. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Tue Nov 23 07:13:44 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Nov 2010 01:13:44 -0500 Subject: [Python-Dev] [Python-checkins] r86702 - python/branches/py3k/Lib/idlelib/IOBinding.py In-Reply-To: <20101123060131.EB345EE9C0@mail.python.org> References: <20101123060131.EB345EE9C0@mail.python.org> Message-ID: <4CEB5B98.6070003@udel.edu> On 11/23/2010 1:01 AM, terry.reedy wrote: > Author: terry.reedy > Date: Tue Nov 23 07:01:31 2010 > New Revision: 86702 > > Log: Issue 9222 Fix filetypes for open dialog Sorry, forgot to add this before clicking [go] or whatever the button is. Is there any way to revise a revision ;-? > Modified: > python/branches/py3k/Lib/idlelib/IOBinding.py > > Modified: python/branches/py3k/Lib/idlelib/IOBinding.py > ============================================================================== > --- python/branches/py3k/Lib/idlelib/IOBinding.py (original) > +++ python/branches/py3k/Lib/idlelib/IOBinding.py Tue Nov 23 07:01:31 2010 > @@ -476,8 +476,8 @@ > savedialog = None > > filetypes = [ > - ("Python and text files", "*.py *.pyw *.txt", "TEXT"), > - ("All text files", "*", "TEXT"), > + ("Python files", "*.py *.pyw", "TEXT"), > + ("Text files", "*.txt", "TEXT"), > ("All files", "*"), > ] From orsenthil at gmail.com Tue Nov 23 07:16:12 2010 From: orsenthil at gmail.com (Senthil Kumaran) Date: Tue, 23 Nov 2010 14:16:12 +0800 Subject: [Python-Dev] [Python-checkins] r86703 - python/branches/release31-maint/Lib/idlelib/IOBinding.py In-Reply-To: <20101123060705.0651CEE9C0@mail.python.org> References: <20101123060705.0651CEE9C0@mail.python.org> Message-ID: Hi Terry, On Tue, Nov 23, 2010 at 2:07 PM, terry.reedy wrote: > Author: terry.reedy > Date: Tue Nov 23 07:07:04 2010 > New Revision: 86703 > > Log: > Issue 9222 Fix filetypes for open dialog > > Modified: > ? python/branches/release31-maint/Lib/idlelib/IOBinding.py You should be using svnmerge.py script ( referenced in the dev FAQ), to merge your changes to release31-maint. This helps in merge tracking and helpful to release managers when they do the release. It is pretty simple, in your release31-maint checkout: Just run python svnmerge.py merge -r 9221 (your py3k revision value) If successful, do a svn commit -F svnmerge-output-filename ( this file is autogenerated) If any conflicts occur, resolve them and then do the step 2. Thanks, Senthil From g.brandl at gmx.net Tue Nov 23 07:44:43 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 23 Nov 2010 07:44:43 +0100 Subject: [Python-Dev] [Python-checkins] r86702 - python/branches/py3k/Lib/idlelib/IOBinding.py In-Reply-To: <4CEB5B98.6070003@udel.edu> References: <20101123060131.EB345EE9C0@mail.python.org> <4CEB5B98.6070003@udel.edu> Message-ID: Am 23.11.2010 07:13, schrieb Terry Reedy: > > > On 11/23/2010 1:01 AM, terry.reedy wrote: >> Author: terry.reedy >> Date: Tue Nov 23 07:01:31 2010 >> New Revision: 86702 >> >> Log: > Issue 9222 Fix filetypes for open dialog > > Sorry, forgot to add this before clicking [go] or whatever the button > is. Is there any way to revise a revision ;-? Yes, with SVN there is. I don't know if you can do it with whatever GUI tool you use, but the command is the following: svn propedit --revprop -r 86702 svn:log In a short time however, after switching to Mercurial, commits will be truly immutable. However, since the equivalent to committing in SVN is a two-step process (commit locally and then push one or more commits to the public repo on the server), you can review your commits locally before pushing them, and fix mistakes by "rewriting history" (you can see from that description that it won't work when the changes are already public). Georg From tjreedy at udel.edu Tue Nov 23 07:49:56 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Nov 2010 01:49:56 -0500 Subject: [Python-Dev] [Python-checkins] r86703 - python/branches/release31-maint/Lib/idlelib/IOBinding.py In-Reply-To: References: <20101123060705.0651CEE9C0@mail.python.org> Message-ID: <4CEB6414.9020606@udel.edu> On 11/23/2010 1:16 AM, Senthil Kumaran wrote: > Hi Terry, > > On Tue, Nov 23, 2010 at 2:07 PM, terry.reedy wrote: >> Author: terry.reedy >> Date: Tue Nov 23 07:07:04 2010 >> New Revision: 86703 >> >> Log: >> Issue 9222 Fix filetypes for open dialog >> >> Modified: >> python/branches/release31-maint/Lib/idlelib/IOBinding.py > > > You should be using svnmerge.py script ( referenced in the dev FAQ), > to merge your changes to release31-maint. This helps in merge tracking > and helpful to release managers when they do the release. > > It is pretty simple, in your release31-maint checkout: > > Just run python svnmerge.py merge -r 9221 (your py3k revision value) > If successful, do a svn commit -F svnmerge-output-filename ( this file > is autogenerated) I am using TortoiseSVN which has a similar merge but does not seem to autogenerate anything. I did use its merge + commit for the 2.7 backport. Terry From martin at v.loewis.de Tue Nov 23 07:55:20 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Nov 2010 07:55:20 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEB11C6.1010504@jcea.es> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEAFF9F.5070503@jcea.es> <4CEB0567.8040500@v.loewis.de> <4CEB11C6.1010504@jcea.es> Message-ID: <4CEB6558.3000600@v.loewis.de> > But if we say the Python can be compiled as 64 bits under Solaris, would > be nice if that was actually true. Now that we have a buildbot (under > OpenIndiana) to test, it is doable. But it is true, and always has been true. The lib/64 issue did not prevent one building Python on Solaris/SPARC64 at all, including the extension modules. Just edit Modules/Setup to suit your needs - that works since 1995 (before distutils was even written). > If not, we could say that Solaris+64 bits is unsupported. I don't think > we should go that way. Solaris+64 bits should be a full citizen. There we go again: "supported". Python builds on many systems which we don't have buildbots for, including obscure systems (although Guido has ruled that we won't specifically accept code for obscure systems anymore, unlike we did before). It is never fully automatic (you always have to at least make sure manually that the dependencies are installed). Regards, Martin From tjreedy at udel.edu Tue Nov 23 08:16:11 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Nov 2010 02:16:11 -0500 Subject: [Python-Dev] [Python-checkins] r86702 - python/branches/py3k/Lib/idlelib/IOBinding.py In-Reply-To: References: <20101123060131.EB345EE9C0@mail.python.org> <4CEB5B98.6070003@udel.edu> Message-ID: On 11/23/2010 1:44 AM, Georg Brandl wrote: > Am 23.11.2010 07:13, schrieb Terry Reedy: >> >> >> On 11/23/2010 1:01 AM, terry.reedy wrote: >>> Author: terry.reedy >>> Date: Tue Nov 23 07:01:31 2010 >>> New Revision: 86702 >>> >>> Log: >> Issue 9222 Fix filetypes for open dialog >> >> Sorry, forgot to add this before clicking [go] or whatever the button >> is. Is there any way to revise a revision ;-? > > Yes, with SVN there is. I don't know if you can do it with whatever > GUI tool you use, but the command is the following: > > svn propedit --revprop -r 86702 svn:log (followed by new message?) OK, done. TortoiseSVN has a nice revision log dialog. Right click and one of the choices is 'edit log message'. Easy. I see that there is a TortoiseHg as well. -- Terry Jan Reedy From g.brandl at gmx.net Tue Nov 23 09:10:46 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 23 Nov 2010 09:10:46 +0100 Subject: [Python-Dev] [Python-checkins] r86703 - python/branches/release31-maint/Lib/idlelib/IOBinding.py In-Reply-To: <4CEB6414.9020606@udel.edu> References: <20101123060705.0651CEE9C0@mail.python.org> <4CEB6414.9020606@udel.edu> Message-ID: Am 23.11.2010 07:49, schrieb Terry Reedy: > > > On 11/23/2010 1:16 AM, Senthil Kumaran wrote: >> Hi Terry, >> >> On Tue, Nov 23, 2010 at 2:07 PM, terry.reedy wrote: >>> Author: terry.reedy >>> Date: Tue Nov 23 07:07:04 2010 >>> New Revision: 86703 >>> >>> Log: >>> Issue 9222 Fix filetypes for open dialog >>> >>> Modified: >>> python/branches/release31-maint/Lib/idlelib/IOBinding.py >> >> >> You should be using svnmerge.py script ( referenced in the dev FAQ), >> to merge your changes to release31-maint. This helps in merge tracking >> and helpful to release managers when they do the release. >> >> It is pretty simple, in your release31-maint checkout: >> >> Just run python svnmerge.py merge -r 9221 (your py3k revision value) >> If successful, do a svn commit -F svnmerge-output-filename ( this file >> is autogenerated) > > I am using TortoiseSVN which has a similar merge but does not seem to > autogenerate anything. I did use its merge + commit for the 2.7 backport. While the policy is to use svnmerge and I'd expect developers to follow this policy, in this specific case it's not as important anymore since we use neither svnmerge's mass merging nor its blocking feature anymore. Georg From trent at snakebite.org Tue Nov 23 09:40:50 2010 From: trent at snakebite.org (Trent Nelson) Date: Tue, 23 Nov 2010 03:40:50 -0500 Subject: [Python-Dev] Stable buildbots In-Reply-To: References: <20101113133712.60e9be27@pitrou.net> Message-ID: <4CEB7E12.1070201@snakebite.org> On 14-Nov-10 3:48 AM, David Bolen wrote: > This is a completely separate issue, though probably around just as > long, and like the popup problem its frequency changes over time. By > "hung" here I'm referring to cases where something must go wrong with > a test and/or its cleanup such that a python_d process remains > running, usually several of them at the same time. My guess: the "hung" (single-threaded) Python process has called select() without a timeout in order to wait for some data. However, the data never arrives (due to a broken/failed test), and the select() never returns. On Windows, processes seem harder to kill when they get into this state. If I purposely wedge a Windows process via select() via the interactive interpreter, ctrl-c has absolutely no effect (whereas on Unix, ctrl-c will interrupt the select()). As for why kill_python.exe doesn't seem to be able to kill said wedged processes, the MSDN documentation on TerminateProcess[1] states the following: The terminated process cannot exit until all pending I/O has been completed or canceled. (sic) It's not unreasonable to assume a wedged select() constitutes pending I/O, so that's a possible explanation as to why kill_python.exe isn't able to terminate the processes. (Also, kill_python currently assumes TerminateProcess() always works; perhaps this optimism is misplaced. Also note the XXX TODO regarding the fact that we don't kill processes that have loaded our python*.dll, but may not be named python_d.exe. I don't think that's the issue here, though.) On 14-Nov-10 5:32 AM, David Bolen wrote: > "Martin v. L?wis" writes: > >> This is what kill_python.exe is supposed to solve. So I recommend to >> investigate why it fails to kill the hanging Pythons. > > Yeah, I know, and I can't say I disagree in principle - not sure why > Windows doesn't let the kill in that module work (or if there's an > issue actually running it under all conditions). > > At the moment though, I do know that using the sysinternals pskill > utility externally (which is what I currently do interactively) > definitely works so to be honest, That's interesting. (That kill_python.exe doesn't kill the wedged processes, but pskill does.) kill_python is pretty simple, it just calls TerminateProcess() after acquiring a handle with the relevant PROCESS_TERMINATE access right. That being said, that's the recommended way to kill a process -- I doubt pskill would be going about it any differently (although, it is sysinternals... you never know what kind of crazy black magic it's doing behind the scenes). Are you calling pskill with the -t flag? i.e. kill process and all dependents? That might be the ticket, especially if killing the child process that wedged select() is waiting on causes it to return, and thus, makes it killable. Otherwise, if it happens again, can you try kill_python.exe first, then pskill, and confirm if the former fails but the latter succeeds? Trent. [1]: http://msdn.microsoft.com/en-us/library/ms686714(VS.85).aspx From v+python at g.nevcal.com Tue Nov 23 11:30:31 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 02:30:31 -0800 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: References: <4CEA0246.9080607@g.nevcal.com> Message-ID: <4CEB97C7.1070708@g.nevcal.com> On 11/22/2010 8:33 AM, Guido van Rossum wrote: > On Sun, Nov 21, 2010 at 9:40 PM, Glenn Linderman wrote: >> > In reviewing my notes from my experimentations with CGIHTTPServer >> > (Python2.6) and then http.server (Python 3.2a4), I note one behavior I >> > haven't reported as a bug, nor do I know where to start to figure it out, >> > other than experimentally. >> > >> > The experiment: launching CGIHTTPServer without environment variables, by >> > the simple expedient of using a batch file to unset all the existing >> > environment variables, and then launching Python2.6 with CGIHTTPServer. >> > >> > So it failed early: random.py fails at line 110 (Python 2.6). > What specific traceback do you get? In my copy of the code that line says > > a = long(_hexlify(_urandom(16)), 16) > > and I could just imagine that _urandom() fails for some reason to do > with the environment (it is a reference to os.urandom()), which, being > part of the C library code, might depend on the environment. > > But you're not giving enough info to debug this. OK, here is the traceback. I've upgraded the application from Python 2.6 + CGIHTTPServer.py + bugfixes to Python 3.2a4 + http.server + bugfixes, hoping that it would fix it, but since it didn't that the traceback would be more relevant. It seems that _urandom is the likely culprit. Traceback (most recent call last): File "d:\my\web\areliabl\0test\https.py", line 5, in import server File "d:\my\web\areliabl\0test\server.py", line 88, in import email.message File "C:\Python32\lib\email\message.py", line 17, in from email import utils File "C:\Python32\lib\email\utils.py", line 27, in import random File "C:\Python32\lib\random.py", line 698, in _inst = Random() File "C:\Python32\lib\random.py", line 90, in __init__ self.seed(x) File "C:\Python32\lib\random.py", line 108, in seed a = int.from_bytes(_urandom(32), 'big') WindowsError: [Error -2146893818] Invalid Signature -------------- next part -------------- An HTML attachment was scrubbed... URL: From amauryfa at gmail.com Tue Nov 23 11:55:08 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 23 Nov 2010 11:55:08 +0100 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: <4CEB97C7.1070708@g.nevcal.com> References: <4CEA0246.9080607@g.nevcal.com> <4CEB97C7.1070708@g.nevcal.com> Message-ID: Hi, 2010/11/23 Glenn Linderman : > ? File "C:\Python32\lib\random.py", line 108, in seed > ??? a = int.from_bytes(_urandom(32), 'big') > WindowsError: [Error -2146893818] Invalid Signature In the subprocess documentation http://docs.python.org/library/subprocess.html """On Windows, in order to run a side-by-side assembly the specified env *must* include a valid SystemRoot.""" Can you keep this variable and start again? -- Amaury Forgeot d'Arc From martin at v.loewis.de Tue Nov 23 12:55:38 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Nov 2010 12:55:38 +0100 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: References: <4CEA0246.9080607@g.nevcal.com> <4CEB97C7.1070708@g.nevcal.com> Message-ID: <4CEBABBA.9050002@v.loewis.de> Am 23.11.2010 11:55, schrieb Amaury Forgeot d'Arc: > Hi, > > 2010/11/23 Glenn Linderman : >> File "C:\Python32\lib\random.py", line 108, in seed >> a = int.from_bytes(_urandom(32), 'big') >> WindowsError: [Error -2146893818] Invalid Signature > > In the subprocess documentation http://docs.python.org/library/subprocess.html > """On Windows, in order to run a side-by-side assembly the specified > env *must* include a valid SystemRoot.""" Indeed, setting SystemRoot might solve this problem. According to http://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ CrypoAPI, in Windows 7, requires this variable be set. Failure to find the enhanced crypto provider would explain why the "random" module of Python fails to work. The specific cause is in the registry: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Cryptography\Defaults\Provider\Microsoft Strong Cryptographic Provider has as it's ImagePath value %SystemRoot%\system32\rsaenh.dll So the registry (and COM) do rely on environment variables. Regards, Martin From stephen at xemacs.org Tue Nov 23 13:15:20 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 23 Nov 2010 21:15:20 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <877hg4ck2v.fsf@uwakimon.sk.tsukuba.ac.jp> Terry Reedy writes: > Yes. As I read the standard, UCS-2 is limited to BMP chars. Et tu, Terry? OK, I change my vote on the suggestion of "UCS2" to -1. If a couple of conscientious blokes like you and David both understand it that way, I can't see any way to fight it. FWIW, ISO/IEC 10646 (which is authoritative for UCS-2 and UCS-4) is available via http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html Probably I'm the last non-author to ever read that document! From nadeem.vawda at gmail.com Tue Nov 23 13:15:18 2010 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Tue, 23 Nov 2010 14:15:18 +0200 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> <4CEAE828.5000801@voidspace.org.uk> Message-ID: 2010/11/23 ?ukasz Langa : > If you agree to do that for regrtest I will clean up the tests for warnings. > Already did that for zipfile so it doesn't raise ResourceWarnings anymore. I > just need to correct multiprocessing and xmlrpc ResourceWarnings, silence > some DeprecationWarnings in the tests and we're all set. Ah, I see a couple > more with -uall but nothing scary. There are also some in test_socket - I've submitted a patch on Roundup: http://bugs.python.org/issue10512 Looking at the multiprocessing warnings, they seem to be caused by leaks in the underlying package, unlike xmlrpc and socket, where it's just a matter of the test code neglecting to close the connection. So +1 to: > Anyway, I find warnings as errors in regrtest a welcome feature. Let's make > it happen :) Nadeem From jcea at jcea.es Tue Nov 23 13:19:39 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 23 Nov 2010 13:19:39 +0100 Subject: [Python-Dev] Solaris family and 64 bits compiling In-Reply-To: <4CEB6558.3000600@v.loewis.de> References: <4CEAB7C9.7020504@jcea.es> <4CEAC3F0.4040806@jcea.es> <4CEAC798.5050707@v.loewis.de> <4CEAE129.2060505@jcea.es> <4CEAE934.9000106@v.loewis.de> <4CEAFF9F.5070503@jcea.es> <4CEB0567.8040500@v.loewis.de> <4CEB11C6.1010504@jcea.es> <4CEB6558.3000600@v.loewis.de> Message-ID: <4CEBB15B.1010800@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 23/11/10 07:55, "Martin v. L?wis" wrote: >> >> But if we say the Python can be compiled as 64 bits under Solaris, would >> >> be nice if that was actually true. Now that we have a buildbot (under >> >> OpenIndiana) to test, it is doable. > > > > But it is true, and always has been true. The lib/64 issue did not > > prevent one building Python on Solaris/SPARC64 at all, including the > > extension modules. Just edit Modules/Setup to suit your needs - that > > works since 1995 (before distutils was even written). Would be acceptable to change something like: """ add_library_path("/usr/local/lib") """ to something similar to: """ if (platform.uname()=="SunOS") and (platform.architecture()[0]=="64bits") : add_library_path("/usr/local/lib/64") else : add_library_path("/usr/local/lib") """ python-dev would consider that change OK?. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOuxW5lgi5GaxT1NAQJuDwP/dzbhDZScanoSnPeF4Ze5XHm+WnSmowx+ x9qvM782i4bYzqYNsbpPHflshROpUwdl9dC0/dFySLFWmMYo12hYogbM6vr5RD6k vEgq1iriIfsei9yNrtt2Ou6+1LVxJ2FMsbpY0Av5hDQVfuJpvB5WRML/mbyYj4T7 9w/jmPT2+rc= =riDG -----END PGP SIGNATURE----- From ncoghlan at gmail.com Tue Nov 23 14:41:05 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 23 Nov 2010 23:41:05 +1000 Subject: [Python-Dev] [Python-checkins] r86633 - in python/branches/py3k: Doc/library/inspect.rst Doc/whatsnew/3.2.rst Lib/inspect.py Lib/test/test_inspect.py Misc/NEWS In-Reply-To: <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> Message-ID: On Tue, Nov 23, 2010 at 2:46 AM, wrote: > On 04:24 pm, solipsis at pitrou.net wrote: >> >> On Mon, 22 Nov 2010 17:08:36 +0100 >> Hrvoje Niksic wrote: >>> >>> On 11/22/2010 04:37 PM, Antoine Pitrou wrote: >>> > +1. ?The problem with int constants is that the int gets printed, not >>> > the name, when you dump them for debugging purposes :) >>> >>> Well, it's trivial to subclass int to something with a nicer __repr__. >>> PyGTK uses that technique for wrapping C enums: >> >> Nice. It might be useful to add a private _Constant class somewhere for >> stdlib purposes. > > http://www.python.org/dev/peps/pep-0354/ Indeed, it is difficult to do enums is such a way that they feel sufficiently robust to be worth the effort of including them (although these days, I would be inclined to follow the namedtuple API style rather than that presented in PEP 354). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Tue Nov 23 14:50:53 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 13:50:53 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> Message-ID: <4CEBC6BD.9060402@voidspace.org.uk> On 23/11/2010 13:41, Nick Coghlan wrote: > On Tue, Nov 23, 2010 at 2:46 AM, wrote: >> On 04:24 pm, solipsis at pitrou.net wrote: >>> On Mon, 22 Nov 2010 17:08:36 +0100 >>> Hrvoje Niksic wrote: >>>> On 11/22/2010 04:37 PM, Antoine Pitrou wrote: >>>>> +1. The problem with int constants is that the int gets printed, not >>>>> the name, when you dump them for debugging purposes :) >>>> Well, it's trivial to subclass int to something with a nicer __repr__. >>>> PyGTK uses that technique for wrapping C enums: >>> Nice. It might be useful to add a private _Constant class somewhere for >>> stdlib purposes. >> http://www.python.org/dev/peps/pep-0354/ > Indeed, it is difficult to do enums is such a way that they feel > sufficiently robust to be worth the effort of including them (although > these days, I would be inclined to follow the namedtuple API style > rather than that presented in PEP 354). Right. As it happens I just submitted a patch to Barry Warsaw's enum package (nice), flufl.enum [1], to allow namedtuple style creation of named constants: >>> from flufl.enum import make_enum >>> Colors = make_enum('Colors', 'red green blue') >>> Colors PEP 354 was rejected for two primary reasons - lack of interest and nowhere obvious to put it. Would it be *so bad* if an enum type lived in its own module? There is certainly more interest now, and if we are to use something like this in the standard library it *has* to be in the standard library (unless every module implements their own private _Constant class). Time to revisit the PEP? All the best, Michael [1] https://launchpad.net/flufl.enum > Cheers, > Nick. > -- http://www.voidspace.org.uk/ From solipsis at pitrou.net Tue Nov 23 15:02:19 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 15:02:19 +0100 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> Message-ID: <20101123150219.29e20374@pitrou.net> On Tue, 23 Nov 2010 00:07:09 -0500 Glyph Lefkowitz wrote: > On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto < > ocean-city at m2.ccsnet.ne.jp> wrote: > > > Hello. Does this affect python? Thank you. > > > > http://www.openssl.org/news/secadv_20101116.txt > > > > No. Well, actually it does, but Python links against the system OpenSSL on most platforms (except Windows), so it's up to the OS vendor to apply the patch. Regards Antoine. From ncoghlan at gmail.com Tue Nov 23 15:03:53 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 24 Nov 2010 00:03:53 +1000 Subject: [Python-Dev] Re-enable warnings in regrtest and/or unittest In-Reply-To: <4CEAE828.5000801@voidspace.org.uk> References: <4CEAA4DB.6020904@gmail.com> <4CEAA9D4.2020904@langa.pl> <4CEAAC56.2090702@voidspace.org.uk> <4CEABD59.6080005@gmail.com> <4CEAE828.5000801@voidspace.org.uk> Message-ID: On Tue, Nov 23, 2010 at 8:01 AM, Michael Foord wrote: > On 22/11/2010 21:08, Guido van Rossum wrote: >> >> On Mon, Nov 22, 2010 at 11:24 AM, Brett Cannon ?wrote: >>> >>> The problem with that is it means developers who switch to Python 3.2 >>> or whatever are suddenly going to have their tests fail until they >>> update their code to turn the warnings off. >> >> That sounds like a feature to me... :-) >> > I think Ezio was suggesting just turning warnings on by default when > unittest is run, not turning them into errors. Ezio is suggesting that > developers could explicitly turn warnings off again, but when you use the > default test runner warnings would be shown. His logic is that warnings are > for developers, and so are tests... Having at least the default test runner change the default warnings behaviour to -Wd (while still respecting sys.warnoptions) sounds like a good idea. That way users won't see the warnings (as intended with that change), but developers are less likely to get nasty surprises when things break in future releases (which was one of our major concerns when we made the decision to change the default handling of DeprecationWarning). A similar change may be appropriate for doctest as well. Printing out the list of suppressed warnings in verbose mode may also be useful. A blanket -We is unlikely to work for the test suite, since generating warnings on some platforms is expected behaviour (e.g. due to the ongoing argument between multiprocessing and FreeBSD as to the appropriate behaviour of semaphores). However, we may be able to get to the point where it is run that way by default and then affected tests use check_warnings() to alter the filter configuration (something that many such affected tests already do). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Tue Nov 23 15:02:57 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 15:02:57 +0100 Subject: [Python-Dev] r86699 - python/branches/py3k/Lib/zipfile.py References: <20101122233126.C8BDBEE981@mail.python.org> <66720F75-169A-4702-AF53-69845701AA55@langa.pl> Message-ID: <20101123150257.76a423ad@pitrou.net> On Mon, 22 Nov 2010 22:00:08 -0600 Benjamin Peterson wrote: > 2010/11/22 ?ukasz Langa : > > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47: > > > > No test? > > > > > > The tests were there already, raising ResourceWarnings. After this change, > > they stopped doing that. You may say: now they pass for the first time :) > > It looks like you added new API, though. For that, we would expect new tests. It's an internal API, although ZipExtFile doesn't begin with an underscore. Regards Antoine. From ncoghlan at gmail.com Tue Nov 23 15:16:15 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 24 Nov 2010 00:16:15 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBC6BD.9060402@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> Message-ID: On Tue, Nov 23, 2010 at 11:50 PM, Michael Foord wrote: > PEP 354 was rejected for two primary reasons - lack of interest and nowhere > obvious to put it. Would it be *so bad* if an enum type lived in its own > module? There is certainly more interest now, and if we are to use something > like this in the standard library it *has* to be in the standard library > (unless every module implements their own private _Constant class). > > Time to revisit the PEP? If you (or anyone else) wanted to revisit the PEP, then I would advise trawling through the standard library looking for constants that could be sensibly converted to enum values. A decision would also need to be made as to whether or not to subclass int, or just provide __index__ (the former has the advantage of being able to drop cleanly into OS level APIs that expect a numerical constant). Whether enums should provide arbitrary name-value mappings (ala C enums) or were restricted to sequential indices starting from zero would be another question best addressed by a code survey of at least the stdlib. And getgeneratorstate() doesn't count as a use case, since the ordering isn't needed and using string literals instead of integers will cover the debugging aspect :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Tue Nov 23 15:24:18 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 14:24:18 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> Message-ID: <4CEBCE92.40801@voidspace.org.uk> On 23/11/2010 14:16, Nick Coghlan wrote: > On Tue, Nov 23, 2010 at 11:50 PM, Michael Foord > wrote: >> PEP 354 was rejected for two primary reasons - lack of interest and nowhere >> obvious to put it. Would it be *so bad* if an enum type lived in its own >> module? There is certainly more interest now, and if we are to use something >> like this in the standard library it *has* to be in the standard library >> (unless every module implements their own private _Constant class). >> >> Time to revisit the PEP? > If you (or anyone else) wanted to revisit the PEP, then I would advise > trawling through the standard library looking for constants that could > be sensibly converted to enum values. > > A decision would also need to be made as to whether or not to subclass > int, or just provide __index__ (the former has the advantage of being > able to drop cleanly into OS level APIs that expect a numerical > constant). > > Whether enums should provide arbitrary name-value mappings (ala C > enums) or were restricted to sequential indices starting from zero > would be another question best addressed by a code survey of at least > the stdlib. > > And getgeneratorstate() doesn't count as a use case, since the > ordering isn't needed and using string literals instead of integers > will cover the debugging aspect :) > Well, for backwards compatibility reasons the new constants would have to *behave* like the old ones (including having the same underlying value and comparing equal to it). In many cases it is *likely* that subclassing int is a better way of achieving that. Actually looking through the standard library to evaluate it is the only way of confirming that. Another API, that reduces the duplication of creating the enum and setting the names, could be something like: make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, module=__name__) Using __name__ we can set the module globals in the call to make_enums. All the best, Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ From solipsis at pitrou.net Tue Nov 23 15:42:29 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 15:42:29 +0100 Subject: [Python-Dev] constant/enum type in stdlib References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> Message-ID: <20101123154229.474f7a90@pitrou.net> On Tue, 23 Nov 2010 14:24:18 +0000 Michael Foord wrote: > Well, for backwards compatibility reasons the new constants would have > to *behave* like the old ones (including having the same underlying > value and comparing equal to it). > > In many cases it is *likely* that subclassing int is a better way of > achieving that. Actually looking through the standard library to > evaluate it is the only way of confirming that. > > Another API, that reduces the duplication of creating the enum and > setting the names, could be something like: > > make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, > module=__name__) > > Using __name__ we can set the module globals in the call to make_enums. I don't understand why people insist on calling that an "enum". enum is a C legacy and it doesn't bring anything useful as I can tell. Instead, just assign the values explicitly. Antoine. From benjamin at python.org Tue Nov 23 15:49:37 2010 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 23 Nov 2010 08:49:37 -0600 Subject: [Python-Dev] r86699 - python/branches/py3k/Lib/zipfile.py In-Reply-To: <20101123150257.76a423ad@pitrou.net> References: <20101122233126.C8BDBEE981@mail.python.org> <66720F75-169A-4702-AF53-69845701AA55@langa.pl> <20101123150257.76a423ad@pitrou.net> Message-ID: 2010/11/23 Antoine Pitrou : > On Mon, 22 Nov 2010 22:00:08 -0600 > Benjamin Peterson wrote: >> 2010/11/22 ?ukasz Langa : >> > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47: >> > >> > No test? >> > >> > >> > The tests were there already, raising ResourceWarnings. After this change, >> > they stopped doing that. You may say: now they pass for the first time :) >> >> It looks like you added new API, though. For that, we would expect new tests. > > It's an internal API, although ZipExtFile doesn't begin with an > underscore. Why is it internal API then? -- Regards, Benjamin From benjamin at python.org Tue Nov 23 15:52:09 2010 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 23 Nov 2010 08:52:09 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123154229.474f7a90@pitrou.net> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> Message-ID: 2010/11/23 Antoine Pitrou : > On Tue, 23 Nov 2010 14:24:18 +0000 > Michael Foord wrote: >> Well, for backwards compatibility reasons the new constants would have >> to *behave* like the old ones (including having the same underlying >> value and comparing equal to it). >> >> In many cases it is *likely* that subclassing int is a better way of >> achieving that. Actually looking through the standard library to >> evaluate it is the only way of confirming that. >> >> Another API, that reduces the duplication of creating the enum and >> setting the names, could be something like: >> >> ? ? ?make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, >> module=__name__) >> >> Using __name__ we can set the module globals in the call to make_enums. > > I don't understand why people insist on calling that an "enum". enum is > a C legacy and it doesn't bring anything useful as I can tell. Instead, > just assign the values explicitly. The concept of a "enumeration" of values is still useful outside its stunted C incarnation. Out of curiosity, why is enum "legacy" in C? -- Regards, Benjamin From fuzzyman at voidspace.org.uk Tue Nov 23 15:56:36 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 14:56:36 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123154229.474f7a90@pitrou.net> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> Message-ID: <4CEBD624.9000402@voidspace.org.uk> On 23/11/2010 14:42, Antoine Pitrou wrote: > On Tue, 23 Nov 2010 14:24:18 +0000 > Michael Foord wrote: >> Well, for backwards compatibility reasons the new constants would have >> to *behave* like the old ones (including having the same underlying >> value and comparing equal to it). >> >> In many cases it is *likely* that subclassing int is a better way of >> achieving that. Actually looking through the standard library to >> evaluate it is the only way of confirming that. >> >> Another API, that reduces the duplication of creating the enum and >> setting the names, could be something like: >> >> make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, >> module=__name__) >> >> Using __name__ we can set the module globals in the call to make_enums. > I don't understand why people insist on calling that an "enum". enum is > a C legacy and it doesn't bring anything useful as I can tell. Instead, > just assign the values explicitly. > enum isn't only in C. (They are in C# as well at least.) Wikipedia links enum to "enumerated type" and says: an enumerated type (also called enumeration or enum) is a data type consisting of a set of named values It sounds entirely appropriate. I have no problem with explicitly assigning values instead of doing it automagically. All the best, Michael > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ From stephen at xemacs.org Tue Nov 23 16:00:22 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 00:00:22 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEA5744.3080308@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> Message-ID: <8762voccft.fsf@uwakimon.sk.tsukuba.ac.jp> If you don't care about the ISO standard, but only about Python, Martin's right, I was wrong. You can stop reading now. "Martin v. L?wis" writes: > I could only find the FCD of 10646:2010, where annex H was integrated > into section 10: Thank you for the reference. I referred to two older versions, 10646-1:1993 (for the annexes and Amendment, and my basic understanding) and 10646:2003 (for the detailed definition of UCS-2 in Sections 7, 8 and 13; unfortunately, I missed the most important detail, which is in Section 9). In :2003 the Annex I referred to as "Annex H" is Annex J, and "Annex Q" is partly in Section 9.1 and mostly in Annex C. I don't know where the former is in the 2010 FCD, and the latter is section 9.2. > I think they are now acknowledging that UCS-2 was a misleading term, > making it ambiguous whether this refers to a CCS, a CEF, or a CES; > like "ASCII", people have been using it for all three of them. In :1993 it wasn't ambiguous, they simply didn't make those distinctions. They were not needed for ISO 10646's published versions, although they certainly are for Unicode. Now, quite clearly, the ISO has *changed the definition* in every new version, progressively adding new restrictions that go beyond clarifying ambiguity. But even in :2003, in view of 4.2, 6.2, 6.3, and 13.1, UCS-2 is clearly well-defined as a CM according to UTR#17, which can probably be identified with CCS in :2003 terminology. Ie, returning to UTR#17 terminology, it is the composition of a CES, a CEF, and a CCS, which are not defined individually. Note: The definition of "coded character" changed between :2003 and the 2010 FCD, from "character with representation" to "character with integer". There is a NOTE indicating that 16-bit integers may be used in processing. Given that this is a non-normative note, I take it to mean that in an array of 16-bit integers, "most significant octet" is to be interpreted in the natural way for the architecture rather than by the representation in memory, which might be little-endian. IMO it's unnatural to think that that changes the definition of UCS-2 to be either a CEF, or a composition of a CEF and a CCS. > Apparently, the ISO WG interprets earlier revisions as saying that > UCS-2 is a CEF that restricted UTF-16 to the BMP. I think that ISO 10646-1:1993 admits only one interpretation, a CM restricted to the BMP (including surrogates), and ISO 10646:2003 admits only one interpretation, a CM restricted to the BMP (not including surrogates). The note under Table 4 on p.24 of the FCD is, uh, well, a lie. Earlier versions certainly did not restrict to "scalar values"; they had no such concept. > THIS IS NOT WHAT PYTHON DOES. Well, no shit, Sherlock. You don't have to yell at me, I know what Python does. The question is, is what does UCS-2 do? The answer is that in :1993, AFAICT it did what Python does. In :2003, they added (last sentence, section 9.1): UCS-2 cannot be used to represent any characters on the supplementary planes. I assume they maintain that position in 2010, so End Of Thread. I apologize for missing that when I was reviewing the standard earlier, but I expected restrictions on UCS-2 to be explained in 13.1 or perhaps 14. And 13.1 simply requires that characters in the BMP be represented by their defined code positions, truncated to two octets. Like earlier versions, it doesn't prohibit use of surrogates or say that non-BMP characters can't be represented. > Not sure what it says in your copy; in mine, section 9.3 says [snip] Mine (:2003) says "NOTE 2 - When confined to the code positions in Planes 00 to 10, UCS-4 is also referred to as UCS Transformation Format 32 (UTF-32)." Then it references the Unicode Standard (v4.0) as the authority for UTF-32. Obviously they continued to be confused at this point in time; by the draft you have, apparently the WG had decided to pretty much completely synchronize the whole standard to a subset of Unicode. This seems pointless to me (unlike, say, the work that has been done on standardizing criteria for repertoire changes). In particular, the :1993 definition of UCS-2 was a perfectly good standard for describing the processing Python actually does internally. The current definition of UCS-2 as identical to the BMP is useless, and good riddance, I'm perfectly happy to have them deprecate it. From solipsis at pitrou.net Tue Nov 23 16:01:06 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 16:01:06 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> Message-ID: <1290524466.3642.4.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 08:52 -0600, Benjamin Peterson a ?crit : > 2010/11/23 Antoine Pitrou : > > On Tue, 23 Nov 2010 14:24:18 +0000 > > Michael Foord wrote: > >> Well, for backwards compatibility reasons the new constants would have > >> to *behave* like the old ones (including having the same underlying > >> value and comparing equal to it). > >> > >> In many cases it is *likely* that subclassing int is a better way of > >> achieving that. Actually looking through the standard library to > >> evaluate it is the only way of confirming that. > >> > >> Another API, that reduces the duplication of creating the enum and > >> setting the names, could be something like: > >> > >> make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, > >> module=__name__) > >> > >> Using __name__ we can set the module globals in the call to make_enums. > > > > I don't understand why people insist on calling that an "enum". enum is > > a C legacy and it doesn't bring anything useful as I can tell. Instead, > > just assign the values explicitly. > > The concept of a "enumeration" of values is still useful outside its > stunted C incarnation. Well, it is easy to assign range(N) to a tuple of names when desired. I don't think an automatically-enumerating constant generator is needed. Regards Antoine. From solipsis at pitrou.net Tue Nov 23 16:01:59 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 16:01:59 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBD624.9000402@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <4CEBD624.9000402@voidspace.org.uk> Message-ID: <1290524519.3642.5.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 14:56 +0000, Michael Foord a ?crit : > On 23/11/2010 14:42, Antoine Pitrou wrote: > > On Tue, 23 Nov 2010 14:24:18 +0000 > > Michael Foord wrote: > >> Well, for backwards compatibility reasons the new constants would have > >> to *behave* like the old ones (including having the same underlying > >> value and comparing equal to it). > >> > >> In many cases it is *likely* that subclassing int is a better way of > >> achieving that. Actually looking through the standard library to > >> evaluate it is the only way of confirming that. > >> > >> Another API, that reduces the duplication of creating the enum and > >> setting the names, could be something like: > >> > >> make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, > >> module=__name__) > >> > >> Using __name__ we can set the module globals in the call to make_enums. > > I don't understand why people insist on calling that an "enum". enum is > > a C legacy and it doesn't bring anything useful as I can tell. Instead, > > just assign the values explicitly. > > > > enum isn't only in C. (They are in C# as well at least.) Well, it's been inherited by C-like languages, no doubt. Like braces and semicolumns :) Regards Antoine. From solipsis at pitrou.net Tue Nov 23 15:59:59 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 15:59:59 +0100 Subject: [Python-Dev] r86699 - python/branches/py3k/Lib/zipfile.py In-Reply-To: References: <20101122233126.C8BDBEE981@mail.python.org> <66720F75-169A-4702-AF53-69845701AA55@langa.pl> <20101123150257.76a423ad@pitrou.net> Message-ID: <1290524399.3642.3.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 08:49 -0600, Benjamin Peterson a ?crit : > 2010/11/23 Antoine Pitrou : > > On Mon, 22 Nov 2010 22:00:08 -0600 > > Benjamin Peterson wrote: > >> 2010/11/22 ?ukasz Langa : > >> > Wiadomo?? napisana przez Benjamin Peterson w dniu 2010-11-23, o godz. 00:47: > >> > > >> > No test? > >> > > >> > > >> > The tests were there already, raising ResourceWarnings. After this change, > >> > they stopped doing that. You may say: now they pass for the first time :) > >> > >> It looks like you added new API, though. For that, we would expect new tests. > > > > It's an internal API, although ZipExtFile doesn't begin with an > > underscore. > > Why is it internal API then? Because it's for use by ZipFile.open(). The ZipExtFile constructor is not supposed to be called by the user. You might instead asked why ZipExtFile isn't called _ZipExtFile instead, and I have no idea. Regards Antoine. From fuzzyman at voidspace.org.uk Tue Nov 23 16:15:29 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 15:15:29 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290524466.3642.4.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> Message-ID: <4CEBDA91.4050205@voidspace.org.uk> On 23/11/2010 15:01, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 08:52 -0600, Benjamin Peterson a ?crit : >> 2010/11/23 Antoine Pitrou: >>> On Tue, 23 Nov 2010 14:24:18 +0000 >>> Michael Foord wrote: >>>> Well, for backwards compatibility reasons the new constants would have >>>> to *behave* like the old ones (including having the same underlying >>>> value and comparing equal to it). >>>> >>>> In many cases it is *likely* that subclassing int is a better way of >>>> achieving that. Actually looking through the standard library to >>>> evaluate it is the only way of confirming that. >>>> >>>> Another API, that reduces the duplication of creating the enum and >>>> setting the names, could be something like: >>>> >>>> make_enums("Names", "NAME_ONE NAME_TWO NAME_THREE", base_type=int, >>>> module=__name__) >>>> >>>> Using __name__ we can set the module globals in the call to make_enums. >>> I don't understand why people insist on calling that an "enum". enum is >>> a C legacy and it doesn't bring anything useful as I can tell. Instead, >>> just assign the values explicitly. >> The concept of a "enumeration" of values is still useful outside its >> stunted C incarnation. > Well, it is easy to assign range(N) to a tuple of names when desired. I > don't think an automatically-enumerating constant generator is needed. > Right, and that is current practise. It has the disadvantage (that you seemed to acknowledge) that when debugging the integer values are seen instead of something with a useful repr. Having a *simple* class (and API to create them) that produces named constants with a useful repr, is what we are discussing, and that seems awfully like an enum (in the general sense not in a C specific sense). For backwards compatibility these constants, where they replace integer constants, would need to be integer subclasses with the same behaviour. Like the Qt example you appreciated so much. ;-) There are still two reasonable APIs (unless you have changed your mind and think that sticking with plain integers is best), of which I prefer the latter: SOME_CONST = Constant('SOME_CONST', 1) OTHER_CONST = Constant('OTHER_CONST', 2) or: Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) SOME_CONST = Constants.SOME_CONST OTHER_CONST = Constants.OTHER_CONST (Well, there is a third option that takes __name__ and sets the constants in the module automagically. I can understand why people would dislike that though.) All the best, Michael Foord Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ From solipsis at pitrou.net Tue Nov 23 16:30:53 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 16:30:53 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBDA91.4050205@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> Message-ID: <1290526253.3642.9.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit : > There are still two reasonable APIs (unless you have changed your mind > and think that sticking with plain integers is best), of which I prefer > the latter: > > SOME_CONST = Constant('SOME_CONST', 1) > OTHER_CONST = Constant('OTHER_CONST', 2) > > or: > > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) Or: Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', values=range(1, 3)) Again, auto-enumeration is useless since it's trivial to achieve explicitly. Regards Antoine. From fuzzyman at voidspace.org.uk Tue Nov 23 16:40:28 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 15:40:28 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290526253.3642.9.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> Message-ID: <4CEBE06C.9030101@voidspace.org.uk> On 23/11/2010 15:30, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit : >> There are still two reasonable APIs (unless you have changed your mind >> and think that sticking with plain integers is best), of which I prefer >> the latter: >> >> SOME_CONST = Constant('SOME_CONST', 1) >> OTHER_CONST = Constant('OTHER_CONST', 2) >> >> or: >> >> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) > Or: > > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', > values=range(1, 3)) > > Again, auto-enumeration is useless since it's trivial to achieve > explicitly. Ah, I see. It is the auto-enumeration you disliked. Sure - not a problem. I think the step that Nick described, of evaluating places in the standard library that this could be used, is a good one. I'll try to get around to it and perhaps attempt to resuscitate the PEP. (Any suggestions as to an appropriate module if having it live in its own module is still an objection?) Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Tue Nov 23 17:05:19 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 17:05:19 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBE06C.9030101@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> Message-ID: <1290528319.3642.11.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 15:40 +0000, Michael Foord a ?crit : > On 23/11/2010 15:30, Antoine Pitrou wrote: > > Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit : > >> There are still two reasonable APIs (unless you have changed your mind > >> and think that sticking with plain integers is best), of which I prefer > >> the latter: > >> > >> SOME_CONST = Constant('SOME_CONST', 1) > >> OTHER_CONST = Constant('OTHER_CONST', 2) > >> > >> or: > >> > >> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) > > Or: > > > > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', > > values=range(1, 3)) > > > > Again, auto-enumeration is useless since it's trivial to achieve > > explicitly. > > Ah, I see. It is the auto-enumeration you disliked. Sure - not a problem. > > I think the step that Nick described, of evaluating places in the > standard library that this could be used, is a good one. I'll try to get > around to it and perhaps attempt to resuscitate the PEP. (Any > suggestions as to an appropriate module if having it live in its own > module is still an objection?) We already have a bunch of bizarrely unrelated stuff in collections (such as Callable), so we could put enum there too. Regards Antoine. From fuzzyman at voidspace.org.uk Tue Nov 23 17:07:30 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 16:07:30 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290528319.3642.11.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> Message-ID: <4CEBE6C2.1070204@voidspace.org.uk> On 23/11/2010 16:05, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 15:40 +0000, Michael Foord a ?crit : >> On 23/11/2010 15:30, Antoine Pitrou wrote: >>> Le mardi 23 novembre 2010 ? 15:15 +0000, Michael Foord a ?crit : >>>> There are still two reasonable APIs (unless you have changed your mind >>>> and think that sticking with plain integers is best), of which I prefer >>>> the latter: >>>> >>>> SOME_CONST = Constant('SOME_CONST', 1) >>>> OTHER_CONST = Constant('OTHER_CONST', 2) >>>> >>>> or: >>>> >>>> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) >>> Or: >>> >>> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', >>> values=range(1, 3)) >>> >>> Again, auto-enumeration is useless since it's trivial to achieve >>> explicitly. >> Ah, I see. It is the auto-enumeration you disliked. Sure - not a problem. >> >> I think the step that Nick described, of evaluating places in the >> standard library that this could be used, is a good one. I'll try to get >> around to it and perhaps attempt to resuscitate the PEP. (Any >> suggestions as to an appropriate module if having it live in its own >> module is still an objection?) > We already have a bunch of bizarrely unrelated stuff in collections > (such as Callable), so we could put enum there too. > I guess it creates collections of constants... Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From Ben.Cottrell at nominum.com Tue Nov 23 16:37:43 2010 From: Ben.Cottrell at nominum.com (Ben.Cottrell at nominum.com) Date: Tue, 23 Nov 2010 07:37:43 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: Your message of "Tue, 23 Nov 2010 15:15:29 GMT." <4CEBDA91.4050205@voidspace.org.uk> Message-ID: <20101123153743.3D9451B8ED4@shell-too.nominum.com> On Tue, 23 Nov 2010 15:15:29 +0000, Michael Foord wrote: > There are still two reasonable APIs (unless you have changed your mind > and think that sticking with plain integers is best), of which I prefer > the latter: > > SOME_CONST = Constant('SOME_CONST', 1) > OTHER_CONST = Constant('OTHER_CONST', 2) > > or: > > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) > SOME_CONST = Constants.SOME_CONST > OTHER_CONST = Constants.OTHER_CONST I prefer the latter too, because that makes it possible to have 'Constants' be a rendezvous point for making sure that you're passing something valid. Perhaps using 'in': def func(foo): if foo not in Constants: raise ValueError('foo must be SOME_CONST or OTHER_CONST') ... I know this is probably not going to happen, but I would *so much* like it if functions would start rejecting "the wrong kind of 2". Constants that are valid, integer-wise, but which aren't part of the set of constants allowed for that argument. I'd prefer not to think of the number of times I've made the following mistake: s = socket.socket(socket.SOCK_DGRAM, socket.AF_INET) ~Ben From turnbull at sk.tsukuba.ac.jp Tue Nov 23 17:16:55 2010 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 01:16:55 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEA527B.4030002@v.loewis.de> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE96A40.1050705@v.loewis.de> <87ipzqc4gi.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA27EB.8000104@v.loewis.de> <87fwutd49e.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA527B.4030002@v.loewis.de> Message-ID: <871v6cc8w8.fsf@uwakimon.sk.tsukuba.ac.jp> "Martin v. L?wis" writes: > I disagree: Quoting from Unicode 5.0, section 5.4: > > # The individual components of implementations may have different > # levels of support for surrogates, as long as those components are > # assembled and communicate correctly. "Assembly" is the problem. If chr() or a slice creates a lone surrogate and surrogateescape passes it back out, Python as a whole is non-conforming. Technically, you can hide behind "none of slicing, chr(), or surrogateescape promises to conform", and maybe that would fly to a standards lawyer; I'd have to see the precise statement. Here's a more convincing example. A user specifies "utf8" as her locale charset. Then she specifies a string containing a non-BMP character as the "description" of a file, and internal code munges this via slicing into a file name conforming to some specification (eg, length limit + uniquifier if needed). Then if the non-BMP character is in the "right" place, she will get either a broken file name, which will either get written to disk or raise an exception, depending on whether the munging program has enabled surrogateescape or not. I claim both of those results are non-conforming to the specification of UTF-16, and therefore Python Unicode processing as a whole must be considered non-conforming. It's still pretty damn good. But I've elaborated that point elsewhere. > The rationale for supporting these characters in chr() goes back much > further than the surrogateescape handler - as Python unicode strings > are sequences of code points, it would be impractical if you couldn't > create some of them, or even would have to consult the UCD before > determining whether they can be created. The Zen is irrelevant to determining conformance to Unicode, which has its own Zen. From stephen at xemacs.org Tue Nov 23 17:18:57 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 01:18:57 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEA5744.3080308@v.loewis.de> <4CEA6661.4080402@egenix.com> Message-ID: <87zkt0au8e.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > For practical purposes, UCS2/UCS4 convey far more inherent information > than narrow/wide: That was my stance, but in fact (1) the ISO JTC1/SC2 has deliberately made them ambiguous by changing their definitions over the years[1], and (2) the more recent definitions and "interpretations" of UCS-2 *prohibit* use of surrogates in UCS-2 as far as I can tell. And that's what you'll see everywhere you look, because Wikipedia and friends pick up the most recent versions of everything. > So don't just think about "what will developers know?", also think > about "what will developers know, and what will a quick trip to a > search engine tell them?". It will tell them that UCS-2 cannot even *express* non-BMP characters. Terry and David are *not* dummies, and that's what they got from more or less careful study of the issue. > And once you take that stance, the overly > generic narrow/wide terms fail, badly. I still agree that something more accurate would be nice, but face it: the ISO will redefine and deprecate such terms as soon as they notice us using them. > +1 for MAL's suggested tweaks to the Py3k configure options. Despite my natural sympathy for your arguments, and MAL's, I'm still -1. I really wish I could switch back, but it seems to me that "UCS-2" is a liability we don't need, *especially* on Windows where the default build is presumably going to be UCS2 forever. Footnotes: [1] You'd think it would be hard to change the definition of UCS-4, but they managed. :-( From fuzzyman at voidspace.org.uk Tue Nov 23 17:19:16 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 16:19:16 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123153743.3D9451B8ED4@shell-too.nominum.com> References: <20101123153743.3D9451B8ED4@shell-too.nominum.com> Message-ID: <4CEBE984.4050807@voidspace.org.uk> On 23/11/2010 15:37, Ben.Cottrell at nominum.com wrote: > On Tue, 23 Nov 2010 15:15:29 +0000, Michael Foord wrote: >> There are still two reasonable APIs (unless you have changed your mind >> and think that sticking with plain integers is best), of which I prefer >> the latter: >> >> SOME_CONST = Constant('SOME_CONST', 1) >> OTHER_CONST = Constant('OTHER_CONST', 2) >> >> or: >> >> Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', start=1) >> SOME_CONST = Constants.SOME_CONST >> OTHER_CONST = Constants.OTHER_CONST > I prefer the latter too, because that makes it possible to have > 'Constants' be a rendezvous point for making sure that you're > passing something valid. Perhaps using 'in': > > def func(foo): > if foo not in Constants: > raise ValueError('foo must be SOME_CONST or OTHER_CONST') > ... > > I know this is probably not going to happen, but I would *so much* > like it if functions would start rejecting "the wrong kind of 2". > Constants that are valid, integer-wise, but which aren't part of > the set of constants allowed for that argument. I'd prefer not to > think of the number of times I've made the following mistake: > > s = socket.socket(socket.SOCK_DGRAM, socket.AF_INET) Well it would be perfectly possible for the __contains__ method (on the metaclass so that a Constants class can act as a container) to permit a *raw integer* (to be backwards compatible with code using hard coded values) but not permit other constants that aren't valid. Code that is *deliberately* using the wrong constants would be screwed of course... All the best, Michael > ~Ben > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From barry at python.org Tue Nov 23 17:27:03 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 23 Nov 2010 11:27:03 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBC6BD.9060402@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> Message-ID: <20101123112703.42b42812@mission> On Nov 23, 2010, at 01:50 PM, Michael Foord wrote: >Right. As it happens I just submitted a patch to Barry Warsaw's enum package >(nice), flufl.enum [1], to allow namedtuple style creation of named >constants: Thanks for the plug (and the nice patch). FWIW, the documentation for the package is here: http://packages.python.org/flufl.enum/ I made some explicit decisions about the API and semantics of this package, to fit my own use cases and sensibilities. I guess you wouldn't expect anything else , but I'm willing to acknowledge that others would make different decisions, and certainly the number of existing enum implementations out there proves that there are lots of interesting ways to go about it. That said, there are several things I like about my package: * Enums are not subclassed from ints or strs. They are a distinct data type that can be converted to and from ints and strs. EIBTI. * The typical way to create them is through a simple, but explicit class definition. I personally like being explicit about the item values, and the assignments are required to make the metaclass work properly, but Michael's convenience patch is totally appropriate for cases where you don't care, or you want a one-liner. * Enum items are singletons and are intended to be compared by identity. They can be compared by equality but are not ordered. * Enum items have an unambiguous symbolic repr and a nice human readable str. * Given an enum item, you can get to its enum class, and given the class you can get to the set of items. * Enums can be subclassed (though all items in the subclass must have unique values). In any case it may be that enums are too tied to specific use cases to find a good common ground for the stdlib. I've been using my module for years and if there's interest I would of course be happy to donate it for use in the stdlib. Like the original sets implementation, it makes perfect sense to provide them in a separate module rather than as a built-in type. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Tue Nov 23 17:31:27 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 23 Nov 2010 11:31:27 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBDA91.4050205@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> Message-ID: <20101123113127.78506cb5@mission> On Nov 23, 2010, at 03:15 PM, Michael Foord wrote: >(Well, there is a third option that takes __name__ and sets the constants in >the module automagically. I can understand why people would dislike that >though.) Personally, I think if you want that, then the explicit class definition is a better way to go. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From pje at telecommunity.com Tue Nov 23 17:52:37 2010 From: pje at telecommunity.com (P.J. Eby) Date: Tue, 23 Nov 2010 11:52:37 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123113127.78506cb5@mission> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <20101123113127.78506cb5@mission> Message-ID: <20101123165252.0C0743A4114@sparrow.telecommunity.com> At 11:31 AM 11/23/2010 -0500, Barry Warsaw wrote: >On Nov 23, 2010, at 03:15 PM, Michael Foord wrote: > > >(Well, there is a third option that takes __name__ and sets the constants in > >the module automagically. I can understand why people would dislike that > >though.) > >Personally, I think if you want that, then the explicit class definition is a >better way to go. This reminds me: a stdlib enum should support proper pickling and copying; i.e.: assert SomeEnum.anEnum is pickle.loads(pickle.dumps(SomeEnum.anEnum)) This could probably be implemented by adding something like: def __reduce__(self): return getattr, (self._class, self._enumname) in the EnumValue class. From fuzzyman at voidspace.org.uk Tue Nov 23 18:02:33 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 23 Nov 2010 17:02:33 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123112703.42b42812@mission> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <20101123112703.42b42812@mission> Message-ID: <4CEBF3A9.3060604@voidspace.org.uk> On 23/11/2010 16:27, Barry Warsaw wrote: > On Nov 23, 2010, at 01:50 PM, Michael Foord wrote: > >> Right. As it happens I just submitted a patch to Barry Warsaw's enum package >> (nice), flufl.enum [1], to allow namedtuple style creation of named >> constants: > Thanks for the plug (and the nice patch). > > FWIW, the documentation for the package is here: > > http://packages.python.org/flufl.enum/ > > I made some explicit decisions about the API and semantics of this package, to > fit my own use cases and sensibilities. I guess you wouldn't expect anything > else, but I'm willing to acknowledge that others would make different > decisions, and certainly the number of existing enum implementations out there > proves that there are lots of interesting ways to go about it. > > That said, there are several things I like about my package: > > * Enums are not subclassed from ints or strs. They are a distinct data type > that can be converted to and from ints and strs. EIBTI. But if we are to use it *in* the standard library (as opposed to merely adding a module *to* the standard library) there are backwards compatibility concerns. Where modules are already using integers for constants then integers still need to work. One easy way to achieve this is to subclass integer. If we don't do that (assuming we decide that putting a solution in the standard library is appropriate) then we'll have to evaluate what we mean by backwards compatible. If the modules that use the constants aren't to change then comparing equal to the underlying value is the minimum (so that the original value can still be used in place of the new named constant). Not sure if you'd be happy to make that change in flufl.enum. > * The typical way to create them is through a simple, but explicit class > definition. I personally like being explicit about the item values, and the > assignments are required to make the metaclass work properly, but Michael's > convenience patch is totally appropriate for cases where you don't care, or > you want a one-liner. If make_enum was to take a set of values to use (as Antoine suggested) I don't see what's un-explicit about it. All the best, Michael > * Enum items are singletons and are intended to be compared by identity. They > can be compared by equality but are not ordered. > > * Enum items have an unambiguous symbolic repr and a nice human readable str. > > * Given an enum item, you can get to its enum class, and given the class you > can get to the set of items. > > * Enums can be subclassed (though all items in the subclass must have unique > values). > > In any case it may be that enums are too tied to specific use cases to find a > good common ground for the stdlib. I've been using my module for years and if > there's interest I would of course be happy to donate it for use in the > stdlib. Like the original sets implementation, it makes perfect sense to > provide them in a separate module rather than as a built-in type. > > -Barry > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 23 18:37:40 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 18:37:40 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> Message-ID: <1290533860.3642.73.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 12:32 -0500, Isaac Morland a ?crit : > On Tue, 23 Nov 2010, Antoine Pitrou wrote: > > > We already have a bunch of bizarrely unrelated stuff in collections > > (such as Callable), so we could put enum there too. > > Why not just "enum" (i.e., "from enum import [...]" or "import > enum.[...]")? Enumerations are one of the basic kinds of types overall > (speaking informally and independent of any specific language) - they > aren't at all exotic. Enumerations aren't a type at all (they have no distinguishing property). > And "Flat is better than nested", after all. Not when it means creating a separate module for every micro-feature. Regards Antoine. From ijmorlan at uwaterloo.ca Tue Nov 23 18:32:15 2010 From: ijmorlan at uwaterloo.ca (Isaac Morland) Date: Tue, 23 Nov 2010 12:32:15 -0500 (EST) Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290528319.3642.11.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> Message-ID: On Tue, 23 Nov 2010, Antoine Pitrou wrote: > We already have a bunch of bizarrely unrelated stuff in collections > (such as Callable), so we could put enum there too. Why not just "enum" (i.e., "from enum import [...]" or "import enum.[...]")? Enumerations are one of the basic kinds of types overall (speaking informally and independent of any specific language) - they aren't at all exotic. And "Flat is better than nested", after all. Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist From ijmorlan at uwaterloo.ca Tue Nov 23 18:50:31 2010 From: ijmorlan at uwaterloo.ca (Isaac Morland) Date: Tue, 23 Nov 2010 12:50:31 -0500 (EST) Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290533860.3642.73.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: On Tue, 23 Nov 2010, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 12:32 -0500, Isaac Morland a ?crit : >> On Tue, 23 Nov 2010, Antoine Pitrou wrote: >> >>> We already have a bunch of bizarrely unrelated stuff in collections >>> (such as Callable), so we could put enum there too. >> >> Why not just "enum" (i.e., "from enum import [...]" or "import >> enum.[...]")? Enumerations are one of the basic kinds of types overall >> (speaking informally and independent of any specific language) - they >> aren't at all exotic. > > Enumerations aren't a type at all (they have no distinguishing > property). Each enumeration is a type (well, OK, not in every language, presumably, but certainly in many languages). The word "basic" is more important than "types" in my sentence - the point is that an enumeration capability is a very common one in a type system, and is very general, not specific to any particular application. >> And "Flat is better than nested", after all. > > Not when it means creating a separate module for every micro-feature. Classes have their own keyword. I don't think it's disproportionate to give enums a top-level module name. Having said that, I understand we're trying to have a not-too-flat module namespace and I can see the sense in putting it in "collections". But I think the idea that enumerations are of very wide applicability and hence deserve a shorter name should be seriously considered. I'll leave it at that, except for: Hey, how about this syntax: enum Colors: red = 0 green = 10 blue (blue gets the value 11) ;-) Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist From fdrake at acm.org Tue Nov 23 18:57:20 2010 From: fdrake at acm.org (Fred Drake) Date: Tue, 23 Nov 2010 12:57:20 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290533860.3642.73.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: On Tue, Nov 23, 2010 at 12:37 PM, Antoine Pitrou wrote: > Enumerations aren't a type at all (they have no distinguishing > property). In any given language, this may be true, or not. Whether they should be distinct in Python is core to the current discussion. >From a backward-compatibility perspective, what makes sense depends on whether they're used to implement existing constants (socket.AF_INET, etc.) or if they reserved for new features only. ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From solipsis at pitrou.net Tue Nov 23 19:06:42 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 19:06:42 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: <1290535602.3642.87.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 12:57 -0500, Fred Drake a ?crit : > On Tue, Nov 23, 2010 at 12:37 PM, Antoine Pitrou wrote: > > Enumerations aren't a type at all (they have no distinguishing > > property). > > In any given language, this may be true, or not. Whether they should > be distinct in Python is core to the current discussion. I meant "type" in the structural sense (hence the parenthesis). enums are just auto-generated constants. Since Python makes it trivial to generate sequential integers, there's no need for a specific "enum" construct. Now you may argue that enums should be strongly-typed, but that would be a bit backwards given Python's preference for duck-typing. > From a backward-compatibility perspective, what makes sense depends on > whether they're used to implement existing constants (socket.AF_INET, > etc.) or if they reserved for new features only. It's not only backwards compatibility. New features relying on C APIs have to be able to map constants to the integers used in the C library. It would be much better if this were done naturally rather than through explicit conversion maps. (this really means subclassing int, if we don't want to complicate C-level code) Regards Antoine. From solipsis at pitrou.net Tue Nov 23 19:07:56 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 19:07:56 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: <1290535676.3642.89.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 12:50 -0500, Isaac Morland a ?crit : > Each enumeration is a type (well, OK, not in every language, presumably, > but certainly in many languages). The word "basic" is more important than > "types" in my sentence - the point is that an enumeration capability is a > very common one in a type system, and is very general, not specific to any > particular application. Python already has an enumeration capability. It's called range(). There's nothing else that C enums have. AFAICT, neither do enums in other mainstream languages (assuming they even exist; I don't remember Perl, PHP or Javascript having anything like that, but perhaps I'm mistaken). Regards Antoine. From v+python at g.nevcal.com Tue Nov 23 19:56:20 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 10:56:20 -0800 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: <4CEBABBA.9050002@v.loewis.de> References: <4CEA0246.9080607@g.nevcal.com> <4CEB97C7.1070708@g.nevcal.com> <4CEBABBA.9050002@v.loewis.de> Message-ID: <4CEC0E54.5070101@g.nevcal.com> On 11/23/2010 3:55 AM, "Martin v. L?wis" wrote: > Am 23.11.2010 11:55, schrieb Amaury Forgeot d'Arc: >> Hi, >> >> 2010/11/23 Glenn Linderman: >>> File "C:\Python32\lib\random.py", line 108, in seed >>> a = int.from_bytes(_urandom(32), 'big') >>> WindowsError: [Error -2146893818] Invalid Signature >> In the subprocess documentation http://docs.python.org/library/subprocess.html >> """On Windows, in order to run a side-by-side assembly the specified >> env *must* include a valid SystemRoot.""" > Indeed, setting SystemRoot might solve this problem. According to > > http://jpassing.com/2009/12/28/the-hidden-danger-of-forgetting-to-specify-systemroot-in-a-custom-environment-block/ > > CrypoAPI, in Windows 7, requires this variable be set. Failure to > find the enhanced crypto provider would explain why the "random" > module of Python fails to work. > > The specific cause is in the registry: > HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Cryptography\Defaults\Provider\Microsoft > Strong Cryptographic Provider has as it's ImagePath value > > %SystemRoot%\system32\rsaenh.dll > > So the registry (and COM) do rely on environment variables. > > Regards, > Martin I find it sad but hilarious that after working so hard to remove the need for environment variables from Windows that M$ has introduced new dependencies on them. I wonder if this particular registry variable is simply an oversight/bug on M$' part, that they will eventually fix, or if it a turnaround toward the use of more environment variables in the future. Hmm. Time will tell, I suppose. I'm unaware of any benefits in _changing_ SystemRoot to other values, so not pre-expanding it in that registry location seems only to add an unnecessary dependency on the environment. Indeed, preserving that one environment variable allows my version of http.server to proceed with, as far as initial testing can determine, proper behavior. Thanks for your help in figuring this out. That was a lot faster than a "binary search" to choose which variable(s) to preserve. My purpose in such testing was two-fold: firstly, web servers, for security purposes, generally limit the number of environment variables that are seen by CGI programs, and secondly, in debugging whether or not http.server was properly setting the necessary environment variables, the many other environment variables were cluttering up log dumps of all environment variables. It will be nicer to limit the "passed through" environment variables to SystemRoot, as see how things go. I have read some about side-by-side assemblies but had considered them a good reason to stick with the outdated M$VC 6.0 compiler, which doesn't seem to need to create them, and their myriad requirements, which seem far from necessary for simply compiling a program. I was disappointed to realize that Python was heading down the path of using the newer tools that create side-by-side assemblies, but I suppose using an old and crufty compiler like M$VC 6.0 cannot support some of the newer features of Windows, which may seem to be necessary to some.... like 64-bit support, which does seem necessary, even to me. I was well aware that shortcuts and the registry _may_ refer to environment variables, and have a number of environment variables of my own which leverage that capability, to avoid hard-coded drive letters and paths in certain areas, and for the convenience of shorting the specification of some of the long-winded path names that Windows foists upon us (some of those have been significantly shortened in Windows 6.1, and maybe 6.0 which I used only for 2 months with disgust; 6.1 has helped alleviate the disgust, but I still recommend XP for people that don't need 64-bit capabilities). -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue Nov 23 19:58:37 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 10:58:37 -0800 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: References: <4CEA0246.9080607@g.nevcal.com> <4CEAE6A7.3010902@g.nevcal.com> Message-ID: <4CEC0EDD.5080604@g.nevcal.com> On 11/22/2010 2:56 PM, Tim Lesher wrote: > On Mon, Nov 22, 2010 at 16:54, Glenn Linderman wrote: >> I suppose it is possible that some environment variables are used by Python >> directly (but I can't seem to find a documented list of them) although I >> would expect that usage to be optional, with fall-back defaults when they >> don't exist. > I can verify that that's the case: Python (at least through 3.1.2) > runs fine on Windows platforms when environment variables are > completely unavailable. I know that from running our port for Windows > CE (which has no environment variables at all), cross-compiled for > Windows XP. Is the Windows CE port generally available? From where? The CE ports I have found in past searches seem to have been quite outdated and not much on-going activity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Nov 23 20:11:06 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 23 Nov 2010 14:11:06 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Nov 22, 2010 at 1:13 PM, Raymond Hettinger wrote: .. > Any explanation we give users needs to let them know two things: > * that we cover the entire range of unicode not just BMP > * that sometimes len(chr(i)) is one and sometimes two This discussion motivated me to start looking into how well Python library itself is prepared to deal with len(chr(i)) = 2. I was not surprised to find that textwrap does not handle the issue that well: >>> len(wrap(' \U00010140' * 80, 20)) 12 >>> len(wrap(' \U00000140' * 80, 20)) 8 That module should probably be rewritten to properly implement the Unicode line breaking algorithm . Yet finding a bug in a str object method after a 5 min review was a bit discouraging: >>> 'xyz'.center(20, '\U00010140') Traceback (most recent call last): File "", line 1, in TypeError: The fill character must be exactly one character long Given the apparent difficulty of writing even basic text processing algorithms in presence of surrogate pairs, I wonder how wise it is to expose Python users to them. As Wikipedia explains, [1] """ Because the most commonly used characters are all in the Basic Multilingual Plane, converting between surrogate pairs and the original values is often not tested thoroughly. This leads to persistent bugs, and potential security holes, even in popular and well-reviewed application software. """ Since UCS-2 (the Character Encoding Form (CEF)) is now defined [1] to cover only BMP, maybe rather than changing the terms used in the reference manual, we should tighten the code to conform to the updated standards? Again, given that the str object itself has at least one non-BMP character bug as we are closing on the third major release of py3k, how likely are 3rd party developers to get their libraries right as they port to 3.x? [1] http://en.wikipedia.org/wiki/UTF-16/UCS-2 [2] http://unicode.org/reports/tr17/#CharacterEncodingForm From amauryfa at gmail.com Tue Nov 23 20:19:28 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Tue, 23 Nov 2010 20:19:28 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: 2010/11/23 Alexander Belopolsky : > This discussion motivated me to start looking into how well Python > library itself is prepared to deal with len(chr(i)) = 2. ?I was not > surprised to find that textwrap does not handle the issue that well: > >>>> len(wrap(' \U00010140' * 80, 20)) > 12 >>>> len(wrap(' \U00000140' * 80, 20)) > 8 > > That module should probably be rewritten to properly implement ?the > Unicode line breaking algorithm > . > > Yet finding a bug in a str object method after a 5 min review was a > bit discouraging: > >>>> 'xyz'.center(20, '\U00010140') > Traceback (most recent call last): > ?File "", line 1, in > TypeError: The fill character must be exactly one character long > > Given the apparent difficulty of writing even basic text processing > algorithms in presence of surrogate pairs, I wonder how wise it is to > expose Python users to them. This was already discussed two years ago: http://mail.python.org/pipermail/python-dev/2008-July/080900.html So yes, wrap() and center() should be fixed. -- Amaury Forgeot d'Arc From janssen at parc.com Tue Nov 23 20:26:57 2010 From: janssen at parc.com (Bill Janssen) Date: Tue, 23 Nov 2010 11:26:57 PST Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: <58396.1290540417@parc.com> Isaac Morland wrote: > On Tue, 23 Nov 2010, Antoine Pitrou wrote: > > > Le mardi 23 novembre 2010 ? 12:32 -0500, Isaac Morland a ?crit : > >> On Tue, 23 Nov 2010, Antoine Pitrou wrote: > >> > >>> We already have a bunch of bizarrely unrelated stuff in collections > >>> (such as Callable), so we could put enum there too. > >> > >> Why not just "enum" (i.e., "from enum import [...]" or "import > >> enum.[...]")? Enumerations are one of the basic kinds of types overall > >> (speaking informally and independent of any specific language) - they > >> aren't at all exotic. > > > > Enumerations aren't a type at all (they have no distinguishing > > property). Not in C, but in some other languages. > Each enumeration is a type (well, OK, not in every language, > presumably, but certainly in many languages). The main purpose of that is to be able to catch type mismatches with static typing, though. Seems kind of pointless for Python. > Classes have their own keyword. I don't think it's disproportionate > to give enums a top-level module name. I do. > Hey, how about this syntax: > > enum Colors: > red = 0 > green = 10 > blue Why not class Color: red = (255, 0, 0) green = (0, 255, 0) blue = (0, 0, 255) Seems to handle the situation OK. Bill From mal at egenix.com Tue Nov 23 20:31:37 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 23 Nov 2010 20:31:37 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEC1699.40700@egenix.com> Alexander Belopolsky wrote: > On Mon, Nov 22, 2010 at 1:13 PM, Raymond Hettinger > wrote: > .. >> Any explanation we give users needs to let them know two things: >> * that we cover the entire range of unicode not just BMP >> * that sometimes len(chr(i)) is one and sometimes two > > This discussion motivated me to start looking into how well Python > library itself is prepared to deal with len(chr(i)) = 2. I was not > surprised to find that textwrap does not handle the issue that well: > >>>> len(wrap(' \U00010140' * 80, 20)) > 12 >>>> len(wrap(' \U00000140' * 80, 20)) > 8 > > That module should probably be rewritten to properly implement the > Unicode line breaking algorithm > . > > Yet finding a bug in a str object method after a 5 min review was a > bit discouraging: > >>>> 'xyz'.center(20, '\U00010140') > Traceback (most recent call last): > File "", line 1, in > TypeError: The fill character must be exactly one character long > > Given the apparent difficulty of writing even basic text processing > algorithms in presence of surrogate pairs, I wonder how wise it is to > expose Python users to them. What's the alternative ? Without surrogates, Python users with UCS-2 build (e.g. the Windows Python users) would not be allowed to play with non-BMP code points. IMHO, it's better to fix the stdlib. This is a long process, as you can see with the Python3 stdlib evolution, but Python will eventually get there. > As Wikipedia explains, [1] > > """ > Because the most commonly used characters are all in the Basic > Multilingual Plane, converting between surrogate pairs and the > original values is often not tested thoroughly. This leads to > persistent bugs, and potential security holes, even in popular and > well-reviewed application software. > """ > > Since UCS-2 (the Character Encoding Form (CEF)) is now defined [1] to > cover only BMP, maybe rather than changing the terms used in the > reference manual, we should tighten the code to conform to the updated > standards? Can we please stop turning this around over and over again :-) UCS-2 has never supported anything other than the BMP. However, you can interpret sequences of UCS-2 code unit as UTF-16 and then get access to the full Unicode character set. We've been doing this in codecs ever since UCS-4 builds were introduced some 8-9 years ago. The change to have chr(i) return surrogates on UCS-2 builds was perhaps done too early, but then, without such changes you'd never notice that your code doesn't work well with surrogates. It's just one piece of the puzzle when going from 8-bit strings to Unicode. > Again, given that the str object itself has at least one non-BMP > character bug as we are closing on the third major release of py3k, > how likely are 3rd party developers to get their libraries right as > they port to 3.x? > > [1] http://en.wikipedia.org/wiki/UTF-16/UCS-2 > [2] http://unicode.org/reports/tr17/#CharacterEncodingForm -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 23 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Tue Nov 23 20:34:17 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 23 Nov 2010 11:34:17 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290535602.3642.87.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> Message-ID: On Tue, Nov 23, 2010 at 10:06 AM, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 12:57 -0500, Fred Drake a ?crit : >> On Tue, Nov 23, 2010 at 12:37 PM, Antoine Pitrou wrote: >> > Enumerations aren't a type at all (they have no distinguishing >> > property). >> >> In any given language, this may be true, or not. ?Whether they should >> be distinct in Python is core to the current discussion. > > I meant "type" in the structural sense (hence the parenthesis). enums > are just auto-generated constants. Since Python makes it trivial to > generate sequential integers, there's no need for a specific "enum" > construct. > > Now you may argue that enums should be strongly-typed, but that would be > a bit backwards given Python's preference for duck-typing. Please take a step back. The best example of the utility of enums even for Python is bool. I resisted this for the longest time but people kept asking for it. Some properties of bool: (a) bool is a (final) subclass of int, and an int is acceptable in a pinch where a bool is expected (b) bool values are guaranteed unique -- there is only one instance with value True, and only one with value False (c) bool values have a str() and repr() that shows their name instead of their value (but not their class -- that's rarely an issue, and makes the output more compact) I think it makes sense to add a way to the stdlib to add other types like bool. I think (c) is probably the most important feature, followed by (a) -- except the *final* part: I want to subclass enums. (b) is probably easy to do but I don't think it matters that much in practice. >> From a backward-compatibility perspective, what makes sense depends on >> whether they're used to implement existing constants (socket.AF_INET, >> etc.) or if they reserved for new features only. > > It's not only backwards compatibility. New features relying on C APIs > have to be able to map constants to the integers used in the C library. > It would be much better if this were done naturally rather than through > explicit conversion maps. I'm not sure what you mean here. Can you give an example of what you mean? I agree that it should be possible to make pretty much any constant in the OS modules enums -- even if the values vary across platforms. > (this really means subclassing int, if we don't want to complicate > C-level code) Right. FWIW I don't think I'm particular about the exact API to construct a new enum type in Python code; I think in most cases explicitly assigning values is fine. Often the values are constrained by something external anyway; it should be easy to dynamically set the values of a particular enum type (even add new values after the fact). There might also be enums with the same value (even though the mapping from int to enum will then have to pick one). I expect that the API to convert between enums and bare ints should be i = int(e) and e = (i). It would be nice if s = str(e) and e = (s) would work too. -- --Guido van Rossum (python.org/~guido) From barry at python.org Tue Nov 23 20:40:45 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 23 Nov 2010 14:40:45 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: <20101123144045.17b00ac4@mission> On Nov 23, 2010, at 12:57 PM, Fred Drake wrote: >>From a backward-compatibility perspective, what makes sense depends on >whether they're used to implement existing constants (socket.AF_INET, >etc.) or if they reserved for new features only. As is usually the case, there's little reason to change existing working code. Enums can be used whenever a module or API is updated. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Tue Nov 23 20:47:47 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 23 Nov 2010 14:47:47 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEBF3A9.3060604@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <20101123112703.42b42812@mission> <4CEBF3A9.3060604@voidspace.org.uk> Message-ID: <20101123144747.44a2f4c9@mission> On Nov 23, 2010, at 05:02 PM, Michael Foord wrote: >> * Enums are not subclassed from ints or strs. They are a distinct data type >> that can be converted to and from ints and strs. EIBTI. > >But if we are to use it *in* the standard library (as opposed to merely >adding a module *to* the standard library) there are backwards compatibility >concerns. Where modules are already using integers for constants then >integers still need to work. Is int(enum_value) enough, or must the enum value actually *be* an int? >One easy way to achieve this is to subclass integer. If we don't do that >(assuming we decide that putting a solution in the standard library is >appropriate) then we'll have to evaluate what we mean by backwards >compatible. If the modules that use the constants aren't to change then >comparing equal to the underlying value is the minimum (so that the original >value can still be used in place of the new named constant). Not sure if >you'd be happy to make that change in flufl.enum. I'm not sure either. In flufl.enum enum_class(i) also works as expected. >> * The typical way to create them is through a simple, but explicit class >> definition. I personally like being explicit about the item values, and >> the assignments are required to make the metaclass work properly, but >> Michael's convenience patch is totally appropriate for cases where you >> don't care, or you want a one-liner. > >If make_enum was to take a set of values to use (as Antoine suggested) I >don't see what's un-explicit about it. When I saw your patch I immediately thought that I could add a default argument that was something like `int_iter`, i.e. an iterator of integers for the values in the string. I suspect YAGNI, which is why I didn't just add it, but I'm not totally opposed to it. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Tue Nov 23 21:01:02 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 23 Nov 2010 15:01:02 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123165252.0C0743A4114@sparrow.telecommunity.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <20101123113127.78506cb5@mission> <20101123165252.0C0743A4114@sparrow.telecommunity.com> Message-ID: <20101123150102.75f6256c@mission> On Nov 23, 2010, at 11:52 AM, P.J. Eby wrote: >This reminds me: a stdlib enum should support proper pickling and copying; >i.e.: > > assert SomeEnum.anEnum is pickle.loads(pickle.dumps(SomeEnum.anEnum)) > >This could probably be implemented by adding something like: > > def __reduce__(self): > return getattr, (self._class, self._enumname) > >in the EnumValue class. Excellent idea, thanks. Added to flufl.enum in r38. However, only enums created with the class syntax can be pickled though. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From guido at python.org Tue Nov 23 21:00:51 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 23 Nov 2010 12:00:51 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123144747.44a2f4c9@mission> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <20101123112703.42b42812@mission> <4CEBF3A9.3060604@voidspace.org.uk> <20101123144747.44a2f4c9@mission> Message-ID: On Tue, Nov 23, 2010 at 11:47 AM, Barry Warsaw wrote: > On Nov 23, 2010, at 05:02 PM, Michael Foord wrote: > >>> * Enums are not subclassed from ints or strs. ?They are a distinct data type >>> ? ?that can be converted to and from ints and strs. ?EIBTI. >> >>But if we are to use it *in* the standard library (as opposed to merely >>adding a module *to* the standard library) there are backwards compatibility >>concerns. Where modules are already using integers for constants then >>integers still need to work. > > Is int(enum_value) enough, or must the enum value actually *be* an int? I vote for *be*, following bool's example. >>One easy way to achieve this is to subclass integer. If we don't do that >>(assuming we decide that putting a solution in the standard library is >>appropriate) then we'll have to evaluate what we mean by backwards >>compatible. If the modules that use the constants aren't to change then >>comparing equal to the underlying value is the minimum (so that the original >>value can still be used in place of the new named constant). Not sure if >>you'd be happy to make that change in flufl.enum. > > I'm not sure either. ?In flufl.enum enum_class(i) also works as expected. > >>> * The typical way to create them is through a simple, but explicit class >>> ? ?definition. ?I personally like being explicit about the item values, and >>> ? ?the assignments are required to make the metaclass work properly, but >>> ? ?Michael's convenience patch is totally appropriate for cases where you >>> ? ?don't care, or you want a one-liner. >> >>If make_enum was to take a set of values to use (as Antoine suggested) I >>don't see what's un-explicit about it. > > When I saw your patch I immediately thought that I could add a default > argument that was something like `int_iter`, i.e. an iterator of integers for > the values in the string. ?I suspect YAGNI, which is why I didn't just add it, > but I'm not totally opposed to it. > > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- --Guido van Rossum (python.org/~guido) From jcea at jcea.es Tue Nov 23 21:33:02 2010 From: jcea at jcea.es (Jesus Cea) Date: Tue, 23 Nov 2010 21:33:02 +0100 Subject: [Python-Dev] Sporadic problems with bugs.python.org Message-ID: <4CEC24FE.70107@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Happen to me last Sunday, and happening just now. I can access http://bugs.python.org/ just fine, but trying to post a message, open a new bug, change nosy, etc., takes a LONG time (minutes) and it is finally failing with a "400 Bad Request" error: """ Bad Request Your browser sent a request that this server could not understand. Apache/2.2.9 (Debian) mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9 OpenSSL/0.9.8g mod_wsgi/2.5 Server at bugs.python.org Port 80 """ Last sunday I was able to open the bug after a time. Today I have been retrying for while, with no luck yet. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOwk/plgi5GaxT1NAQJYuQP+LhEUtOXyaz0Ut6586/cwura87jq/XVxn XatNzwadYNH4yF3ewXVkLk6eSjXOnEszr8kWX3inoLY9ND7o3TCMn5uCKOF2G4Lh sgogv7eB5KEffAaXoxZxT+ZJVYBEPyUISgMeD40DL/tQJIcMBtyZtU1nY5QxwPzN O8mGHBlEGpQ= =i/s7 -----END PGP SIGNATURE----- From martin at v.loewis.de Tue Nov 23 21:33:19 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 23 Nov 2010 21:33:19 +0100 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: <4CEC0E54.5070101@g.nevcal.com> References: <4CEA0246.9080607@g.nevcal.com> <4CEB97C7.1070708@g.nevcal.com> <4CEBABBA.9050002@v.loewis.de> <4CEC0E54.5070101@g.nevcal.com> Message-ID: <4CEC250F.6060102@v.loewis.de> > I have read some about side-by-side assemblies but had considered them a > good reason to stick with the outdated M$VC 6.0 compiler, which doesn't > seem to need to create them, and their myriad requirements, which seem > far from necessary for simply compiling a program. I was disappointed > to realize that Python was heading down the path of using the newer > tools that create side-by-side assemblies, but I suppose using an old > and crufty compiler like M$VC 6.0 cannot support some of the newer > features of Windows, which may seem to be necessary to some.... like > 64-bit support, which does seem necessary, even to me. The rationale for moving along with the releases is different, though: you cannot obtain the old versions anymore, except perhaps on Ebay. So new developers coming to Python would not be able to build Python extensions if we didn't always try to use a compiler that is still available (and we are stressing that a little bit: 3.2 will use VS 2008, even though it has been already superceded). In any case, VS 2010 will stop using SxS for the CRT. Regards, Martin From v+python at g.nevcal.com Tue Nov 23 21:42:40 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 12:42:40 -0800 Subject: [Python-Dev] is this a bug? no environment variables In-Reply-To: <4CEC250F.6060102@v.loewis.de> References: <4CEA0246.9080607@g.nevcal.com> <4CEB97C7.1070708@g.nevcal.com> <4CEBABBA.9050002@v.loewis.de> <4CEC0E54.5070101@g.nevcal.com> <4CEC250F.6060102@v.loewis.de> Message-ID: <4CEC2740.7@g.nevcal.com> On 11/23/2010 12:33 PM, "Martin v. L?wis" wrote: > In any case, VS 2010 will stop using SxS for the CRT. Good news! Maybe M$VC will become a useful compiler yet again :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue Nov 23 21:43:05 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 12:43:05 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> Message-ID: <4CEC2759.40203@g.nevcal.com> On 11/23/2010 11:34 AM, Guido van Rossum wrote: > The best example of the utility of enums even for Python is bool. I > resisted this for the longest time but people kept asking for it. Some > properties of bool: > > (a) bool is a (final) subclass of int, and an int is acceptable in a > pinch where a bool is expected > (b) bool values are guaranteed unique -- there is only one instance > with value True, and only one with value False > (c) bool values have a str() and repr() that shows their name instead > of their value (but not their class -- that's rarely an issue, and > makes the output more compact) > > I think it makes sense to add a way to the stdlib to add other types > like bool. I think (c) is probably the most important feature, > followed by (a) -- except the *final* part: I want to subclass enums. > (b) is probably easy to do but I don't think it matters that much in > practice. I was concerned about uniqueness constraints some were touting. While that can be a useful property for some enumerations, it can also be convenient for other enumerations to have multiple names map to the same value. Bool seems appropriately not extensible to additional values. While there are tri-valued (and other) logic systems, they deserve a separate namespace. Bool seems to be an example, then of a "set of distingushed names, with values associated to the names", and is restricted to [two] [unique] integer values. C/C++/C# enum is somewhat like that, and is also restricted to integer values [not necessarily unique]. I wonder if a set of distinguished names need to be restricted to integer values to be useful, although I have no doubt that distinguished names with integer values are useful. Someone used an example of color names class having RGB tuple values, which is a counter example to a restriction to integers. I can think of others as well. Perhaps a "set of distinguished names, with values associated to the names" is really a dict, with the unique names restricted to Python identifier syntax (to be useful), and the values unrestricted. The type of the named value, and the value of the named value, seem not to need to be restricted. But the implementations Bool = dict('False': 0, 'True': 1) or alternately class Bool(): self.False = 0 self.True = 1 is missing a couple characteristics of Python's present bool: the names are not special, and the values are not immutable. Perhaps games could be played to make the second implementation effectively immutable. So I think the real trick of the "enum" (or a generalized "distinguished names") is in the naming. A technique to import the keys that are legal Python identifiers from a dict into a namespace, and retain henceforth immutable values for those names would permit the syntactical usage that people are accustomed to from the C/C++/C# enum, but with extended ranges and types of values, and it seems Bool could be mostly reimplemented via that technique. What is still missing? The "debugging" help: the values, once imported, should not become "just" values of their type, but rather a new type of value, that has an associated name (and type, I think). Whatever magic is worked under the covers to make sure that there is just one True and just one False, so that they can be distinguished from the values 1 and 0, and so reported, should also be applied to these values. So there need not be new syntax for creating the name/value pairs; just use dict. The only new API would be the code that "imports" the dict into the local namespace. Note that other scoped definitions of True and False are not possible today because True and False are keywords. It would be inappropriate to define these distinguished names as all being keywords, so it seems like one could still override the names, even once defined, but such overridden names would lose their special value that makes them a distinguished name. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 23 21:48:43 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 21:48:43 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> Message-ID: <1290545323.3642.101.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 11:34 -0800, Guido van Rossum a ?crit : > >> From a backward-compatibility perspective, what makes sense depends on > >> whether they're used to implement existing constants (socket.AF_INET, > >> etc.) or if they reserved for new features only. > > > > It's not only backwards compatibility. New features relying on C APIs > > have to be able to map constants to the integers used in the C library. > > It would be much better if this were done naturally rather than through > > explicit conversion maps. > > I'm not sure what you mean here. Can you give an example of what you > mean? I agree that it should be possible to make pretty much any > constant in the OS modules enums -- even if the values vary across > platforms. I mean that PyArg_ParseTuple should continue to be pratical even if e.g. os.SEEK_SET and friends become named constants. It implies that the various format codes such as "i", "l", etc. are still usable with those constants. Hence: > > (this really means subclassing int, if we don't want to complicate > > C-level code) > > Right. :-) Regards Antoine. From rrr at ronadam.com Tue Nov 23 22:03:21 2010 From: rrr at ronadam.com (Ron Adam) Date: Tue, 23 Nov 2010 15:03:21 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290535676.3642.89.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535676.3642.89.camel@localhost.localdomain> Message-ID: On 11/23/2010 12:07 PM, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 12:50 -0500, Isaac Morland a ?crit : >> Each enumeration is a type (well, OK, not in every language, presumably, >> but certainly in many languages). The word "basic" is more important than >> "types" in my sentence - the point is that an enumeration capability is a >> very common one in a type system, and is very general, not specific to any >> particular application. > > Python already has an enumeration capability. It's called range(). > There's nothing else that C enums have. AFAICT, neither do enums in > other mainstream languages (assuming they even exist; I don't remember > Perl, PHP or Javascript having anything like that, but perhaps I'm > mistaken). Aren't we forgetting enumerate? >>> colors = 'BLACK BROWN RED ORANGE YELLOW GREEN BLUE VIOLET GREY WHITE' >>> dict(e for e in enumerate(colors.split())) {0: 'BLACK', 1: 'BROWN', 2: 'RED', 3: 'ORANGE', 4: 'YELLOW', 5: 'GREEN', 6: 'BLUE', 7: 'VIOLET', 8: 'GREY', 9: 'WHITE'} >>> dict((f, n) for (n, f) in enumerate(colors.split())) {'BLUE': 6, 'BROWN': 1, 'GREY': 8, 'YELLOW': 4, 'GREEN': 5, 'VIOLET': 7, 'ORANGE': 3, 'BLACK': 0, 'WHITE': 9, 'RED': 2} Most other languages that use numbered constants number them by base n^2. >>> [x**2 for x in range(10)] [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] Binary flags have the advantage of saving memory because you can assign more than one to a single integer. Another advantage is other languages use them so it can make it easier interface with them. There also may be some performance advantages as well since you can test for multiple flags with a single comparison. Sets of strings can also work when you don't need to associate a numeric value to the constant. ie... the constant is the value. In this case the set supplies the api. Cheers, Ron From glyph at twistedmatrix.com Tue Nov 23 22:06:41 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 23 Nov 2010 16:06:41 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123153743.3D9451B8ED4@shell-too.nominum.com> References: <20101123153743.3D9451B8ED4@shell-too.nominum.com> Message-ID: On Nov 23, 2010, at 10:37 AM, Ben.Cottrell at nominum.com wrote: > I'd prefer not to think of the number of times I've made the following mistake: > > s = socket.socket(socket.SOCK_DGRAM, socket.AF_INET) If it's any consolation, it's fewer than the number of times I have :). (More fun, actually, is where you pass a file descriptor to the wrong argument of 'fromfd'...) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Tue Nov 23 22:06:45 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 24 Nov 2010 08:06:45 +1100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290526253.3642.9.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> Message-ID: <4CEC2CE5.8000302@pearwood.info> Antoine Pitrou wrote: > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', > values=range(1, 3)) > > Again, auto-enumeration is useless since it's trivial to achieve > explicitly. That doesn't make auto-enumeration "useless". Unnecessary, perhaps, but not useless. But even then it's only unnecessary if the number of constants are small enough that you can see how many there are without counting (essentially, 4 or fewer). When you have more, it becomes error-prone and a nuisance to have to count them by hand: Constants = make_constants( 'Constants', 'ST_MODE ST_INO ST_DEV ST_NLINK ST_UID ST_GID' \ 'ST_SIZE ST_ATIME ST_MTIME ST_CTIME', values=range(10) ) -- Steven From glyph at twistedmatrix.com Tue Nov 23 22:10:00 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 23 Nov 2010 16:10:00 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290524466.3642.4.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> Message-ID: <935CA187-6799-437E-8F18-2A35886B5117@twistedmatrix.com> On Nov 23, 2010, at 10:01 AM, Antoine Pitrou wrote: > Well, it is easy to assign range(N) to a tuple of names when desired. I > don't think an automatically-enumerating constant generator is needed. I don't think that numerical enumerations are the only kind of constants we're talking about. Others have already mentioned strings. Also, see for some other use-cases. Since this isn't coming to 2.x, we're probably going to do our own thing anyway (unless it turns out that flufl.enum is so great that we want to add another dependency...) but I'm hoping that the outcome of this discussion will point to something we can be compatible with. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Nov 23 22:15:20 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 23 Nov 2010 22:15:20 +0100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <935CA187-6799-437E-8F18-2A35886B5117@twistedmatrix.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <935CA187-6799-437E-8F18-2A35886B5117@twistedmatrix.com> Message-ID: <1290546920.3642.104.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 16:10 -0500, Glyph Lefkowitz a ?crit : > > On Nov 23, 2010, at 10:01 AM, Antoine Pitrou wrote: > > > Well, it is easy to assign range(N) to a tuple of names when > > desired. I > > don't think an automatically-enumerating constant generator is > > needed. > > I don't think that numerical enumerations are the only kind of > constants we're talking about. Others have already mentioned strings. > Also, see for some other use-cases. Since this > isn't coming to 2.x, we're probably going to do our own thing anyway > (unless it turns out that flufl.enum is so great that we want to add > another dependency...) but I'm hoping that the outcome of this > discussion will point to something we can be compatible with. I think that asking for too many features would get in the way, and also make the API quite un-Pythonic. If you want your values to be e.g. OR'able, just choose your values wisely ;) Regards Antoine. From rrr at ronadam.com Tue Nov 23 22:21:17 2010 From: rrr at ronadam.com (Ron Adam) Date: Tue, 23 Nov 2010 15:21:17 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535676.3642.89.camel@localhost.localdomain> Message-ID: Oops.. x**2 should have been 2**x below. On 11/23/2010 03:03 PM, Ron Adam wrote: > > > On 11/23/2010 12:07 PM, Antoine Pitrou wrote: >> Le mardi 23 novembre 2010 ? 12:50 -0500, Isaac Morland a ?crit : >>> Each enumeration is a type (well, OK, not in every language, presumably, >>> but certainly in many languages). The word "basic" is more important than >>> "types" in my sentence - the point is that an enumeration capability is a >>> very common one in a type system, and is very general, not specific to any >>> particular application. >> >> Python already has an enumeration capability. It's called range(). >> There's nothing else that C enums have. AFAICT, neither do enums in >> other mainstream languages (assuming they even exist; I don't remember >> Perl, PHP or Javascript having anything like that, but perhaps I'm >> mistaken). > > > Aren't we forgetting enumerate? > > >>> colors = 'BLACK BROWN RED ORANGE YELLOW GREEN BLUE VIOLET GREY WHITE' > > >>> dict(e for e in enumerate(colors.split())) > {0: 'BLACK', 1: 'BROWN', 2: 'RED', 3: 'ORANGE', 4: 'YELLOW', 5: 'GREEN', 6: > 'BLUE', 7: 'VIOLET', 8: 'GREY', 9: 'WHITE'} > > >>> dict((f, n) for (n, f) in enumerate(colors.split())) > {'BLUE': 6, 'BROWN': 1, 'GREY': 8, 'YELLOW': 4, 'GREEN': 5, 'VIOLET': 7, > 'ORANGE': 3, 'BLACK': 0, 'WHITE': 9, 'RED': 2} > > > Most other languages that use numbered constants number them by base n^2. > > >>> [x**2 for x in range(10)] > [0, 1, 4, 9, 16, 25, 36, 49, 64, 81] >>> [2**x for x in range(10)] [1, 2, 4, 8, 16, 32, 64, 128, 256, 512] > Binary flags have the advantage of saving memory because you can assign > more than one to a single integer. Another advantage is other languages use > them so it can make it easier interface with them. There also may be some > performance advantages as well since you can test for multiple flags with a > single comparison. > > Sets of strings can also work when you don't need to associate a numeric > value to the constant. ie... the constant is the value. In this case the > set supplies the api. > > Cheers, > Ron > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > From steve at pearwood.info Tue Nov 23 22:30:37 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 24 Nov 2010 08:30:37 +1100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290535676.3642.89.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535676.3642.89.camel@localhost.localdomain> Message-ID: <4CEC327D.1050503@pearwood.info> Antoine Pitrou wrote: > Python already has an enumeration capability. It's called range(). > There's nothing else that C enums have. AFAICT, neither do enums in > other mainstream languages (assuming they even exist; I don't remember > Perl, PHP or Javascript having anything like that, but perhaps I'm > mistaken). In Pascal, enumerations are a type, and the value of the named values are an implementation detail. E.g. one would define an enumerated type: type flavour = (sweet, salty, sour, bitter, umame); var x: flavour; and then you would write something like: x := sour; Notice that the constants sweet etc. aren't explicitly predefined, since they're purely internal details and the compiler is allowed to number them any way it likes. In Python, we would need stronger guarantees about the values chosen, so that they could be exposed to external modules, pickled, etc. But that doesn't mean we should be forced to specify the values ourselves. -- Steven From greg.ewing at canterbury.ac.nz Tue Nov 23 22:26:58 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 10:26:58 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <20101123154229.474f7a90@pitrou.net> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> Message-ID: <4CEC31A2.5080809@canterbury.ac.nz> Antoine Pitrou wrote: > I don't understand why people insist on calling that an "enum". enum is > a C legacy and it doesn't bring anything useful as I can tell. The usefulness is that they can have a str() or repr() that displays the name of the value instead of an integer. The bool type was added for much the same reason -- otherwise we would simply have gotten builtin names False = 0 and True = 1. -- Greg From greg.ewing at canterbury.ac.nz Tue Nov 23 22:27:02 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 10:27:02 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290524519.3642.5.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <4CEBD624.9000402@voidspace.org.uk> <1290524519.3642.5.camel@localhost.localdomain> Message-ID: <4CEC31A6.5090505@canterbury.ac.nz> Antoine Pitrou wrote: > Well, it's been inherited by C-like languages, no doubt. Like braces and > semicolumns :) The idea isn't confined to the C family. Pascal and many of the languages inspired by it also have enumerated types. -- Greg From tjreedy at udel.edu Tue Nov 23 23:44:07 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Nov 2010 17:44:07 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 11/23/2010 2:11 PM, Alexander Belopolsky wrote: > This discussion motivated me to start looking into how well Python > library itself is prepared to deal with len(chr(i)) = 2. I was not Good idea! > surprised to find that textwrap does not handle the issue that well: > >>>> len(wrap(' \U00010140' * 80, 20)) > 12 >>>> len(wrap(' \U00000140' * 80, 20)) > 8 How well does textwrap handles composable pairs (letter + accent)? Does is count two codepoints as one char space? and avoid putting line breaks between? I suspect textwrap should be regarded as (extended?)_ascii_textwrap. > > That module should probably be rewritten to properly implement the > Unicode line breaking algorithm > . Probably a good idea > Yet finding a bug in a str object method after a 5 min review was a > bit discouraging: > >>>> 'xyz'.center(20, '\U00010140') > Traceback (most recent call last): > File "", line 1, in > TypeError: The fill character must be exactly one character long Again, what does it do with letter + decorator combinations? It seems to me that the whole notion that one code point == one printed character space is broken once one leaves ascii. Perhaps we need an is_uchar function to recognize multi-code sequences, inclusing surrogate pairs, that represent one char for the purpose of character oriented functions. > Given the apparent difficulty of writing even basic text processing > algorithms in presence of surrogate pairs, I wonder how wise it is to > expose Python users to them. As Wikipedia explains, [1] > > """ > Because the most commonly used characters are all in the Basic > Multilingual Plane, converting between surrogate pairs and the > original values is often not tested thoroughly. This leads to > persistent bugs, and potential security holes, even in popular and > well-reviewed application software. > """ So we did not test thoroughly enough and need to add appropriate unit tests as bugs are fixed. -- Terry Jan Reedy From tjreedy at udel.edu Wed Nov 24 00:07:03 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Nov 2010 18:07:03 -0500 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: <4CEC43A4.80907@netwok.org> References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> Message-ID: <4CEC4917.2070508@udel.edu> On 11/23/2010 5:43 PM, ?ric Araujo wrote: >> Modified: python/branches/py3k/Misc/ACKS >> ============================================================================== >> --- python/branches/py3k/Misc/ACKS (original) >> +++ python/branches/py3k/Misc/ACKS Tue Nov 23 21:32:47 2010 >> @@ -1,4 +1,4 @@ >> -Acknowledgements >> +?Acknowledgements > > This change introduced a so-called UTF-8 BOM in the file. Is > TortoiseSvn the culprit or a text editor? I used Notepad to edit the file, TortoiseSvn to commit, the same as I did for #9222, rev86702, Lib\idlelib\IOBinding.py, yesterday. If the latter is OK, perhaps *.py gets filtered better than misc. text files. I believe I have the config as specified in dev/faq. [miscellany] enable-auto-props = yes [auto-props] * = svn:eol-style=native *.c = svn:keywords=Id *.h = svn:keywords=Id *.py = svn:keywords=Id *.txt = svn:keywords=Author Date Id Revision Terry From ijmorlan at uwaterloo.ca Wed Nov 24 00:15:03 2010 From: ijmorlan at uwaterloo.ca (Isaac Morland) Date: Tue, 23 Nov 2010 18:15:03 -0500 (EST) Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <58396.1290540417@parc.com> References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <58396.1290540417@parc.com> Message-ID: On Tue, 23 Nov 2010, Bill Janssen wrote: > The main purpose of that is to be able to catch type mismatches with > static typing, though. Seems kind of pointless for Python. The concept can work dynamically. In fact, the flufl.enum package which has been discussed here makes each enumeration into a separate class so many of the advantages of catching type mismatches are obtained. >> Hey, how about this syntax: >> >> enum Colors: >> red = 0 >> green = 10 >> blue > > Why not > > class Color: > red = (255, 0, 0) > green = (0, 255, 0) > blue = (0, 0, 255) > > Seems to handle the situation OK. Yes, this looks almost exactly like flufl.enum syntax. In any case my suggestion of a new keyword was not meant to be taken seriously. If I ever think I have a good reason to suggest a new keyword I'll sleep on it, take a vacation, and then if I still think a new keyword is justified I will specifically disclaim any possibility of the suggestion being a joke. Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist From db3l.net at gmail.com Wed Nov 24 00:18:33 2010 From: db3l.net at gmail.com (David Bolen) Date: Tue, 23 Nov 2010 18:18:33 -0500 Subject: [Python-Dev] Stable buildbots References: <20101113133712.60e9be27@pitrou.net> <4CEB7E12.1070201@snakebite.org> Message-ID: Trent Nelson writes: > That's interesting. (That kill_python.exe doesn't kill the wedged > processes, but pskill does.) kill_python is pretty simple, it just > calls TerminateProcess() after acquiring a handle with the relevant > PROCESS_TERMINATE access right. (...) > > Are you calling pskill with the -t flag? i.e. kill process and all > dependents? That might be the ticket, especially if killing the child > process that wedged select() is waiting on causes it to return, and > thus, makes it killable. Nope, just "pskill python_d". Haven't bothered to check the pskill source but I'm assuming it's just a basic TerminateProcess. Ideally my quickest workaround would just be to replace the kill_python in the buildbot tools script with that command but of course they could get updated on checkouts and I'm not arguing it's generally appropriate enough to belong in the source. I suspect the problem may be on the "identify which process to kill" rather than the "kill it" part, but it's definitely going to take time to figure that out for sure. While the approach kill_python takes is much more appropriate, since we don't currently have multiple builds running simultaneously (and for me the machines are dedicated as build slaves, so I won't be having my own python_d), a more blanket kill operation is safe enough. > Otherwise, if it happens again, can you try kill_python.exe first, > then pskill, and confirm if the former fails but the latter succeeds? Yeah, I've got a temporary tree with a built-binary around, but still have to make sure of the right way to run it manually in a way that it will do the identification right (which I think also means I need to figure out from which build tree the hung process started). Up until now, typically when I've found a hung setup, the rest of the build tree which originally applied to that process has been cleaned. I definitely sympathize with Martin's position though - it wasn't the simplest tool to write (and I still have some email from him about the week+ it took just to test the process identification part remotely through buildbots at the time), so I regret not jumping right in to try to fix it. But it's just way more effort than typing "pskill python_d", at least with my current availability. -- David From greg.ewing at canterbury.ac.nz Wed Nov 24 00:32:39 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 12:32:39 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290546920.3642.104.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <935CA187-6799-437E-8F18-2A35886B5117@twistedmatrix.com> <1290546920.3642.104.camel@localhost.localdomain> Message-ID: <4CEC4F17.7030600@canterbury.ac.nz> Antoine Pitrou wrote: > I think that asking for too many features would get in the way, and also > make the API quite un-Pythonic. If you want your values to be e.g. > OR'able, just choose your values wisely ;) On the other hand it could be useful to have an easy way to request power-of-2 value assignment, seeing as it's another common pattern. -- Greg From greg.ewing at canterbury.ac.nz Wed Nov 24 00:32:56 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 12:32:56 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <58396.1290540417@parc.com> References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <58396.1290540417@parc.com> Message-ID: <4CEC4F28.7010904@canterbury.ac.nz> Bill Janssen wrote: > The main purpose of that is to be able to catch type mismatches with > static typing, though. Seems kind of pointless for Python. But catching type mismatches with dynamic typing doesn't seem pointless for Python. There's nothing static about the proposals being made here that I can see. > Why not > > class Color: > red = (255, 0, 0) > green = (0, 255, 0) > blue = (0, 0, 255) If all you want is a bunch of named constants, that's fine. But the facilities being discussed here are designed to give you other things as well, such as c = Color.red print(c) printing "red" rather than "(255, 0, 0)". -- Greg From greg.ewing at canterbury.ac.nz Wed Nov 24 00:33:02 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 12:33:02 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290526253.3642.9.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> Message-ID: <4CEC4F2E.6080601@canterbury.ac.nz> Antoine Pitrou wrote: > Constants = make_constants('Constants', 'SOME_CONST OTHER_CONST', > values=range(1, 3)) > > Again, auto-enumeration is useless since it's trivial to achieve > explicitly. But seeing as it's going to be a common thing to do, why not make it the default? When defining an enum, often you don't *care* what the underlying values are, so assigning sequential natural numbers is as good a default as any. In fact, with the Pascal concept of an enumerated type you don't get any choice in the matter. It's only in the C family that you get this bastardised conflation of enumerations with arbitrary named constants... -- Greg From greg.ewing at canterbury.ac.nz Wed Nov 24 00:41:50 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 12:41:50 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <58396.1290540417@parc.com> Message-ID: <4CEC513E.4050603@canterbury.ac.nz> Isaac Morland wrote: > In any case my > suggestion of a new keyword was not meant to be taken seriously. I don't think it need be taken entirely as a joke, either. All the proposed patterns for creating enums that I've seen end up leaving something to be desired. They violate DRY by requiring you to write the class name twice, or they make you write the names of the values in quotes, or some other minor ugliness. While it may be possible to work around these things with sufficient levels of metaclass hackery and black magic, at some point one has to consider whether new syntax might be the least worst option. -- Greg From greg.ewing at canterbury.ac.nz Wed Nov 24 00:49:42 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 24 Nov 2010 12:49:42 +1300 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEC5316.4010608@canterbury.ac.nz> Alexander Belopolsky wrote: > """ > Because the most commonly used characters are all in the Basic > Multilingual Plane, converting between surrogate pairs and the > original values is often not tested thoroughly. This leads to > persistent bugs, and potential security holes, even in popular and > well-reviewed application software. > """ Maybe Python should have used UTF-8 as its internal unicode representation. Then people who were foolish enough to assume one character per string item would have their programs break rather soon under only light unicode testing. :-) -- Greg From foom at fuhm.net Wed Nov 24 01:22:23 2010 From: foom at fuhm.net (James Y Knight) Date: Tue, 23 Nov 2010 19:22:23 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEC5316.4010608@canterbury.ac.nz> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> Message-ID: <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> On Nov 23, 2010, at 6:49 PM, Greg Ewing wrote: > Maybe Python should have used UTF-8 as its internal unicode > representation. Then people who were foolish enough to assume > one character per string item would have their programs break > rather soon under only light unicode testing. :-) You put a smiley, but, in all seriousness, I think that's actually the right thing to do if anyone writes a new programming language. It is clearly the right thing if you don't have to be concerned with backwards-compatibility: nobody really needs to be able to access the Nth codepoint in a string in constant time, so there's not really any point in storing a vector of codepoints. Instead, provide bidirectional iterators which can traverse the string by byte, codepoint, or by grapheme (that is: the set of combining characters + base character that go together, making up one thing which a human would think of as a character). James From jcea at jcea.es Wed Nov 24 01:31:01 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 24 Nov 2010 01:31:01 +0100 Subject: [Python-Dev] Sporadic problems with bugs.python.org In-Reply-To: <4CEC24FE.70107@jcea.es> References: <4CEC24FE.70107@jcea.es> Message-ID: <4CEC5CC5.5070305@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 23/11/10 21:33, Jesus Cea wrote: > Happen to me last Sunday, and happening just now. > > I can access http://bugs.python.org/ just fine, but trying to post a > message, open a new bug, change nosy, etc., takes a LONG time (minutes) > and it is finally failing with a "400 Bad Request" error: > > """ > Bad Request > > Your browser sent a request that this server could not understand. > Apache/2.2.9 (Debian) mod_python/3.3.1 Python/2.5.2 mod_ssl/2.2.9 > OpenSSL/0.9.8g mod_wsgi/2.5 Server at bugs.python.org Port 80 > """ > > Last sunday I was able to open the bug after a time. Today I have been > retrying for while, with no luck yet. Still retrying, with no luck. Anybody else can reproduce?. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOxcxZlgi5GaxT1NAQJGEQQApyTPFFyPbzc45v5AfeLwT0YHvIcFyT5a lZVZIJ+TVeI1PY/bZpebO4YnjQ6JrHIIedXf8IUqBi9sD8UUDY5tST8TikZPwvvk pGvdCRwa2A6slGG5zgnA4u4+H2MiOiRhua0sTELNQJYAgzTNER+LDTWQ04p31kOD D++Hjb2mBs8= =TI1J -----END PGP SIGNATURE----- From fuzzyman at voidspace.org.uk Wed Nov 24 01:41:37 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 24 Nov 2010 00:41:37 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1290546920.3642.104.camel@localhost.localdomain> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <935CA187-6799-437E-8F18-2A35886B5117@twistedmatrix.com> <1290546920.3642.104.camel@localhost.localdomain> Message-ID: <4CEC5F41.8060806@voidspace.org.uk> On 23/11/2010 21:15, Antoine Pitrou wrote: > Le mardi 23 novembre 2010 ? 16:10 -0500, Glyph Lefkowitz a ?crit : >> On Nov 23, 2010, at 10:01 AM, Antoine Pitrou wrote: >> >>> Well, it is easy to assign range(N) to a tuple of names when >>> desired. I >>> don't think an automatically-enumerating constant generator is >>> needed. >> I don't think that numerical enumerations are the only kind of >> constants we're talking about. Others have already mentioned strings. >> Also, see for some other use-cases. Since this >> isn't coming to 2.x, we're probably going to do our own thing anyway >> (unless it turns out that flufl.enum is so great that we want to add >> another dependency...) but I'm hoping that the outcome of this >> discussion will point to something we can be compatible with. > I think that asking for too many features would get in the way, and also > make the API quite un-Pythonic. If you want your values to be e.g. > OR'able, just choose your values wisely ;) > Well, the point of an OR'able flag is that the result shows the OR'd values in the repr. Raymond suggests using a set of strings where you need flag constants. For new apis (so no backwards compatibility constraints) where you don't need to use integers (i.e. not wrapping a C library) that's a great suggestion: flags = {'FOO', 'BAR'} Michael > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From lukasz at langa.pl Wed Nov 24 01:50:23 2010 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Wed, 24 Nov 2010 01:50:23 +0100 Subject: [Python-Dev] Centos 5.5 freeze during test_concurrent_futures Message-ID: Hi there! py3k built from trunk on Centos 5.5 freezes during regrtest on test_concurrent_futures with "Fatal Python error: Invalid thread state for this thread". As in a typical concurrent problem, subsequent calls freeze in different test cases, but the freeze itself is always reproducible and always during this test. A colorful example: http://bpaste.net/show/11493/ I created an issue for that here: http://bugs.python.org/issue10517 If necessary, I can provide Centos 5.5 shell access. I would also like to donate a Centos 5.5 buildbot. -- Best regards, ?ukasz Langa tel. +48 791 080 144 WWW http://lukasz.langa.pl/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcea at jcea.es Wed Nov 24 02:32:05 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 24 Nov 2010 02:32:05 +0100 Subject: [Python-Dev] Sporadic problems with bugs.python.org In-Reply-To: <4CEC5CC5.5070305@jcea.es> References: <4CEC24FE.70107@jcea.es> <4CEC5CC5.5070305@jcea.es> Message-ID: <4CEC6B15.6060606@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 24/11/10 01:31, Jesus Cea wrote: > Still retrying, with no luck. > > Anybody else can reproduce?. One of my tracker changes was just processed. The important one still retrying every 5 minutes... I hope I can go sleep before dawn :-P. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOxrFZlgi5GaxT1NAQLHUQP+IyN3X/vt5AQKpg/fTjSUpfX2f3wTzeOp 8+5Gnb2ktyZQEF0ELBo0wiWNReJcxicw3ZD9Zqy05cprJ8VL7QZSRHkom+BiXrKK P+Rllulp8Eu+wq59NKJb5DGk8tfDt6zywepUAHB449Dkcyq9p8gt8L5LAiABTfsy dFaQPP2w1Kg= =ERTw -----END PGP SIGNATURE----- From tjreedy at udel.edu Wed Nov 24 02:51:20 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 23 Nov 2010 20:51:20 -0500 Subject: [Python-Dev] Sporadic problems with bugs.python.org In-Reply-To: <4CEC6B15.6060606@jcea.es> References: <4CEC24FE.70107@jcea.es> <4CEC5CC5.5070305@jcea.es> <4CEC6B15.6060606@jcea.es> Message-ID: On 11/23/2010 8:32 PM, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 24/11/10 01:31, Jesus Cea wrote: >> Still retrying, with no luck. >> >> Anybody else can reproduce?. > > One of my tracker changes was just processed. > > The important one still retrying every 5 minutes... > > I hope I can go sleep before dawn :-P. I added a comment to one issue and opened another with no problem during the last couple of hours. -- Terry Jan Reedy From glyph at twistedmatrix.com Wed Nov 24 02:52:13 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 23 Nov 2010 20:52:13 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> Message-ID: On Nov 23, 2010, at 7:22 PM, James Y Knight wrote: > On Nov 23, 2010, at 6:49 PM, Greg Ewing wrote: >> Maybe Python should have used UTF-8 as its internal unicode >> representation. Then people who were foolish enough to assume >> one character per string item would have their programs break >> rather soon under only light unicode testing. :-) > > You put a smiley, but, in all seriousness, I think that's actually the right thing to do if anyone writes a new programming language. It is clearly the right thing if you don't have to be concerned with backwards-compatibility: nobody really needs to be able to access the Nth codepoint in a string in constant time, so there's not really any point in storing a vector of codepoints. > > Instead, provide bidirectional iterators which can traverse the string by byte, codepoint, or by grapheme (that is: the set of combining characters + base character that go together, making up one thing which a human would think of as a character). I really hope that this idea is not just for new programming languages. If you switch from doing unicode "wrong" to doing unicode "right" in Python, you quadruple the memory footprint of programs which primarily store and manipulate large amounts of text. This is especially ridiculous in PyGTK applications, where the GUI's internal representation required by the GUI UTF-8 anyway, so the round-tripping of string data back and forth to the exploded UTF-32 representation is wasting gobs of memory and time. It at least makes sense when your C library's idea about character width and your Python build match up. But, in a desktop app this is unlikely to be a performance concern; in servers, it's a big deal; measurably so. I am pretty sure that in the server apps that I work on, we are eventually going to need our own string type and UTF-8 logic that does exactly what James suggested - certainly if we ever hope to support Py3. (I dimly recall that both James and I have made this point before, but it's pretty important, so it bears repeating.) From glyph at twistedmatrix.com Wed Nov 24 02:56:57 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 23 Nov 2010 20:56:57 -0500 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) In-Reply-To: <20101123150219.29e20374@pitrou.net> References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> <20101123150219.29e20374@pitrou.net> Message-ID: <720EFE43-119F-4F2F-BCB1-939275B5FA6E@twistedmatrix.com> On Nov 23, 2010, at 9:02 AM, Antoine Pitrou wrote: > On Tue, 23 Nov 2010 00:07:09 -0500 > Glyph Lefkowitz wrote: >> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto < >> ocean-city at m2.ccsnet.ne.jp> wrote: >> >>> Hello. Does this affect python? Thank you. >>> >>> http://www.openssl.org/news/secadv_20101116.txt >>> >> >> No. > > Well, actually it does, but Python links against the system OpenSSL on > most platforms (except Windows), so it's up to the OS vendor to apply > the patch. It does? If so, I must have misunderstood the vulnerability. Can you explain how it affects Python? From stephen at xemacs.org Wed Nov 24 03:29:47 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 11:29:47 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87tyj7bgis.fsf@uwakimon.sk.tsukuba.ac.jp> Alexander Belopolsky writes: > Yet finding a bug in a str object method after a 5 min review was a > bit discouraging: > > >>> 'xyz'.center(20, '\U00010140') > Traceback (most recent call last): > File "", line 1, in > TypeError: The fill character must be exactly one character long > > Given the apparent difficulty of writing even basic text processing > algorithms in presence of surrogate pairs, I wonder how wise it is to > expose Python users to them. "Consenting adults" applies here. What to do? Write tests, fix the stdlib. Raise the probability of surrogate pair tests in the fuzzer. But "expose the users to surrogate pairs in an efficient (ie, UCS-2) implementation" is a fundamental design principle of Python. Tightening up the internal implementation is -10 unacceptable IMO YMMV. > Again, given that the str object itself has at least one non-BMP > character bug as we are closing on the third major release of py3k, > how likely are 3rd party developers to get their libraries right as > they port to 3.x? Not our problem, really. We need to fix the stdlib, but 3rd party libraries know what they're doing. I guess we could provide a fuzztest module that generates known nasty data (zero, very big numbers, "\0x00", "\U00010140", etc) that people would be able to plug in as a data source for their own code. Of course that doesn't replace conventional unittests based on analysis of edge cases and tests designed to tickle them, but it would be a start for many projects. From raymond.hettinger at gmail.com Wed Nov 24 03:35:35 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 23 Nov 2010 18:35:35 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEC513E.4050603@canterbury.ac.nz> References: <20101121034404.52924F20A@mail.python.org> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <58396.1290540417@parc.com> <4CEC513E.4050603@canterbury.ac.nz> Message-ID: <6A9ADF09-971A-4CD7-B583-3BF264E47CF2@gmail.com> On Nov 23, 2010, at 3:41 PM, Greg Ewing wrote: > While it may be possible to work around these things with > sufficient levels of metaclass hackery and black magic, at > some point one has to consider whether new syntax might > be the least worst option. The least worst option is to do nothing at all. That's better than creating a new little monster with its own nuances and limitations. We've gotten by well for almost two decades without this particular static language feature creeping into Python. For the most part, strings work well enough (see decimal.ROUND_UP for example). They are self-documenting and work well with the rest of the language. When a cluster of names cries out for its own namespace, the usual technique is to put the names in class (see the examples in the namedtuple docs for a way to make this a one-liner) or in a module (see opcode.py for example). For xor'able and or'able flags, sets of strings work well: flags = {'runnable', 'callable'} flags |= {'runnable', 'kissable'} if 'callable' in flags: . . . We have a hard enough time getting people to not program Java in Python. IMO, adding a new enumeration type would make this situation worse. Also, it adds weight to the language -- Python is not in needs of yet another fundamental construct. Raymond P.S. I do recognize that lots of people have written their own versions of Enum(), but I think they do it either out of habits formed from statically compiled languages that lack all of our namespace mechanisms or they do it because it is easy and fun to write (just like people seem to enjoy writing flatten() recipes more than they like actually using them). One other thought: With Py3.x, the language had its one chance to get smaller. Old-style classes were tossed, some built-ins vanished, and a few obsolete modules got nuked. It would be easy to have a "let's add thingie x" fest and lose those benefits. There are many devs who find that the language does not fit-in-their-heads anymore, so considerable restraint needs to be exercised before adding a new language feature that would soon permeate everyone's code base and add yet another thing that infrequent users have to learn before being able to read code. From stephen at xemacs.org Wed Nov 24 03:44:40 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 11:44:40 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> Message-ID: <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> James Y Knight writes: > You put a smiley, but, in all seriousness, I think that's actually > the right thing to do if anyone writes a new programming > language. It is clearly the right thing if you don't have to be > concerned with backwards-compatibility: nobody really needs to be > able to access the Nth codepoint in a string in constant time, so > there's not really any point in storing a vector of codepoints. A sad commentary on the state of Emacs usage, "nobody". The theory is that accessing the first character of a region in a string often occurs as a primitive operation in O(N) or worse algorithms, sometimes without enough locality at the "collection of regions" level to give a reasonably small average access time. In practice, any *Emacs user can tell you that yes, we do need to be able to access the Nth codepoint in a buffer in constant time. The O(N) behavior of current Emacs implementations means that people often use a binary coding system on large files. Yes, some position caching is done, but if you have a large file (eg, a mail file) which is virtually segmented using pointers to regions, locality gets lost. (This is not a design bug, this is a fundamental requirement: consider fast switching between threaded view and author-sorted view.) And of course an operation that sorts regions in a buffer using character pointers will have the same problem. Working with memory pointers, OTOH, sucks more than that; GNU Emacs recently bit the bullet and got rid of their higher-level memory-oriented APIs, all of the Lisp structures now work with pointers, and only the very low-level structures know about character-to-memory pointer translation. This performance issue is perceptible even on 3GHz machines with not so large (50MB) mbox files. It's *horrid* if you do something like "occur" on a 1GB log file, then try randomly jumping to detected log entries. From fdrake at acm.org Wed Nov 24 03:58:47 2010 From: fdrake at acm.org (Fred Drake) Date: Tue, 23 Nov 2010 21:58:47 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <6A9ADF09-971A-4CD7-B583-3BF264E47CF2@gmail.com> References: <20101121034404.52924F20A@mail.python.org> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <58396.1290540417@parc.com> <4CEC513E.4050603@canterbury.ac.nz> <6A9ADF09-971A-4CD7-B583-3BF264E47CF2@gmail.com> Message-ID: On Tue, Nov 23, 2010 at 9:35 PM, Raymond Hettinger wrote: > The least worst option is to do nothing at all. For the standard library, I agree. There are enough variants that are needed/desired in different contexts, and there isn't a single clear winner. Nor is there any compelling reason to have a winner. I'm generally in favor of enums (or whatever you want to call them), and I'm in favor of importing support for the flavor you need, or just defining constants in whatever way makes sense for your library or application. I don't see any problems that aren't solved by that. ? -Fred -- Fred L. Drake, Jr.? ? "A storm broke loose in my mind."? --Albert Einstein From jcea at jcea.es Wed Nov 24 04:03:36 2010 From: jcea at jcea.es (Jesus Cea) Date: Wed, 24 Nov 2010 04:03:36 +0100 Subject: [Python-Dev] Sporadic problems with bugs.python.org In-Reply-To: References: <4CEC24FE.70107@jcea.es> <4CEC5CC5.5070305@jcea.es> <4CEC6B15.6060606@jcea.es> Message-ID: <4CEC8088.7010709@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 24/11/10 02:51, Terry Reedy wrote: >> I hope I can go sleep before dawn :-P. > > I added a comment to one issue and opened another with no problem during > the last couple of hours. My changes have work now. After like 8 hours and a retry every five minutes. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTOyAiJlgi5GaxT1NAQLavgP/ZmlKIu+luLw7DpJAVk/p3BCF7wmciE0J KW5SmCHVsyPuKFgOY45f5PM0q7+iXiv3m59zrDNbk0yBvLnVbmGwEeeV1/kGsZ94 NrYuHqnwW6h19tbrFTmVZ5BVKBSc4pdvBhV3+0Zx9hAfkkH/heE4WKJEFd7tIzTu h9jsvAI8pR8= =sG82 -----END PGP SIGNATURE----- From glyph at twistedmatrix.com Wed Nov 24 04:27:38 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Tue, 23 Nov 2010 22:27:38 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> On Nov 23, 2010, at 9:44 PM, Stephen J. Turnbull wrote: > James Y Knight writes: > >> You put a smiley, but, in all seriousness, I think that's actually >> the right thing to do if anyone writes a new programming >> language. It is clearly the right thing if you don't have to be >> concerned with backwards-compatibility: nobody really needs to be >> able to access the Nth codepoint in a string in constant time, so >> there's not really any point in storing a vector of codepoints. > > A sad commentary on the state of Emacs usage, "nobody". > > The theory is that accessing the first character of a region in a > string often occurs as a primitive operation in O(N) or worse > algorithms, sometimes without enough locality at the "collection of > regions" level to give a reasonably small average access time. I'm not sure what you mean by "the theory is". Whose theory? About what? > In practice, any *Emacs user can tell you that yes, we do need to be > able to access the Nth codepoint in a buffer in constant time. The > O(N) behavior of current Emacs implementations means that people often > use a binary coding system on large files. Yes, some position caching > is done, but if you have a large file (eg, a mail file) which is > virtually segmented using pointers to regions, locality gets lost. > (This is not a design bug, this is a fundamental requirement: consider > fast switching between threaded view and author-sorted view.) Sounds like a design bug to me. Personally, I'd implement "fast switching between threaded view and author-sorted view" the same way I'd address any other multiple-views-on-the-same-data problem. I'd retain data structures for both, and update them as the underlying model changed. These representations may need to maintain cursors into the underlying character data, if they must retain giant wads of character data as an underlying representation (arguably the _main_ design bug in Emacs, that it encourages you to do that for everything, rather than imposing a sensible structure), but those cursors don't need to be code-point counters; they could be byte offsets, or opaque handles whose precise meaning varied with the potentially variable underlying storage. Also, please remember that Emacs couldn't be implemented with giant Python strings anyway: crucially, all of this stuff is _mutable_ in Emacs. > And of course an operation that sorts regions in a buffer using > character pointers will have the same problem. Working with memory > pointers, OTOH, sucks more than that; GNU Emacs recently bit the > bullet and got rid of their higher-level memory-oriented APIs, all of > the Lisp structures now work with pointers, and only the very > low-level structures know about character-to-memory pointer > translation. > > This performance issue is perceptible even on 3GHz machines with not > so large (50MB) mbox files. It's *horrid* if you do something like > "occur" on a 1GB log file, then try randomly jumping to detected log > entries. Case in point: "occur" needs to scan the buffer anyway; you can't do better than linear time there. So you're going to iterate through the buffer, using one of the techniques that James proposed, and remember some locations. Why not just have those locations be opaque cursors into your data? In summary: you're right, in that James missed a spot. You need bidirectional, *copyable* iterators that can traverse the string by byte, codepoint, grapheme, or decomposed glyph. From v+python at g.nevcal.com Wed Nov 24 05:28:19 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 20:28:19 -0800 Subject: [Python-Dev] http.server - reference to bug #427345 Message-ID: <4CEC9463.8030302@g.nevcal.com> Where might I find the bug #427345 that is referred to in a comment inside http.server ? Here is a code excerpt: # throw away additional data [see bug #427345] while select.select([self.rfile._sock], [], [], 0)[0]: if not self.rfile._sock.recv(1): break -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.curtin at gmail.com Wed Nov 24 05:35:10 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Tue, 23 Nov 2010 22:35:10 -0600 Subject: [Python-Dev] http.server - reference to bug #427345 In-Reply-To: <4CEC9463.8030302@g.nevcal.com> References: <4CEC9463.8030302@g.nevcal.com> Message-ID: On Tue, Nov 23, 2010 at 22:28, Glenn Linderman > wrote: > Where might I find the bug #427345 that is referred to in a comment inside > http.server ? Here is a code excerpt: > > # throw away additional data [see bug #427345] > while select.select([self.rfile._sock], [], [], 0)[0]: > if not self.rfile._sock.recv(1): > break > http://bugs.python.org/issue427345 http://bugs.python.org/ has a box on the left-hand side where you can enter issue numbers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Wed Nov 24 06:07:52 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 14:07:52 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> Message-ID: <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> Note that I'm not saying that there shouldn't be a UTF-8 string type; I'm just saying that for some purposes it might be a good idea to keep UTF-16 and UTF-32 string types around. Glyph Lefkowitz writes: > > The theory is that accessing the first character of a region in a > > string often occurs as a primitive operation in O(N) or worse > > algorithms, sometimes without enough locality at the "collection of > > regions" level to give a reasonably small average access time. > > I'm not sure what you mean by "the theory is". Whose theory? About what? Mine. About why somebody somewhere someday would need fast random access to character positions. "Nobody ever needs that" is a strong claim. > > In practice, any *Emacs user can tell you that yes, we do need to be > > able to access the Nth codepoint in a buffer in constant time. The > > O(N) behavior of current Emacs implementations means that people often > > use a binary coding system on large files. Yes, some position caching > > is done, but if you have a large file (eg, a mail file) which is > > virtually segmented using pointers to regions, locality gets lost. > > (This is not a design bug, this is a fundamental requirement: consider > > fast switching between threaded view and author-sorted view.) > > Sounds like a design bug to me. Personally, I'd implement "fast > switching between threaded view and author-sorted view" the same > way I'd address any other multiple-views-on-the-same-data problem. > I'd retain data structures for both, and update them as the > underlying model changed. Um, that's precisely the design I'm talking about. But as you recognize later, the message content is not part of those structures because there's no real point in copying it *if you have fast access to character positions*. In a variable width character, character- addressed design, there can be a perceptible delay in accessing even the "next" message's content if you're in the wrong view. > These representations may need to maintain cursors into the > underlying character data, if they must retain giant wads of > character data as an underlying representation (arguably the _main_ > design bug in Emacs, that it encourages you to do that for > everything, rather than imposing a sensible structure), but those > cursors don't need to be code-point counters; they could be byte > offsets, or opaque handles whose precise meaning varied with the > potentially variable underlying storage. Both byte offsets and opaque handles really really suck to design, implement, and maintain, if Lisp or Python level users can use them. They're hard enough to do when you can hide them behind internal APIs, but if they're accessible to users they're an endless source of user bugs. What was that you were saying about the difficulty of remembering which argument is the fd? It's like that. Sure, you can design APIs to help get that right, but it's not easy to provide one that can be used for all the different applications out there. > Also, please remember that Emacs couldn't be implemented with giant > Python strings anyway: crucially, all of this stuff is _mutable_ in > Emacs. No, that's a red herring. The use-cases where Emacs users complain most is browsing giant logs and reading old mail; neither needs the content to be mutable (although of course it's a convenience in the mail case if you delete messages or fetch new mail, but that could be done with transaction logs that get appended to the on-disk file). > Case in point: "occur" needs to scan the buffer anyway; you can't > do better than linear time there. So you're going to iterate > through the buffer, using one of the techniques that James > proposed, and remember some locations. Why not just have those > locations be opaque cursors into your data? They are. But unless you're willing to implement correct character motion, they need to be character indicies, which will be slow to access the actual locations. We've implemented caches, as does Emacs, but they don't always get hits. Finding an arbitrary position once can involve perceptible delay on up to 1GHz machines; doing it in a loop (which mail programs have a habit of doing) could be very painful. > In summary: you're right, in that James missed a spot. You need > bidirectional, *copyable* iterators that can traverse the string by > byte, codepoint, grapheme, or decomposed glyph. That's a good start, yes. But once you talk about "remembering some locations", you're implicitly talking about random access. Either you maintain position indexes which naively implemented can easily be close to the size of the text buffer (indexes are going to be at least 4 bytes, possibly 8, per position, and something like "occur" can generate a lot of positions) -- in which case you might as well just use a representation that is an array in the first place -- or you need to implement a position cache which can be very hairy to do well. Or you can give user programs memory indicies, and enjoy the fun as the poor developers do things like "pos += 1" which works fine on the ASCII data they have lying around, then wonder why they get Unicode errors when they take substrings. I'm sure it all can be done, but I don't think it will be done right the first time around. You may be right that designs better adapted to large data sets than Emacs's "everything is a big buffer" will almost always be available with reasonable effort. But remember, a lot of good applications start small, when a flat array might make lots of sense as the underlying structure, and then need to scale. If you need to scale for the paying customers, well, "ouch!" but you can afford it, but for many volunteer or startup projects that takes the wind right out of your sails. Note that if the user doesn't use private space, in a UCS-2 build you have about 1.5K code points available for compressing non-BMP characters into a 2-byte, valid Unicode representation (of course you need to save off the table somewhere if that ever gets out of your program, but that's easy). I find it hard to imagine that there will be many use-cases that need more than that many non-BMP characters. So probably you can tell those few users who care to use a UCS-4 build; most of the array use-cases can be served by UCS-2. Note that in my Japanese corpuses, UTF-8 averages just about 2 bytes per character anyway, and those are mail files, where two lines of Japanese may be preceded by 2KB of ASCII-only header. I suspect Hebrew, Arabic, and Cyrillic users will have similar experiences. By the way, to send the ball back into your court, I have this feeling that the demand for UTF-8 is once again driven by native English speakers who are very shortly going to find themselves, and the data they are most familiar with, very much in the minority. Of course the market that benefits from UTF-8 compression will remain very large for the immediate future, but in the grand scheme of things, most of the world is going to prefer UTF-16 by a substantial margin. N.B. I'm not talking about persistent storage, where it's 6 of one and half a dozen of the other; you can translate UTF-8 to UTF-16 way faster than you can read content from disk, of course. From foom at fuhm.net Wed Nov 24 07:26:11 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 24 Nov 2010 01:26:11 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> On Nov 24, 2010, at 12:07 AM, Stephen J. Turnbull wrote: > Or you can give user programs memory indicies, and enjoy the fun as > the poor developers do things like "pos += 1" which works fine on > the ASCII data they have lying around, then wonder why they get > Unicode errors when they take substrings. a) You seem to be hung up implementation details of emacs. But yes, positions should be stored as an byte offset into the utf8 string. NOT as number of codepoints since the beginning of the string. Probably you want it to be somewhat opaque, so that you actually have to specify whether you wanted to go to +1 byte, codepoint, or grapheme. b) Those poor developers are *already* screwed if they're using pos += 1 when pos is a codepoint index and they then take a substring based on that! They will get half a character when the string contains combining characters... Pretending that "codepoints" are a useful abstraction just makes poor developers get by without doing the correct thing (incrementing to the next grapheme boundary) for a little bit longer. But once you [the language implementor] are providing correct abstractions for grapheme movement, it's just as easy to also provide an abstraction for codepoint movement, and make your low-level implementation of the iterator object be a byte-offset into a UTF8 buffer. James From foom at fuhm.net Wed Nov 24 07:27:52 2010 From: foom at fuhm.net (James Y Knight) Date: Wed, 24 Nov 2010 01:27:52 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 24, 2010, at 12:07 AM, Stephen J. Turnbull wrote: > By the way, to send the ball back into your court, I have this feeling > that the demand for UTF-8 is once again driven by native English > speakers who are very shortly going to find themselves, and the data > they are most familiar with, very much in the minority. Of course the > market that benefits from UTF-8 compression will remain very large for > the immediate future, but in the grand scheme of things, most of the > world is going to prefer UTF-16 by a substantial margin. No, the demand for UTF-8 is because that's what much of the internet (and not coincidentally, unix) world has standardized on. The main pieces of software using UTF-16 (Windows, Java) started doing so before it became apparent that 16 bits wasn't enough to actually hold a unicode codepoint, so they were actually implementing UCS-2. In those days, UCS-2 was a fairly sensible choice. But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly superior. Not because it's smaller -- it's pretty much a tossup -- but because it is an ASCII superset, and thus more easily compatible with other software. That also makes it most commonly used for internet communication. (So, there's a huge advantage for using it internally as well right there: no transcoding necessary for writing your HTML output). UTF-16 is incompatible with ASCII, and furthermore, it's still a variable-width encoding, with all the same issues that causes. As such, there's really very little to be said in favor of it. If you really want a fixed-width encoding, you have to go to UTF-32, which is excessively large. UTF-32 is a losing choice, simply because of the wasted memory usage. But that's all a side issue: even if you do choose UTF-16 as your underlying encoding, you *still* need to provide iterators that work by "byte" (only now bytes are 16-bits), by codepoint, and by grapheme. Of course, people who implement UTF-16 (such as python, java, and windows) often pretend they're still implementing UCS-2, and don't bother even providing their users with the necessary APIs to do things correctly. Which, you can often get away with...just so long as you don't mind that you sometimes end up splitting a string in the middle of a codepoint and causing a unicode error! James From v+python at g.nevcal.com Wed Nov 24 08:43:18 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 23 Nov 2010 23:43:18 -0800 Subject: [Python-Dev] Web servers, bytes, str, documentation, Python 3.2a4 In-Reply-To: <20101122043957.2A5D6235C7A@kimball.webabinitio.net> References: <4CE7452A.7050109@g.nevcal.com> <4CE7B34D.4020309@netwok.org> <4CE8111F.9060502@g.nevcal.com> <4CE8CFCD.4040906@g.nevcal.com> <20101121171821.195552194AC@kimball.webabinitio.net> <4CE9EABA.1090306@g.nevcal.com> <20101122043957.2A5D6235C7A@kimball.webabinitio.net> Message-ID: <4CECC216.8090802@g.nevcal.com> On 11/21/2010 8:39 PM, R. David Murray wrote: > On Sun, 21 Nov 2010 19:59:54 -0800, Glenn Linderman wrote: >> On 11/21/2010 9:18 AM, R. David Murray wrote: >>> I want to look at the CGI issue, but I'm not sure when I'll get to it. >> Actually, since this code was working before 3.x, and if email.parser >> can now accept binary streams, it seems like maybe the only thing that >> might be wrong is that presently it is getting a text stream instead, so >> that is something cgi.py or the application program would have to >> switch, and then maybe some testing would discover correctness, or maybe >> a specification of UTF-8 as the encoding to use for the text parts would >> have to be done. > Well, given the bytes/string split in Python3, code definitely has to > be changed to make this work, since you have to explicitly call bytes > processing routines (message_from_bytes, message_from_binary_file, > BytesFeedparser, etc) to parse binary data, and likewise use > BytesGenerator to emit binary data. Looks like cgi.py also calls http.client and both of them would need to be changed to deal with bytes. I don't have the full translation of API calls in my head, nor have I ever used the email.parser API to know what the calls actually do... just read a bit about it... but that is different than using it... However, I find code in http.client.parse_headers that is attempting to work-around reading a binary stream and feeding email.parser a string. So definitely some work to be done to fix things. I did add some explicit threads to http.server CGI script code that I think work around the deadlocks that can result from attempting to serialize 3 pipes, and yet not require full buffering of stdin or stdout. At the moment, I still am doing full buffering of stderr, but that is thought to be small potatoes in an http.server environment, generally. But since my test case is a CGI form data, I'm stuck until this is fixed, or I wrap my head around the code in http.client and email.parser. But not tonight (yawn!). -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Nov 24 09:02:13 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Nov 2010 09:02:13 +0100 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) In-Reply-To: <720EFE43-119F-4F2F-BCB1-939275B5FA6E@twistedmatrix.com> References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> <20101123150219.29e20374@pitrou.net> <720EFE43-119F-4F2F-BCB1-939275B5FA6E@twistedmatrix.com> Message-ID: <1290585733.3642.2.camel@localhost.localdomain> Le mardi 23 novembre 2010 ? 20:56 -0500, Glyph Lefkowitz a ?crit : > On Nov 23, 2010, at 9:02 AM, Antoine Pitrou wrote: > > > On Tue, 23 Nov 2010 00:07:09 -0500 > > Glyph Lefkowitz wrote: > >> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto < > >> ocean-city at m2.ccsnet.ne.jp> wrote: > >> > >>> Hello. Does this affect python? Thank you. > >>> > >>> http://www.openssl.org/news/secadv_20101116.txt > >>> > >> > >> No. > > > > Well, actually it does, but Python links against the system OpenSSL on > > most platforms (except Windows), so it's up to the OS vendor to apply > > the patch. > > > It does? If so, I must have misunderstood the vulnerability. Can you > explain how it affects Python? If I believe the link above: ?Any OpenSSL based TLS server is vulnerable if it is multi-threaded and uses OpenSSL's internal caching mechanism. Servers that are multi-process and/or disable internal session caching are NOT affected.? So, you just have to create a multithreaded TLS server which doesn't disable server-side session caching (it is enabled by default according to http://www.openssl.org/docs/ssl/SSL_CTX_set_session_cache_mode.html ) Regards Antoine. From solipsis at pitrou.net Wed Nov 24 09:42:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Nov 2010 09:42:07 +0100 Subject: [Python-Dev] Centos 5.5 freeze during test_concurrent_futures References: Message-ID: <20101124094207.33ac093f@pitrou.net> Hi, > py3k built from trunk on Centos 5.5 freezes during regrtest on test_concurrent_futures with "Fatal Python error: Invalid thread state for this thread". As in a typical concurrent problem, subsequent calls freeze in different test cases, but the freeze itself is always reproducible and always during this test. Well, could you run this under gdb and report the stacks for the various threads when the process crashes? (when compiled --with-pydebug, if possible) Thank you Antoine. From solipsis at pitrou.net Wed Nov 24 09:43:12 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Nov 2010 09:43:12 +0100 Subject: [Python-Dev] http.server - reference to bug #427345 References: <4CEC9463.8030302@g.nevcal.com> Message-ID: <20101124094312.06bec373@pitrou.net> On Tue, 23 Nov 2010 22:35:10 -0600 Brian Curtin wrote: > On Tue, Nov 23, 2010 at 22:28, Glenn Linderman > > > wrote: > > > Where might I find the bug #427345 that is referred to in a comment inside > > http.server ? Here is a code excerpt: > > > > # throw away additional data [see bug #427345] > > while select.select([self.rfile._sock], [], [], 0)[0]: > > if not self.rfile._sock.recv(1): > > break > > > > http://bugs.python.org/issue427345 > > http://bugs.python.org/ has a box on the left-hand side where you can enter > issue numbers. And of course you can also reverse-engineer the clever URL scheme used by Roundup bug entries ;) Regards Antoine. From stephen at xemacs.org Wed Nov 24 10:03:29 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 18:03:29 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> Message-ID: <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> James Y Knight writes: > a) You seem to be hung up implementation details of emacs. Hung up? No. It's the program whose text model I know best, and even if its design could theoretically be a lot better for this purpose, I can't say I've seen a real program whose model is obviously better for the purpose of a language for implementing text editors.[1] So it's not obvious to me that its model can be ruled out on a priori grounds. If not, it would be nice if your new language could implement it efficiently without contorted programming. > But yes, positions should be stored as an byte offset into the > utf8 string. NOT as number of codepoints since the beginning of > the string. Probably you want it to be somewhat opaque, so that > you actually have to specify whether you wanted to go to +1 > byte, codepoint, or grapheme. Well, first of all, +1 byte should not be available to a text iterator, at least not with the same iterator/position object that implements character and/or grapheme movement. (You seem to have thought about this issue a lot, but mixing bytes with text units makes wonder how much practical implementation you've done.) Second, incrementing to grapheme boundaries is relatively easy to do efficiently, just as incrementing to a UTF-8 character boundary is easy to do. We already do the latter, the former is pragmatically harder, but not a conceptual stretch. That's not the question. The question is how do we identify an arbitrary position in the text? Sometimes it's nice to have a numerical measure of size or location. It is not obvious that position by grapheme count is going to be the obvious way to determine position in a text. Eg, for languages with variable metric characters, character counts as a way of lining up table columns is going the way of Tyrannosaurus. In the Han-using languages, yes, column counts within lines are going to be important forever, because the characters are literally square for most practical purposes ... but they don't use composing characters (all the Japanese kana are precomposed, for example), so position by grapheme is going to be very close to position by character, and fine positioning will be done either by mouse or by incrementing the last few characters. Nor do I think operations like "advance 1,000,000 characters" will have less meaning than "advance 1,000,000 graphemes." Both of them are just a way of saying "go way far away", end up in about the same place, and where there's a bias, it will be pretty consistent in a statistical sense for any given natural language (and therefore, for 99% of users). > But once you [the language implementor] are providing correct > abstractions for grapheme movement, it's just as easy to also > provide an abstraction for codepoint movement, and make your > low-level implementation of the iterator object be a byte-offset > into a UTF8 buffer. Sure, that's fine for something that just iterates over the text. But if you actually need to remember positions, or regions, to jump to later or to communicate to other code that manipulates them, doing this stuff the straightforward way (just copying the whole iterator object to hang on to its state) becomes expensive. You end up proliferating types that all do the same kind of thing. Judicious use of inheritance helps, but getting the fundamental abstraction right is hard. Or least, Emacs hasn't found it in 20 years of trying. OTOH, all that stuff "just works" and just works efficiently, up to the grapheme vs. character issue, with an array. About that issue, to go back to tired old Emacs, *all* of the things I can think of that I might want to do by grapheme (display, insert, delete, move a few places) do fit the "increment until done" model. These things already work quite well for the variable-width buffer that "multilingual" Emacsen use, whether the old Mule encoding or UTF-8. So I can see how the UTF-8 model with appropriate iterators for characters and graphemes can work well for lots of applications and use cases. But Emacs already has opaque "markers", yet nevertheless the use of integer character positions in strings and buffers has survived. That *may* have to do with mutability, and the "all the world is a buffer" design, as Glyph suggested, but I think it more likely that markers are very expense to create and use compared to integers. Perhaps an editor of power similar to Emacs could be implemented with string operations on lines, or the like, and these issues would go away. But it's not obvious to me. Footnotes: [1] Yes, I know that not all programs are text editors. So shoot me. It's still the text manipulation program I know best, and it's not obvious to me that it's the unique class that would need these features. From stephen at xemacs.org Wed Nov 24 10:51:49 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 24 Nov 2010 18:51:49 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87lj4jaw22.fsf@uwakimon.sk.tsukuba.ac.jp> James Y Knight writes: > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly > superior [...]a because it is an ASCII superset, and thus more > easily compatible with other software. That also makes it most > commonly used for internet communication. Sure, UTF-8 is very nice as a protocol for communicating text. So what? If your application involves shoveling octets real fast, don't convert and shovel those octets. If your application involves significant text processing, well, conversion can almost always be done as fast as you can do I/O so it doesn't cost wallclock time, and generally doesn't require a huge percentage of CPU time compared to the actual text processing. It's just a specialization of serialization, that we do all the time for more complex data structures. So wire protocols are not a killer argument for or against any particular internal representation of text. > (So, there's a huge advantage for using it internally as well right > there: no transcoding necessary for writing your HTML output). I don't know your use cases but for mine, transcoding (whether in Lisp or Python or C) is invariably the least of my worries. *Especially* transcoding to UTF-8, which is the default codec for me, and I *never* mix bytes and text, so having not bothered to set the codec, I don't bother to transcode explicitly. > If you really want a fixed-width encoding, you have to go to > UTF-32 Not really. I never bothered implementing the codec, because I haven't yet seen a non-BMP Unicode character in the wild (I still see a lot of non-Unicode characters, but hey, that's the price you pay for living in the land that invented sushi, sake, and anime). For most use cases, those are going to be rare, where by "rare" I mean "you aren't going to see 6400 *different* non-BMP characters."[1] So instead of having the codec produce UTF-16, you have it produce (Holy CEF, Batman!) "pure" UCS-2 with the non-BMP characters registered on demand and encoded in the BMP private area. Python, of course, will never know the difference, and your language won't need to care, either. > But that's all a side issue: even if you do choose UTF-16 as your > underlying encoding, you *still* need to provide iterators that > work by "byte" (only now bytes are 16-bits), by codepoint, Nope, see above. Codepoints can be bytes and vice versa. The needed codec is no harder to use than any other codec, and only slightly less efficient than the normal UTF-8 codec unless you're basically restricted to a rather uncommon script (and even then there are optimizations). > and by grapheme. Sure, but as I point out elsewhere, the use cases where grapheme movement is distinguished from character movement I can come up with are all iterative, and I don't need array behavior for both anyway. So since I *can* have a character array in Unicode, and I *can't* have a grapheme array (except maybe by a scheme like the above), I'll go for the character array. Unless maybe you convince me I don't need it, but I'm yet to be convinced. > away with...just so long as you don't mind that you sometimes end > up splitting a string in the middle of a codepoint and causing a > unicode error! I *do* mind, but I like Python anyway. Footnotes: [1] OK, in practice a lot of the private space will be taken by existing system characters, such as the Apple logo (absolutely essential for writing email on Mac, at least in Japan). Whose use-case is going to see 1000 different non-BMP characters in a session? I do know a couple of Buddhist dictionary editors, but aside from them, I can't think of anybody. Lara Croft, maybe. From solipsis at pitrou.net Wed Nov 24 11:27:30 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Nov 2010 11:27:30 +0100 Subject: [Python-Dev] len(chr(i)) = 2? References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <87lj4jaw22.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20101124112730.6867fb17@pitrou.net> On Wed, 24 Nov 2010 18:51:49 +0900 "Stephen J. Turnbull" wrote: > James Y Knight writes: > > > But, now, if your choices are UTF-8 or UTF-16, UTF-8 is clearly > > superior [...]a because it is an ASCII superset, and thus more > > easily compatible with other software. That also makes it most > > commonly used for internet communication. > > Sure, UTF-8 is very nice as a protocol for communicating text. So > what? If your application involves shoveling octets real fast, don't > convert and shovel those octets. If your application involves > significant text processing, well, conversion can almost always be > done as fast as you can do I/O so it doesn't cost wallclock time, and > generally doesn't require a huge percentage of CPU time compared to > the actual text processing. It's just a specialization of > serialization, that we do all the time for more complex data > structures. > > So wire protocols are not a killer argument for or against any > particular internal representation of text. Agreed. Decoding and encoding utf-8 is so fast that it should be dwarfed by any actual processing done on the text. Regards Antoine. From solipsis at pitrou.net Wed Nov 24 12:37:54 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Nov 2010 12:37:54 +0100 Subject: [Python-Dev] r86726 - python/branches/release27-maint/Objects/setobject.c References: <20101124103923.DC18EDE50@mail.python.org> Message-ID: <20101124123754.3b60d3a3@pitrou.net> On Wed, 24 Nov 2010 11:39:23 +0100 (CET) armin.rigo wrote: > Author: armin.rigo > Date: Wed Nov 24 11:39:23 2010 > New Revision: 86726 > > Log: > A no-op change. It looks like this call was not meant to be a recursive > call, but just call the helper (which the recursive call ends up doing). Since it's allegedly a no-op change, it doesn't come with a test, and 2.7.1 is in rc phase, is it really the right time to do it? What is the motivation for it? Thanks Antoine. > > > Modified: > python/branches/release27-maint/Objects/setobject.c > > Modified: python/branches/release27-maint/Objects/setobject.c > ============================================================================== > --- python/branches/release27-maint/Objects/setobject.c (original) > +++ python/branches/release27-maint/Objects/setobject.c Wed Nov 24 11:39:23 2010 > @@ -1858,7 +1858,7 @@ > tmpkey = make_new_set(&PyFrozenSet_Type, key); > if (tmpkey == NULL) > return -1; > - rv = set_contains(so, tmpkey); > + rv = set_contains_key(so, tmpkey); > Py_DECREF(tmpkey); > } > return rv; From fuzzyman at voidspace.org.uk Wed Nov 24 13:30:15 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 24 Nov 2010 12:30:15 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> Message-ID: <4CED0557.9090101@voidspace.org.uk> On 23/11/2010 14:16, Nick Coghlan wrote: > On Tue, Nov 23, 2010 at 11:50 PM, Michael Foord > wrote: >> PEP 354 was rejected for two primary reasons - lack of interest and nowhere >> obvious to put it. Would it be *so bad* if an enum type lived in its own >> module? There is certainly more interest now, and if we are to use something >> like this in the standard library it *has* to be in the standard library >> (unless every module implements their own private _Constant class). >> >> Time to revisit the PEP? > If you (or anyone else) wanted to revisit the PEP, then I would advise > trawling through the standard library looking for constants that could > be sensibly converted to enum values. Based on a non-exhaustive search, Python standard library modules currently using integers for constants: * re - has flags (OR'able constants) defined in sre_constants, each flag has two names (e.g. re.IGNORECASE and re.I) * os has SEEK_SET, SEEK_CUR, SEEK_END - *plus* those implemented in posix / nt * doctest has its own flag system, but is really just using integer flags / constants (quite a few of them) * token has a tonne of constants (autogenerated) * socket exports a bunch of constants defined in _socket * gzip has flags: FTEXT, FHCRC, FEXTRA, FNAME, FCOMMENT * errno (builtin module) EALREADY, EINPROGRESS, EWOULDBLOCK, ECONNRESET, EINVAL, ENOTCONN, ESHUTDOWN, EINTR, EISCONN, EBADF, ECONNABORTED * opcode has HAVE_ARGUMENT, EXTENDED_ARG. In fact pretty much the whole of opcode is about defining and exposing named constants * msilib uses flag constants * multiprocessing.pool - RUN, CLOSE, TERMINATE * multiprocessing.util - NOTSET, SUBDEBUG, DEBUG, INFO, SUBWARNING * xml.dom and xml.dom.Node (in __init__.py) have a bunch of constants * xml.dom.NodeFilter.NodeFilter holds a bunch of constants (some of them flags) * xmlrpc.client has a bunch of error constants * calendar uses constants to represent weekdays, plus one for the EPOCH that is best left alone * http.client has a tonne of constants - recognisable as ports / error codes though * dis has flags in COMPILER_FLAG_NAMES, which are then set as locals in inspect * io defines SEEK_SET, SEEK_CUR, SEEK_END (same as os) Where constants are implemented in C but exported via a Python module (the constants exported by os and socket for example) they could be wrapped. Where they are exported directly by a C extension or builtin module (e.g. errno) they are probably best left. Raymond feels that having an enum / constant type would be Javaesque and unused. If we used it in the standard library the unused fear at least would be unwarranted. The change would be largely transparent to developers, except they get better debugging info. Twisted is also looking for an enum / constant type: http://twistedmatrix.com/trac/ticket/4671 Because we would need to subclass from int for backwards compatibility we can't (unless the base class is set dynamically which I don't propose) it couldn't replace float / string constants. Hopefully it would still be sufficient to allow Twisted to use it. (Although they do so love reimplementing parts of the standard library - usually better than the standard library it has to be said.) All the best, Michael There are a tonne of constants that are used as numbers (MAX_LINE_LENGTH appears in a few places) and aren't just arbitrary constants. There are also some other interesting ones: * pty has STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO, CHILD * poplib has POP3_PORT, POP3_SSL_PORT - recognisable as port numbers, should be left as ints * datetime.py has MINYEAR and MAXYEAR * colorsys has float constants * tty uses constants for termios list indexes (used as numbers I guess) * curses.ascii has a whole bunch of integer constants referring to ascii characters * Several modules - decimal, concurrent.futures, uuid (and now inspect) already use strings > A decision would also need to be made as to whether or not to subclass > int, or just provide __index__ (the former has the advantage of being > able to drop cleanly into OS level APIs that expect a numerical > constant). > > Whether enums should provide arbitrary name-value mappings (ala C > enums) or were restricted to sequential indices starting from zero > would be another question best addressed by a code survey of at least > the stdlib. > > And getgeneratorstate() doesn't count as a use case, since the > ordering isn't needed and using string literals instead of integers > will cover the debugging aspect :) > > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Wed Nov 24 15:08:04 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 25 Nov 2010 00:08:04 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CED0557.9090101@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> Message-ID: On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord wrote: > Based on a non-exhaustive search, Python standard library modules currently > using integers for constants: Thanks for that review. I think following up on the "NamedConstant" idea may make more sense than pursuing enums in their own right. That way we could get the debugging benefits on the Python side regardless of any type constraints on the value (e.g. needing to be an integer in order to interface to C code), without needing to design an enum API that suited all purposes. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From exarkun at twistedmatrix.com Wed Nov 24 16:01:06 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Wed, 24 Nov 2010 15:01:06 -0000 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) In-Reply-To: <1290585733.3642.2.camel@localhost.localdomain> References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> <20101123150219.29e20374@pitrou.net> <720EFE43-119F-4F2F-BCB1-939275B5FA6E@twistedmatrix.com> <1290585733.3642.2.camel@localhost.localdomain> Message-ID: <20101124150106.2109.660794265.divmod.xquotient.197@localhost.localdomain> On 08:02 am, solipsis at pitrou.net wrote: >Le mardi 23 novembre 2010 ? 20:56 -0500, Glyph Lefkowitz a ?crit : >>On Nov 23, 2010, at 9:02 AM, Antoine Pitrou wrote: >> >> > On Tue, 23 Nov 2010 00:07:09 -0500 >> > Glyph Lefkowitz wrote: >> >> On Mon, Nov 22, 2010 at 11:13 PM, Hirokazu Yamamoto < >> >> ocean-city at m2.ccsnet.ne.jp> wrote: >> >> >> >>> Hello. Does this affect python? Thank you. >> >>> >> >>> http://www.openssl.org/news/secadv_20101116.txt >> >>> >> >> >> >> No. >> > >> > Well, actually it does, but Python links against the system OpenSSL >>on >> > most platforms (except Windows), so it's up to the OS vendor to >>apply >> > the patch. >> >> >>It does? If so, I must have misunderstood the vulnerability. Can you >>explain how it affects Python? > >If I believe the link above: > 1CAny OpenSSL based TLS server is vulnerable if it is multi-threaded and >uses OpenSSL's internal caching mechanism. Servers that are >multi-process and/or disable internal session caching are NOT >affected. 1D > >So, you just have to create a multithreaded TLS server which doesn't >disable server-side session caching (it is enabled by default according >to http://www.openssl.org/docs/ssl/SSL_CTX_set_session_cache_mode.html >) Hm. The session cache is enabled by default, but nothing will ever use it unless the server specifies a session id using SSL_set_session_id_context or SSL_CTX_set_session_id_context. Python doesn't expose these, so I don't think any Python SSL server can set them. The vulnerability announcement isn't 100% clear on this, but I took a look at the patch which fixes the issue and it /appears/ as though if a client never tries to re-use a session then you will be safe from this bug. However, perhaps this only means that only malicious clients (which send a session id even when they can't actually have one) will be able to trigger the bug. Or I may misunderstand how SSL sessions work in OpenSSL entirely. The documentation for them is on par with that for most of the rest of OpenSSL. Jean-Paul From solipsis at pitrou.net Wed Nov 24 16:11:20 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 24 Nov 2010 16:11:20 +0100 Subject: [Python-Dev] OpenSSL Voluntarily (openssl-1.0.0a) References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> <20101123150219.29e20374@pitrou.net> <720EFE43-119F-4F2F-BCB1-939275B5FA6E@twistedmatrix.com> <1290585733.3642.2.camel@localhost.localdomain> <20101124150106.2109.660794265.divmod.xquotient.197@localhost.localdomain> Message-ID: <20101124161120.5ddd106c@pitrou.net> On Wed, 24 Nov 2010 15:01:06 -0000 exarkun at twistedmatrix.com wrote: > > > >If I believe the link above: > > 1CAny OpenSSL based TLS server is vulnerable if it is multi-threaded and > >uses OpenSSL's internal caching mechanism. Servers that are > >multi-process and/or disable internal session caching are NOT > >affected. 1D > > > >So, you just have to create a multithreaded TLS server which doesn't > >disable server-side session caching (it is enabled by default according > >to http://www.openssl.org/docs/ssl/SSL_CTX_set_session_cache_mode.html > >) > > Hm. The session cache is enabled by default, but nothing will ever use > it unless the server specifies a session id using > SSL_set_session_id_context or SSL_CTX_set_session_id_context. Python > doesn't expose these, so I don't think any Python SSL server can set > them. Well, Python calls SSL_CTX_set_session_id_context() implicitly, starting from 3.2 (precisely so that the session cache gets used). The "documentation" I've found about the "session id context" seems to suggest that a process-wide constant is enough. (and you can verify that caching occurs using the new SSLContext.session_stats() method) > Or I may misunderstand how SSL sessions work in OpenSSL entirely. The > documentation for them is on par with that for most of the rest of > OpenSSL. Agreed. Regards Antoine. From steve at pearwood.info Wed Nov 24 16:44:57 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 25 Nov 2010 02:44:57 +1100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> Message-ID: <4CED32F9.5050004@pearwood.info> Nick Coghlan wrote: > On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord > wrote: >> Based on a non-exhaustive search, Python standard library modules currently >> using integers for constants: > > Thanks for that review. I think following up on the "NamedConstant" > idea may make more sense than pursuing enums in their own right. Pardon me if I've missed something in this thread, but when you say "NamedConstant", do you mean actual constants that can only be bound once but not re-bound? If so, +1. If not, what do you mean? I thought PEP 3115 could be used to implement such constants, but I can't get it to work... class readonlydict(dict): def __setitem__(self, key, value): if key in self: raise TypeError("can't rebind constant") dict.__setitem__(self, key, value) # Need to also handle updates, del, pop, etc. class MetaConstant(type): @classmethod def __prepare__(metacls, name, bases): return readonlydict() def __new__(cls, name, bases, classdict): assert type(classdict) is readonlydict return type.__new__(cls, name, bases, classdict) class Constant(metaclass=MetaConstant): a = 1 b = 2 c = 3 What I expect is that Constant.a should return 1, and Constant.a=2 should raise TypeError, but what I get is a normal class __dict__. >>> Constant.a 1 >>> Constant.a = 2 >>> Constant.a 2 -- Steven From exarkun at twistedmatrix.com Wed Nov 24 17:23:12 2010 From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com) Date: Wed, 24 Nov 2010 16:23:12 -0000 Subject: [Python-Dev] OpenSSL Vulnerability (openssl-1.0.0a) In-Reply-To: <20101124161120.5ddd106c@pitrou.net> References: <4CEB3F72.7000006@m2.ccsnet.ne.jp> <20101123150219.29e20374@pitrou.net> <720EFE43-119F-4F2F-BCB1-939275B5FA6E@twistedmatrix.com> <1290585733.3642.2.camel@localhost.localdomain> <20101124150106.2109.660794265.divmod.xquotient.197@localhost.localdomain> <20101124161120.5ddd106c@pitrou.net> Message-ID: <20101124162312.2109.1025683352.divmod.xquotient.215@localhost.localdomain> On 03:11 pm, solipsis at pitrou.net wrote: >On Wed, 24 Nov 2010 15:01:06 -0000 >exarkun at twistedmatrix.com wrote: >> > >> >If I believe the link above: >> > 1CAny OpenSSL based TLS server is vulnerable if it is multi-threaded >>and >> >uses OpenSSL's internal caching mechanism. Servers that are >> >multi-process and/or disable internal session caching are NOT >> >affected. 1D >> > >> >So, you just have to create a multithreaded TLS server which doesn't >> >disable server-side session caching (it is enabled by default >>according >> >to >>http://www.openssl.org/docs/ssl/SSL_CTX_set_session_cache_mode.html >> >) >> >>Hm. The session cache is enabled by default, but nothing will ever >>use >>it unless the server specifies a session id using >>SSL_set_session_id_context or SSL_CTX_set_session_id_context. Python >>doesn't expose these, so I don't think any Python SSL server can set >>them. > >Well, Python calls SSL_CTX_set_session_id_context() implicitly, >starting >from 3.2 (precisely so that the session cache gets used). The >"documentation" I've found about the "session id context" seems to >suggest that a process-wide constant is enough. Ah. Okay, then Python 3.2 would be vulnerable. Good thing it isn't released yet. ;) Jean-Paul From benjamin at python.org Wed Nov 24 17:32:56 2010 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 24 Nov 2010 10:32:56 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CED32F9.5050004@pearwood.info> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED32F9.5050004@pearwood.info> Message-ID: 2010/11/24 Steven D'Aprano : > Nick Coghlan wrote: >> >> On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord >> wrote: >>> >>> Based on a non-exhaustive search, Python standard library modules >>> currently >>> using integers for constants: >> >> Thanks for that review. I think following up on the "NamedConstant" >> idea may make more sense than pursuing enums in their own right. > > Pardon me if I've missed something in this thread, but when you say > "NamedConstant", do you mean actual constants that can only be bound once > but not re-bound? If so, +1. If not, what do you mean? > > I thought PEP 3115 could be used to implement such constants, but I can't > get it to work... > > class readonlydict(dict): > ? ?def __setitem__(self, key, value): > ? ? ? ?if key in self: > ? ? ? ? ? ?raise TypeError("can't rebind constant") > ? ? ? ?dict.__setitem__(self, key, value) > ? ?# Need to also handle updates, del, pop, etc. > > class MetaConstant(type): > ? ?@classmethod > ? ?def __prepare__(metacls, name, bases): > ? ? ? ?return readonlydict() > ? ?def __new__(cls, name, bases, classdict): > ? ? ? ?assert type(classdict) is readonlydict > ? ? ? ?return type.__new__(cls, name, bases, classdict) > > class Constant(metaclass=MetaConstant): > ? ?a = 1 > ? ?b = 2 > ? ?c = 3 > > > What I expect is that Constant.a should return 1, and Constant.a=2 should > raise TypeError, but what I get is a normal class __dict__. The construction namespace can be customized, but class.__dict__ must always be a real dict. -- Regards, Benjamin From jsbueno at python.org.br Wed Nov 24 18:23:57 2010 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Wed, 24 Nov 2010 15:23:57 -0200 Subject: [Python-Dev] Fwd: constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> Message-ID: Hi -- If I may add my 0.02 cents - this sample has a sample implementation of the proposed features I found most interesting up to now: 1) inherit from int 2) display the constant's name on 'repr' 3) optionally populate a module with the constants 4) Optionally provide a starting value for the enum 5) Optionally provide a mapping with the values http://pastebin.com/6f1u35qJ (implementation is in python 2) Todo here: 6) Make them "read only" 7) Make the base type optional, with "int" as default - but also being able to create "constants" inheriting from other objects 8) more ideas? I am willing to play along this sample code as discussion goes on if there is any feedback. ?js ?-><- From alexander.belopolsky at gmail.com Wed Nov 24 18:37:43 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 24 Nov 2010 12:37:43 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Tue, Nov 23, 2010 at 2:18 PM, Amaury Forgeot d'Arc wrote: .. >> Given the apparent difficulty of writing even basic text processing >> algorithms in presence of surrogate pairs, I wonder how wise it is to >> expose Python users to them. > > This was already discussed two years ago: > > http://mail.python.org/pipermail/python-dev/2008-July/080900.html > Thanks for the link. Let me summarize that discussion as I read it. The discussion starts with a reference to Guido's 2001 post which concluded with """ ... if we had wanted to use a variable-lenth internal representation, we should have picked UTF-8 way back, like Perl did. Moving to a UTF-16-based internal representation now will give us all the problems of the Perl choice without any of the benefits. """ [1] and proposes to move to USC-4 completely for Python 3.0. Note that this is not the option that I would like to discuss here. I don't propose to discuss abandoning narrow builds. Instead, I would like to discuss the costs and benefits associated with using variable width CES as an internal representation. This is where the 2008 discussion moved. OP did not realize that narrow build supported UTF-16 and like myself was surprised that application developers should be aware of surrogates if they want to use narrow builds. It was also suggested that Python itself is likely to have many bugs that can be triggered by non-BMP characters on narrow builds. Guido's response was: """ I'd also prefer to receive bug reports about breakages actually encountered in the wild than purely theoretical issues """ I don't think this is a good position to take. Programs that expect one code unit where Python may produce two are likely to have security holes. Even when programmers carefully sanitize their input, they are likely to do it at the code point level based on Unicode category and 0xFFFF boundary does not mean anything special for their applications. I think anyone who wants to write a robust application has two choices in practice: (a) use wide Unicode build; (b) restrict all text to BMP. Supporting surrogates at the application level is likely to be prohibitively expensive. It was later suggested that the main benefit of "UTF-16" builds is that they can easily interface with system libraries that are "UTF-16" based. However, how likely are these libraries be bug-free when it comes to non-BMP characters? The history teaches us that not very likely. Daniel Arbuckle presented arguments against imposing the burden of dealing with surrogates on application writers. [2] The recurrent theme on the thread was that non-BMP characters are rare and those who need them can afford the extra development cost associated with the surrogates. This point was very eloquently articulated by Guido: """ Who are the many here? Who are the few? I'd venture that (at least for the foreseeable future, say, until China will finally have taken over the role of the US as the de-facto dominant super power :-) the many are people whose app will never see a Unicode character outside the BMP, or who do such minimal string processing that their code doesn't care whether it's handling UTF-16-encoded data. """ [3] This argument can also be used to support the position that narrow builds should not support non-BMP characters. Later the discussion started resembling this thread when it went into a scholastic dispute over fine points in Unicode Standard terminology. :-) Then BDFL vetoed len(u"\U00012345") returning 1 on narrow builds. [4] I would be against that as well. I don't see len("\U00012345") == 2 as a big problem because application developers can simply avoid using \U literals if they don't want to support non-BMP characters. On the other hand, an option to warn users about non-BMP literals on a narrow build may be useful but it is easy to implement in lint-like tools. There were multiple suggestions for standard library additions to help application writers to deal with surrogate pairs, but as far as I can tell, nothing has been done in this area in the following two years. I don't think there is a recipe on how to fix legacy character-by-character processing loop such as for c in string: ... to make it iterate over code points consistently in wide and narrow builds. (Note that I am not asking for a grapheme iterator here. This is clearly an application level feature.) > So yes, wrap() and center() should be fixed. I opened an issue 10521 for that. [5] I am fully prepared to see it dismissed as "theoretical" and be closed with "won't fix" or linger indefinitely. Fixing it would most likely involve writing the second version of pad() utility function specifically for the narrow build. All examples I've seen in Python C code of dealing with surrogates came with hand-coded #ifndef Py_UNICODE_WIDE fragments and no user-friendly macros or APIs that would abstract it away. A quick grep for maxunicode in the standard library revealed only one case of "narrow-build aware" code: if sys.maxunicode != 65535: # XXX: negation does not work with big charsets return charset See Lib/sre_compile.py. Not exactly a model to follow. To conclude, I feel that rather than trying to fully support non-BMP characters as surrogate pairs in narrow builds, we should make it easier for application developers to avoid them. If abandoning internal use of UTF-16 is not an option, I think we should at least add an option for decoders that currently produce surrogate pairs to treat non-BMP characters as errors and handle them according to user's choice. [1] http://mail.python.org/pipermail/i18n-sig/2001-June/001107.html [2] http://mail.python.org/pipermail/python-dev/2008-July/080912.html [3] http://mail.python.org/pipermail/python-dev/2008-July/080940.html [4] http://mail.python.org/pipermail/python-dev/2008-July/080916.html [5] http://bugs.python.org/issue10521 From fuzzyman at voidspace.org.uk Wed Nov 24 18:41:08 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 24 Nov 2010 17:41:08 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> Message-ID: <4CED4E34.5060400@voidspace.org.uk> On 24/11/2010 14:08, Nick Coghlan wrote: > On Wed, Nov 24, 2010 at 10:30 PM, Michael Foord > wrote: >> Based on a non-exhaustive search, Python standard library modules currently >> using integers for constants: > Thanks for that review. I think following up on the "NamedConstant" > idea may make more sense than pursuing enums in their own right. That > way we could get the debugging benefits on the Python side regardless > of any type constraints on the value (e.g. needing to be an integer in > order to interface to C code), without needing to design an enum API > that suited all purposes. Can you explain what you see as the difference? I'm not particularly interested in type validation but I like the fact that typical enum APIs allow you to group constants: the generated constant class acts as a namespace for all the defined constants. Are you just suggesting something along the lines of: class NamedConstant(int): def __new__(cls, name, val): return int.__new__(cls, val) def __init__(self, name, val): self._name = name def __repr__(self): return '' % self._name FOO = NamedConstant('FOO', 3) In general the less features the better, but I'd like a few more features than that. :-) All the best, Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From mal at egenix.com Wed Nov 24 19:50:57 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 24 Nov 2010 19:50:57 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CED5E91.9070705@egenix.com> Alexander Belopolsky wrote: > To conclude, I feel that rather than trying to fully support non-BMP > characters as surrogate pairs in narrow builds, we should make it > easier for application developers to avoid them. I don't understand what you're after here. Programmers can easily avoid them by not using them :-) > If abandoning > internal use of UTF-16 is not an option, I think we should at least > add an option for decoders that currently produce surrogate pairs to > treat non-BMP characters as errors and handle them according to user's > choice. But what do you gain by doing this ? You'd lose the round-trip safety of those codecs and that's not a good thing. Note that most text processing APIs in Python work based on code units, which in most cases represent single code points, but in some cases can also represent surrogates (both on UCS-2 and on UCS-4 builds). E.g. str.center(n) centers the string in a padded string that is composed of n code units. Whether that operation will result in a text that's centered visually on output is a completely different story. The original string could contain surrogates, it could also contain combing code points, so the visual presentation of the result may very well not be centered at all; it may not even appear as having the length n to the user. Since we're not going change the semantics of those APIs, it is OK to not support padding with non-BMP code points on UCS-2 builds. Supporting such cases would only cause problems: * if the methods would pad with surrogates, the resulting string would no longer have length n; breaking the assumption that len(str.center(n)) == n * if the methods would pad with half the number of surroagtes to make sure that len(str.center(n)) == n, the resulting output to e.g. a terminal would be further off, than what you already have with surrogates and combining code points in the original string. More on codecs supporting surrogates: http://mail.python.org/pipermail/python-dev/2008-July/080915.html Perhaps it's time to reconsider a project I once started but that never got off the ground: http://mail.python.org/pipermail/python-dev/2008-July/080911.html Here's the pre-PEP: http://mail.python.org/pipermail/python-dev/2001-July/015938.html -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 24 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From brett at python.org Wed Nov 24 20:04:01 2010 From: brett at python.org (Brett Cannon) Date: Wed, 24 Nov 2010 11:04:01 -0800 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: <4CEC4917.2070508@udel.edu> References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> Message-ID: On Tue, Nov 23, 2010 at 15:07, Terry Reedy wrote: > > > On 11/23/2010 5:43 PM, ?ric Araujo wrote: >>> >>> Modified: python/branches/py3k/Misc/ACKS >>> >>> ============================================================================== >>> --- python/branches/py3k/Misc/ACKS ? ? ?(original) >>> +++ python/branches/py3k/Misc/ACKS ? ? ?Tue Nov 23 21:32:47 2010 >>> @@ -1,4 +1,4 @@ >>> -Acknowledgements >>> +?Acknowledgements >> >> This change introduced a so-called UTF-8 BOM in the file. ?Is >> TortoiseSvn the culprit or a text editor? > > I used Notepad to edit the file, TortoiseSvn to commit, the same as I did > for #9222, rev86702, Lib\idlelib\IOBinding.py, yesterday. > If the latter is OK, perhaps *.py gets filtered better than misc. text > files. I believe I have the config as specified in dev/faq. Adding the BOM will be an editor thing, not a svn thing. Doing a Google search for [ms notepad bom] shows that Notepad did the "helpful", invisible edit. -Brett > > [miscellany] > enable-auto-props = yes > > [auto-props] > * = svn:eol-style=native > *.c = svn:keywords=Id > *.h = svn:keywords=Id > *.py = svn:keywords=Id > *.txt = svn:keywords=Author Date Id Revision > > Terry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > From tjreedy at udel.edu Wed Nov 24 20:25:17 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 24 Nov 2010 14:25:17 -0500 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> Message-ID: On 11/24/2010 2:04 PM, Brett Cannon wrote: > On Tue, Nov 23, 2010 at 15:07, Terry Reedy wrote: >> I used Notepad to edit the file, TortoiseSvn to commit, the same as I did >> for #9222, rev86702, Lib\idlelib\IOBinding.py, yesterday. >> If the latter is OK, perhaps *.py gets filtered better than misc. text >> files. I believe I have the config as specified in dev/faq. > > Adding the BOM will be an editor thing, not a svn thing. Doing a > Google search for [ms notepad bom] shows that Notepad did the > "helpful", invisible edit. So I presume it did the same with IOBinding.py. Does *.py get filtered is a way that could be extended to no-extention files? Do *.txt files get BOM filtered off? Should all text files in repository have some extension (default .txt)? More to the point, can better filtering be added to the new hg repository? Or can a local Windows hg setup have such filtering on local commits before pushing? I know now that I could always edit with IDLE's editor, but it is a lot easier to right click and select edit than it is to run thru the directory tree in an open dialog. And of course, since the pseudo-BOM addition is undocumented within notepad itself, and probably other editors, it is easy to not know. -- Terry Jan Reedy From g.brandl at gmx.net Wed Nov 24 21:04:40 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 24 Nov 2010 21:04:40 +0100 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> Message-ID: Am 24.11.2010 20:25, schrieb Terry Reedy: > On 11/24/2010 2:04 PM, Brett Cannon wrote: >> On Tue, Nov 23, 2010 at 15:07, Terry Reedy wrote: > >>> I used Notepad to edit the file, TortoiseSvn to commit, the same as I did >>> for #9222, rev86702, Lib\idlelib\IOBinding.py, yesterday. >>> If the latter is OK, perhaps *.py gets filtered better than misc. text >>> files. I believe I have the config as specified in dev/faq. >> >> Adding the BOM will be an editor thing, not a svn thing. Doing a >> Google search for [ms notepad bom] shows that Notepad did the >> "helpful", invisible edit. > > So I presume it did the same with IOBinding.py. Does *.py get filtered > is a way that could be extended to no-extention files? Do *.txt files > get BOM filtered off? Should all text files in repository have some > extension (default .txt)? > > More to the point, can better filtering be added to the new hg > repository? Or can a local Windows hg setup have such filtering on local > commits before pushing? Of course it can; it's just a matter of writing the respective hooks. What we *can* do in any case is to check for UTF-8 "BOMs" server-side in the whitespace checking hook. > I know now that I could always edit with IDLE's editor, but it is a lot > easier to right click and select edit than it is to run thru the > directory tree in an open dialog. And of course, since the pseudo-BOM > addition is undocumented within notepad itself, and probably other > editors, it is easy to not know. It should show up as an invisible change in the first line of a file when you look at a "svn diff". (It is a very good practice to look at a diff before committing anyway.) Georg From alexander.belopolsky at gmail.com Wed Nov 24 21:06:25 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 24 Nov 2010 15:06:25 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CED5E91.9070705@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> Message-ID: On Wed, Nov 24, 2010 at 1:50 PM, M.-A. Lemburg wrote: .. >> add an option for decoders that currently produce surrogate pairs to >> treat non-BMP characters as errors and handle them according to user's >> choice. > > But what do you gain by doing this ? You'd lose the round-trip > safety of those codecs and that's not a good thing. > Any non-trivial text processing is likely to be broken in presence of surrogates. Producing them on input is just trading known issue for an unknown one. Processing surrogate pairs in python code is hard. Software that has to support non-BMP characters will most likely be written for a wide build and contain subtle bugs when run under a narrow build. Note that my latest proposal does not abolish surrogates outright. Users who want them can still use something like "surrogateescape" error handler for non-BMP characters. > Since we're not going change the semantics of those APIs, > it is OK to not support padding with non-BMP code points on > UCS-2 builds. > Well, I think more users are willing to accept slightly misaligned text in their web-app logs than those willing to cope with Traceback (most recent call last): ... TypeError: The fill character must be exactly one character long there. Yes, allowing non-trusted users to specify fill character is unlikely, but it is quite likely that naive slicing or iteration over string units would result in Traceback (most recent call last): ... UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 0: surrogates not allowed > Supporting such cases would only cause problems: > > * if the methods would pad with surrogates, the resulting > ?string would no longer have length n; breaking the > ?assumption that len(str.center(n)) == n > I agree, but how is this different from breaking the assumption that len(chr(i)) == 1? > * if the methods would pad with half the number of surroagtes > ?to make sure that len(str.center(n)) == n, the resulting > ?output to e.g. a terminal would be further off, than what > ?you already have with surrogates and combining code points > ?in the original string. > I agree again. What I suggested on the tracker, supporting non-BMP characters in narrow builds should mean that library functions given input with the same UCS-4 encoding should produce output with the same UCS-4 encoding. > Perhaps it's time to reconsider a project I once started > but that never got off the ground: > > ?http://mail.python.org/pipermail/python-dev/2008-July/080911.html > > Here's the pre-PEP: > > ?http://mail.python.org/pipermail/python-dev/2001-July/015938.html I agree again, but I feel that exposing code units rather than code points at the Python string level takes us back to 2.x days of mixing bytes and strings. Let me quote Guido circa 2001 again: """ ... if we had wanted to use a variable-lenth internal representation, we should have picked UTF-8 way back, like Perl did. Moving to a UTF-16-based internal representation now will give us all the problems of the Perl choice without any of the benefits. """ I don't understand what changed since 2001 that made this argument invalid. I note that an opinion has been raised on this thread that if we want compressed internal representation for strings, we should use UTF-8. I tend to agree, but UTF-8 has been repeatedly rejected as too hard to implement. What makes UTF-16 easier than UTF-8? Only the fact that you can ignore bugs longer, in my view. From g.brandl at gmx.net Wed Nov 24 21:24:49 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 24 Nov 2010 21:24:49 +0100 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation Message-ID: Hi, at , you can look at a version of the 3.2 docs that has the upcoming commenting feature. JavaScript is mandatory. I've switched on anonymous comments for testing, but usually at least comments from anonymous users can be moderated. Be sure to test the "propose a change" feature too. Login currently allows OpenID exclusively. Credits go to Jacob Mason, whose GSOC project is responsible for almost all of what you see there. [1] Please test on a smaller page, such as , there is currently a speed issue with larger pages. (Helpful tips from JS experts are welcome.) Other things I have to do before this can go live: * reuse existing logins from either wiki or tracker? * (re)Captcha integration for anonymous comments * easier moderation (currently emails are sent on new comments) * facility for (semi)automatic applying of proposals (once Hg is live, this should be easy to do due to the separation between commit and merge) * allow commenting on code blocks (figure out where to place the "bubble") Any feedback is appreciated (I'd suggest mailing it to doc-SIG only, to avoid cluttering up python-dev). Have fun, Georg [1] The source for the webapp is at , but most of the functionality is implemented in Sphinx trunk. From anurag.chourasia at gmail.com Wed Nov 24 22:01:32 2010 From: anurag.chourasia at gmail.com (Anurag Chourasia) Date: Thu, 25 Nov 2010 02:31:32 +0530 Subject: [Python-Dev] collect2: library libpython2.6 not found while building extensions (--enable-shared) Message-ID: All, When I configure python to enable shared libraries, none of the extensions are getting built during the make step due to this error. building 'cStringIO' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I. -IInclude -I./Include -I/opt/freeware/include -I/opt/freeware/include/readline -I/opt/freeware/include/ncurses -I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include -I/u01/home/apli/wm/GDD/Python-2.6.6 -c /u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.c -o build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.o ./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.o -L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/cStringIO.so *collect2: library libpython2.6 not found* building 'cPickle' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I. -IInclude -I./Include -I/opt/freeware/include -I/opt/freeware/include/readline -I/opt/freeware/include/ncurses -I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include -I/u01/home/apli/wm/GDD/Python-2.6.6 -c /u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.c -o build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.o ./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.o -L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/cPickle.so *collect2: library libpython2.6 not found* This is on AIX 5.3, GCC 4.2, Python 2.6.6 I can confirm that there is a libpython2.6.a file in the top level directory from where I am doing the configure/make etc Here are the options supplied to the configure command ./configure --enable-shared --disable-ipv6 --with-gcc=gcc CPPFLAGS="-I /opt/freeware/include -I /opt/freeware/include/readline -I /opt/freeware/include/ncurses" Please guide me in getting past this error. Thanks for your help on this. Regards, Anurag -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Wed Nov 24 23:13:50 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Wed, 24 Nov 2010 23:13:50 +0100 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> Message-ID: <4CED8E1E.5050400@v.loewis.de> > So I presume it did the same with IOBinding.py. No. This file contains only ASCII characters, so notepad has decided to not add the BOM. Regards, Martin From dreamingforward at gmail.com Thu Nov 25 00:38:01 2010 From: dreamingforward at gmail.com (average) Date: Wed, 24 Nov 2010 16:38:01 -0700 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> Message-ID: Is immutability a general need that should have general solution? By generalizing the idea to lists/tuples, set/frozenset, dicts, and strings (for example), it seems one could simplify the container classes, eliminate code complexity, and perhaps improve resource utilization. mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Nov 25 00:41:58 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 25 Nov 2010 00:41:58 +0100 Subject: [Python-Dev] r86731 - in python/branches/py3k: Lib/distutils/command/install.py Lib/distutils/sysconfig.py Lib/sysconfig.py Makefile.pre.in Misc/python.pc.in configure configure.in References: <20101124194347.C5C86EEA56@mail.python.org> Message-ID: <20101125004158.32b1ceaa@pitrou.net> On Wed, 24 Nov 2010 20:43:47 +0100 (CET) barry.warsaw wrote: > Author: barry.warsaw > Date: Wed Nov 24 20:43:47 2010 > New Revision: 86731 > > Log: > Final patch for issue 9807. This seems to have broken compilation under Windows: Build started: Project: ssl, Configuration: Debug|Win32 Performing Makefile project actions Traceback (most recent call last): File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 519, in main() File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 507, in main known_paths = addusersitepackages(known_paths) File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 253, in addusersitepackages user_site = getusersitepackages() File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 228, in getusersitepackages user_base = getuserbase() # this will also set USER_BASE File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 218, in getuserbase USER_BASE = get_config_var('userbase') File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\sysconfig.py", line 586, in get_config_var return get_config_vars().get(name) File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\sysconfig.py", line 478, in get_config_vars _CONFIG_VARS['abiflags'] = sys.abiflags AttributeError: 'module' object has no attribute 'abiflags' Regards Antoine. From barry at python.org Thu Nov 25 00:50:25 2010 From: barry at python.org (Barry Warsaw) Date: Wed, 24 Nov 2010 18:50:25 -0500 Subject: [Python-Dev] r86731 - in python/branches/py3k: Lib/distutils/command/install.py Lib/distutils/sysconfig.py Lib/sysconfig.py Makefile.pre.in Misc/python.pc.in configure configure.in In-Reply-To: <20101125004158.32b1ceaa@pitrou.net> References: <20101124194347.C5C86EEA56@mail.python.org> <20101125004158.32b1ceaa@pitrou.net> Message-ID: <20101124185025.6cb67127@mission> On Nov 25, 2010, at 12:41 AM, Antoine Pitrou wrote: >On Wed, 24 Nov 2010 20:43:47 +0100 (CET) >barry.warsaw wrote: >> Author: barry.warsaw >> Date: Wed Nov 24 20:43:47 2010 >> New Revision: 86731 >> >> Log: >> Final patch for issue 9807. > >This seems to have broken compilation under Windows: > >Build started: Project: ssl, Configuration: Debug|Win32 >Performing Makefile project actions >Traceback (most recent call last): > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 519, in > main() > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 507, in main > known_paths = addusersitepackages(known_paths) > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 253, in addusersitepackages > user_site = getusersitepackages() > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 228, in getusersitepackages > user_base = getuserbase() # this will also set USER_BASE > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\site.py", line 218, in getuserbase > USER_BASE = get_config_var('userbase') > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\sysconfig.py", line 586, in get_config_var > return get_config_vars().get(name) > File "d:\cygwin\home\db3l\buildarea\3.x.bolen-windows\build\lib\sysconfig.py", line 478, in get_config_vars > _CONFIG_VARS['abiflags'] = sys.abiflags >AttributeError: 'module' object has no attribute 'abiflags' As discussed on IRC, _CONFIG_VARS['abiflags'] = '' if sys.abiflags is not defined. Amaury is going to test that. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From greg.ewing at canterbury.ac.nz Thu Nov 25 01:19:37 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 25 Nov 2010 13:19:37 +1300 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> Message-ID: <4CEDAB99.2000005@canterbury.ac.nz> On 24/11/10 13:22, James Y Knight wrote: > Instead, provide bidirectional iterators which can traverse the string by byte, > codepoint, or by grapheme Maybe it would be a good idea to add some iterators like this to Python. (Or has the time machine beaten me there?) -- Greg From stephen at xemacs.org Thu Nov 25 03:17:44 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 25 Nov 2010 11:17:44 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> Message-ID: <87bp5eb0zb.fsf@uwakimon.sk.tsukuba.ac.jp> Alexander Belopolsky writes: > Any non-trivial text processing is likely to be broken in presence of > surrogates. If you're worried about this, write a UCS-2-producing codec that rejects surrogates or stuffs them into the private zone of the BMP. Maybe such a codec should be default, but so far nobody seems to want one enough; they want UTF-16 even though they know it's wrong. One of the things that makes the 16-bit code unit attractive to me is that the options for working around the variable-width nature of UTF-16 (without actually implementing conformance to UTF-16 in internal operations!) are many. If you use octets as code units, you don't have such options: you have to do it right. > Processing surrogate pairs in python code is hard. Sure, but as James Knight and MAL point out, so is processing compose characters, and those errors will go undetected in your proposals, even with a strict UCS-2 definition. What can you do? Banning composing characters isn't going to fly! > Yes, allowing non-trusted users to specify fill character is unlikely, > but it is quite likely that naive slicing or iteration over string > units would result in > > Traceback (most recent call last): Naive slicing yes, but naive iteration (ie, iteration that consumes the whole string, or up to a known character, rather than up to a specified position) is highly unlikely to result in such a traceback. It is precisely that property (non-BMP characters get passed through unchanged, or ignored) that makes extension to non-BMP code points attractive. > I agree again, but I feel that exposing code units rather than code > points at the Python string level takes us back to 2.x days of mixing > bytes and strings. It does, but there's a difference. With bytes as UTF-8, only ASCII values have defined semantics in Unicode. The rest have semantics that is context-dependent, and they are frequent in any non-English processing and many English use cases (math symbols, correctly- oriented punctuation). With 16-bit code units, all values have well- defined semantics in Unicode, and non-characters are going to be extremely rare in the vast majority of use cases. IOW, you can think of Python as a UCS-2 device processing characters, and let surrounding UTF-16 processors deal with the errors. > Let me quote Guido circa 2001 again: > > """ > ... if we had wanted to use a > variable-lenth internal representation, we should have picked UTF-8 > way back, like Perl did. Moving to a UTF-16-based internal > representation now will give us all the problems of the Perl choice > without any of the benefits. > """ > > I don't understand what changed since 2001 that made this argument > invalid. Nothing. The internal representation of Python is UCS-2, not UTF-16. People who want to think otherwise are kidding themselves. The presence of surrogates is not sufficient to call something UTF-16. Preserving the Unicode code points through any builtin operations is a necessary condition, and Python doesn't do that. *However*, in my opinion, it's not a big deal to allow surrogates in UCS-2 a la ISO 10646-1:1996. That lets people who want a quick and dirty way to handle BMP text that *might* (but usually won't) contain some non-BMP characters go a long way fast. "Although practicality beats purity." > I note that an opinion has been raised on this thread that > if we want compressed internal representation for strings, we should > use UTF-8. I tend to agree, but UTF-8 has been repeatedly rejected as > too hard to implement. What makes UTF-16 easier than UTF-8? Only the > fact that you can ignore bugs longer, in my view. That's mostly true. My guess is that we can probably ignore those bugs for as long as it takes someone to write the higher-level libraries that James suggests and MAL has actually proposed and started a PEP for. From greg.ewing at canterbury.ac.nz Thu Nov 25 03:35:50 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 25 Nov 2010 15:35:50 +1300 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEDCB86.9030506@canterbury.ac.nz> On 24/11/10 22:03, Stephen J. Turnbull wrote: > But > if you actually need to remember positions, or regions, to jump to > later or to communicate to other code that manipulates them, doing > this stuff the straightforward way (just copying the whole iterator > object to hang on to its state) becomes expensive. If the internal representation of a text pointer (I won't call it an iterator because that means something else in Python) is a byte offset or something similar, it shouldn't take up any more space than a Python int, which is what you'd be using anyway if you represented text positions by grapheme indexes or whatever. If you want the text pointer to also remember which string it points into, it'll be a bit bigger, but again, no bigger than you would need to get the same functionality using a grapheme index plus a reference to the original string. Probably smaller, because it would all be encapsulated in one object. So I don't really see what you're arguing for here. How do *you* think positions in unicode strings should be represented? -- Greg From greg.ewing at canterbury.ac.nz Thu Nov 25 04:19:33 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 25 Nov 2010 16:19:33 +1300 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEDD5C5.9050306@canterbury.ac.nz> On 25/11/10 06:37, Alexander Belopolsky wrote: > I don't think there is a recipe on how to fix legacy > character-by-character processing loop such as > > for c in string: > ... > > to make it iterate over code points consistently in wide and narrow > builds. A couple of possibilities: 1) Make things so that 'for c in string' does actually iterate over characters rather than code units. This could break existing code, though. 2) Provide some things like for c in string.chars(): ... for c in string.graphemes(): ... where chars() and graphemes() return appropriate iterators. (Or possibly iterable views, but that would raise the expectation that the views could also be randomly indexed by char or grapheme, which we probably wouldn't want to support.) -- Greg From greg.ewing at canterbury.ac.nz Thu Nov 25 04:46:53 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 25 Nov 2010 16:46:53 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> Message-ID: <4CEDDC2D.204@canterbury.ac.nz> On 25/11/10 12:38, average wrote: > Is immutability a general need that should have general solution? I don't think it really generalizes. Tuples are not just frozen lists, for example -- they have a different internal structure that's more efficient to create and access. -- Greg From stephen at xemacs.org Thu Nov 25 04:55:40 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 25 Nov 2010 12:55:40 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEDCB86.9030506@canterbury.ac.nz> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEDCB86.9030506@canterbury.ac.nz> Message-ID: <87ipzm6oqr.fsf@uwakimon.sk.tsukuba.ac.jp> Greg Ewing writes: > On 24/11/10 22:03, Stephen J. Turnbull wrote: > > But > > if you actually need to remember positions, or regions, to jump to > > later or to communicate to other code that manipulates them, doing > > this stuff the straightforward way (just copying the whole iterator > > object to hang on to its state) becomes expensive. > > If the internal representation of a text pointer (I won't call it > an iterator because that means something else in Python) is a byte > offset or something similar, it shouldn't take up any more space > than a Python int, which is what you'd be using anyway if you > represented text positions by grapheme indexes or whatever. That's not necessarily true. Eg, in Emacs ("there you go again"), Lisp integers are not only immediate (saving one pointer), but the type is encoded in the lower bits, so that there is no need for a type pointer -- the representation is smaller than the opaque marker type. Altogether, up to 8 of 12 bytes saved on a 32-bit platform, or 16 of 24 bytes on a 64-bit platform. In Python it's true that markers can use the same data structure as integers and simply provide different methods, and it's arguable that Python's design is better. But if you use bytes internally, then you have problems. Do you expose that byte value to the user? Can users (programmers using the language and end users) specify positions in terms of byte values? If so, what do you do if the user specifies a byte value that points into a multibyte character? What if the user wants to specify position by number of characters? Can you translate efficiently? As I say elsewhere, it's possible that there really never is a need to efficiently specify an absolute position in a large text as a character (grapheme, whatever) count. But I think it would be hard to implement an efficient text-processing *language*, eg, a Python module for *full conformance* in handling Unicode, on top of UTF-8. Any time you have an algorithm that requires efficient access to arbitrary text positions, you'll spend all your skull sweat fighting the representation. At least, that's been my experience with Emacsen. > So I don't really see what you're arguing for here. How do > *you* think positions in unicode strings should be represented? I think what users should see is character positions, and they should be able to specify them numerically as well as via an opaque marker object. I don't care whether that position is represented as bytes or characters internally, except that the experience of Emacsen is that representation as byte positions is both inefficient and fragile. The representation as character positions is more robust but slightly more inefficient. From alexander.belopolsky at gmail.com Thu Nov 25 05:37:33 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 24 Nov 2010 23:37:33 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87bp5eb0zb.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> <87bp5eb0zb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 24, 2010 at 9:17 PM, Stephen J. Turnbull wrote: .. > ?> I note that an opinion has been raised on this thread that > ?> if we want compressed internal representation for strings, we should > ?> use UTF-8. ?I tend to agree, but UTF-8 has been repeatedly rejected as > ?> too hard to implement. ?What makes UTF-16 easier than UTF-8? ?Only the > ?> fact that you can ignore bugs longer, in my view. > > That's mostly true. ?My guess is that we can probably ignore those > bugs for as long as it takes someone to write the higher-level > libraries that James suggests and MAL has actually proposed and > started a PEP for. > As far as I can tell, that PEP generated grand total of one comment in nine years. This may or may not be indicative of how far away we are from seeing it implemented. :-) As far as UTF-8 vs. UCS-2/4 debate, I have an idea that may be even more far fetched. Once upon a time, Python Unicode strings supported buffer protocol and would lazily fill an internal buffer with bytes in the default encoding. In 3.x the default encoding has been fixed as UTF-8, buffer protocol support was removed from strings, but the internal buffer caching (now UTF-8) encoded representation remained. Maybe we can now implement defenc logic in reverse. Recall that strings are stored as UCS-2/4 sequences, but once buffer is requested in 2.x Python code or char* is obtained via _PyUnicode_AsStringAndSize() at the C level in 3.x, an internal buffer is filled with UTF-8 bytes and defenc is set to point to that buffer. So the idea is for strings to store their data as UTF-8 buffer pointed by defenc upon construction. If an application uses string indexing, UTF-8 only strings will lazily fill their UCS-2/4 buffer. Proper, Unicode-aware algorithms such as grapheme, word or line iteration or simple operations such as concatenation, search or substitution would operate directly on defenc buffers. Presumably over time fewer and fewer applications would use code unit indexing that require UCS-2/4 buffer and eventually Python strings can stop supporting indexing altogether just like they stopped supporting the buffer protocol in 3.x. From tjreedy at udel.edu Thu Nov 25 06:22:01 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 25 Nov 2010 00:22:01 -0500 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> Message-ID: On 11/24/2010 3:04 PM, Georg Brandl wrote: >>> Adding the BOM will be an editor thing, not a svn thing. Doing a > It should show up as an invisible change in the first line of a file when you > look at a "svn diff". (It is a very good practice to look at a diff before > committing anyway.) It does show up, and yes I agree. That should be in dev/faq if not already -- Terry Jan Reedy From tjreedy at udel.edu Thu Nov 25 06:23:27 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 25 Nov 2010 00:23:27 -0500 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: <4CED8E1E.5050400@v.loewis.de> References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> <4CED8E1E.5050400@v.loewis.de> Message-ID: On 11/24/2010 5:13 PM, "Martin v. L?wis" wrote: >> So I presume it did the same with IOBinding.py. > > No. This file contains only ASCII characters, so notepad has decided > to not add the BOM. Or it somehow got removed from the .py file. I tried with another .py file (and reverted!) and the diff showed the invisible change to the first line that Georg predicted. -- Terry Jan Reedy From tjreedy at udel.edu Thu Nov 25 06:39:30 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 25 Nov 2010 00:39:30 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> Message-ID: On 11/24/2010 3:06 PM, Alexander Belopolsky wrote: > Any non-trivial text processing is likely to be broken in presence of > surrogates. Producing them on input is just trading known issue for > an unknown one. Processing surrogate pairs in python code is hard. > Software that has to support non-BMP characters will most likely be > written for a wide build and contain subtle bugs when run under a > narrow build. Note that my latest proposal does not abolish > surrogates outright. Users who want them can still use something like > "surrogateescape" error handler for non-BMP characters. It seems to me that what you are asking for is an alternate, optional, utf-8-bmp codec that would raise an error, in either direction, for non-bmp chars. Then, as you suggest, if one is not prepared for surrogates, they are not allowed. -- Terry Jan Reedy From anurag.chourasia at gmail.com Thu Nov 25 10:24:34 2010 From: anurag.chourasia at gmail.com (Anurag Chourasia) Date: Thu, 25 Nov 2010 14:54:34 +0530 Subject: [Python-Dev] AIX 5.3 - Enabling Shared Library Support Vs Extensions Message-ID: All, When I configure python to enable shared libraries, none of the extensions are getting built during the make step due to this error. building 'cStringIO' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I. -IInclude -I./Include -I/opt/freeware/include -I/opt/freeware/include/readline -I/opt/freeware/include/ncurses -I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include -I/u01/home/apli/wm/GDD/Python-2.6.6 -c /u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.c -o build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.o ./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cStringIO.o -L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/cStringIO.so *collect2: library libpython2.6 not found* building 'cPickle' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I. -IInclude -I./Include -I/opt/freeware/include -I/opt/freeware/include/readline -I/opt/freeware/include/ncurses -I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include -I/u01/home/apli/wm/GDD/Python-2.6.6 -c /u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.c -o build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.o ./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/cPickle.o -L/usr/local/lib *-lpython2.6* -o build/lib.aix-5.3-2.6/cPickle.so *collect2: library libpython2.6 not found* This is on AIX 5.3, GCC 4.2, Python 2.6.6 I can confirm that there is a libpython2.6.a file in the top level directory from where I am doing the configure/make etc Here are the options supplied to the configure command ./configure --enable-shared --disable-ipv6 --with-gcc=gcc CPPFLAGS="-I /opt/freeware/include -I /opt/freeware/include/readline -I /opt/freeware/include/ncurses" Please guide me in getting past this error. Thanks for your help on this. Regards, Anurag -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu Nov 25 10:34:51 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 25 Nov 2010 01:34:51 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEC2759.40203@g.nevcal.com> References: <20101121034404.52924F20A@mail.python.org> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> <4CEC2759.40203@g.nevcal.com> Message-ID: <4CEE2DBB.3040502@g.nevcal.com> So the following code defines constants with associated names that get put in the repr. I'm still a Python newbie in some areas, particularly classes and metaclasses, maybe more. But this Python 3 code seems to create constants with names ... works for int and str at least. Special case for int defines a special __or__ operator to OR both the values and the names, which some might like. Dunno why it doesn't work for dict, and it is too late to research that today. That's the last test case in the code below, so you can see how it works for int and string before it bombs. There's some obvious cleanup work to be done, and it would be nice to make the names actually be constant... but they do lose their .name if you ignorantly assign the base type, so at least it is hard to change the value and keep the associated .name that gets reported by repr, which might reduce some confusion at debug time. An idea I had, but have no idea how to implement, is that it might be nice to say: with imported_constants_from_module: do_stuff where do_stuff could reference the constants without qualifying them by module. Of course, if you knew it was just a module of constants, you could "import * from module" :) But the idea of with is that they'd go away at the end of that scope. Some techniques here came from Raymond's namedtuple code. def constant( name, val ): typ = str( type( val )) if typ.startswith("": typ = typ[ 8:-2 ] ev = ''' class constant_%s( %s ): def __new__( cls, val, name ): self = %s.__new__( cls, val ) self.name = name return self def __repr__( self ): return self.name + ': ' + str( self ) ''' if typ == 'int': ev += ''' def __or__( self, other ): if isinstance( other, constant_int ): return constant_int( int( self ) | int( other ), self.name + ' | ' + other.name ) ''' ev += ''' %s = constant_%s( %s, '%s' ) ''' ev = ev % ( typ, typ, typ, name, typ, repr( val ), name ) print( ev ) exec( ev, globals()) constant('O_RANDOM', val=16 ) constant('O_SEQUENTIAL', val=32 ) constant("O_STRING", val="string") def foo( x ): print( str( x )) print( repr( x )) print( type( x )) foo( O_RANDOM ) foo( O_SEQUENTIAL ) foo( O_STRING ) zz = O_RANDOM | O_SEQUENTIAL foo( zz ) y = {'ab': 2, 'yz': 3 } constant('O_DICT', y ) -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Thu Nov 25 10:51:09 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 25 Nov 2010 10:51:09 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> Message-ID: <4CEE318D.5000705@egenix.com> Terry Reedy wrote: > On 11/24/2010 3:06 PM, Alexander Belopolsky wrote: > >> Any non-trivial text processing is likely to be broken in presence of >> surrogates. Producing them on input is just trading known issue for >> an unknown one. Processing surrogate pairs in python code is hard. >> Software that has to support non-BMP characters will most likely be >> written for a wide build and contain subtle bugs when run under a >> narrow build. Note that my latest proposal does not abolish >> surrogates outright. Users who want them can still use something like >> "surrogateescape" error handler for non-BMP characters. > > It seems to me that what you are asking for is an alternate, optional, > utf-8-bmp codec that would raise an error, in either direction, for > non-bmp chars. Then, as you suggest, if one is not prepared for > surrogates, they are not allowed. That would be a possibility as well... but I doubt that many users are going to bother, since slicing surrogates is just as bad as slicing combining code points and the latter are much more common in real life and they do happen to mostly live in the BMP. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 25 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mal at egenix.com Thu Nov 25 10:57:17 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 25 Nov 2010 10:57:17 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> <87bp5eb0zb.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CEE32FD.90507@egenix.com> Alexander Belopolsky wrote: > On Wed, Nov 24, 2010 at 9:17 PM, Stephen J. Turnbull wrote: > .. >> > I note that an opinion has been raised on this thread that >> > if we want compressed internal representation for strings, we should >> > use UTF-8. I tend to agree, but UTF-8 has been repeatedly rejected as >> > too hard to implement. What makes UTF-16 easier than UTF-8? Only the >> > fact that you can ignore bugs longer, in my view. >> >> That's mostly true. My guess is that we can probably ignore those >> bugs for as long as it takes someone to write the higher-level >> libraries that James suggests and MAL has actually proposed and >> started a PEP for. >> > > As far as I can tell, that PEP generated grand total of one comment in > nine years. This may or may not be indicative of how far away we are > from seeing it implemented. :-) At the time it was too early for people to start thinking about these issues. Actual use of Unicode really only started a few years ago. Since I didn't have a need for such an indexing module myself (and didn't have much time to work on it anyway), I punted on the idea. If someone else wants to pick up the idea, I'd gladly help out with the details. > As far as UTF-8 vs. UCS-2/4 debate, I have an idea that may be even > more far fetched. Once upon a time, Python Unicode strings supported > buffer protocol and would lazily fill an internal buffer with bytes in > the default encoding. In 3.x the default encoding has been fixed as > UTF-8, buffer protocol support was removed from strings, but the > internal buffer caching (now UTF-8) encoded representation remained. > Maybe we can now implement defenc logic in reverse. Recall that > strings are stored as UCS-2/4 sequences, but once buffer is requested > in 2.x Python code or char* is obtained via > _PyUnicode_AsStringAndSize() at the C level in 3.x, an internal buffer > is filled with UTF-8 bytes and defenc is set to point to that buffer. The original idea was for that buffer to go away once we moved to Unicode for strings. Reality has shown that we still need to stick the buffer, though, since the UTF-8 representation of Unicode objects is used a lot. > So the idea is for strings to store their data as UTF-8 buffer > pointed by defenc upon construction. If an application uses string > indexing, UTF-8 only strings will lazily fill their UCS-2/4 buffer. > Proper, Unicode-aware algorithms such as grapheme, word or line > iteration or simple operations such as concatenation, search or > substitution would operate directly on defenc buffers. Presumably > over time fewer and fewer applications would use code unit indexing > that require UCS-2/4 buffer and eventually Python strings can stop > supporting indexing altogether just like they stopped supporting the > buffer protocol in 3.x. I don't follow you: how would UTF-8, which has even more issues with variable length representation of code points, make something easier compared to UTF-16, which has far fewer such issues and then only for non-BMP code points ? Please note that we can only provide one way of string indexing in Python using the standard s[1] notation and since we don't want that operation to be fast and no more than O(1), using the code units as items is the only reasonable way to implement it. With an indexing module, we could then let applications work based on higher level indexing schemes such as complete code points (skipping surrogates), combined code points, graphemes (ignoring e.g. most control code points and zero width code points), words (with some customizations as to where to break words, which will likely have to be language dependent), lines (which can be complicated for scripts that use columns instead ;-)), paragraphs, etc. It would also help to add transparent indexing for right-to-left scripts and text that uses both left-to-right and right-to-left text (BIDI). However, in order for these indexing methods to actually work, they will need to return references to the code units, so we cannot just drop that access method. * Back on the surrogates topic: In any case, I think this discussion is losing its grip on reality. By far, most strings you find in actual applications don't use surrogates at all, so the problem is being exaggerated. If you need to be careful about surrogates for some reason, I think a single new method .hassurrogates() on string objects would go a long way in making detection and adding special-casing for these a lot easier. If adding support for surrogates doesn't make sense (e.g. in the case of the formatting methods), then we simply punt on that and leave such handling to other tools. * Regarding preventing surrogates from entering the Python runtime: It is by far more important to maintain round-trip safety for Unicode data, than getting every bit of code work correctly with surrogates (often, there won't be a single correct way). With a new method for fast detection of surrogates, we could protect code which obviously doesn't work with surrogates and then consider each case individually by either adding special cases as necessary or punting on the support. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 25 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From nadeem.vawda at gmail.com Thu Nov 25 11:12:20 2010 From: nadeem.vawda at gmail.com (Nadeem Vawda) Date: Thu, 25 Nov 2010 12:12:20 +0200 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEE2DBB.3040502@g.nevcal.com> References: <20101121034404.52924F20A@mail.python.org> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> <4CEC2759.40203@g.nevcal.com> <4CEE2DBB.3040502@g.nevcal.com> Message-ID: On Thu, Nov 25, 2010 at 11:34 AM, Glenn Linderman wrote: > So the following code defines constants with associated names that get put > in the repr. The code you gave doesn't work if the constant() function is moved into a separate module from the code that calls it. The globals() function, as I understand it, gives you access to the global namespace *of the current module*, so the constants end up being defined in the module containing constant(), not the module you're calling it from. You could get around this by passing the globals of the calling module to constant(), but I think it's cleaner to use a class to provide a distinct namespace for the constants. > An idea I had, but have no idea how to implement, is that it might be nice > to say: > > ??? with imported_constants_from_module: > ??? ?????? do_stuff > > where do_stuff could reference the constants without qualifying them by > module.? Of course, if you knew it was just a module of constants, you could > "import * from module" :)? But the idea of with is that they'd go away at > the end of that scope. I don't think this is possible - the context manager protocol doesn't allow you to modify the namespace of the caller like that. Also, a with statement does not have its own namespace; any names defined inside its body will continue to be visible in the containing scope. Of course, if you want to achieve something similar (at function scope), you could say: def foo(bar, baz): from module import * ... From fuzzyman at voidspace.org.uk Thu Nov 25 11:34:25 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 25 Nov 2010 10:34:25 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> <4CEC2759.40203@g.nevcal.com> <4CEE2DBB.3040502@g.nevcal.com> Message-ID: <4CEE3BB1.5090308@voidspace.org.uk> On 25/11/2010 10:12, Nadeem Vawda wrote: > On Thu, Nov 25, 2010 at 11:34 AM, Glenn Linderman wrote: >> So the following code defines constants with associated names that get put >> in the repr. > The code you gave doesn't work if the constant() function is moved > into a separate module from the code that calls it. The globals() > function, as I understand it, gives you access to the global namespace > *of the current module*, so the constants end up being defined in the > module containing constant(), not the module you're calling it from. > > You could get around this by passing the globals of the calling module > to constant(), but I think it's cleaner to use a class to provide a > distinct namespace for the constants. > >> An idea I had, but have no idea how to implement, is that it might be nice >> to say: >> >> with imported_constants_from_module: >> do_stuff >> >> where do_stuff could reference the constants without qualifying them by >> module. Of course, if you knew it was just a module of constants, you could >> "import * from module" :) But the idea of with is that they'd go away at >> the end of that scope. > I don't think this is possible - the context manager protocol doesn't > allow you to modify the namespace of the caller like that. Also, a > with statement does not have its own namespace; any names defined > inside its body will continue to be visible in the containing scope. > > Of course, if you want to achieve something similar (at function > scope), you could say: > > def foo(bar, baz): > from module import * > ... Not in Python 3 you can't. :-) That's invalid syntax, import * can only be used at module level. This makes *testing* import * (i.e. testing your __all__) annoying - you have to exec('from module import *') instead. Michael > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Thu Nov 25 11:37:13 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 25 Nov 2010 10:37:13 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEE2DBB.3040502@g.nevcal.com> References: <20101121034404.52924F20A@mail.python.org> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CEBCE92.40801@voidspace.org.uk> <20101123154229.474f7a90@pitrou.net> <1290524466.3642.4.camel@localhost.localdomain> <4CEBDA91.4050205@voidspace.org.uk> <1290526253.3642.9.camel@localhost.localdomain> <4CEBE06C.9030101@voidspace.org.uk> <1290528319.3642.11.camel@localhost.localdomain> <1290533860.3642.73.camel@localhost.localdomain> <1290535602.3642.87.camel@localhost.localdomain> <4CEC2759.40203@g.nevcal.com> <4CEE2DBB.3040502@g.nevcal.com> Message-ID: <4CEE3C59.1030002@voidspace.org.uk> On 25/11/2010 09:34, Glenn Linderman wrote: > So the following code defines constants with associated names that get > put in the repr. > > I'm still a Python newbie in some areas, particularly classes and > metaclasses, maybe more. > But this Python 3 code seems to create constants with names ... works > for int and str at least. > > Special case for int defines a special __or__ operator to OR both the > values and the names, which some might like. > > Dunno why it doesn't work for dict, and it is too late to research > that today. That's the last test case in the code below, so you can > see how it works for int and string before it bombs. > > There's some obvious cleanup work to be done, and it would be nice to > make the names actually be constant... but they do lose their .name if > you ignorantly assign the base type, so at least it is hard to change > the value and keep the associated .name that gets reported by repr, > which might reduce some confusion at debug time. > > An idea I had, but have no idea how to implement, is that it might be > nice to say: > > with imported_constants_from_module: > do_stuff > > where do_stuff could reference the constants without qualifying them > by module. Of course, if you knew it was just a module of constants, > you could "import * from module" :) But the idea of with is that > they'd go away at the end of that scope. > > Some techniques here came from Raymond's namedtuple code. > > > def constant( name, val ): > typ = str( type( val )) > if typ.startswith("": > typ = typ[ 8:-2 ] > ev = ''' > class constant_%s( %s ): > def __new__( cls, val, name ): > self = %s.__new__( cls, val ) > self.name = name > return self > def __repr__( self ): > return self.name + ': ' + str( self ) > ''' > if typ == 'int': > ev += ''' > def __or__( self, other ): > if isinstance( other, constant_int ): > return constant_int( int( self ) | int( other ), > self.name + ' | ' + other.name ) > ''' Not quite correct. If you or a value you with itself you should get back just the value not something with "name|name" as the repr. We can hold off on implementations until we have general agreement that some kind of named constant *should* be added, and what the feature set should look like. All the best, Michael > ev += ''' > %s = constant_%s( %s, '%s' ) > > ''' > ev = ev % ( typ, typ, typ, name, typ, repr( val ), name ) > print( ev ) > exec( ev, globals()) > > constant('O_RANDOM', val=16 ) > > constant('O_SEQUENTIAL', val=32 ) > > constant("O_STRING", val="string") > > def foo( x ): > print( str( x )) > print( repr( x )) > print( type( x )) > > foo( O_RANDOM ) > foo( O_SEQUENTIAL ) > foo( O_STRING ) > > zz = O_RANDOM | O_SEQUENTIAL > > foo( zz ) > > y = {'ab': 2, 'yz': 3 } > constant('O_DICT', y ) > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Thu Nov 25 12:47:00 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 25 Nov 2010 12:47:00 +0100 Subject: [Python-Dev] [Python-checkins] r86748 - in python/branches/py3k-urllib/Lib: http/client.py urllib/request.py In-Reply-To: <20101125081820.7FA2EEEA97@mail.python.org> References: <20101125081820.7FA2EEEA97@mail.python.org> Message-ID: <4CEE4CB4.6010107@netwok.org> > Author: senthil.kumaran > New Revision: 86748 > > Log: > Experimental - Transparent gzip Encoding in urllib2. There should be a good way to deal with Content-Length. Cool feature! But... > Modified: > python/branches/py3k-urllib/Lib/http/client.py > python/branches/py3k-urllib/Lib/urllib/request.py No tests? Misc/NEWS? :) Regards From rob.cliffe at btinternet.com Thu Nov 25 13:52:44 2010 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 25 Nov 2010 12:52:44 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEDDC2D.204@canterbury.ac.nz> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CEDDC2D.204@canterbury.ac.nz> Message-ID: <4CEE5C1C.9000905@btinternet.com> On 25/11/2010 03:46, Greg Ewing wrote: > On 25/11/10 12:38, average wrote: >> Is immutability a general need that should have general solution? > Yes, I have sometimes thought this. Might be nice to have a "mutable" attribute that could be read and could be changed from True to False, though presumably not vice versa. > I don't think it really generalizes. Tuples are not just frozen > lists, for example -- they have a different internal structure > that's more efficient to create and access. > But couldn't they be presented to the Python programmer as a single type, with the implementation details hidden "under the hood"? So MyList.__mutable__ = False would have the same effect as the present MyList = tuple(MyList) This would simplify some code that copes with either list(s) or tuple(s) as input data. One would need syntax for (im)mutable literals, e.g. []i # immutable list (really a tuple). Bit of a shame that "i[]" doesn't work. or []f # frozen list (same thing) [] # mutable list (same as now) []m # alternative syntax for mutable list This would reduce the overloading on parentheses and avoid having to write a tuple of one item as (t,) which often trips up newbies. It woud also avoid one FAQ: Why does Python have separate list and tuple types? Also the syntax could be extended, e.g. {a,b,c}f # frozen set with 3 objects {p:x,q:y}f # frozen dictionary with 2 items {:}f, {}f # (re the thread on set literals) frozen empty dictionary and frozen empty set! Just some thoughts for Python 4. Best wishes Rob Cliffe From g.brandl at gmx.net Thu Nov 25 14:27:14 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 25 Nov 2010 14:27:14 +0100 Subject: [Python-Dev] [Python-checkins] r86748 - in python/branches/py3k-urllib/Lib: http/client.py urllib/request.py In-Reply-To: <4CEE4CB4.6010107@netwok.org> References: <20101125081820.7FA2EEEA97@mail.python.org> <4CEE4CB4.6010107@netwok.org> Message-ID: Am 25.11.2010 12:47, schrieb ?ric Araujo: >> Author: senthil.kumaran >> New Revision: 86748 >> >> Log: >> Experimental - Transparent gzip Encoding in urllib2. There should be a good way to deal with Content-Length. > Cool feature! But... > >> Modified: >> python/branches/py3k-urllib/Lib/http/client.py >> python/branches/py3k-urllib/Lib/urllib/request.py > No tests? Misc/NEWS? :) Note that this is work in a separate branch. Georg From emile.anclin at logilab.fr Thu Nov 25 15:30:23 2010 From: emile.anclin at logilab.fr (Emile Anclin) Date: Thu, 25 Nov 2010 15:30:23 +0100 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError Message-ID: <201011251530.23947.emile.anclin@logilab> hello, working on Pylint, we have a lot of voluntary corrupted files to test Pylint behavior; for instance $ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py # -*- coding: IBO-8859-1 -*- """ check correct unknown encoding declaration """ __revision__ = '????' and we try to find that module : find_module('func_unknown_encoding', None). But python3 raises SyntaxError in that case ; it didn't raise SyntaxError on python2 nor does so on our func_nonascii_noencoding and func_wrong_encoding modules (with obvious names) Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) [GCC 4.3.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from imp import find_module >>> find_module('func_unknown_encoding', None) Traceback (most recent call last): File "", line 1, in SyntaxError: encoding problem: with BOM >>> find_module('func_wrong_encoding', None) (<_io.TextIOWrapper name=5 encoding='utf-8'>, 'func_wrong_encoding.py', ('.py', 'U', 1)) >>> find_module('func_nonascii_noencoding', None) (<_io.TextIOWrapper name=6 encoding='utf-8'>, 'func_nonascii_noencoding.py', ('.py', 'U', 1)) So what is the reason of this selective behavior? Furthermore, there is BOM in our func_unknown_encoding.py module. -- Emile Anclin http://www.logilab.fr/ http://www.logilab.org/ Informatique scientifique & et gestion de connaissances From rrr at ronadam.com Thu Nov 25 18:22:58 2010 From: rrr at ronadam.com (Ron Adam) Date: Thu, 25 Nov 2010 11:22:58 -0600 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: <201011251530.23947.emile.anclin@logilab> References: <201011251530.23947.emile.anclin@logilab> Message-ID: <4CEE9B72.1070002@ronadam.com> On 11/25/2010 08:30 AM, Emile Anclin wrote: > > hello, > > working on Pylint, we have a lot of voluntary corrupted files to test > Pylint behavior; for instance > > $ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py > # -*- coding: IBO-8859-1 -*- > """ check correct unknown encoding declaration > """ > > __revision__ = '????' > > > and we try to find that module : > find_module('func_unknown_encoding', None). But python3 raises SyntaxError > in that case ; it didn't raise SyntaxError on python2 nor does so on our > func_nonascii_noencoding and func_wrong_encoding modules (with obvious > names) > > Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) > [GCC 4.3.4] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> from imp import find_module >>>> find_module('func_unknown_encoding', None) > Traceback (most recent call last): > File "", line 1, in > SyntaxError: encoding problem: with BOM >>>> find_module('func_wrong_encoding', None) > (<_io.TextIOWrapper name=5 encoding='utf-8'>, 'func_wrong_encoding.py', > ('.py', 'U', 1)) >>>> find_module('func_nonascii_noencoding', None) > (<_io.TextIOWrapper name=6 encoding='utf-8'>, > 'func_nonascii_noencoding.py', ('.py', 'U', 1)) > > > So what is the reason of this selective behavior? > Furthermore, there is BOM in our func_unknown_encoding.py module. I don't think there is a clear reason by design. Also try importing the same modules directly and noting the differences in the errors you get. For example, the problem that brought this to my attention in python3.2. >>> find_module('test/badsyntax_pep3120') Segmentation fault >>> from test import badsyntax_pep3120 Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.2/test/badsyntax_pep3120.py", line 1 SyntaxError: Non-UTF-8 code starting with '\xf6' in file /usr/local/lib/python3.2/test/badsyntax_pep3120.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details The import statement uses parser.c, and tokenizer.c indirectly, to import a file, but the imp module uses tokenizer.c directly. They aren't consistent in how they handle errors because the different error messages are generated in different places depending on what the error is, *and* what the code path to get to that point was, *and* weather or not a filename was set. For the example above with imp.findmodule(), the filename isn't set, so you get a different error than if you used import, which uses the parser module and that does set the filename. From what I've seen, it would help if the imp module was rewritten to use parser.c like the import statement does, rather than tokenizer.c directly. The error handling in parser.c is much better than tokenizer.c. Possibly tokenizer.c could be cleaned up after that and be made much simpler. Ron Adam From rrr at ronadam.com Thu Nov 25 18:22:58 2010 From: rrr at ronadam.com (Ron Adam) Date: Thu, 25 Nov 2010 11:22:58 -0600 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: <201011251530.23947.emile.anclin@logilab> References: <201011251530.23947.emile.anclin@logilab> Message-ID: <4CEE9B72.1070002@ronadam.com> On 11/25/2010 08:30 AM, Emile Anclin wrote: > > hello, > > working on Pylint, we have a lot of voluntary corrupted files to test > Pylint behavior; for instance > > $ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py > # -*- coding: IBO-8859-1 -*- > """ check correct unknown encoding declaration > """ > > __revision__ = '????' > > > and we try to find that module : > find_module('func_unknown_encoding', None). But python3 raises SyntaxError > in that case ; it didn't raise SyntaxError on python2 nor does so on our > func_nonascii_noencoding and func_wrong_encoding modules (with obvious > names) > > Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) > [GCC 4.3.4] on linux2 > Type "help", "copyright", "credits" or "license" for more information. >>>> from imp import find_module >>>> find_module('func_unknown_encoding', None) > Traceback (most recent call last): > File "", line 1, in > SyntaxError: encoding problem: with BOM >>>> find_module('func_wrong_encoding', None) > (<_io.TextIOWrapper name=5 encoding='utf-8'>, 'func_wrong_encoding.py', > ('.py', 'U', 1)) >>>> find_module('func_nonascii_noencoding', None) > (<_io.TextIOWrapper name=6 encoding='utf-8'>, > 'func_nonascii_noencoding.py', ('.py', 'U', 1)) > > > So what is the reason of this selective behavior? > Furthermore, there is BOM in our func_unknown_encoding.py module. I don't think there is a clear reason by design. Also try importing the same modules directly and noting the differences in the errors you get. For example, the problem that brought this to my attention in python3.2. >>> find_module('test/badsyntax_pep3120') Segmentation fault >>> from test import badsyntax_pep3120 Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.2/test/badsyntax_pep3120.py", line 1 SyntaxError: Non-UTF-8 code starting with '\xf6' in file /usr/local/lib/python3.2/test/badsyntax_pep3120.py on line 1, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details The import statement uses parser.c, and tokenizer.c indirectly, to import a file, but the imp module uses tokenizer.c directly. They aren't consistent in how they handle errors because the different error messages are generated in different places depending on what the error is, *and* what the code path to get to that point was, *and* weather or not a filename was set. For the example above with imp.findmodule(), the filename isn't set, so you get a different error than if you used import, which uses the parser module and that does set the filename. From what I've seen, it would help if the imp module was rewritten to use parser.c like the import statement does, rather than tokenizer.c directly. The error handling in parser.c is much better than tokenizer.c. Possibly tokenizer.c could be cleaned up after that and be made much simpler. Ron Adam From merwok at netwok.org Thu Nov 25 18:53:54 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 25 Nov 2010 18:53:54 +0100 Subject: [Python-Dev] [Python-checkins] r86748 - in python/branches/py3k-urllib/Lib: http/client.py urllib/request.py In-Reply-To: References: <20101125081820.7FA2EEEA97@mail.python.org> <4CEE4CB4.6010107@netwok.org> Message-ID: <4CEEA2B2.1030306@netwok.org> >>> Modified: >>> python/branches/py3k-urllib/Lib/http/client.py >>> python/branches/py3k-urllib/Lib/urllib/request.py >> No tests? Misc/NEWS? :) > > Note that this is work in a separate branch. Ah, didn?t notice that! Senthil replied as much in private email: > That was in a different branch. Once stable shall definitey include > the tests and news. unconsciously-ignoring-svn-branches-to-preserve-sanity-ly yours, ?ric From victor.stinner at haypocalc.com Thu Nov 25 22:39:00 2010 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 25 Nov 2010 22:39:00 +0100 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CE6F93F.9010109@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> Message-ID: <201011252239.00288.victor.stinner@haypocalc.com> On Friday 19 November 2010 23:25:03 you wrote: > > Python is unclear about non-BMP characters: narrow build was called > > "ucs2" for long time, even if it is UTF-16 (each character is encoded to > > one or two UTF-16 words). > > No, no, no :-) > > UCS2 and UCS4 are more appropriate than "narrow" and "wide" or even > "UTF-16" and "UTF-32". Ok for Python 2: $ ./python Python 2.7.0+ (release27-maint:84618M, Sep 8 2010, 12:43:49) >>> import sys; sys.maxunicode 65535 >>> x=u'\U0010ffff'; len(x) 2 >>> ord(x) ... TypeError: ord() expected a character, but string of length 2 found But Python 3 does use UTF-16 for narrow build: $ ./python Python 3.2a3+ (py3k:86396:86399M, Nov 10 2010, 15:24:09) >>> import sys; sys.maxunicode 65535 >>> c=chr(0x10ffff); len(c) 2 >>> ord(c) 1114111 Victor From merwok at netwok.org Fri Nov 26 02:32:43 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 26 Nov 2010 02:32:43 +0100 Subject: [Python-Dev] [Python-checkins] r86750 - python/branches/py3k/Demo/curses/life.py In-Reply-To: <20101125145644.D98FAEEA26@mail.python.org> References: <20101125145644.D98FAEEA26@mail.python.org> Message-ID: <4CEF0E3B.2070608@netwok.org> Hello, > Author: senthil.kumaran > Log: > Mouse support and colour to Demo/curses/life.py by Dafydd Crosby > > Modified: > python/branches/py3k/Demo/curses/life.py Okay, this time I?m reacting to the right branch > Modified: python/branches/py3k/Demo/curses/life.py > ============================================================================== > --- python/branches/py3k/Demo/curses/life.py (original) > +++ python/branches/py3k/Demo/curses/life.py Thu Nov 25 15:56:44 2010 > @@ -1,6 +1,7 @@ > #!/usr/bin/env python3 > # life.py -- A curses-based version of Conway's Game of Life. > # Contributed by AMK > +# Mouse support and colour by Dafydd Crosby Shouldn?t his name rather be in Misc/ACKS too? Modules typically (warning: non-scientific data) include the name of the author or first contributors but not the name of every contributor. I think these cool features deserve a note in Misc/NEWS too :) Re: ?colour?: the rest of the file use US English, as do the function names (see for example curses.has_color). It?s good to use one dialect consistently in one file. going-back-to-stare-at-shiny-colors-ly yours, ?ric From orsenthil at gmail.com Fri Nov 26 03:15:24 2010 From: orsenthil at gmail.com (Senthil Kumaran) Date: Fri, 26 Nov 2010 10:15:24 +0800 Subject: [Python-Dev] [Python-checkins] r86750 - python/branches/py3k/Demo/curses/life.py In-Reply-To: <4CEF0E3B.2070608@netwok.org> References: <20101125145644.D98FAEEA26@mail.python.org> <4CEF0E3B.2070608@netwok.org> Message-ID: <20101126021524.GA1450@rubuntu> On Fri, Nov 26, 2010 at 02:32:43AM +0100, ?ric Araujo wrote: > Shouldn?t his name rather be in Misc/ACKS too? Modules typically > (warning: non-scientific data) include the name of the author or first > contributors but not the name of every contributor. > > I think these cool features deserve a note in Misc/NEWS too :) I don't think it is required. Demo stuffs are usually fun demonstrations. The contributor had added his name to patch in the header, and I just left it like that. It's fine. For features and important patches (subjective), Misc/{ACKS,NEWS} are both added. > Re: ?colour?: the rest of the file use US English, as do the function > names (see for example curses.has_color). It?s good to use one dialect > consistently in one file. Good catch. Did not realize it because, we write it as colour too. Changing it. Thanks, Senthil From stephen at xemacs.org Fri Nov 26 03:42:33 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 26 Nov 2010 11:42:33 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEE318D.5000705@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> <4CEE318D.5000705@egenix.com> Message-ID: <87fwuo7qli.fsf@uwakimon.sk.tsukuba.ac.jp> M.-A. Lemburg writes: > That would be a possibility as well... but I doubt that many users > are going to bother, since slicing surrogates is just as bad as > slicing combining code points and the latter are much more common in > real life and they do happen to mostly live in the BMP. That's only if you require 100% fidelity in the data, which may not be true in some use cases. Where 99.99% fidelity is good enough, an unexpected sliced surrogate pair is a show-stopper, while a sliced combining character sequence not only doesn't stop the show (at least in Python, and I doubt any correct Unicode process can signal a fatal error there either, I can put a tilde on a Cyrillic character if I want to, no?), it's probably readable enough that readers will assume a keypunch error. Personally, if available I would always use some such dodge in server software (I don't care enough about 24x7 availability to write it myself, though). And never in a script for interactive use; something needs fixing, may as well take the fatal error and fix it on the spot. (Again, "on the spot" for me can mean "tomorrow".) From stephen at xemacs.org Fri Nov 26 04:02:09 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 26 Nov 2010 12:02:09 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <4CEE32FD.90507@egenix.com> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CED5E91.9070705@egenix.com> <87bp5eb0zb.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEE32FD.90507@egenix.com> Message-ID: <87eia87pou.fsf@uwakimon.sk.tsukuba.ac.jp> M.-A. Lemburg writes: > Please note that we can only provide one way of string indexing > in Python using the standard s[1] notation and since we don't > want that operation to be fast and no more than O(1), using the > code units as items is the only reasonable way to implement it. AFAICT, the "we" that wants "no more than O(1)" does not include Glyph Lefkowitz, James Knight, and Greg Ewing. Greg even said that in designing a UTF-8 string type he might not provide a indexing operation at all. (Caution: That may not be what he meant; I'm just reporting the way I interpreted it.) Of course none of them are proposing to change Python, that's all in the context of designing a new language. But it does suggest that a lot of people can't think of use cases where O(1) string indexing is more important than Unicode robustness. > It is by far more important to maintain round-trip safety for > Unicode data, than getting every bit of code work correctly > with surrogates (often, there won't be a single correct way). But surely it's more important than that to ensure that surrogates can't crash a Python process with unexpect UnicodeErrors? From jcea at jcea.es Fri Nov 26 05:11:56 2010 From: jcea at jcea.es (Jesus Cea) Date: Fri, 26 Nov 2010 05:11:56 +0100 Subject: [Python-Dev] Question about GDB bindings and 32/64 bits Message-ID: <4CEF338C.4070509@jcea.es> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I have installed GDB 7.2 32 bits and 32 bits buildslaves are green. Nevertheless 64 bits buildslaves are failing test_gdb. Is there any expectation that a 32 bits GDB be able to debug a 64 bits python?. If not, gdb test should compare "platform.architecture()" (for python and gdb in the system) and run only when they are the same. If this should work, I would open a bug and maybe spend some time with it. But before thinking about investing time, I would like to know if this mix is actually expected or not to work. If not, I would consider to install a 64 bits GDB too and do some tricks (like using an "/usr/local/bin/gdb" script wrapper to choose 32/64 "real" gdb version) to actually execute "test_gdb" in both buildslaves (they are running in the same physical machine). Any advice? PS: I am talking about AMD64 OpenIndiana buildbots. Haven't check others. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBTO8zjJlgi5GaxT1NAQLusgP9GVuhvQJWhPqjzdkZnrMObQg0AD6ggbIR 2B4IstFpD1bKvIcGPJv0Irk3+heaQuFbTzYVLC132d89Ektfib9ZbJ/hzJz2wqd2 lnkfNUCV0tKal3P7kbGYUk828glIrlufSuF1HYIknd2BAzHFl5Zf6q5/AXzYr90D v4Y82b7Wg0k= =NHcR -----END PGP SIGNATURE----- From glyph at twistedmatrix.com Fri Nov 26 08:21:26 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Fri, 26 Nov 2010 02:21:26 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 24, 2010, at 4:03 AM, Stephen J. Turnbull wrote: > You end up proliferating types that all do the same kind of thing. Judicious use of inheritance helps, but getting the fundamental abstraction right is hard. Or least, Emacs hasn't found it in 20 years of trying. Emacs hasn't even figured out how to do general purpose iteration in 20 years of trying either. The easiest way I've found to loop across an arbitrary pile of 'stuff' is the CL 'loop' macro, which you're not even supposed to use. Even then, you still have to make the arcane and pointless distinction of using 'across' or 'in' or 'on'. Python, on the other hand, has iteration pretty well tied up nicely in a bow. I don't know how to respond to the rest of your argument. Nothing you've said has in any way indicated to me why having code-point offsets is a good idea, only that people who know C and elisp would rather sling around piles of integers than have good abstract types. For example: > I think it more likely that markers are very expense to create and use compared to integers. What? When you do 'for x in str' in python, you are already creating an iterator object, which has to store the exact same amount of state that our proposed 'marker' or 'character pointer' would have to store. The proposed UTF-8 marker would have to do a tiny bit more work when iterating because it would have to combine multibyte characters, but in exchange for that you get to skip a whole ton of copying when encoding and decoding. How is this expensive to create and use? For every application I have ever designed, encountered, or can even conjecture about, this would be cheaper. (Assuming not just a UTF-8 string type, but one for UTF-16 as well, where native data is in that format already.) For what it's worth, not wanting to use abstract types in Emacs makes sense to me: I've written my share of elisp code, and it is hard to create reasonable abstractions in Emacs, because the facilities for defining types and creating polymorphic logic are so crude. It's a lot easier to just assume your underlying storage is an array, because at the end of the day you're going to need to call some functions on it which care whether it's an array or an alist or a list or a vector anyway, so you might as well just say so up front. But in Python we could just call 'mystring.by_character()' or 'mystring.by_codepoint()' and get an iterator object back and forget about all that junk. -------------- next part -------------- An HTML attachment was scrubbed... URL: From glyph at twistedmatrix.com Fri Nov 26 08:51:35 2010 From: glyph at twistedmatrix.com (Glyph Lefkowitz) Date: Fri, 26 Nov 2010 02:51:35 -0500 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: <87ipzm6oqr.fsf@uwakimon.sk.tsukuba.ac.jp> References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEDCB86.9030506@canterbury.ac.nz> <87ipzm6oqr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 24, 2010, at 10:55 PM, Stephen J. Turnbull wrote: > Greg Ewing writes: >> On 24/11/10 22:03, Stephen J. Turnbull wrote: >>> But >>> if you actually need to remember positions, or regions, to jump to >>> later or to communicate to other code that manipulates them, doing >>> this stuff the straightforward way (just copying the whole iterator >>> object to hang on to its state) becomes expensive. >> >> If the internal representation of a text pointer (I won't call it >> an iterator because that means something else in Python) is a byte >> offset or something similar, it shouldn't take up any more space >> than a Python int, which is what you'd be using anyway if you >> represented text positions by grapheme indexes or whatever. > > That's not necessarily true. Eg, in Emacs ("there you go again"), > Lisp integers are not only immediate (saving one pointer), but the > type is encoded in the lower bits, so that there is no need for a type > pointer -- the representation is smaller than the opaque marker type. > Altogether, up to 8 of 12 bytes saved on a 32-bit platform, or 16 of > 24 bytes on a 64-bit platform. Yes, yes, lisp is very clever. Maybe some other runtime, like PyPy, could make this optimization. But I don't think that anyone is filling up main memory with gigantic piles of character indexes and need to squeeze out that extra couple of bytes of memory on such a tiny object. Plus, this would allow such a user to stop copying the character data itself just to decode it, and on mostly-ascii UTF-8 text (a common use-case) this is a 2x savings right off the bat. > In Python it's true that markers can use the same data structure as > integers and simply provide different methods, and it's arguable that > Python's design is better. But if you use bytes internally, then you > have problems. No, you just have design questions. > Do you expose that byte value to the user? Yes, but only if they ask for it. It's useful for computing things like quota and the like. > Can users (programmers using the language and end users) specify positions in terms of byte values? Sure, why not? > If so, what do you do if the user specifies a byte value that points into a multibyte character? Go to the beginning of the multibyte character. Report that position; if the user then asks the requested marker object for its position, it will report that byte offset, not the originally-requested one. (Obviously, do the same thing for surrogate pair code points.) > What if the user wants to specify position by number of characters? Part of the point that we are trying to make here is that nobody really cares about that use-case. In order to know anything useful about a position in a text, you have to have traversed to that location in the text. You can remember interesting things like the offsets of starts of lines, or the x/y positions of characters. > Can you translate efficiently? No, because there's no point :). But you _could_ implement an overlay that cached things like the beginning of lines, or the x/y positions of interesting characters. > As I say elsewhere, it's possible that there really never is a need to efficiently specify an absolute position in a large text as a character (grapheme, whatever) count. > But I think it would be hard to implement an efficient text-processing *language*, eg, a Python module > for *full conformance* in handling Unicode, on top of UTF-8. Still: why? I guess if I have some free time I'll try my hand at it, and maybe I'll run into a wall and realize you're right :). > Any time you have an algorithm that requires efficient access to arbitrary text positions, you'll spend all your skull sweat fighting the representation. At least, that's been my experience with Emacsen. What sort of algorithm would that be, though? The main thing that I could think of is a text editor trying to efficiently allow the user to scroll to the middle of a large file without reading the whole thing into memory. But, in that case, you could use byte-positions to estimate, and display an heuristic number while calculating the real line numbers. (This is what 'less' does, and it seems to work well.) >> So I don't really see what you're arguing for here. How do >> *you* think positions in unicode strings should be represented? > > I think what users should see is character positions, and they should > be able to specify them numerically as well as via an opaque marker > object. I don't care whether that position is represented as bytes or > characters internally, except that the experience of Emacsen is that > representation as byte positions is both inefficient and fragile. The > representation as character positions is more robust but slightly more > inefficient. Is it really the representation as byte positions which is fragile (i.e. the internal implementation detail), or the exposure of that position to calling code, and the idiomatic usage of that number as an integer? -------------- next part -------------- An HTML attachment was scrubbed... URL: From facundobatista at gmail.com Fri Nov 26 16:05:09 2010 From: facundobatista at gmail.com (Facundo Batista) Date: Fri, 26 Nov 2010 12:05:09 -0300 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: References: Message-ID: On Wed, Nov 24, 2010 at 5:24 PM, Georg Brandl wrote: > at , you can look at a version of the 3.2 > docs that has the upcoming commenting feature. ?JavaScript is mandatory. This is awesome!! Thanks for this work, remember to buy you a beer next PyCon! > Credits go to Jacob Mason, whose GSOC project is responsible for almost all > of what you see there. ?[1] Ok, two beers. -- .? ? Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From ocean-city at m2.ccsnet.ne.jp Fri Nov 26 17:33:50 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Sat, 27 Nov 2010 01:33:50 +0900 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011140106.55153.victor.stinner@haypocalc.com> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> Message-ID: <4CEFE16E.6040801@m2.ccsnet.ne.jp> On 2010/11/14 9:06, Victor Stinner wrote: > Yes, but how do you check if the input argument is a bytes or a str object > with your PyArg_Parse converter? You should use "O" format and manually > convert it to unicode, and then convert the result back to bytes (if the input > was bytes). It don't think that it makes the code shorter. > > The code is currently working. The question is if we have to drop the ANSI API > now, later or never. It looks like the decision moves to "later" (deprecate in > 3.2, remove in 3.3). I still think that drop now doesn't really hurt. > > Victor Humble thoughts... Is it possible a conversion from bytes (ANSI) to unicode fails on windows? If not, is it allowed to convert to unicode with PyUnicode_FSDecoder if function doesn't return str? For example, os.stat() takes str as arguments but doesn't return str. # I noticed win_readlink() in Modules/posixmodule.c already unicode # only. Maybe not so much problem? ;-) From ocean-city at m2.ccsnet.ne.jp Fri Nov 26 18:06:06 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Sat, 27 Nov 2010 02:06:06 +0900 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <201011111718.08207.eckhardt@satorlaser.com> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011111718.08207.eckhardt@satorlaser.com> Message-ID: <4CEFE8FE.8060201@m2.ccsnet.ne.jp> On 2010/11/12 1:18, Ulrich Eckhardt wrote: >> # I recently did it for winsound.PlaySound with MvL's approval > > Interesting, is there a ticket associate with this? Also, was that on Python 3 > or 2? Which commits? Sorry for late posting. Rev 86300 and Issue 6317. From status at bugs.python.org Fri Nov 26 18:07:01 2010 From: status at bugs.python.org (Python tracker) Date: Fri, 26 Nov 2010 18:07:01 +0100 (CET) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20101126170701.EDA80104026@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2010-11-19 - 2010-11-26) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 2533 (-16) closed 19792 (+98) total 22325 (+82) Open issues with patches: 1083 Issues opened (66) ================== #1178: IDLE - add "paste code" functionality http://bugs.python.org/issue1178 reopened by ned.deily #3709: BaseHTTPRequestHandler innefficient when sending HTTP header http://bugs.python.org/issue3709 reopened by r.david.murray #5150: IDLE to support reindent.py http://bugs.python.org/issue5150 reopened by rhettinger #8879: Implement os.link on Windows http://bugs.python.org/issue8879 reopened by amaury.forgeotdarc #9769: PyUnicode_FromFormatV() doesn't handle non-ascii text correctl http://bugs.python.org/issue9769 reopened by belopolsky #10220: Make generator state easier to introspect http://bugs.python.org/issue10220 reopened by ncoghlan #10268: Add --enable-loadable-sqlite-extensions option to `configure` http://bugs.python.org/issue10268 reopened by ned.deily #10441: some stdlib modules need to be updated to handle SSL certifica http://bugs.python.org/issue10441 reopened by pitrou #10453: Add -h/--help option to compileall http://bugs.python.org/issue10453 reopened by eric.araujo #10464: netrc module not parsing passwords containing #s. http://bugs.python.org/issue10464 opened by the_isz #10466: locale.py resetlocale throws exception on Windows (getdefaultl http://bugs.python.org/issue10466 opened by skoczian #10469: test_socket fails using Visual Studio 2010 http://bugs.python.org/issue10469 opened by Kotan #10475: hardcoded compilers for LDSHARED/LDCXXSHARED on NetBSD http://bugs.python.org/issue10475 opened by njoly #10478: Ctrl-C locks up the interpreter http://bugs.python.org/issue10478 opened by isandler #10479: cgitb.py should assume a binary stream for output http://bugs.python.org/issue10479 opened by v+python #10480: cgi.py should document the need for binary stdin/stdout http://bugs.python.org/issue10480 opened by v+python #10481: subprocess PIPEs are byte streams http://bugs.python.org/issue10481 opened by v+python #10482: subprocess and deadlock avoidance http://bugs.python.org/issue10482 opened by v+python #10483: http.server - what is executable on Windows http://bugs.python.org/issue10483 opened by v+python #10484: http.server.is_cgi fails to handle CGI URLs containing PATH_IN http://bugs.python.org/issue10484 opened by v+python #10485: http.server fails when query string contains addition '?' char http://bugs.python.org/issue10485 opened by v+python #10486: http.server doesn't set all CGI environment variables http://bugs.python.org/issue10486 opened by v+python #10487: http.server - doesn't process Status: header from CGI scripts http://bugs.python.org/issue10487 opened by v+python #10492: test_doctest fails with iso-8859-15 locale http://bugs.python.org/issue10492 opened by pitrou #10494: Demo/comparisons/regextest.py needs some usage information. http://bugs.python.org/issue10494 opened by ramiroluz #10495: Demo/comparisons/sortingtest.py needs some usage information. http://bugs.python.org/issue10495 opened by ramiroluz #10496: "import site failed" when Python can't find home directory http://bugs.python.org/issue10496 opened by bbi5291 #10497: Incorrect use of gettext in argparse http://bugs.python.org/issue10497 opened by eric.araujo #10498: calendar.LocaleHTMLCalendar.formatyearpage() results in traceb http://bugs.python.org/issue10498 opened by r.david.murray #10499: Modular interpolation in configparser http://bugs.python.org/issue10499 opened by lukasz.langa #10500: Palevo.DZ worm msix86 installer 3.x installer http://bugs.python.org/issue10500 opened by VilIgnoble #10502: Add unittestguirunner to Tools/ http://bugs.python.org/issue10502 opened by michael.foord #10503: os.getuid() documentation should be clear on what kind of uid http://bugs.python.org/issue10503 opened by giampaolo.rodola #10504: Trivial mingw compile fixes http://bugs.python.org/issue10504 opened by jonny #10507: Check well-formedness of reST markup within "make patchcheck" http://bugs.python.org/issue10507 opened by dmalcolm #10509: PyTokenizer_FindEncoding can lead to a segfault if bad charact http://bugs.python.org/issue10509 opened by Trundle #10510: distutils upload/register should use CRLF in HTTP requests http://bugs.python.org/issue10510 opened by Brian.Jones #10512: regrtest ResourceWarning - unclosed sockets and files http://bugs.python.org/issue10512 opened by nvawda #10513: sqlite3.InterfaceError after commit http://bugs.python.org/issue10513 opened by anders.blomdell at control.lth.se #10514: configure does not create accurate Makefile http://bugs.python.org/issue10514 opened by daelious #10515: csv sniffer does not recognize quotes at the end of line http://bugs.python.org/issue10515 opened by Martin.Budaj #10516: Add list.clear() and list.copy() http://bugs.python.org/issue10516 opened by terry.reedy #10517: test_concurrent_futures crashes with "Fatal Python error: Inva http://bugs.python.org/issue10517 opened by lukasz.langa #10518: Bring back callable() http://bugs.python.org/issue10518 opened by pitrou #10519: setobject.c no-op typo http://bugs.python.org/issue10519 opened by arigo #10521: str methods don't accept non-BMP fillchar on a narrow Unicode http://bugs.python.org/issue10521 opened by belopolsky #10522: test_telnet exception http://bugs.python.org/issue10522 opened by pitrou #10523: argparse has problem parsing option files containing empty row http://bugs.python.org/issue10523 opened by Michal.Pomorski #10524: Patch to add Pardus to supported dists in platform http://bugs.python.org/issue10524 opened by zaburt #10527: multiprocessing.Pipe problem: "handle out of range in select() http://bugs.python.org/issue10527 opened by synapse #10528: argparse uses %s in gettext calls http://bugs.python.org/issue10528 opened by eric.araujo #10529: Write argparse i18n howto http://bugs.python.org/issue10529 opened by eric.araujo #10530: distutils2 should allow the installing of python files with in http://bugs.python.org/issue10530 opened by michael.foord #10531: write tilted text in turtle http://bugs.python.org/issue10531 opened by lanyjie #10532: A bug related to matching the empty string http://bugs.python.org/issue10532 opened by lanyjie #10533: Need example of using __missing__ http://bugs.python.org/issue10533 opened by lukasz.langa #10534: difflib.SequenceMatcher: expose junk sets, deprecate undocumen http://bugs.python.org/issue10534 opened by terry.reedy #10535: Enable warnings by default in unittest http://bugs.python.org/issue10535 opened by ezio.melotti #10536: Enhancements to gettext docs http://bugs.python.org/issue10536 opened by eric.araujo #10537: IDLE crashes when you paste something. http://bugs.python.org/issue10537 opened by 5ragar5 #10538: PyArg_ParseTuple("s*") does not always incref object http://bugs.python.org/issue10538 opened by krisvale #10539: Regular expression not checking 'range' element on 1st char in http://bugs.python.org/issue10539 opened by TxRxFx #10540: test_shutil fails on Windows after r86733 http://bugs.python.org/issue10540 opened by brian.curtin #10541: regrtest.py -T broken http://bugs.python.org/issue10541 opened by doerwalter #10542: Py_UNICODE_NEXT and other macros for surrogates http://bugs.python.org/issue10542 opened by belopolsky #10543: Test discovery (unittest) does not work with jython http://bugs.python.org/issue10543 opened by michael.foord Most recent 15 issues with no replies (15) ========================================== #10543: Test discovery (unittest) does not work with jython http://bugs.python.org/issue10543 #10542: Py_UNICODE_NEXT and other macros for surrogates http://bugs.python.org/issue10542 #10541: regrtest.py -T broken http://bugs.python.org/issue10541 #10539: Regular expression not checking 'range' element on 1st char in http://bugs.python.org/issue10539 #10538: PyArg_ParseTuple("s*") does not always incref object http://bugs.python.org/issue10538 #10537: IDLE crashes when you paste something. http://bugs.python.org/issue10537 #10536: Enhancements to gettext docs http://bugs.python.org/issue10536 #10534: difflib.SequenceMatcher: expose junk sets, deprecate undocumen http://bugs.python.org/issue10534 #10531: write tilted text in turtle http://bugs.python.org/issue10531 #10530: distutils2 should allow the installing of python files with in http://bugs.python.org/issue10530 #10523: argparse has problem parsing option files containing empty row http://bugs.python.org/issue10523 #10522: test_telnet exception http://bugs.python.org/issue10522 #10514: configure does not create accurate Makefile http://bugs.python.org/issue10514 #10507: Check well-formedness of reST markup within "make patchcheck" http://bugs.python.org/issue10507 #10499: Modular interpolation in configparser http://bugs.python.org/issue10499 Most recent 15 issues waiting for review (15) ============================================= #10542: Py_UNICODE_NEXT and other macros for surrogates http://bugs.python.org/issue10542 #10540: test_shutil fails on Windows after r86733 http://bugs.python.org/issue10540 #10536: Enhancements to gettext docs http://bugs.python.org/issue10536 #10535: Enable warnings by default in unittest http://bugs.python.org/issue10535 #10527: multiprocessing.Pipe problem: "handle out of range in select() http://bugs.python.org/issue10527 #10524: Patch to add Pardus to supported dists in platform http://bugs.python.org/issue10524 #10521: str methods don't accept non-BMP fillchar on a narrow Unicode http://bugs.python.org/issue10521 #10518: Bring back callable() http://bugs.python.org/issue10518 #10515: csv sniffer does not recognize quotes at the end of line http://bugs.python.org/issue10515 #10512: regrtest ResourceWarning - unclosed sockets and files http://bugs.python.org/issue10512 #10509: PyTokenizer_FindEncoding can lead to a segfault if bad charact http://bugs.python.org/issue10509 #10504: Trivial mingw compile fixes http://bugs.python.org/issue10504 #10499: Modular interpolation in configparser http://bugs.python.org/issue10499 #10498: calendar.LocaleHTMLCalendar.formatyearpage() results in traceb http://bugs.python.org/issue10498 #10497: Incorrect use of gettext in argparse http://bugs.python.org/issue10497 Top 10 most discussed issues (10) ================================= #10461: Use with statement throughout the docs http://bugs.python.org/issue10461 27 msgs #7995: On Mac / BSD sockets returned by accept inherit the parent's F http://bugs.python.org/issue7995 24 msgs #10453: Add -h/--help option to compileall http://bugs.python.org/issue10453 24 msgs #9915: speeding up sorting with a key http://bugs.python.org/issue9915 14 msgs #9742: Python 2.7: math module fails to build on Solaris 9 http://bugs.python.org/issue9742 13 msgs #10533: Need example of using __missing__ http://bugs.python.org/issue10533 13 msgs #9509: argparse FileType raises ugly exception for missing file http://bugs.python.org/issue9509 12 msgs #10469: test_socket fails using Visual Studio 2010 http://bugs.python.org/issue10469 12 msgs #10504: Trivial mingw compile fixes http://bugs.python.org/issue10504 12 msgs #10518: Bring back callable() http://bugs.python.org/issue10518 12 msgs Issues closed (92) ================== #2244: urllib and urllib2 decode userinfo multiple times http://bugs.python.org/issue2244 closed by orsenthil #2986: difflib.SequenceMatcher not matching long sequences http://bugs.python.org/issue2986 closed by terry.reedy #3292: Position index limit; s.insert(i,x) not same as s[i:i]=[x] http://bugs.python.org/issue3292 closed by rhettinger #4493: urllib2 doesn't always supply / where URI path component is em http://bugs.python.org/issue4493 closed by orsenthil #4925: Improve error message of subprocess when cannot open http://bugs.python.org/issue4925 closed by benjamin.peterson #5353: Improve IndexError messages with actual values http://bugs.python.org/issue5353 closed by rhettinger #5412: extend configparser to support mapping access(__*item__) http://bugs.python.org/issue5412 closed by lukasz.langa #5616: Distutils 2to3 support doesn't have the doctest_only flag. http://bugs.python.org/issue5616 closed by eric.araujo #6166: encoding error for 'setup.py --author' when read via subproces http://bugs.python.org/issue6166 closed by eric.araujo #6378: Patch to make 'idle.bat' run idle.pyw using appropriate Python http://bugs.python.org/issue6378 closed by brian.curtin #6466: duplicate get_version() code between cygwinccompiler and emxcc http://bugs.python.org/issue6466 closed by eric.araujo #6722: collections.namedtuple: confusing example http://bugs.python.org/issue6722 closed by rhettinger #6799: mimetypes does not give canonical extension for guess_extensio http://bugs.python.org/issue6799 closed by eric.araujo #6878: changed return type from tkinter.Canvas.coords http://bugs.python.org/issue6878 closed by belopolsky #7212: Retrieve an arbitrary element from a set without removing it http://bugs.python.org/issue7212 closed by rhettinger #7226: IDLE right-clicks don't work on Mac OS 10.5 http://bugs.python.org/issue7226 closed by ned.deily #7257: Improve documentation of list.sort and sorted() http://bugs.python.org/issue7257 closed by rhettinger #7645: test_distutils fails on Windows XP http://bugs.python.org/issue7645 closed by brian.curtin #7770: sin/cos function in decimal-docs http://bugs.python.org/issue7770 closed by rhettinger #7804: test_readline failure http://bugs.python.org/issue7804 closed by pitrou #8078: add more baud constants to termios http://bugs.python.org/issue8078 closed by pitrou #8340: bytearray undocumented on trunk http://bugs.python.org/issue8340 closed by pitrou #8381: IDLE 2.6 freezes on OS X 10.6 http://bugs.python.org/issue8381 closed by ned.deily #8569: Upgrade OpenSSL in Windows builds http://bugs.python.org/issue8569 closed by brian.curtin #8590: test_httpservers.CGIHTTPServerTestCase failure on 3.1-maint Ma http://bugs.python.org/issue8590 closed by michael.foord #8631: subprocess.Popen.communicate(...) hangs on Windows http://bugs.python.org/issue8631 closed by brian.curtin #8645: PyUnicode_AsEncodedObject is undocumented http://bugs.python.org/issue8645 closed by belopolsky #8646: PyUnicode_EncodeDecimal is undocumented http://bugs.python.org/issue8646 closed by belopolsky #8647: PyUnicode_GetMax is undocumented http://bugs.python.org/issue8647 closed by eric.araujo #8705: shutil.rmtree with empty filepath http://bugs.python.org/issue8705 closed by brian.curtin #8938: Mac OS dialogs(Save As..., Load) translation http://bugs.python.org/issue8938 closed by ned.deily #9222: IDLE: Fix open/saveas 'Files of type' choices http://bugs.python.org/issue9222 closed by terry.reedy #9500: urllib2: Content-Encoding http://bugs.python.org/issue9500 closed by r.david.murray #9732: Addition of getattr_static for inspect module http://bugs.python.org/issue9732 closed by michael.foord #9746: All sequence types support .index and .count http://bugs.python.org/issue9746 closed by eric.araujo #9802: Document 'stability' of builtin min() and max() http://bugs.python.org/issue9802 closed by rhettinger #9807: deriving configuration information for different builds with t http://bugs.python.org/issue9807 closed by barry #9846: ZipExtFile provides no mechanism for closing the underlying fi http://bugs.python.org/issue9846 closed by lukasz.langa #9852: test_ctypes fail with clang http://bugs.python.org/issue9852 closed by ned.deily #9876: ConfigParser can't interpolate values from other sections http://bugs.python.org/issue9876 closed by lukasz.langa #9965: Loading malicious pickle may cause excessive memory usage http://bugs.python.org/issue9965 closed by georg.brandl #10134: test_email failures on Windows: end of line issue? http://bugs.python.org/issue10134 closed by r.david.murray #10138: calendar module does not support years outside [1, 9999] range http://bugs.python.org/issue10138 closed by belopolsky #10164: Add an assertBytesEqual to unittest and use it for bytes asser http://bugs.python.org/issue10164 closed by rhettinger #10172: code block has no syntax coloring http://bugs.python.org/issue10172 closed by georg.brandl #10183: test_concurrent_futures failure on Windows http://bugs.python.org/issue10183 closed by bquinlan #10255: refleak in initstdio http://bugs.python.org/issue10255 closed by pitrou #10299: Add index with links section for built-in functions http://bugs.python.org/issue10299 closed by ezio.melotti #10319: SocketServer.TCPServer truncates responses on close (in some s http://bugs.python.org/issue10319 closed by orsenthil #10325: PY_LLONG_MAX & co - preprocessor constants or not? http://bugs.python.org/issue10325 closed by mark.dickinson #10366: Remove unneeded '(object)' from 3.x class examples http://bugs.python.org/issue10366 closed by eric.araujo #10371: Deprecate trace module undocumented API http://bugs.python.org/issue10371 closed by belopolsky #10377: cProfile incorrectly labels its output http://bugs.python.org/issue10377 closed by orsenthil #10391: obj2ast's error handling can lead to python crashing with a C- http://bugs.python.org/issue10391 closed by benjamin.peterson #10420: Document of Bdb.effective is wrong. http://bugs.python.org/issue10420 closed by georg.brandl #10430: _sha.sha().digest() method is endian-sensitive. and hexdigest( http://bugs.python.org/issue10430 closed by krisvale #10437: ThreadPoolExecutor should accept max_workers=None http://bugs.python.org/issue10437 closed by stutzbach #10439: PyCodec C API is not documented in reST http://bugs.python.org/issue10439 closed by georg.brandl #10448: Add Mako template benchmark to Python Benchmark Suite http://bugs.python.org/issue10448 closed by pitrou #10450: Fix markup in Misc/NEWS http://bugs.python.org/issue10450 closed by eric.araujo #10458: 2.7 += re.ASCII http://bugs.python.org/issue10458 closed by terry.reedy #10459: missing character names in unicodedata (CJK...) http://bugs.python.org/issue10459 closed by loewis #10460: Misc/indent.pro does not reflect PEP 7 http://bugs.python.org/issue10460 closed by georg.brandl #10462: Handler.close is not called in subclass while Logger.removeHan http://bugs.python.org/issue10462 closed by vinay.sajip #10463: Wrong return type for xml.etree.ElementTree.parse() http://bugs.python.org/issue10463 closed by tiwoc #10465: gzip module calls getattr incorrectly http://bugs.python.org/issue10465 closed by georg.brandl #10467: io.BytesIO.readinto() segfaults when used on BytesIO object se http://bugs.python.org/issue10467 closed by benjamin.peterson #10468: Document UnicodeError access functions http://bugs.python.org/issue10468 closed by georg.brandl #10470: python -m unittest ought to default to discovery http://bugs.python.org/issue10470 closed by michael.foord #10471: include documentation in python docs and under python -h for o http://bugs.python.org/issue10471 closed by georg.brandl #10472: Strange tab key behaviour in interactive python 2.7 OSX 10.6.2 http://bugs.python.org/issue10472 closed by ned.deily #10473: Strange behavior for socket.timeout http://bugs.python.org/issue10473 closed by ned.deily #10474: range.count returns boolean http://bugs.python.org/issue10474 closed by benjamin.peterson #10476: __iter__ on a byte file object using a method to return an ite http://bugs.python.org/issue10476 closed by benjamin.peterson #10477: AttributeError: 'NoneType' object has no attribute 'name' (bo http://bugs.python.org/issue10477 closed by eric.araujo #10488: Improve documentation for 'float' built-in. http://bugs.python.org/issue10488 closed by mark.dickinson #10489: configparser: remove broken `__name__` support http://bugs.python.org/issue10489 closed by lukasz.langa #10490: mimetypes read_windows_registry fails for non-ASCII keys http://bugs.python.org/issue10490 closed by r.david.murray #10491: Insecure Windows python directory permissions http://bugs.python.org/issue10491 closed by loewis #10493: test_strptime failures under OpenIndiana http://bugs.python.org/issue10493 closed by jcea #10501: make_buildinfo regression with unquoted path http://bugs.python.org/issue10501 closed by krisvale #10505: test_compileall: failure on Windows http://bugs.python.org/issue10505 closed by eric.araujo #10506: argparse execute system exit in python prompt http://bugs.python.org/issue10506 closed by r.david.murray #10508: compiler warnings about formatting pid_t as an int http://bugs.python.org/issue10508 closed by georg.brandl #10511: heapq docs clarification http://bugs.python.org/issue10511 closed by georg.brandl #10520: Build with --enable-shared fails http://bugs.python.org/issue10520 closed by barry #10525: Added mouse and colour support to Game of Life curses demo http://bugs.python.org/issue10525 closed by orsenthil #10526: Minor typo in What's New in Python 2.7 http://bugs.python.org/issue10526 closed by georg.brandl #10345: fcntl.ioctl always fails claiming an invalid fd http://bugs.python.org/issue10345 closed by ned.deily #1059244: distutil bdist hardcodes the python location http://bugs.python.org/issue1059244 closed by eric.araujo #1574217: isinstance swallows exceptions http://bugs.python.org/issue1574217 closed by r.david.murray #1699853: locale.getlocale() output fails as setlocale() input http://bugs.python.org/issue1699853 closed by r.david.murray From fijall at gmail.com Fri Nov 26 19:23:45 2010 From: fijall at gmail.com (Maciej Fijalkowski) Date: Fri, 26 Nov 2010 20:23:45 +0200 Subject: [Python-Dev] PyPy 1.4 released Message-ID: =============================== PyPy 1.4: Ouroboros in practice =============================== We're pleased to announce the 1.4 release of PyPy. This is a major breakthrough in our long journey, as PyPy 1.4 is the first PyPy release that can translate itself faster than CPython. Starting today, we are using PyPy more for our every-day development. So may you :) You can download it here: http://pypy.org/download.html What is PyPy ============ PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython. It's fast (`pypy 1.4 and cpython 2.6`_ comparison) Among its new features, this release includes numerous performance improvements (which made fast self-hosting possible), a 64-bit JIT backend, as well as serious stabilization. As of now, we can consider the 32-bit and 64-bit linux versions of PyPy stable enough to run `in production`_. Numerous speed achievements are described on `our blog`_. Normalized speed charts comparing `pypy 1.4 and pypy 1.3`_ as well as `pypy 1.4 and cpython 2.6`_ are available on benchmark website. For the impatient: yes, we got a lot faster! More highlights =============== * PyPy's built-in Just-in-Time compiler is fully transparent and automatically generated; it now also has very reasonable memory requirements. The total memory used by a very complex and long-running process (translating PyPy itself) is within 1.5x to at most 2x the memory needed by CPython, for a speed-up of 2x. * More compact instances. All instances are as compact as if they had ``__slots__``. This can give programs a big gain in memory. (In the example of translation above, we already have carefully placed ``__slots__``, so there is no extra win.) * `Virtualenv support`_: now PyPy is fully compatible with virtualenv_: note that to use it, you need a recent version of virtualenv (>= 1.5). * Faster (and JITted) regular expressions - huge boost in speeding up the `re` module. * Other speed improvements, like JITted calls to functions like map(). .. _virtualenv: http://pypi.python.org/pypi/virtualenv .. _`Virtualenv support`: http://morepypy.blogspot.com/2010/08/using-virtualenv-with-pypy.html .. _`in production`: http://morepypy.blogspot.com/2010/11/running-large-radio-telescope-software.html .. _`our blog`: http://morepypy.blogspot.com .. _`pypy 1.4 and pypy 1.3`: http://speed.pypy.org/comparison/?exe=1%2B41,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=1%2B41&chart=normal+bars .. _`pypy 1.4 and cpython 2.6`: http://speed.pypy.org/comparison/?exe=2%2B35,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=2%2B35&chart=normal+bars Cheers, Carl Friedrich Bolz, Antonio Cuni, Maciej Fijalkowski, Amaury Forgeot d'Arc, Armin Rigo and the PyPy team From reid.kleckner at gmail.com Fri Nov 26 19:33:54 2010 From: reid.kleckner at gmail.com (Reid Kleckner) Date: Fri, 26 Nov 2010 13:33:54 -0500 Subject: [Python-Dev] PyPy 1.4 released In-Reply-To: References: Message-ID: Congratulations! Excellent work. Reid On Fri, Nov 26, 2010 at 1:23 PM, Maciej Fijalkowski wrote: > =============================== > PyPy 1.4: Ouroboros in practice > =============================== > > We're pleased to announce the 1.4 release of PyPy. This is a major breakthrough > in our long journey, as PyPy 1.4 is the first PyPy release that can translate > itself faster than CPython. ?Starting today, we are using PyPy more for > our every-day development. ?So may you :) You can download it here: > > ? ?http://pypy.org/download.html > > What is PyPy > ============ > > PyPy is a very compliant Python interpreter, almost a drop-in replacement > for CPython. It's fast (`pypy 1.4 and cpython 2.6`_ comparison) > > Among its new features, this release includes numerous performance improvements > (which made fast self-hosting possible), a 64-bit JIT backend, as well > as serious stabilization. As of now, we can consider the 32-bit and 64-bit > linux versions of PyPy stable enough to run `in production`_. > > Numerous speed achievements are described on `our blog`_. Normalized speed > charts comparing `pypy 1.4 and pypy 1.3`_ as well as `pypy 1.4 and cpython 2.6`_ > are available on benchmark website. For the impatient: yes, we got a lot faster! > > More highlights > =============== > > * PyPy's built-in Just-in-Time compiler is fully transparent and > ?automatically generated; it now also has very reasonable memory > ?requirements. ?The total memory used by a very complex and > ?long-running process (translating PyPy itself) is within 1.5x to > ?at most 2x the memory needed by CPython, for a speed-up of 2x. > > * More compact instances. ?All instances are as compact as if > ?they had ``__slots__``. ?This can give programs a big gain in > ?memory. ?(In the example of translation above, we already have > ?carefully placed ``__slots__``, so there is no extra win.) > > * `Virtualenv support`_: now PyPy is fully compatible with > virtualenv_: note that > ?to use it, you need a recent version of virtualenv (>= 1.5). > > * Faster (and JITted) regular expressions - huge boost in speeding up > ?the `re` module. > > * Other speed improvements, like JITted calls to functions like map(). > > .. _virtualenv: http://pypi.python.org/pypi/virtualenv > .. _`Virtualenv support`: > http://morepypy.blogspot.com/2010/08/using-virtualenv-with-pypy.html > .. _`in production`: > http://morepypy.blogspot.com/2010/11/running-large-radio-telescope-software.html > .. _`our blog`: http://morepypy.blogspot.com > .. _`pypy 1.4 and pypy 1.3`: > http://speed.pypy.org/comparison/?exe=1%2B41,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=1%2B41&chart=normal+bars > .. _`pypy 1.4 and cpython 2.6`: > http://speed.pypy.org/comparison/?exe=2%2B35,1%2B172&ben=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20&env=1&hor=false&bas=2%2B35&chart=normal+bars > > Cheers, > > Carl Friedrich Bolz, Antonio Cuni, Maciej Fijalkowski, > Amaury Forgeot d'Arc, Armin Rigo and the PyPy team > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/reid.kleckner%40gmail.com > From brian.curtin at gmail.com Fri Nov 26 19:52:22 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 26 Nov 2010 12:52:22 -0600 Subject: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py In-Reply-To: <20101126184428.E04A0EE984@mail.python.org> References: <20101126184428.E04A0EE984@mail.python.org> Message-ID: On Fri, Nov 26, 2010 at 12:44, hirokazu.yamamoto wrote: > Author: hirokazu.yamamoto > Date: Fri Nov 26 19:44:28 2010 > New Revision: 86817 > > Log: > Now can reproduce the error on AMD64 Windows Server 2008 > even where os.symlink is not supported. > > > Modified: > python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py > > Modified: python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py > > ============================================================================== > --- python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py > (original) > +++ python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py Fri > Nov 26 19:44:28 2010 > @@ -271,24 +271,32 @@ > shutil.rmtree(src_dir) > shutil.rmtree(os.path.dirname(dst_dir)) > > - @support.skip_unless_symlink > + @unittest.skipUnless(hasattr(os, 'link'), 'requires os.link') > def test_dont_copy_file_onto_link_to_itself(self): > # bug 851123. > os.mkdir(TESTFN) > src = os.path.join(TESTFN, 'cheese') > dst = os.path.join(TESTFN, 'shop') > try: > - f = open(src, 'w') > - f.write('cheddar') > - f.close() > - > - if hasattr(os, "link"): > - os.link(src, dst) > - self.assertRaises(shutil.Error, shutil.copyfile, src, dst) > - with open(src, 'r') as f: > - self.assertEqual(f.read(), 'cheddar') > - os.remove(dst) > + with open(src, 'w') as f: > + f.write('cheddar') > + os.link(src, dst) > + self.assertRaises(shutil.Error, shutil.copyfile, src, dst) > + with open(src, 'r') as f: > + self.assertEqual(f.read(), 'cheddar') > + os.remove(dst) > + finally: > + shutil.rmtree(TESTFN, ignore_errors=True) > > + @support.skip_unless_symlink > + def test_dont_copy_file_onto_symlink_to_itself(self): > + # bug 851123. > + os.mkdir(TESTFN) > + src = os.path.join(TESTFN, 'cheese') > + dst = os.path.join(TESTFN, 'shop') > + try: > + with open(src, 'w') as f: > + f.write('cheddar') > # Using `src` here would mean we end up with a symlink pointing > # to TESTFN/TESTFN/cheese, while it should point at > # TESTFN/cheese. > @@ -298,10 +306,7 @@ > self.assertEqual(f.read(), 'cheddar') > os.remove(dst) > finally: > - try: > - shutil.rmtree(TESTFN) > - except OSError: > - pass > + shutil.rmtree(TESTFN, ignore_errors=True) > > @support.skip_unless_symlink > def test_rmtree_on_symlink(self): You might be working on something slightly different, but I have an issue created for the failure of that test: http://bugs.python.org/issue10540 It slipped past me because I was only running the test suite as a regular user without the required symlink privilege, so the test was skipped. That Server 2008 build slave runs the test suite as administrator, so it was running that test and going into the os.link block, which it didn't do until r86733. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ocean-city at m2.ccsnet.ne.jp Fri Nov 26 20:45:18 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Sat, 27 Nov 2010 04:45:18 +0900 Subject: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py In-Reply-To: References: <20101126184428.E04A0EE984@mail.python.org> Message-ID: <4CF00E4E.6030507@m2.ccsnet.ne.jp> On 2010/11/27 3:52, Brian Curtin wrote: > On Fri, Nov 26, 2010 at 12:44, hirokazu.yamamoto> wrote: > >> Author: hirokazu.yamamoto >> Date: Fri Nov 26 19:44:28 2010 >> New Revision: 86817 >> >> Log: >> Now can reproduce the error on AMD64 Windows Server 2008 >> even where os.symlink is not supported. >> >> >> Modified: >> python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >> >> Modified: python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >> >> ============================================================================== >> --- python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >> (original) >> +++ python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py Fri >> Nov 26 19:44:28 2010 >> @@ -271,24 +271,32 @@ >> shutil.rmtree(src_dir) >> shutil.rmtree(os.path.dirname(dst_dir)) >> >> - @support.skip_unless_symlink >> + @unittest.skipUnless(hasattr(os, 'link'), 'requires os.link') >> def test_dont_copy_file_onto_link_to_itself(self): >> # bug 851123. >> os.mkdir(TESTFN) >> src = os.path.join(TESTFN, 'cheese') >> dst = os.path.join(TESTFN, 'shop') >> try: >> - f = open(src, 'w') >> - f.write('cheddar') >> - f.close() >> - >> - if hasattr(os, "link"): >> - os.link(src, dst) >> - self.assertRaises(shutil.Error, shutil.copyfile, src, dst) >> - with open(src, 'r') as f: >> - self.assertEqual(f.read(), 'cheddar') >> - os.remove(dst) >> + with open(src, 'w') as f: >> + f.write('cheddar') >> + os.link(src, dst) >> + self.assertRaises(shutil.Error, shutil.copyfile, src, dst) >> + with open(src, 'r') as f: >> + self.assertEqual(f.read(), 'cheddar') >> + os.remove(dst) >> + finally: >> + shutil.rmtree(TESTFN, ignore_errors=True) >> >> + @support.skip_unless_symlink >> + def test_dont_copy_file_onto_symlink_to_itself(self): >> + # bug 851123. >> + os.mkdir(TESTFN) >> + src = os.path.join(TESTFN, 'cheese') >> + dst = os.path.join(TESTFN, 'shop') >> + try: >> + with open(src, 'w') as f: >> + f.write('cheddar') >> # Using `src` here would mean we end up with a symlink pointing >> # to TESTFN/TESTFN/cheese, while it should point at >> # TESTFN/cheese. >> @@ -298,10 +306,7 @@ >> self.assertEqual(f.read(), 'cheddar') >> os.remove(dst) >> finally: >> - try: >> - shutil.rmtree(TESTFN) >> - except OSError: >> - pass >> + shutil.rmtree(TESTFN, ignore_errors=True) >> >> @support.skip_unless_symlink >> def test_rmtree_on_symlink(self): > > > You might be working on something slightly different, but I have an issue > created for the failure of that test: http://bugs.python.org/issue10540 > > It slipped past me because I was only running the test suite as a regular > user without the required symlink privilege, so the test was skipped. That > Server 2008 build slave runs the test suite as administrator, so it was > running that test and going into the os.link block, which it didn't do until > r86733. I'm not sure, but why does os.path.samefile return False for hard link on windows? MSDN says, > A hard link is the file system representation of a file by which more > than one path references a single file in the same volume. (http://msdn.microsoft.com/en-us/library/aa365006%28VS.85%29.aspx) I know st_ino on windows is a bit different from POSIX, so, just I'm not sure. ;-) From brian.curtin at gmail.com Fri Nov 26 21:02:29 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 26 Nov 2010 14:02:29 -0600 Subject: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py In-Reply-To: <4CF00E4E.6030507@m2.ccsnet.ne.jp> References: <20101126184428.E04A0EE984@mail.python.org> <4CF00E4E.6030507@m2.ccsnet.ne.jp> Message-ID: On Fri, Nov 26, 2010 at 13:45, Hirokazu Yamamoto wrote: > On 2010/11/27 3:52, Brian Curtin wrote: > >> On Fri, Nov 26, 2010 at 12:44, hirokazu.yamamoto< >> python-checkins at python.org >> >>> wrote: >>> >> >> Author: hirokazu.yamamoto >>> Date: Fri Nov 26 19:44:28 2010 >>> New Revision: 86817 >>> >>> Log: >>> Now can reproduce the error on AMD64 Windows Server 2008 >>> even where os.symlink is not supported. >>> >>> >>> Modified: >>> python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >>> >>> Modified: python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >>> >>> >>> ============================================================================== >>> --- python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >>> (original) >>> +++ python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py >>> Fri >>> Nov 26 19:44:28 2010 >>> @@ -271,24 +271,32 @@ >>> shutil.rmtree(src_dir) >>> shutil.rmtree(os.path.dirname(dst_dir)) >>> >>> - @support.skip_unless_symlink >>> + @unittest.skipUnless(hasattr(os, 'link'), 'requires os.link') >>> def test_dont_copy_file_onto_link_to_itself(self): >>> # bug 851123. >>> os.mkdir(TESTFN) >>> src = os.path.join(TESTFN, 'cheese') >>> dst = os.path.join(TESTFN, 'shop') >>> try: >>> - f = open(src, 'w') >>> - f.write('cheddar') >>> - f.close() >>> - >>> - if hasattr(os, "link"): >>> - os.link(src, dst) >>> - self.assertRaises(shutil.Error, shutil.copyfile, src, >>> dst) >>> - with open(src, 'r') as f: >>> - self.assertEqual(f.read(), 'cheddar') >>> - os.remove(dst) >>> + with open(src, 'w') as f: >>> + f.write('cheddar') >>> + os.link(src, dst) >>> + self.assertRaises(shutil.Error, shutil.copyfile, src, dst) >>> + with open(src, 'r') as f: >>> + self.assertEqual(f.read(), 'cheddar') >>> + os.remove(dst) >>> + finally: >>> + shutil.rmtree(TESTFN, ignore_errors=True) >>> >>> + @support.skip_unless_symlink >>> + def test_dont_copy_file_onto_symlink_to_itself(self): >>> + # bug 851123. >>> + os.mkdir(TESTFN) >>> + src = os.path.join(TESTFN, 'cheese') >>> + dst = os.path.join(TESTFN, 'shop') >>> + try: >>> + with open(src, 'w') as f: >>> + f.write('cheddar') >>> # Using `src` here would mean we end up with a symlink >>> pointing >>> # to TESTFN/TESTFN/cheese, while it should point at >>> # TESTFN/cheese. >>> @@ -298,10 +306,7 @@ >>> self.assertEqual(f.read(), 'cheddar') >>> os.remove(dst) >>> finally: >>> - try: >>> - shutil.rmtree(TESTFN) >>> - except OSError: >>> - pass >>> + shutil.rmtree(TESTFN, ignore_errors=True) >>> >>> @support.skip_unless_symlink >>> def test_rmtree_on_symlink(self): >>> >> >> >> You might be working on something slightly different, but I have an issue >> created for the failure of that test: http://bugs.python.org/issue10540 >> >> It slipped past me because I was only running the test suite as a regular >> user without the required symlink privilege, so the test was skipped. That >> Server 2008 build slave runs the test suite as administrator, so it was >> running that test and going into the os.link block, which it didn't do >> until >> r86733. >> > > I'm not sure, but why does os.path.samefile return False for hard link > on windows? MSDN says, > > > A hard link is the file system representation of a file by which more > > than one path references a single file in the same volume. > (http://msdn.microsoft.com/en-us/library/aa365006%28VS.85%29.aspx) > > I know st_ino on windows is a bit different from POSIX, so, just I'm not > sure. ;-) The samefile thing, I don't know either. GetFinalPathNameByHandle does not appear to work with hard links, at least how it's being used right now. It has no problem with symlinks. We briefly chatted about this on the os.link feature issue, but I never found a way around it. I'll look into it this weekend. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ocean-city at m2.ccsnet.ne.jp Fri Nov 26 21:18:58 2010 From: ocean-city at m2.ccsnet.ne.jp (Hirokazu Yamamoto) Date: Sat, 27 Nov 2010 05:18:58 +0900 Subject: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py In-Reply-To: References: <20101126184428.E04A0EE984@mail.python.org> <4CF00E4E.6030507@m2.ccsnet.ne.jp> Message-ID: <4CF01632.8070504@m2.ccsnet.ne.jp> On 2010/11/27 5:02, Brian Curtin wrote: > We briefly chatted about this on the os.link > feature issue, but I never found a way around it. How about implementing os.path.samefile in Modules/posixmodule.c like this? http://bugs.python.org/file19262/py3k_fix_kill_python_for_short_path.patch # I hope this works. From brian.curtin at gmail.com Fri Nov 26 21:31:49 2010 From: brian.curtin at gmail.com (Brian Curtin) Date: Fri, 26 Nov 2010 14:31:49 -0600 Subject: [Python-Dev] [Python-checkins] r86817 - python/branches/py3k-stat-on-windows/Lib/test/test_shutil.py In-Reply-To: <4CF01632.8070504@m2.ccsnet.ne.jp> References: <20101126184428.E04A0EE984@mail.python.org> <4CF00E4E.6030507@m2.ccsnet.ne.jp> <4CF01632.8070504@m2.ccsnet.ne.jp> Message-ID: On Fri, Nov 26, 2010 at 14:18, Hirokazu Yamamoto wrote: > On 2010/11/27 5:02, Brian Curtin wrote: > >> We briefly chatted about this on the os.link >> feature issue, but I never found a way around it. >> > > How about implementing os.path.samefile in > Modules/posixmodule.c like this? > > http://bugs.python.org/file19262/py3k_fix_kill_python_for_short_path.patch > > # I hope this works. > That's almost identical to what the current os.path.sameopenfile is. Lib/ntpath.py opens both files, then compares them via _getfileinformation. That function is implemented to take in a file descriptor, call GetFileInformationByHandle with it, then returns a tuple of dwVolumeSerialNumber, nFileIndexHigh, and nFileIndexLow. -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Fri Nov 26 21:39:36 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Fri, 26 Nov 2010 21:39:36 +0100 Subject: [Python-Dev] Removal of Win32 ANSI API In-Reply-To: <4CEFE16E.6040801@m2.ccsnet.ne.jp> References: <4CDC14C0.6070300@m2.ccsnet.ne.jp> <201011121308.30368.victor.stinner@haypocalc.com> <4CDEBB11.5050209@m2.ccsnet.ne.jp> <201011140106.55153.victor.stinner@haypocalc.com> <4CEFE16E.6040801@m2.ccsnet.ne.jp> Message-ID: <4CF01B08.9000409@v.loewis.de> > Is it possible a conversion from bytes (ANSI) to unicode fails on > windows? It should fail sometimes, right? Not for windows-1252, but certainly for shift-jis (you know better than me). It seems that whether MultiByteToWideChar will fail depends on whether MB_ERR_INVALID_CHARS is given or not. I don't know what it will do if this flag is not given - my guess it fills in REPLACEMENT CHARACTER. > If not, is it allowed to convert to unicode with > PyUnicode_FSDecoder if function doesn't return str? For example, > os.stat() takes str as arguments but doesn't return str. This I don't understand. os.stat doesn't return text at all - so what do you want to convert? > # I noticed win_readlink() in Modules/posixmodule.c already unicode > # only. Maybe not so much problem? ;-) Well, readlink is new on Windows, and symlinks are not widespread. So there is no backwards compatibility concern here. Regards, Martin From ncoghlan at gmail.com Sat Nov 27 08:35:52 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 Nov 2010 17:35:52 +1000 Subject: [Python-Dev] [Python-checkins] r86720 - python/branches/py3k/Misc/ACKS In-Reply-To: References: <20101123203252.39BE7EE9CF@mail.python.org> <4CEC43A4.80907@netwok.org> <4CEC4917.2070508@udel.edu> Message-ID: On Thu, Nov 25, 2010 at 5:25 AM, Terry Reedy wrote: > I know now that I could always edit with IDLE's editor, but it is a lot > easier to right click and select edit than it is to run thru the directory > tree in an open dialog. If you want a decent free text editor on Windows, the open source Notepad++ does a very nice job. It also adds an "Edit with Notepad++" to the explorer context menu :) > And of course, since the pseudo-BOM addition is > undocumented within notepad itself, and probably other editors, it is easy > to not know. As far as the implicit BOM addition itself goes, reindent.py and reindent-rst.py could probably be updated to check for it, but the miscellaneous files (like ACKS) are likely to continue to need manual checks. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Sat Nov 27 09:48:52 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 27 Nov 2010 17:48:52 +0900 Subject: [Python-Dev] len(chr(i)) = 2? In-Reply-To: References: <201011192123.14169.victor.stinner@haypocalc.com> <4CE6F93F.9010109@egenix.com> <4CE6FE30.1050903@v.loewis.de> <87hbfc1vnf.fsf@uwakimon.sk.tsukuba.ac.jp> <4CE78F62.7060707@v.loewis.de> <8739qukf9r.fsf@uwakimon.sk.tsukuba.ac.jp> <20101121173825.B1BFB235977@kimball.webabinitio.net> <60F8726F-C1C2-4803-8B8E-688EF0443FA0@gmail.com> <87eiadd46t.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEC5316.4010608@canterbury.ac.nz> <77AAC178-F868-4F05-8509-4A9FB66F61EC@fuhm.net> <87sjyrbftz.fsf@uwakimon.sk.tsukuba.ac.jp> <635C265A-90A8-4B92-A65C-59EF3E8EFD68@twistedmatrix.com> <87oc9fb97b.fsf@uwakimon.sk.tsukuba.ac.jp> <3C1ADB64-63F3-4165-926D-EDE9846E0DBD@fuhm.net> <87mxozayam.fsf@uwakimon.sk.tsukuba.ac.jp> <4CEDCB86.9030506@canterbury.ac.nz> <87ipzm6oqr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87y68f5eyz.fsf@uwakimon.sk.tsukuba.ac.jp> Glyph Lefkowitz writes: > But I don't think that anyone is filling up main memory with > gigantic piles of character indexes and need to squeeze out that > extra couple of bytes of memory on such a tiny object. How do you think editors and browsers represent the regions that they highlight, then? How do you think that structure-oriented editors represent the structures that they work with, then? In a detailed analysis of a C or Java file, it's easy to end up with almost 1:2 positions to characters ratio. Note that *buffer* characters are typically smaller than a platform word, so saving one word in the representation of a position mean a 100% or more increase in the character count of the buffer. Even in the case of UCS-4 on a 32-bit platform, that's a 50% increase in the maximum usable size of a buffer before a parser starts raising OOM errors. There are two plausible ways to represent these structures that I can think of offhand. The first is to do it the way Emacs does, by reading the text into a buffer and using position offsets to map to display or structure attributes. The second is to use a hierarchical document model, and render the display by traversing the document hierarchy. It's not obvious to me that forcing use of the second representation is a good idea for performance in an editor, and I would think that they have similar memory requirements. > Plus, this would allow such a user to stop copying the character > data itself just to decode it, and on mostly-ascii UTF-8 text (a > common use-case) this is a 2x savings right off the bat. Which only matters if you're a server in the business of shoveling octets really fast but are CPU bound (seems unlikely to me, but I'm no expert; WDYT?), and even then is only that big a savings if you can push off the issue of validating the purported UTF-8 text on others. If you're not validating, you may as well acknowledge that you're processing binary data, not text.[1] But we're talking about text. And of course, if you copy mostly-Han UTF-8 text (a common use-case) to UCS-2, this is a 1.5x memory savings right off the bat, and a 3x time savings when iterating in most architectures (one increment operation per character instead of three). As I've already said, I don't think this is an argument in favor of either representation. Sometimes one wins, sometimes the other. I don't think supplying both is a great idea, although I've proposed it myself for XEmacs (but made as opaque as possible). > > In Python it's true that markers can use the same data structure as > > integers and simply provide different methods, and it's arguable that > > Python's design is better. But if you use bytes internally, then you > > have problems. > > No, you just have design questions. Call them what you like, they're as yet unanswered. In any given editing scenario, I'd concede that it's a "SMOD". But if you're designing a language for text processing, it's a restriction that I believe to be a hindrance to applications. Many applications may prefer to use a straightforward array implementation of text and focus their design efforts on the real problems of their use cases. > > Do you expose that byte value to the user? If so, what do you do > > if the user specifies a byte value that points into a multibyte > > character? > > Go to the beginning of the multibyte character. Report that > position; if the user then asks the requested marker object for its > position, it will report that byte offset, not the > originally-requested one. (Obviously, do the same thing for > surrogate pair code points.) I will guarantee that some use cases will prefer that you go to the beginning of the *next* character. For an obvious example, your algorithm will infloop if you iterate "pos += 1". (And the opposite problem appears for "beginning of next character" combined with "pos -= 1".) Of course this trivial example is easily addressed by saying "the user should be using the character iterator API here", but I expect the issue can arise where that is not an easy answer. Either the API becomes complex, or the user/developers will have to do complex bookkeeping that should be done by the text implementation. Nor is it obvious that surrogate pairs will be present in a UCS-2 representation. Specifically, they can be encoded to single private space characters in almost all applications, at a very small cost in performance. > > What if the user wants to specify position by number of > > characters? > > Part of the point that we are trying to make here is that nobody > really cares about that use-case. In order to know anything useful > about a position in a text, you have to have traversed to that > location in the text. Binary search of an ordered text is useful. Granted, this particular example can be addressed usefully in terms of byte positions (viz. your example of less), but your basic premise is falsified. > You can remember interesting things like the offsets of starts of > lines, or the x/y positions of characters. > > > Can you translate efficiently? > > No, because there's no point :). But you _could_ implement an > overlay that cached things like the beginning of lines, or the x/y > positions of interesting characters. Emacs does, and a lot of effort has gone into it, and it still sucks compared to an array representation. Maybe _you_ _could_ do better, but as yet we haven't managed to pull it off. :-( > > But I think it would be hard to implement an efficient > > text-processing *language*, eg, a Python module for *full > > conformance* in handling Unicode, on top of UTF-8. > > Still: why? I guess if I have some free time I'll try my hand at > it, and maybe I'll run into a wall and realize you're right :). I'd rather have you make it plausible to me that there's no point in having efficient access to arbitrary character positions. Then maybe you can delegate that implementation to me. :-) But my Emacs experience says otherwise, and IIUC the intuition and/or experience of MAL and Guido says this is not a YAGNI. > > Any time you have an algorithm that requires efficient access to > > arbitrary text positions, you'll spend all your skull sweat > > fighting the representation. At least, that's been my experience > > with Emacsen. > > What sort of algorithm would that be, though? The main thing that > I could think of is a text editor trying to efficiently allow the > user to scroll to the middle of a large file without reading the > whole thing into memory. Reading into memory or not is a red herring, I think. For many legacy encodings you have to pretty much read the whole thing because they are stateful, and it's just not very expensive compared to the text processing itself (unless your application is shoveling octets as fast as possible, in which case character positions are indeed a YAGNI). The question is whether opaque markers are always sufficient. For example, XEmacs does use byte positions internally for markers and extents (objects representing regions of text that can carry arbitrary properties but are tuned for display properties). Obviously, we have the marker objects you propose as sufficient, and indeed the representation is as efficient as you claim. However, these positions are not exposed as integers to end users, Lisp, or even most of the C code. If a client (end user or code) requests a position, they get a character position. Such requests are frequent enough that they constitute a major drag on many practical applications. It may be that this is unnecessary, as less shows for its application. But less is not an editor, let alone a language for writing editors. Do you know of an editor language of power comparable to Emacs Lisp that is not based on an array representation of text? > Is it really the representation as byte positions which is fragile > (i.e. the internal implementation detail), or the exposure of that > position to calling code, and the idiomatic usage of that number as > an integer? It's the latter. Sufficient effort can make it safe to use byte positions, and the effort is not all that great as long as you don't demand efficiency. XEmacs vs. Emacs implementation of Mule demonstrates that. We at XEmacs never did expose byte positions to even the C code (other than to buffer and string methods), and that implementation has not had to change much, if at all, in 15 years. The caching mechanism to make character position access reasonably efficient, however, has been buggy and not so efficient, and so complex that RMS said "I was going to implement your [position cache] in Emacs but it was too hard for me to understand". (OTOH, the alternative Emacs had implemented turned out to be O(n**2) or worse, so he had to replace it. Translating byte positions to character positions seems to be a real loser.) Emacs did expose byte positions for efficiency reasons, and has had at least four regressions of the "\201 bug". "\201" prefixes a Latin-1 character in internal code, and code that treated byte positions would often result in this being duplicated because all trailing bytes in Mule code are also Latin-1 code points. (Don't ask me about the exact mechanism, XEmacs's implementation is quite different and never suffered from this bug.) Note that a \201-like bug is very unlikely to occur in Python's UCS-2 representation because the semantics of surrogate values in Unicode is unambiguous. However, I believe similar bugs would be possible in a UTF-8 representation -- if code is allowed to choose whether to view UTF-8 in binary or text mode -- because trailing byte values are Latin-1 code points. Maybe I'm just an old granny, scared of my shadow. Footnotes: [1] I have no objection to providing "text" algorithms (such as regexps) for use on "binary" data. But then they don't provide any guarantees that transformations of purported text remains text. From ncoghlan at gmail.com Sat Nov 27 11:51:38 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 Nov 2010 20:51:38 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CED4E34.5060400@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> Message-ID: On Thu, Nov 25, 2010 at 3:41 AM, Michael Foord wrote: > Can you explain what you see as the difference? > > I'm not particularly interested in type validation but I like the fact that > typical enum APIs allow you to group constants: the generated constant class > acts as a namespace for all the defined constants. The problem with blessing one particular "enum API" is that people have so many different ideas as to what an enum API should look like. However, the one thing they all have in common is the ability to take a value and give it a name, then present *both* of those in debugging information. > Are you just suggesting something along the lines of: > > class NamedConstant(int): > def __new__(cls, name, val): > return int.__new__(cls, val) > > def __init__(self, name, val): > self._name = name > > def __repr__(self): > return '' % self._name > > FOO = NamedConstant('FOO', 3) > > In general the less features the better, but I'd like a few more features > than that. :-) Not quite. I'm suggesting a factory function that works for any value, and derives the parent class from the type of the supplied value. However, what you wrote is still the essence of the idea - we would be primarily providing a building block that makes it easier for people to *create* enum APIs if they want to, but for simple use cases (where all they really wanted was the enhanced debugging information) they wouldn't need to bother. In the standard library, wherever we do "enum-like things" we would switch to using named values where it makes sense to do so. Doing so may actually make sense for more than just constants - it may make sense for significant mutable globals as well. ========================================================================== # Implementation (more than just a sketch, since it handles some interesting corner cases) import functools @functools.lru_cache() def _make_named_value_type(base_type): class _NamedValueType(base_type): def __new__(cls, name, value): return base_type.__new__(cls, value) def __init__(self, name, value): self.__name = name super().__init__(value) @property def _name(self): return self.__name def _raw(self): return base_type(self) def __repr__(self): return "{}={}".format(self._name, super().__repr__()) if base_type.__str__ is object.__str__: __str__ = base_type.__repr__ _NamedValueType.__name__ = "Named<{}>".format(base_type.__name__) return _NamedValueType def named_value(name, value): return _make_named_value_type(type(value))(name, value) def set_named_values(namespace, **kwds): for k, v in kwds.items(): namespace[k] = named_value(k, v) x = named_value("FOO", 1) y = named_value("BAR", "Hello World!") z = named_value("BAZ", dict(a=1, b=2, c=3)) print(x, y, z, sep="\n") print("\n".join(map(repr, (x, y, z)))) print("\n".join(map(str, map(type, (x, y, z))))) set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) print("\n".join(map(repr, (foo, bar, baz)))) print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz)) ========================================================================== # Session output for the last 6 lines >>> print(x, y, z, sep="\n") 1 Hello World! {'a': 1, 'c': 3, 'b': 2} >>> print("\n".join(map(repr, (x, y, z)))) FOO=1 BAR='Hello World!' BAZ={'a': 1, 'c': 3, 'b': 2} >>> print("\n".join(map(str, map(type, (x, y, z))))) '> '> '> >>> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) >>> print("\n".join(map(repr, (foo, bar, baz)))) foo=1 bar='Hello World!' baz={'a': 1, 'c': 3, 'b': 2} >>> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz)) True True True For "normal" use, such objects would look like ordinary instances of their class. They would only behave differently when their representation is printed (prepending their name), or when their type is interrogated (being an instance of the named subclass rather than the ordinary type). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 27 13:05:32 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 Nov 2010 22:05:32 +1000 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: References: Message-ID: On Thu, Nov 25, 2010 at 6:24 AM, Georg Brandl wrote: > Hi, > > at , you can look at a version of the 3.2 > docs that has the upcoming commenting feature. ?JavaScript is mandatory. Very nice! I'm not sure what to do about the discoverability of the comment bubbles as the end of each paragraph. I initially thought commenting wasn't available on What's New or the Using Python docs until seeing where the blue comment bubbles appeared in the math module docs. A discreet notice at the bottom of the sidebar and/or an explanation at the "Report a Bug" page may cover it I guess. > Please test on a smaller page, such as , > there is currently a speed issue with larger pages. ?(Helpful tips from > JS experts are welcome.) I gave the JS a fair few comments on the first paragraph to digest. I also put my detailed UI comments there as well (I needed something to write about while testing, so I figured I may as well make it useful to you!) > Other things I have to do before this can go live: > > * reuse existing logins from either wiki or tracker? Tracker sounds like the best bet to me. > Any feedback is appreciated (I'd suggest mailing it to doc-SIG only, to avoid > cluttering up python-dev). My comments may on the math module may give you a chance to see how easy it is to get text out of comments into a form suitable for sending to a mailing list or posting to a tracker issue for further discussion :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 27 13:17:31 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 Nov 2010 22:17:31 +1000 Subject: [Python-Dev] [Python-checkins] r86745 - in python/branches/py3k: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: <20101125061234.F1CC3EEA23@mail.python.org> References: <20101125061234.F1CC3EEA23@mail.python.org> Message-ID: On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy wrote: > ?The :class:`SequenceMatcher` class has this constructor: > > > -.. class:: SequenceMatcher(isjunk=None, a='', b='') > +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True) > > ? ?Optional argument *isjunk* must be ``None`` (the default) or a one-argument > ? ?function that takes a sequence element and returns true if and only if the > @@ -340,6 +349,9 @@ > ? ?The optional arguments *a* and *b* are sequences to be compared; both default to > ? ?empty strings. ?The elements of both sequences must be :term:`hashable`. > > + ? The optional argument *autojunk* can be used to disable the automatic junk > + ? heuristic. > + Catching up on checkins traffic, so a later checkin may already fix this, but there should be a versionchanged tag in the docs to note when the autojunk parameter was added. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 27 13:22:50 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 27 Nov 2010 22:22:50 +1000 Subject: [Python-Dev] [Python-checkins] r86750 - python/branches/py3k/Demo/curses/life.py In-Reply-To: <20101126021524.GA1450@rubuntu> References: <20101125145644.D98FAEEA26@mail.python.org> <4CEF0E3B.2070608@netwok.org> <20101126021524.GA1450@rubuntu> Message-ID: On Fri, Nov 26, 2010 at 12:15 PM, Senthil Kumaran wrote: >> Re: ?colour?: the rest of the file use US English, as do the function >> names (see for example curses.has_color). ?It?s good to use one dialect >> consistently in one file. > > Good catch. Did not realize it because, we write it as colour too. > Changing it. I just resign myself to having to spell words like colour and serialise wrong when I'm working on Python. Compared to the adjustments the non-native English speakers have to make, I figure I'm getting off lightly ;) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Sat Nov 27 13:52:40 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 27 Nov 2010 12:52:40 +0000 Subject: [Python-Dev] [Python-checkins] r86750 - python/branches/py3k/Demo/curses/life.py In-Reply-To: References: <20101125145644.D98FAEEA26@mail.python.org> <4CEF0E3B.2070608@netwok.org> <20101126021524.GA1450@rubuntu> Message-ID: <4CF0FF18.4030408@voidspace.org.uk> On 27/11/2010 12:22, Nick Coghlan wrote: > On Fri, Nov 26, 2010 at 12:15 PM, Senthil Kumaran wrote: >>> Re: ?colour?: the rest of the file use US English, as do the function >>> names (see for example curses.has_color). It?s good to use one dialect >>> consistently in one file. >> Good catch. Did not realize it because, we write it as colour too. >> Changing it. > I just resign myself to having to spell words like colour and > serialise wrong when I'm working on Python. Compared to the > adjustments the non-native English speakers have to make, I figure I'm > getting off lightly ;) > I *thought* that the Python policy was that English speakers wrote documentation in English and American speakers wrote documentation in American and that we *don't* insist on US spellings in the Python documentation? Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From eliben at gmail.com Sat Nov 27 14:00:27 2010 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 27 Nov 2010 15:00:27 +0200 Subject: [Python-Dev] [Python-checkins] r86745 - in python/branches/py3k: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: References: <20101125061234.F1CC3EEA23@mail.python.org> Message-ID: On Sat, Nov 27, 2010 at 14:17, Nick Coghlan wrote: > On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy > wrote: > > The :class:`SequenceMatcher` class has this constructor: > > > > > > -.. class:: SequenceMatcher(isjunk=None, a='', b='') > > +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True) > > > > Optional argument *isjunk* must be ``None`` (the default) or a > one-argument > > function that takes a sequence element and returns true if and only if > the > > @@ -340,6 +349,9 @@ > > The optional arguments *a* and *b* are sequences to be compared; both > default to > > empty strings. The elements of both sequences must be > :term:`hashable`. > > > > + The optional argument *autojunk* can be used to disable the automatic > junk > > + heuristic. > > + > > Catching up on checkins traffic, so a later checkin may already fix > this, but there should be a versionchanged tag in the docs to note > when the autojunk parameter was added. > Hi Nick, Since autojunk was added in 2.7.1 (the docs of which do indicate this is the versionchanged tag), I think Terry may have left the tag in 3.2 out on purpose. That said, personally I don't know what the policy is regarding features added just in 3.2 and 2.7 (and didn't exist in 3.1) in this respect. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Sat Nov 27 14:02:36 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 27 Nov 2010 13:02:36 +0000 Subject: [Python-Dev] [Python-checkins] r86745 - in python/branches/py3k: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: References: <20101125061234.F1CC3EEA23@mail.python.org> Message-ID: <4CF1016C.8050902@voidspace.org.uk> On 27/11/2010 13:00, Eli Bendersky wrote: > On Sat, Nov 27, 2010 at 14:17, Nick Coghlan > wrote: > > On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy > > > wrote: > > The :class:`SequenceMatcher` class has this constructor: > > > > > > -.. class:: SequenceMatcher(isjunk=None, a='', b='') > > +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True) > > > > Optional argument *isjunk* must be ``None`` (the default) or > a one-argument > > function that takes a sequence element and returns true if > and only if the > > @@ -340,6 +349,9 @@ > > The optional arguments *a* and *b* are sequences to be > compared; both default to > > empty strings. The elements of both sequences must be > :term:`hashable`. > > > > + The optional argument *autojunk* can be used to disable the > automatic junk > > + heuristic. > > + > > Catching up on checkins traffic, so a later checkin may already fix > this, but there should be a versionchanged tag in the docs to note > when the autojunk parameter was added. > > > Hi Nick, > > Since autojunk was added in 2.7.1 (the docs of which do indicate this > is the versionchanged tag), I think Terry may have left the tag in 3.2 > out on purpose. That said, personally I don't know what the policy is > regarding features added just in 3.2 and 2.7 (and didn't exist in 3.1) > in this respect. Features new in Python 3.2 that didn't exist in 3.1 should have a versionadded:: 3.2 tag. Michael > > Eli > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Sat Nov 27 15:01:22 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 27 Nov 2010 14:01:22 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> Message-ID: <4CF10F32.9020805@voidspace.org.uk> On 27/11/2010 10:51, Nick Coghlan wrote: > On Thu, Nov 25, 2010 at 3:41 AM, Michael Foord > wrote: >> Can you explain what you see as the difference? >> >> I'm not particularly interested in type validation but I like the fact that >> typical enum APIs allow you to group constants: the generated constant class >> acts as a namespace for all the defined constants. > The problem with blessing one particular "enum API" is that people > have so many different ideas as to what an enum API should look like. > There actually seemed to be quite a bit of agreement around basic functionality though. > However, the one thing they all have in common is the ability to take > a value and give it a name, then present *both* of those in debugging > information. And this is the most important functionality. I would say that the grouping (namespacing) of constants is also useful, provided by *most* Python enum APIs and easy to implement without over complexifying the API. (Note that there is no *particular* hurry to get this into 3.2 - the beta is due imminently. I wouldn't object to it ) >> Are you just suggesting something along the lines of: >> >> class NamedConstant(int): >> def __new__(cls, name, val): >> return int.__new__(cls, val) >> >> def __init__(self, name, val): >> self._name = name >> >> def __repr__(self): >> return '' % self._name >> >> FOO = NamedConstant('FOO', 3) >> >> In general the less features the better, but I'd like a few more features >> than that. :-) > Not quite. I'm suggesting a factory function that works for any value, > and derives the parent class from the type of the supplied value. > However, what you wrote is still the essence of the idea - we would be > primarily providing a building block that makes it easier for people > to *create* enum APIs if they want to, but for simple use cases (where > all they really wanted was the enhanced debugging information) they > wouldn't need to bother. In the standard library, wherever we do > "enum-like things" we would switch to using named values where it > makes sense to do so. > > Doing so may actually make sense for more than just constants - it may > make sense for significant mutable globals as well. Very interesting proposal (typed named values rather than just named constants). It doesn't handle flag values, which I would still like, but that only really makes sense for integers (sets can be OR'd but their representation is already understandable). Perhaps the integer named type could be special cased for that. Without the grouping functionality (associating a bunch of names together) you lose the 'from_name' functionality. Guido was in favour of this, and it is an obvious feature where you have grouping: http://mail.python.org/pipermail/python-dev/2010-November/105912.html """I expect that the API to convert between enums and bare ints should be i = int(e) and e = (i). It would be nice if s = str(e) and e = (s) would work too.""" This wouldn't work with your suggested implementation (as it is). Grouping and mutable "named values" could be inefficient and have issues around identity / equality. Maybe restrict the API to the immutable primitives. All the best, Michael > ========================================================================== > # Implementation (more than just a sketch, since it handles some > interesting corner cases) > import functools > @functools.lru_cache() > def _make_named_value_type(base_type): > class _NamedValueType(base_type): > def __new__(cls, name, value): > return base_type.__new__(cls, value) > def __init__(self, name, value): > self.__name = name > super().__init__(value) > @property > def _name(self): > return self.__name > def _raw(self): > return base_type(self) > def __repr__(self): > return "{}={}".format(self._name, super().__repr__()) > if base_type.__str__ is object.__str__: > __str__ = base_type.__repr__ > _NamedValueType.__name__ = "Named<{}>".format(base_type.__name__) > return _NamedValueType > > def named_value(name, value): > return _make_named_value_type(type(value))(name, value) > > def set_named_values(namespace, **kwds): > for k, v in kwds.items(): > namespace[k] = named_value(k, v) > > x = named_value("FOO", 1) > y = named_value("BAR", "Hello World!") > z = named_value("BAZ", dict(a=1, b=2, c=3)) > > print(x, y, z, sep="\n") > print("\n".join(map(repr, (x, y, z)))) > print("\n".join(map(str, map(type, (x, y, z))))) > > set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) > print("\n".join(map(repr, (foo, bar, baz)))) > print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz)) > > ========================================================================== > > # Session output for the last 6 lines >>>> print(x, y, z, sep="\n") > 1 > Hello World! > {'a': 1, 'c': 3, 'b': 2} > >>>> print("\n".join(map(repr, (x, y, z)))) > FOO=1 > BAR='Hello World!' > BAZ={'a': 1, 'c': 3, 'b': 2} > >>>> print("\n".join(map(str, map(type, (x, y, z))))) > '> > '> > '> > >>>> set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) >>>> print("\n".join(map(repr, (foo, bar, baz)))) > foo=1 > bar='Hello World!' > baz={'a': 1, 'c': 3, 'b': 2} > >>>> print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz)) > True True True > > For "normal" use, such objects would look like ordinary instances of > their class. They would only behave differently when their > representation is printed (prepending their name), or when their type > is interrogated (being an instance of the named subclass rather than > the ordinary type). > > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Sat Nov 27 15:58:08 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 28 Nov 2010 00:58:08 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF10F32.9020805@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF10F32.9020805@voidspace.org.uk> Message-ID: On Sun, Nov 28, 2010 at 12:01 AM, Michael Foord wrote: > Very interesting proposal (typed named values rather than just named > constants). It doesn't handle flag values, which I would still like, but > that only really makes sense for integers (sets can be OR'd but their > representation is already understandable). Perhaps the integer named type > could be special cased for that. > > Without the grouping functionality (associating a bunch of names together) > you lose the 'from_name' functionality. Guido was in favour of this, and it > is an obvious feature where you have grouping: > http://mail.python.org/pipermail/python-dev/2010-November/105912.html > > """I expect that the API to convert between enums and bare ints should be > i = int(e) and e = (i). It would be nice if s = str(e) and > e = (s) would work too.""" Note that the "i = int(e)" and "s = str(e)" parts of Guido's expectation do work (they are, in fact, the underling implementation of the _raw() method), so an enum class would only be needed to provide the other half of the equation. The named values have no opinion on equivalence at all (they just defer to the parent class), but change the rules for identity (which are always murky anyway, since caching is optional even for immutable types). > This wouldn't work with your suggested implementation (as it is). Grouping > and mutable "named values" could be inefficient and have issues around > identity / equality. Maybe restrict the API to the immutable primitives. My proposal doesn't say anything about grouping at all - it's just an idea for "here's a standard way to associate a canonical name with a particular object, independent of the namespaces that happen to reference that object". Now, a particular *grouping* API may want to restrict itself in various ways, but that's my point. We should be looking at a standard solution for the ground level problem (i.e. the idea named_value attempts to solve) and then let various 3rd party enum/name grouping implementations flourish on top of that, rather than trying to create an all-singing all-dancing "value grouping" API (which is going to be far more intrusive than a simple API for "here's a way to give your constants and important data structures names that show up in their representations"). For example, using named_value as a primitive, you can fairly easily do: class Namegroup: # Missing lots of niceties of a real enum class, but shows the idea # as to how a real implementation could leverage named_value def __init__(self, _groupname, **kwds): self._groupname = _groupname pattern = _groupname + ".{}" self._value_map = {} for k, v in kwds.items(): attr = named_value(pattern.format(k), v) setattr(self, k, attr) self._value_map[v] = attr @classmethod def from_names(cls, groupname, *args): kwds = dict(zip(args, range(len(args)))) return cls(groupname, **kwds) def __call__(self, arg): return self._value_map[arg] silly = Namegroup.from_names("Silly", "FOO", "BAR", "BAZ") >>> silly.FOO Silly.FOO=0 >>> int(silly.FOO) 0 >>> silly(0) Silly.FOO=0 named_value deals with all the stuff to do with pretending to be the original type of object (only with an associated name), leaving the grouping API to deal with issues of creating groups of names and mapping between them and the original values in various ways. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Nov 27 16:04:17 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 28 Nov 2010 01:04:17 +1000 Subject: [Python-Dev] [Python-checkins] r86750 - python/branches/py3k/Demo/curses/life.py In-Reply-To: <4CF0FF18.4030408@voidspace.org.uk> References: <20101125145644.D98FAEEA26@mail.python.org> <4CEF0E3B.2070608@netwok.org> <20101126021524.GA1450@rubuntu> <4CF0FF18.4030408@voidspace.org.uk> Message-ID: On Sat, Nov 27, 2010 at 10:52 PM, Michael Foord wrote: >> I just resign myself to having to spell words like colour and >> serialise wrong when I'm working on Python. Compared to the >> adjustments the non-native English speakers have to make, I figure I'm >> getting off lightly ;) >> > > I *thought* that the Python policy was that English speakers wrote > documentation in English and American speakers wrote documentation in > American and that we *don't* insist on US spellings in the Python > documentation? If we're just talking about those things in generally, then that's a reasonable rule. But when in close proximity to an actual API that uses the American spelling, or modifying a file that uses the relevant word a lot, following the prevailing style is a definite courtesy to the reader. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Sat Nov 27 16:07:18 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sat, 27 Nov 2010 15:07:18 +0000 Subject: [Python-Dev] [Python-checkins] r86750 - python/branches/py3k/Demo/curses/life.py In-Reply-To: References: <20101125145644.D98FAEEA26@mail.python.org> <4CEF0E3B.2070608@netwok.org> <20101126021524.GA1450@rubuntu> <4CF0FF18.4030408@voidspace.org.uk> Message-ID: <4CF11EA6.8050409@voidspace.org.uk> On 27/11/2010 15:04, Nick Coghlan wrote: > On Sat, Nov 27, 2010 at 10:52 PM, Michael Foord > wrote: >>> I just resign myself to having to spell words like colour and >>> serialise wrong when I'm working on Python. Compared to the >>> adjustments the non-native English speakers have to make, I figure I'm >>> getting off lightly ;) >>> >> I *thought* that the Python policy was that English speakers wrote >> documentation in English and American speakers wrote documentation in >> American and that we *don't* insist on US spellings in the Python >> documentation? > If we're just talking about those things in generally, then that's a > reasonable rule. But when in close proximity to an actual API that > uses the American spelling, or modifying a file that uses the relevant > word a lot, following the prevailing style is a definite courtesy to > the reader. > Ok, thanks. Sounds like a good guideline. Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From ncoghlan at gmail.com Sat Nov 27 16:07:35 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 28 Nov 2010 01:07:35 +1000 Subject: [Python-Dev] [Python-checkins] r86745 - in python/branches/py3k: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: <4CF1016C.8050902@voidspace.org.uk> References: <20101125061234.F1CC3EEA23@mail.python.org> <4CF1016C.8050902@voidspace.org.uk> Message-ID: On Sat, Nov 27, 2010 at 11:02 PM, Michael Foord wrote: > Features new in Python 3.2 that didn't exist in 3.1 should have a > versionadded:: 3.2 tag. As Michael said, from a docs point of view, the version flow is independent: "2.6 -> 2.7" and "3.1 -> 3.2". The issue has really only come up with this release, since there was no intervening 2.x release between 3.0 and 3.1. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From barry at python.org Sat Nov 27 19:22:16 2010 From: barry at python.org (Barry Warsaw) Date: Sat, 27 Nov 2010 13:22:16 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF10F32.9020805@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF10F32.9020805@voidspace.org.uk> Message-ID: <20101127132216.533f7332@mission> On Nov 27, 2010, at 02:01 PM, Michael Foord wrote: >(Note that there is no *particular* hurry to get this into 3.2 - the beta is >due imminently. I wouldn't object to it ) Indeed. I don't think the time is right to try to get this into 3.2. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From anurag.chourasia at gmail.com Sat Nov 27 19:45:44 2010 From: anurag.chourasia at gmail.com (Anurag Chourasia) Date: Sun, 28 Nov 2010 00:15:44 +0530 Subject: [Python-Dev] Python make fails with error "Fatal Python error: Interpreter not initialized (version mismatch?)" Message-ID: Hi All, During the make step of python, I am encountering a weird error. This is on AIX 5.3 using gcc as the compiler. My configuration options are as follows ./configure --enable-shared --disable-ipv6 --with-gcc=gcc CPPFLAGS="-I /opt/freeware/include -I /opt/freeware/include/readline -I /opt/freeware/include/ncurses" LDFLAGS="-L. -L/usr/local/lib" Below is the transcript from the make step. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ running build running build_ext ldd: /lib/libreadline.a: File is an archive. INFO: Can't locate Tcl/Tk libs and/or headers building '_struct' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/u01/home/apli/wm/GDD/Python-2.6.6/./Include -I. -IInclude -I./Include -I/opt/freeware/include -I/opt/freeware/include/readline -I/opt/freeware/include/ncurses -I/usr/local/include -I/u01/home/apli/wm/GDD/Python-2.6.6/Include -I/u01/home/apli/wm/GDD/Python-2.6.6 -c /u01/home/apli/wm/GDD/Python-2.6.6/Modules/_struct.c -o build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/_struct.o ./Modules/ld_so_aix gcc -pthread -bI:Modules/python.exp -L. -L/usr/local/lib build/temp.aix-5.3-2.6/u01/home/apli/wm/GDD/Python-2.6.6/Modules/_struct.o -L. -L/usr/local/lib -lpython2.6 -o build/lib.aix-5.3-2.6/_struct.so *Fatal Python error: Interpreter not initialized (version mismatch?)* *make: 1254-059 The signal code from the last command is 6.* ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The last command that i see above (ld_so_aix) seems to have completed as the file _struct.so exists after this command and hence I am not sure which step is failing. There is no other Python version on my machine. Please guide. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Nov 27 21:50:11 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 27 Nov 2010 15:50:11 -0500 Subject: [Python-Dev] [Python-checkins] r86745 - in python/branches/py3k: Doc/library/difflib.rst Lib/difflib.py Lib/test/test_difflib.py Misc/NEWS In-Reply-To: References: <20101125061234.F1CC3EEA23@mail.python.org> Message-ID: <4CF16F03.9060407@udel.edu> On 11/27/2010 7:17 AM, Nick Coghlan wrote: > On Thu, Nov 25, 2010 at 4:12 PM, terry.reedy wrote: >> The :class:`SequenceMatcher` class has this constructor: >> >> >> -.. class:: SequenceMatcher(isjunk=None, a='', b='') >> +.. class:: SequenceMatcher(isjunk=None, a='', b='', autojunk=True) >> >> Optional argument *isjunk* must be ``None`` (the default) or a one-argument >> function that takes a sequence element and returns true if and only if the >> @@ -340,6 +349,9 @@ >> The optional arguments *a* and *b* are sequences to be compared; both default to >> empty strings. The elements of both sequences must be :term:`hashable`. >> >> + The optional argument *autojunk* can be used to disable the automatic junk >> + heuristic. >> + > > Catching up on checkins traffic, so a later checkin may already fix > this, but there should be a versionchanged tag in the docs to note > when the autojunk parameter was added. Right. When S.C. forward-ported the 2.7 patch. he must have thought it not needed and I missed the difference between the diffs. Will add note in both places needed immediately. Terry From v+python at g.nevcal.com Sat Nov 27 21:56:14 2010 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 27 Nov 2010 12:56:14 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> Message-ID: <4CF1706E.5030503@g.nevcal.com> On 11/27/2010 2:51 AM, Nick Coghlan wrote: > Not quite. I'm suggesting a factory function that works for any value, > and derives the parent class from the type of the supplied value. Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation. I learned a bit making mine, and more understanding yours to some degree. What I still don't understand about your implementation, is that when adding one additional line to your file, it fails: w = named_value("ABC", z ) Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given). -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Nov 27 23:11:44 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 28 Nov 2010 09:11:44 +1100 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: References: Message-ID: <4CF18220.7000202@pearwood.info> Nick Coghlan wrote: > On Thu, Nov 25, 2010 at 6:24 AM, Georg Brandl wrote: >> Hi, >> >> at , you can look at a version of the 3.2 >> docs that has the upcoming commenting feature. JavaScript is mandatory. > > Very nice! > > I'm not sure what to do about the discoverability of the comment > bubbles as the end of each paragraph. I initially thought commenting > wasn't available on What's New or the Using Python docs until seeing > where the blue comment bubbles appeared in the math module docs. I wonder what the point of the comment bubbles is? This isn't a graphical UI where (contrary to popular opinion) a picture is *not* worth a thousand words, but may require a help-bubble to explain. This is text. If you want to make a comment on some text, the usual practice is to add more text :) I wasn't able to find a comment bubble that contained anything, so I don't know what sort of information you expect them to contain -- every one I tried said "0 comments". But it seems to me that comments are superfluous, if not actively harmful: (1) Anything important enough to tell the reader should be included in the text, where it can be easily seen, read and printed. (2) Discovery is lousy -- not only do you need to be running Javascript, which many people do not for performance, privacy and convenience[*], but you have to carefully mouse-over the paragraph just to see the blue bubble, and THEN you have to *precisely* mouse-over the bubble itself. (3) This will be a horrible and possibly even literally painful experience for anyone with a physical disability that makes precise positioning of the mouse difficult. (4) Accessibility for the blind and those using screen readers will probably be non-existent. (5) If the information in the comment bubbles is trivial enough that we're happy to say that the blind, the disabled and those who avoid Javascript don't need it, then perhaps *nobody* needs it. [*] In my experience, websites tend to fall into two basic categories: those that don't work at all without Javascript, and those that run better, faster, and with fewer anti-features and inconveniences without Javascript. -- Steven From g.brandl at gmx.net Sat Nov 27 23:37:29 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 27 Nov 2010 23:37:29 +0100 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: <4CF18220.7000202@pearwood.info> References: <4CF18220.7000202@pearwood.info> Message-ID: Am 27.11.2010 23:11, schrieb Steven D'Aprano: > Nick Coghlan wrote: >> On Thu, Nov 25, 2010 at 6:24 AM, Georg Brandl wrote: >>> Hi, >>> >>> at , you can look at a version of the 3.2 >>> docs that has the upcoming commenting feature. JavaScript is mandatory. >> >> Very nice! >> >> I'm not sure what to do about the discoverability of the comment >> bubbles as the end of each paragraph. I initially thought commenting >> wasn't available on What's New or the Using Python docs until seeing >> where the blue comment bubbles appeared in the math module docs. > > I wonder what the point of the comment bubbles is? This isn't a > graphical UI where (contrary to popular opinion) a picture is *not* > worth a thousand words, but may require a help-bubble to explain. This > is text. If you want to make a comment on some text, the usual practice > is to add more text :) Yes, I already mentioned that the bubbles could be replaced by text links if they prove too confusing. > I wasn't able to find a comment bubble that contained anything, so I > don't know what sort of information you expect them to contain -- every > one I tried said "0 comments". Maybe you should have tried the page I recommended as a demo, and where Nick made his comments? :) > But it seems to me that comments are superfluous, if not actively harmful: (I've not read anything about harmful below. Was that just FUD?) > (1) Anything important enough to tell the reader should be included in > the text, where it can be easily seen, read and printed. Yes. There need to be ways for the reader to feed back to the author what they want to have included. Currently, this is I'm all for removing comments with suggestions once they have been integrated in the main text. > (2) Discovery is lousy -- not only do you need to be running Javascript, > which many people do not for performance, privacy and convenience[*], That is not an argument nowadays, seeing how many sites/web applications require JS. (Most people who deactivate JS globally maintain a whitelist anyway, and can easily add docs.python.org to that.) These comments are an optional feature and therefore do not need to be accessible for 100% of users. > but you have to carefully mouse-over the paragraph just to see the blue > bubble, and THEN you have to *precisely* mouse-over the bubble itself. Bubbles are always shown for paragraphs *with* comments. > (3) This will be a horrible and possibly even literally painful > experience for anyone with a physical disability that makes precise > positioning of the mouse difficult. You're making this point just because of the size of the bubbles? Well, these users can register on the site and there can be a user preference to display larger links instead (if we choose to keep the bubbles, anyway.) > (4) Accessibility for the blind and those using screen readers will > probably be non-existent. It will be the same as for other web apps using JavaScript. Since I'm not a professional user interface designer, I don't know what screen readers can and cannot do. > (5) If the information in the comment bubbles is trivial enough that > we're happy to say that the blind, the disabled and those who avoid > Javascript don't need it, then perhaps *nobody* needs it. Sorry, but that is a nonsensical argument. Apart from the questionable notion that anything must be available to everyone to be worth anything, it also doesn't consider that the comments are not only for fellow users: as I said above, the comments are designed to be a very quick way to give feedback to *us* developers. (This is the reason for the "propose a change" feature, for example.) So even if only 30% of all users had access to the comments and could use that to help us improve the documentation by submitting suggestions and corrections they never would have bothered registering in the tracker for, that would be a net gain. cheers, Georg From raymond.hettinger at gmail.com Sun Nov 28 00:26:13 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 27 Nov 2010 15:26:13 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF1706E.5030503@g.nevcal.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> Message-ID: <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote: > On 11/27/2010 2:51 AM, Nick Coghlan wrote: >> >> Not quite. I'm suggesting a factory function that works for any value, >> and derives the parent class from the type of the supplied value. > > Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation. I learned a bit making mine, and more understanding yours to some degree. What I still don't understand about your implementation, is that when adding one additional line to your file, it fails: > > w = named_value("ABC", z ) > > Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given). Can I suggest that an enum-maker be offered as a third-party module rather than prematurely adding it into the standard library. Raymond From steve at pearwood.info Sun Nov 28 00:58:52 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 28 Nov 2010 10:58:52 +1100 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: References: <4CF18220.7000202@pearwood.info> Message-ID: <4CF19B3C.2000308@pearwood.info> Georg Brandl wrote: > Am 27.11.2010 23:11, schrieb Steven D'Aprano: >> I wasn't able to find a comment bubble that contained anything, so I >> don't know what sort of information you expect them to contain -- every >> one I tried said "0 comments". > > Maybe you should have tried the page I recommended as a demo, and where Nick > made his comments? :) Aha! I never would have guessed that the bubbles are clickable -- I thought you just moused-over them and they showed static comments put there by the developers, part of the documentation itself. I didn't realise that it was for users to add spam^W comments to the page. With that perspective, I need to rethink. Yes, I failed to fully read the instructions you sent, or understand them. That's what users do -- they don't read your instructions, and they misunderstand them. If your UI isn't easily discoverable, users will not be able to use it, and will be frustrated and annoyed. The user is always right, even when they're doing it wrong *wink* >> But it seems to me that comments are superfluous, if not actively harmful: > > (I've not read anything about harmful below. Was that just FUD?) Lowering accessibility to parts of the documentation is what I was talking about when I said "actively harmful". But now that I have better understanding of what the comment system is actually for, I have to rethink. -- Steven From glenn at nevcal.com Sun Nov 28 02:04:49 2010 From: glenn at nevcal.com (Glenn Linderman) Date: Sat, 27 Nov 2010 17:04:49 -0800 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF1706E.5030503@g.nevcal.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> Message-ID: <4CF1AAB1.4010808@nevcal.com> On 11/27/2010 12:56 PM, Glenn Linderman wrote: > On 11/27/2010 2:51 AM, Nick Coghlan wrote: >> Not quite. I'm suggesting a factory function that works for any value, >> and derives the parent class from the type of the supplied value. > > Nick, thanks for the much better implementation than I achieved; you > seem to have the same goals as my implementation. I learned a bit > making mine, and more understanding yours to some degree. What I > still don't understand about your implementation, is that when adding > one additional line to your file, it fails: > > w = named_value("ABC", z ) > > Now I can understand why it might not be a good thing to make a named > value of a named value (confusing, at least), but I was surprised, and > still do not understand, that it failed reporting the __new__() takes > exactly 3 arguments (2 given). OK, I puzzled out the error, and here is a "cure" of sorts. def __new__(cls, name, value): try: return base_type.__new__(cls, value) except TypeError: return base_type.__new__(cls, name, value) def __init__(self, name, value): self.__name = name try: super().__init__(value) except TypeError: super().__init__(name, value) Probably it would be better for the except clause to raise a different type of error ( Can't recursively create named value ) or to cleverly bypass the intermediate named value, and simply apply a new name to the original value. Hmm... For this, only __new__ need be changed: def __new__(cls, name, value): try: return base_type.__new__(cls, value) except TypeError: return _make_named_value_type( type( value._raw() ))( name, value._raw() ) def __init__(self, name, value): self.__name = name super().__init__(value) Thanks for not responding too quickly, I figured out more, and learned more. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 28 03:38:27 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 28 Nov 2010 12:38:27 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> Message-ID: On Sun, Nov 28, 2010 at 9:26 AM, Raymond Hettinger wrote: > > On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote: > >> On 11/27/2010 2:51 AM, Nick Coghlan wrote: >>> >>> Not quite. I'm suggesting a factory function that works for any value, >>> and derives the parent class from the type of the supplied value. >> >> Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation. ?I learned a bit ? ? making mine, and more understanding yours to some degree. ?What I still don't understand about your implementation, is that when adding one additional line to your file, it fails: >> >> w = named_value("ABC", z ) >> >> Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given). > > Can I suggest that an enum-maker be offered as a third-party module rather than prematurely adding it into the standard library. Indeed. Glenn's failing example suggests to me that using a new metaclass is probably going to be a cleaner option than trying to dance around type's default behaviour within an ordinary class definition (if nothing else, a separate metaclass makes it much easier to detect when you're dealing with an instance of a named type). Regardless, I still see value in approaching this whole discussion as a two-level design problem, with "named values" as the more fundamental concept, and then higher level grouping APIs to get enum-style behaviour. Eventually attaining "One Obvious Way" for the former seems achievable to me, while the diversity of use cases for grouping APIs suggests to me that "one-size-fits-all" isn't going to work unless that "one size" is a Frankenstein API with more options than anyone could reasonably hope to keep in their head at once. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Sun Nov 28 04:20:50 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 27 Nov 2010 22:20:50 -0500 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> Message-ID: On 11/27/2010 6:26 PM, Raymond Hettinger wrote: > Can I suggest that an enum-maker be offered as a third-party module Possibly with competing versions for trial and testing ;-) > rather than prematurely adding it into the standard library. I had same thought. -- Terry Jan Reedy From donjohnston at selfaware.com Sun Nov 28 05:17:11 2010 From: donjohnston at selfaware.com (Don Johnston) Date: Sun, 28 Nov 2010 04:17:11 +0000 (UTC) Subject: [Python-Dev] =?utf-8?q?=5BPreview=5D_Comments_and_change_proposal?= =?utf-8?q?s=09on=09documentation?= References: <4CF18220.7000202@pearwood.info> <4CF19B3C.2000308@pearwood.info> Message-ID: Steven D'Aprano pearwood.info> writes: > Aha! I never would have guessed that the bubbles are clickable -- I > thought you just moused-over them and they showed static comments put > there by the developers, part of the documentation itself. I didn't > realise that it was for users to add spam^W comments to the page. With > that perspective, I need to rethink. > > Yes, I failed to fully read the instructions you sent, or understand > them. That's what users do -- they don't read your instructions, and > they misunderstand them. If your UI isn't easily discoverable, users > will not be able to use it, and will be frustrated and annoyed. The user > is always right, even when they're doing it wrong *wink* > > > >> But it seems to me that comments are superfluous, if not actively harmful: > > > > (I've not read anything about harmful below. Was that just FUD?) > > Lowering accessibility to parts of the documentation is what I was > talking about when I said "actively harmful". But now that I have better > understanding of what the comment system is actually for, I have to rethink. > As an end-user, I, too, share concerns about the accessibility of the pending (proposed?) commenting functionality. A read-only JSON API would be great. Up until now, Sphinx has been an incredibly helpful tool for generating beautiful documentation from ReStructuredText, which is great for limiting the risk of malformed input. The new commenting feature ("dynamic application functionality") requires persistence for user-submitted content. Database persistence is currently implemented with the -excellent- SQLAlchemy ORM. So, this is a transition from Sphinx being an excellent publishing tool to being a dynamic publishing platform for user-submitted content ("comments"). I am sure this was not without due consideration, and FUD. The Python Web Framework communities (favorite framework *here*) will be the first to reiterate the challenges that all web application developers (and commenting API providers) face on a daily basis: - SQL Injection - XSS (Cross Site Scripting) - CSRF (Cross Site Request Forgery) Here are a few scenarios to consider: (1) Freeloading jackass decides that each paragraph of our documentation would look better with 200 "comments" for viagara. Freeloading jackass is aware of how HTTP GETs work. - What markup features are supported? - How does the application sanitize user-supplied input? - Is html5lib good enough? - On docs.python.org, how are 1000 inappropriate (freeloading) comments from 1000 different IPs deleted? - What's the roadmap for {..., Akismet, ReCaptcha, ...} support? (2) Freeloading jackass buys a block of javascript adspace on . The block of javascript surreptitiously posts helpful comments on behalf of unwitting users. - How does the application ensure that comments are submitted from the site hosting the documentation? - Which frameworks have existing, reviewed CSRF protections? Trying to read through the new source here [1], but there aren't many docstrings and BB doesn't yet support inline commenting. AFAIK, there are not yet any issues filed for these concerns. [2] 1. In the event that that kind of bug is discovered, how should the community report the issues? 2. If we have an alternate method of encouraging documentation feedback, how can this feature be turned off? Thanks again for a great publishing tool, Don [1] http://bitbucket.org/birkenfeld/sphinx [2] http://bitbucket.org/birkenfeld/sphinx/issues/new From benjamin at python.org Sun Nov 28 05:33:43 2010 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 27 Nov 2010 22:33:43 -0600 Subject: [Python-Dev] [RELEASED] Python 2.7.1 Message-ID: On behalf of the Python development team, I'm happy as a clam to announce the immediate availability of Python 2.7.1. 2.7 includes many features that were first released in Python 3.1. The faster io module, the new nested with statement syntax, improved float repr, set literals, dictionary views, and the memoryview object have been backported from 3.1. Other features include an ordered dictionary implementation, unittests improvements, a new sysconfig module, auto-numbering of fields in the str/unicode format method, and support for ttk Tile in Tkinter. For a more extensive list of changes in 2.7, see http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python distribution. To download Python 2.7.1 visit: http://www.python.org/download/releases/2.7.1/ The 2.7.1 changelog is at: http://svn.python.org/projects/python/tags/r271/Misc/NEWS 2.7 documentation can be found at: http://docs.python.org/2.7/ This is a production release. Please report any bugs you find to the bug tracker: http://bugs.python.org/ Enjoy! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 2.7.1's contributors) From benjamin at python.org Sun Nov 28 05:34:42 2010 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 27 Nov 2010 22:34:42 -0600 Subject: [Python-Dev] [RELEASED] Python 3.1.3 Message-ID: On behalf of the Python development team, I'm happy as a lark to announce the third bugfix release for the Python 3.1 series, Python 3.1.3. This bug fix release features numerous bug fixes and documentation improvements over 3.1.2. The Python 3.1 version series focuses on the stabilization and optimization of the features and changes that Python 3.0 introduced. For example, the new I/O system has been rewritten in C for speed. File system APIs that use unicode strings now handle paths with undecodable bytes in them. Other features include an ordered dictionary implementation, a condensed syntax for nested with statements, and support for ttk Tile in Tkinter. For a more extensive list of changes in 3.1, see http://doc.python.org/3.1/whatsnew/3.1.html or Misc/NEWS in the Python distribution. This is a production release. To download Python 3.1.3 visit: http://www.python.org/download/releases/3.1.3/ A list of changes in 3.1.3 can be found here: http://svn.python.org/projects/python/tags/r313/Misc/NEWS The 3.1 documentation can be found at: http://docs.python.org/3.1 Bugs can always be reported to: http://bugs.python.org Enjoy! -- Benjamin Peterson Release Manager benjamin at python.org (on behalf of the entire python-dev team and 3.1.3's contributors) From martin at v.loewis.de Sun Nov 28 09:09:53 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Sun, 28 Nov 2010 09:09:53 +0100 Subject: [Python-Dev] Virus on python-3.1.2.msi? Message-ID: <4CF20E51.3050004@v.loewis.de> Issue 1050 claims that the 3.1.2 installer has the virus Palevo.DZ. Can somebody with a virus scanner please confirm or contest that claim? Thanks, Martin http://bugs.python.org/issue10500 From fuzzyman at voidspace.org.uk Sun Nov 28 14:48:08 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Nov 2010 13:48:08 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> Message-ID: <4CF25D98.10105@voidspace.org.uk> On 28/11/2010 03:20, Terry Reedy wrote: > On 11/27/2010 6:26 PM, Raymond Hettinger wrote: > >> Can I suggest that an enum-maker be offered as a third-party module > > Possibly with competing versions for trial and testing ;-) > >> rather than prematurely adding it into the standard library. > > I had same thought. > There are already *several* enum packages for Python available. The implementation by Ben Finney, associated with the previous PEP, is on PyPI and the most recent release has over 4000 downloads making it reasonably popular: http://pypi.python.org/pypi/enum/ Other contenders include flufl.enum and lazr.enum. The Twisted guys would like a named constant type, and have a ticket for it, and PyQt has its own implementation (subclassing int) providing this functionality. In terms of assessing *general* usefulness in the wider community that step has already been done. This discussion came out of yet-another-set-of-integer-constants being added to the Python standard library (since changed to strings). We have integer constants, with the associated inscrutability when used from the interactive interpreter or debugging, in *many* standard library modules. The particular features and use cases being discussed have use *within* the standard library in mind. Releasing yet-another-enum-library-that-the-standard-library-can't-use would be a particularly pointless outcome of this discussion. The decision is whether or not to use named constants in the standard library, otherwise we can just point people at one of the several existing packages. All the best, Michael Foord -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From doko at ubuntu.com Sun Nov 28 16:46:09 2010 From: doko at ubuntu.com (Matthias Klose) Date: Sun, 28 Nov 2010 16:46:09 +0100 Subject: [Python-Dev] Question about GDB bindings and 32/64 bits In-Reply-To: <4CEF338C.4070509@jcea.es> References: <4CEF338C.4070509@jcea.es> Message-ID: <4CF27941.1020200@ubuntu.com> On 26.11.2010 05:11, Jesus Cea wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I have installed GDB 7.2 32 bits and 32 bits buildslaves are green. > Nevertheless 64 bits buildslaves are failing test_gdb. > > Is there any expectation that a 32 bits GDB be able to debug a 64 bits > python?. If not, gdb test should compare "platform.architecture()" (for > python and gdb in the system) and run only when they are the same. that would be too restrictive, as an 64bit gdb is able to handle 32bit binaries too. > If > this should work, I would open a bug and maybe spend some time with it. > > But before thinking about investing time, I would like to know if this > mix is actually expected or not to work. > > If not, I would consider to install a 64 bits GDB too and do some tricks > (like using an "/usr/local/bin/gdb" script wrapper to choose 32/64 > "real" gdb version) to actually execute "test_gdb" in both buildslaves > (they are running in the same physical machine). yes, and then you should be able to use this gdb for both 32 and 64bit builds. No need for a wrapper (Such a gdb is available in the gdb64 package on Debian/Ubuntu). Matthias From fuzzyman at voidspace.org.uk Sun Nov 28 17:28:00 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Nov 2010 16:28:00 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> Message-ID: <4CF28310.7070304@voidspace.org.uk> On 28/11/2010 02:38, Nick Coghlan wrote: > On Sun, Nov 28, 2010 at 9:26 AM, Raymond Hettinger > wrote: >> On Nov 27, 2010, at 12:56 PM, Glenn Linderman wrote: >> >>> On 11/27/2010 2:51 AM, Nick Coghlan wrote: >>>> Not quite. I'm suggesting a factory function that works for any value, >>>> and derives the parent class from the type of the supplied value. >>> Nick, thanks for the much better implementation than I achieved; you seem to have the same goals as my implementation. I learned a bit making mine, and more understanding yours to some degree. What I still don't understand about your implementation, is that when adding one additional line to your file, it fails: >>> >>> w = named_value("ABC", z ) >>> >>> Now I can understand why it might not be a good thing to make a named value of a named value (confusing, at least), but I was surprised, and still do not understand, that it failed reporting the __new__() takes exactly 3 arguments (2 given). >> Can I suggest that an enum-maker be offered as a third-party module rather than prematurely adding it into the standard library. > Indeed. Glenn's failing example suggests to me that using a new > metaclass is probably going to be a cleaner option than trying to > dance around type's default behaviour within an ordinary class > definition (if nothing else, a separate metaclass makes it much easier > to detect when you're dealing with an instance of a named type). > Yep, for representing a group of names a single class with a metaclass seems like a reasonable approach. See my note below about agreeing minimal feature-set and minimal-api before we discuss implementation though. > Regardless, I still see value in approaching this whole discussion as > a two-level design problem, with "named values" as the more > fundamental concept, and then higher level grouping APIs to get > enum-style behaviour. It seems like using the term "enum" provokes a strong negative reaction in some of the core-devs who are basically in favour named constants and not actively against grouping. I'm happy with NamedConstant and GroupedNames (or similar) and dropping the use of the term enum. There are also valid concerns about over-engineering (and not so valid concerns...). Simplicity in creating them and no additional burden in using them are fundamental, but in the APIs / implementations suggested so far I think we are keeping that in mind. > Eventually attaining "One Obvious Way" for the > former seems achievable to me, while the diversity of use cases for > grouping APIs suggests to me that "one-size-fits-all" isn't going to > work unless that "one size" is a Frankenstein API with more options > than anyone could reasonably hope to keep in their head at once. > Well... yes - treating it as a two level design problem is fine. I don't think there are *many* competing features, in fact as far as feature requests on python-dev go I think this is a relatively straightforward one with a lot of *agreement* on the basic functionality. We have had various discussions about what the API should look like, or what the implementation should look like, but I don't think there is a lot of disagreement about basic features. There are some 'optional features'. Many of these can be added later without backwards compatibility issues, so those can profitably be omitted from an initial implementation. Features as I see them: Named constant -------------- * Nice repr * Subclass of the type it represents * Trivially easy to convert either to a string (name) and the value it represents * If an integer type, can be OR'd with other named constants and retains a useful repr Grouped constants ---------------- * Easy to create a group of named constants, accessible as attributes on group object * Capability to go from name or value to corresponding constants Optional Features --------------- * Ability to dynamically add new named values to a group. (Suggested by Guido) * Ability to test if a name or value is in a group * Ability to list all names in a group * ANDing as well as ORing * Constants are unique * OR'ing with an integer will look up the name (or calculate it if the int itself represents flags that have already been OR'd) and return a named value (with useful repr) instead of just an integer * Named constants be named values that can wrap *any* type and not just immutable values. (Note that wrapping mutable types makes providing "from_value" functionality harder *unless* we guarantee that named values are unique. If they aren't unique named values for a mutable type can have different values and there is no single definition of what the named value actually is.) Requiring that values only have one name - or alternatively that values on a group could have multiple names (obviously incompatible features). * Requiring all names in a group to be of the same type * Allow names to be set automatically in a namespace, for example in a class namespace or on a module * Allow subclassing and adding of new values only present in subclass I'd rather we agree a suitable (minimal) API and feature set and go to implementation from that. For wrapping mutable types I'm tempted to say YAGNI. For the standard library wrapping integers meets almost all our use-cases except for one float. (At work we have a decimal constant as it happens.) Perhaps we could require immutable types for groups but allow arbitrary values for individual named values? For the named values api: name = NamedValue('name', value) For the grouping (tentatively accepted as reasonable by Antoine): Group = make_constants('Group', name1=value1, name2=value2) name1, name2 = Group.name1, Group.name1 flag = name1 | name2 value = int(Group.name1) name = Group('name1') # alternatively: value = Group.from_name('name1') name = Group.from_value(value1) # Group(value1) could work only if values aren't strings # perhaps: name = Group(value=value1) Group.new_name = value3 # create new value on the group names = Group.all_names() # further bikeshedding on spelling of all_names required # correspondingly 'all_values' I guess, returning the constants themselves Some of the optional features couldn't later be added without backwards compatibility concerns (I think the type checking features and requiring unique values for example). We should at least consider these if we are to make adding them later difficult. I would be fine with not having these features. All the best, Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Sun Nov 28 18:05:12 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Nov 2010 17:05:12 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF28310.7070304@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> Message-ID: <4CF28BC8.1080508@voidspace.org.uk> On 28/11/2010 16:28, Michael Foord wrote: > [snip...] > I don't think there are *many* competing features, in fact as far as > feature requests on python-dev go I think this is a relatively > straightforward one with a lot of *agreement* on the basic functionality. > > We have had various discussions about what the API should look like, > or what the implementation should look like, but I don't think there > is a lot of disagreement about basic features. There are some > 'optional features'. Many of these can be added later without > backwards compatibility issues, so those can profitably be omitted > from an initial implementation. > > Features as I see them: > > Named constant > -------------- > > * Nice repr > * Subclass of the type it represents > * Trivially easy to convert either to a string (name) and the value it > represents > * If an integer type, can be OR'd with other named constants and > retains a useful repr > Note that having an OR repr is meaningless *unless* the constants are intended to be flags, OR'ing should be specified. name = NamedValue('name', value, flags=True) Where flags defaults to False. Typically you will use this through the grouping API anyway - where it can either be a keyword argument (slightly annoying because the suggestion is to create the named values through keyword arguments) or we can have two group-factory functions: Group = make_constants('Group', name1=value1, name2=value2) Flags = make_flags('Flags', name1=value1, name2=value2) It is sensible if flag values are only powers of 2; we could enforce that or not... (Another one for the optional feature list.) I forgot auto-enumeration (specifying names only and having values autogenerated) from the optional feature set by the way. I think Antoine strongly disapproves of this feature because it reminds him of C enums. Mark Dickinson thinks that the flags feature could be an optional feature too. If we have ORing it makes sense to have ANDing, so I guess they belong together. I think there is value in it though. I realise that the optional feature list is now not small, and implementing all of it would create the "franken-api" Nick is worried about. The minimal feature list is nicely small though and provides useful functionality. All the best, Michael > > Grouped constants > ---------------- > * Easy to create a group of named constants, accessible as attributes > on group object > * Capability to go from name or value to corresponding constants > > > Optional Features > --------------- > > * Ability to dynamically add new named values to a group. (Suggested > by Guido) > * Ability to test if a name or value is in a group > * Ability to list all names in a group > * ANDing as well as ORing > * Constants are unique > * OR'ing with an integer will look up the name (or calculate it if the > int itself represents flags that have already been OR'd) and return a > named value (with useful repr) instead of just an integer > * Named constants be named values that can wrap *any* type and not > just immutable values. (Note that wrapping mutable types makes > providing "from_value" functionality harder *unless* we guarantee that > named values are unique. If they aren't unique named values for a > mutable type can have different values and there is no single > definition of what the named value actually is.) > Requiring that values only have one name - or alternatively that > values on a group could have multiple names (obviously incompatible > features). > * Requiring all names in a group to be of the same type > * Allow names to be set automatically in a namespace, for example in a > class namespace or on a module > * Allow subclassing and adding of new values only present in subclass > > > I'd rather we agree a suitable (minimal) API and feature set and go to > implementation from that. > > For wrapping mutable types I'm tempted to say YAGNI. For the standard > library wrapping integers meets almost all our use-cases except for > one float. (At work we have a decimal constant as it happens.) Perhaps > we could require immutable types for groups but allow arbitrary values > for individual named values? > > For the named values api: > > name = NamedValue('name', value) > > For the grouping (tentatively accepted as reasonable by Antoine): > > Group = make_constants('Group', name1=value1, name2=value2) > name1, name2 = Group.name1, Group.name1 > flag = name1 | name2 > > value = int(Group.name1) > name = Group('name1') > # alternatively: value = Group.from_name('name1') > name = Group.from_value(value1) > # Group(value1) could work only if values aren't strings > # perhaps: name = Group(value=value1) > > Group.new_name = value3 # create new value on the group > names = Group.all_names() > # further bikeshedding on spelling of all_names required > # correspondingly 'all_values' I guess, returning the constants > themselves > > Some of the optional features couldn't later be added without > backwards compatibility concerns (I think the type checking features > and requiring unique values for example). We should at least consider > these if we are to make adding them later difficult. I would be fine > with not having these features. > > All the best, > > Michael > >> Cheers, >> Nick. >> > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Sun Nov 28 18:16:21 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Nov 2010 17:16:21 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF28BC8.1080508@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> <4CF28BC8.1080508@voidspace.org.uk> Message-ID: <4CF28E65.2060405@voidspace.org.uk> On 28/11/2010 17:05, Michael Foord wrote: > [snip...] > It is sensible if flag values are only powers of 2; we could enforce > that or not... (Another one for the optional feature list.) > Another 'optional' feature I omitted was Phillip J. Eby's suggestion / requirement that named values be pickleable. Email is clunky for handling this, is there enough support (there is still some objection that is sure) to revive the PEP or create a new one? I also didn't include Nick's suggested API, which is slightly different from the one I suggested: silly = Namegroup.from_names("Silly", "FOO", "BAR", "BAZ") >>> silly.FOO Silly.FOO=0 >>> int(silly.FOO) 0 >>> silly(0) Silly.FOO=0 x = named_value("FOO", 1) y = named_value("BAR", "Hello World!") z = named_value("BAZ", dict(a=1, b=2, c=3)) set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) Where a named value created from an integer is an int subclass, from a dict a dict subclass and so on. Michael > I forgot auto-enumeration (specifying names only and having values > autogenerated) from the optional feature set by the way. I think > Antoine strongly disapproves of this feature because it reminds him of > C enums. > > Mark Dickinson thinks that the flags feature could be an optional > feature too. If we have ORing it makes sense to have ANDing, so I > guess they belong together. I think there is value in it though. > > I realise that the optional feature list is now not small, and > implementing all of it would create the "franken-api" Nick is worried > about. The minimal feature list is nicely small though and provides > useful functionality. > > All the best, > > Michael > >> >> Grouped constants >> ---------------- >> * Easy to create a group of named constants, accessible as attributes >> on group object >> * Capability to go from name or value to corresponding constants >> >> >> Optional Features >> --------------- >> >> * Ability to dynamically add new named values to a group. (Suggested >> by Guido) >> * Ability to test if a name or value is in a group >> * Ability to list all names in a group >> * ANDing as well as ORing >> * Constants are unique >> * OR'ing with an integer will look up the name (or calculate it if >> the int itself represents flags that have already been OR'd) and >> return a named value (with useful repr) instead of just an integer >> * Named constants be named values that can wrap *any* type and not >> just immutable values. (Note that wrapping mutable types makes >> providing "from_value" functionality harder *unless* we guarantee >> that named values are unique. If they aren't unique named values for >> a mutable type can have different values and there is no single >> definition of what the named value actually is.) >> Requiring that values only have one name - or alternatively that >> values on a group could have multiple names (obviously incompatible >> features). >> * Requiring all names in a group to be of the same type >> * Allow names to be set automatically in a namespace, for example in >> a class namespace or on a module >> * Allow subclassing and adding of new values only present in subclass >> >> >> I'd rather we agree a suitable (minimal) API and feature set and go >> to implementation from that. >> >> For wrapping mutable types I'm tempted to say YAGNI. For the standard >> library wrapping integers meets almost all our use-cases except for >> one float. (At work we have a decimal constant as it happens.) >> Perhaps we could require immutable types for groups but allow >> arbitrary values for individual named values? >> >> For the named values api: >> >> name = NamedValue('name', value) >> >> For the grouping (tentatively accepted as reasonable by Antoine): >> >> Group = make_constants('Group', name1=value1, name2=value2) >> name1, name2 = Group.name1, Group.name1 >> flag = name1 | name2 >> >> value = int(Group.name1) >> name = Group('name1') >> # alternatively: value = Group.from_name('name1') >> name = Group.from_value(value1) >> # Group(value1) could work only if values aren't strings >> # perhaps: name = Group(value=value1) >> >> Group.new_name = value3 # create new value on the group >> names = Group.all_names() >> # further bikeshedding on spelling of all_names required >> # correspondingly 'all_values' I guess, returning the constants >> themselves >> >> Some of the optional features couldn't later be added without >> backwards compatibility concerns (I think the type checking features >> and requiring unique values for example). We should at least consider >> these if we are to make adding them later difficult. I would be fine >> with not having these features. >> >> All the best, >> >> Michael >> >>> Cheers, >>> Nick. >>> >> >> > > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From steve at pearwood.info Sun Nov 28 19:05:55 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 29 Nov 2010 05:05:55 +1100 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF28E65.2060405@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> <4CF28BC8.1080508@voidspace.org.uk> <4CF28E65.2060405@voidspace.org.uk> Message-ID: <4CF29A03.3060900@pearwood.info> Michael Foord wrote: > Another 'optional' feature I omitted was Phillip J. Eby's suggestion / > requirement that named values be pickleable. Email is clunky for > handling this, is there enough support (there is still some objection > that is sure) to revive the PEP or create a new one? I think it definitely needs a PEP. I don't care whether you revive the old PEP or write a new one. -- Steven From fuzzyman at voidspace.org.uk Sun Nov 28 19:49:30 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Nov 2010 18:49:30 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF29A03.3060900@pearwood.info> References: <20101121034404.52924F20A@mail.python.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> <4CF28BC8.1080508@voidspace.org.uk> <4CF28E65.2060405@voidspace.org.uk> <4CF29A03.3060900@pearwood.info> Message-ID: <4CF2A43A.5040009@voidspace.org.uk> On 28/11/2010 18:05, Steven D'Aprano wrote: > Michael Foord wrote: > >> Another 'optional' feature I omitted was Phillip J. Eby's suggestion >> / requirement that named values be pickleable. Email is clunky for >> handling this, is there enough support (there is still some objection >> that is sure) to revive the PEP or create a new one? > > I think it definitely needs a PEP. I don't care whether you revive the > old PEP or write a new one. > Well, "if it were to be accepted it would need a PEP" and "the next step should be a PEP" are slightly different statements. :-) As I agree with the former *anyway* at the worst starting a PEP will waste time, so I guess I'll get that underway when I get a chance... Thanks Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From alexander.belopolsky at gmail.com Sun Nov 28 21:24:37 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 15:24:37 -0500 Subject: [Python-Dev] Python and the Unicode Character Database Message-ID: Two recently reported issues brought into light the fact that Python language definition is closely tied to character properties maintained by the Unicode Consortium. [1,2] For example, when Python switches to Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two additional characters that Python can use in identifiers. [3] With Python 3.1: >>> exec('\u0CF1 = 1') Traceback (most recent call last): File "", line 1, in File "", line 1 ? = 1 ^ SyntaxError: invalid character in identifier but with Python 3.2a4: >>> exec('\u0CF1 = 1') >>> eval('\u0CF1') 1 Of course, the likelihood is low that this change will affect any user, but the change in str.isspace() reported in [1] is likely to cause some trouble: Python 2.6.5: >>> u'A\u200bB'.split() [u'A', u'B'] Python 2.7: >>> u'A\u200bB'.split() [u'A\u200bB'] While we have little choice but to follow UCD in defining str.isidentifier(), I think Python can promise users more stability in what it treats as space or as a digit in its builtins. For example, I don't think that supporting >>> float('????.??') 1234.56 is more important than to assure users that once their program accepted some text as a number, they can assume that the text is ASCII. [1] http://bugs.python.org/issue10567 [2] http://bugs.python.org/issue10557 [3] http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes From solipsis at pitrou.net Sun Nov 28 21:43:11 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 28 Nov 2010 21:43:11 +0100 Subject: [Python-Dev] Python and the Unicode Character Database References: Message-ID: <20101128214311.092abd35@pitrou.net> On Sun, 28 Nov 2010 15:24:37 -0500 Alexander Belopolsky wrote: > While we have little choice but to follow UCD in defining > str.isidentifier(), I think Python can promise users more stability in > what it treats as space or as a digit in its builtins. Well, if "unicode support" means "support the latest version of the Unicode standard", I'm not sure we have a choice. We can make exceptions, but that would only confuse users even more, wouldn't it? > For example, > I don't think that supporting > > >>> float('????.??') > 1234.56 > > is more important than to assure users that once their program > accepted some text as a number, they can assume that the text is > ASCII. Why would they assume the text is ASCII? Regards Antoine. From alexander.belopolsky at gmail.com Sun Nov 28 21:58:33 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 15:58:33 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101128214311.092abd35@pitrou.net> References: <20101128214311.092abd35@pitrou.net> Message-ID: On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote: .. >> For example, >> I don't think that supporting >> >> >>> float('????.??') >> 1234.56 >> >> is more important than to assure users that once their program >> accepted some text as a number, they can assume that the text is >> ASCII. > > Why would they assume the text is ASCII? def deposit(self, amountstr): self.balance += float(amountstr) audit_log("Deposited: " + amountstr) Auditor: $ cat numbered-account.log Deposited: ?????.?? ... From solipsis at pitrou.net Sun Nov 28 22:04:15 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 28 Nov 2010 22:04:15 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> Message-ID: <20101128220415.28b77508@pitrou.net> On Sun, 28 Nov 2010 15:58:33 -0500 Alexander Belopolsky wrote: > On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote: > .. > >> For example, > >> I don't think that supporting > >> > >> >>> float('????.??') > >> 1234.56 > >> > >> is more important than to assure users that once their program > >> accepted some text as a number, they can assume that the text is > >> ASCII. > > > > Why would they assume the text is ASCII? > > def deposit(self, amountstr): > self.balance += float(amountstr) > audit_log("Deposited: " + amountstr) > > Auditor: > > $ cat numbered-account.log > Deposited: ?????.?? I'm not sure that's how banking applications are written :) Antoine. From jsbueno at python.org.br Sun Nov 28 22:12:09 2010 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Sun, 28 Nov 2010 19:12:09 -0200 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101128220415.28b77508@pitrou.net> References: <20101128214311.092abd35@pitrou.net> <20101128220415.28b77508@pitrou.net> Message-ID: On Sun, Nov 28, 2010 at 7:04 PM, Antoine Pitrou wrote: > On Sun, 28 Nov 2010 15:58:33 -0500 > Alexander Belopolsky wrote: > >> On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote: >> .. >> >> For example, >> >> I don't think that supporting >> >> >> >> >>> float('????.??') >> >> 1234.56 >> >> >> >> is more important than to assure users that once their program >> >> accepted some text as a number, they can assume that the text is >> >> ASCII. >> > >> > Why would they assume the text is ASCII? >> >> def deposit(self, amountstr): >> ? ? ? self.balance += float(amountstr) >> ? ? ? audit_log("Deposited: " + amountstr) >> >> Auditor: >> >> $ cat numbered-account.log >> Deposited: ?????.?? > > > I'm not sure that's how banking applications are written :) > +1 for this being bogus - I see no correlation whatsoever in numbers inside unicode having to be "ASCII" if we have surpassed all technical barriers for needing to behave like that. ASCII is an oversimplification of human communication needed for computing devices not complex enough to represent it fully. Let novice C programmers in English speaking countries deal with the fact that 1 character is not 1 byte anymore. We are past this point. js -><- > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br > From alexander.belopolsky at gmail.com Sun Nov 28 22:18:06 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 16:18:06 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <20101128220415.28b77508@pitrou.net> Message-ID: On Sun, Nov 28, 2010 at 4:12 PM, Joao S. O. Bueno wrote: .. > Let novice C programmers in English speaking countries deal with the > fact that 1 character is not 1 byte anymore. We are past this point. If you are, please contribute your expertise here: http://bugs.python.org/issue2382 From greg.ewing at canterbury.ac.nz Sun Nov 28 22:23:56 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 29 Nov 2010 10:23:56 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CEE5C1C.9000905@btinternet.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CEDDC2D.204@canterbury.ac.nz> <4CEE5C1C.9000905@btinternet.com> Message-ID: <4CF2C86C.9030505@canterbury.ac.nz> Rob Cliffe wrote: > But couldn't they be presented to the Python programmer as a single > type, with the implementation details hidden "under the hood"? Not in CPython, because tuple items are kept in the same block of memory as the object header. Because CPython can't move objects, this means that the size of the tuple must be known when the object is created. -- Greg From martin at v.loewis.de Sun Nov 28 23:17:13 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 28 Nov 2010 23:17:13 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101128214311.092abd35@pitrou.net> References: <20101128214311.092abd35@pitrou.net> Message-ID: <4CF2D4E9.3060607@v.loewis.de> >>>>> float('????.??') >> 1234.56 I think it's a bug that this works. The definition of the float builtin says Convert a string or a number to floating point. If the argument is a string, it must contain a possibly signed decimal or floating point number, possibly embedded in whitespace. The argument may also be '[+|-]nan' or '[+|-]inf'. Now, one may wonder what precisely a "possibly signed floating point number" is, but most likely, this refers to floatnumber ::= pointfloat | exponentfloat pointfloat ::= [intpart] fraction | intpart "." exponentfloat ::= (intpart | pointfloat) exponent intpart ::= digit+ fraction ::= "." digit+ exponent ::= ("e" | "E") ["+" | "-"] digit+ digit ::= "0"..."9" Regards, Martin From alexander.belopolsky at gmail.com Sun Nov 28 23:31:51 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 17:31:51 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2D4E9.3060607@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 5:17 PM, "Martin v. L?wis" wrote: >>>>>> float('????.??') >>> 1234.56 > > I think it's a bug that this works. The definition of the float builtin says > > Convert a string or a number to floating point. If the argument is a > string, it must contain a possibly signed decimal or floating point > number, possibly embedded in whitespace. The argument may also be > '[+|-]nan' or '[+|-]inf'. > This definition fails long before we get beyond 127-th code point: >>> float('infinity') inf From mal at egenix.com Sun Nov 28 23:42:31 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 28 Nov 2010 23:42:31 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2D4E9.3060607@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> Message-ID: <4CF2DAD7.2000408@egenix.com> "Martin v. L?wis" wrote: >>>>>> float('????.??') >>> 1234.56 > > I think it's a bug that this works. The definition of the float builtin says > > Convert a string or a number to floating point. If the argument is a > string, it must contain a possibly signed decimal or floating point > number, possibly embedded in whitespace. The argument may also be > '[+|-]nan' or '[+|-]inf'. > > Now, one may wonder what precisely a "possibly signed floating point > number" is, but most likely, this refers to > > floatnumber ::= pointfloat | exponentfloat > pointfloat ::= [intpart] fraction | intpart "." > exponentfloat ::= (intpart | pointfloat) exponent > intpart ::= digit+ > fraction ::= "." digit+ > exponent ::= ("e" | "E") ["+" | "-"] digit+ > digit ::= "0"..."9" I don't see why the language spec should limit the wealth of number formats supported by float(). It is not uncommon for Asians and other non-Latin script users to use their own native script symbols for numbers. Just because these digits may look strange to someone doesn't mean that they are meaningless or should be discarded. Please also remember that Python3 now allows Unicode names for identifiers for much the same reasons. Note that the support in float() (and the other numeric constructors) to work with Unicode code points was explicitly added when Unicode support was added to Python and has been available since Python 1.6. It is not a bug by any definition of "bug", even though the feature may bug someone occasionally to go read up a bit on what else the world has to offer other than Arabic numerals :-) http://en.wikipedia.org/wiki/Numeral_system -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mal at egenix.com Sun Nov 28 23:48:59 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 28 Nov 2010 23:48:59 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: Message-ID: <4CF2DC5B.4020702@egenix.com> Alexander Belopolsky wrote: > Two recently reported issues brought into light the fact that Python > language definition is closely tied to character properties maintained > by the Unicode Consortium. [1,2] For example, when Python switches to > Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two > additional characters that Python can use in identifiers. [3] > > With Python 3.1: > >>>> exec('\u0CF1 = 1') > Traceback (most recent call last): > File "", line 1, in > File "", line 1 > ? = 1 > ^ > SyntaxError: invalid character in identifier > > but with Python 3.2a4: > >>>> exec('\u0CF1 = 1') >>>> eval('\u0CF1') > 1 Such changes are not new, but I agree that they should probably be highlighted in the "What's new in Python x.x". > Of course, the likelihood is low that this change will affect any > user, but the change in str.isspace() reported in [1] is likely to > cause some trouble: > > Python 2.6.5: >>>> u'A\u200bB'.split() > [u'A', u'B'] > > Python 2.7: >>>> u'A\u200bB'.split() > [u'A\u200bB'] That's a classical bug fix. > While we have little choice but to follow UCD in defining > str.isidentifier(), I think Python can promise users more stability in > what it treats as space or as a digit in its builtins. Why should we divert from the work done by the Unicode Consortium ? After all, most of their changes are in fact bug fixes as well. > For example, > I don't think that supporting > >>>> float('????.??') > 1234.56 > > is more important than to assure users that once their program > accepted some text as a number, they can assume that the text is > ASCII. Sorry, but I don't agree. If ASCII numerals are an important aspect of an application, the application should make sure that only those numerals are used (e.g. by using a regular expression for checking). In a Unicode world, not accepting non-Arabic numerals would be a limitation, not a feature. Besides Python has had this support since Python 1.6. > [1] http://bugs.python.org/issue10567 > [2] http://bugs.python.org/issue10557 > [3] http://www.unicode.org/versions/Unicode6.0.0/#Database_Changes -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Sun Nov 28 23:51:00 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 17:51:00 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DAD7.2000408@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg wrote: .. > I don't see why the language spec should limit the wealth of number > formats supported by float(). > The Language Spec (whatever it is) should not, but hopefully the Library Reference should. If you follow http://docs.python.org/dev/py3k/library/functions.html#float link and the references therein, you'll end up with digit ::= "0"..."9" http://docs.python.org/dev/py3k/reference/lexical_analysis.html#grammar-token-digit From martin at v.loewis.de Sun Nov 28 23:56:47 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Sun, 28 Nov 2010 23:56:47 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> Message-ID: <4CF2DE2F.5040405@v.loewis.de> Am 28.11.2010 23:31, schrieb Alexander Belopolsky: > On Sun, Nov 28, 2010 at 5:17 PM, "Martin v. L?wis" wrote: >>>>>>> float('????.??') >>>> 1234.56 >> >> I think it's a bug that this works. The definition of the float builtin says >> >> Convert a string or a number to floating point. If the argument is a >> string, it must contain a possibly signed decimal or floating point >> number, possibly embedded in whitespace. The argument may also be >> '[+|-]nan' or '[+|-]inf'. >> > > This definition fails long before we get beyond 127-th code point: > >>>> float('infinity') > inf What do infer from that? That the definition is wrong, or the code is wrong? Regards, Martin From tjreedy at udel.edu Mon Nov 29 00:00:25 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 28 Nov 2010 18:00:25 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> Message-ID: On 11/28/2010 3:58 PM, Alexander Belopolsky wrote: > On Sun, Nov 28, 2010 at 3:43 PM, Antoine Pitrou wrote: > .. >>> For example, >>> I don't think that supporting >>> >>>>>> float('????.??') >>> 1234.56 Even if this is somehow an accident or something that someone snuck in, I think it a good idea that *users* be able to input amounts with their native digits. That is different from requiring *programmers* to write literals with euro-ascii-digits >>> is more important than to assure users that once their program >>> accepted some text as a number, they can assume that the text is >>> ASCII. >> >> Why would they assume the text is ASCII? > > def deposit(self, amountstr): > self.balance += float(amountstr) > audit_log("Deposited: " + amountstr) If the programmer want to assure ascii, he can produce a string, possible formatted, from the amount depform = "Deposited: ${:14.2f}".format def deposit(self, amountstr): amount = float(amountstr) self.balance += amount # audit_log("Deposited: " + str(amount) # simple version audit_log(depform(amount)) Given that amountstr could be something like ' 182.33 ', I think programmer should plan to format it. -- Terry Jan Reedy From alexander.belopolsky at gmail.com Mon Nov 29 00:01:10 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 18:01:10 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DE2F.5040405@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DE2F.5040405@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. L?wis" wrote: .. >> This definition fails long before we get beyond 127-th code point: >> >>>>> float('infinity') >> inf > > What do infer from that? That the definition is wrong, or the code is wrong? The development version of the reference manual is more detailed, but as far as I can tell, it still defines digit as 0-9. http://docs.python.org/dev/py3k/library/functions.html#float From martin at v.loewis.de Mon Nov 29 00:03:45 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 00:03:45 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DAD7.2000408@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: <4CF2DFD1.10901@v.loewis.de> >> Now, one may wonder what precisely a "possibly signed floating point >> number" is, but most likely, this refers to >> >> floatnumber ::= pointfloat | exponentfloat >> pointfloat ::= [intpart] fraction | intpart "." >> exponentfloat ::= (intpart | pointfloat) exponent >> intpart ::= digit+ >> fraction ::= "." digit+ >> exponent ::= ("e" | "E") ["+" | "-"] digit+ >> digit ::= "0"..."9" > > I don't see why the language spec should limit the wealth of number > formats supported by float(). If it doesn't, there should be some other specification of what is correct and what is not. It must not be unspecified. > It is not uncommon for Asians and other non-Latin script users to > use their own native script symbols for numbers. Just because these > digits may look strange to someone doesn't mean that they are > meaningless or should be discarded. Then these users should speak up and indicate their need, or somebody should speak up and confirm that there are users who actually want '????.??' to denote 1234.56. To my knowledge, there is no writing system in which '????.??e4' means 12345600.0. > Please also remember that Python3 now allows Unicode names for > identifiers for much the same reasons. No no no. Addition of Unicode identifiers has a well-designed, deliberate specification, with a PEP and all. The support for non-ASCII digits in float appears to be ad-hoc, and not founded on actual needs of actual users. > Note that the support in float() (and the other numeric constructors) > to work with Unicode code points was explicitly added when Unicode > support was added to Python and has been available since Python 1.6. That doesn't necessarily make it useful. Alexander's complaint is that it makes Python unstable (i.e. changing as the UCD changes). > It is not a bug by any definition of "bug" Most certainly it is: the documentation is either underspecified, or deviates from the implementation (when taking the most plausible interpretation). This is the very definition of "bug". Regards, Martin From tjreedy at udel.edu Mon Nov 29 00:03:30 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 28 Nov 2010 18:03:30 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: On 11/28/2010 5:51 PM, Alexander Belopolsky wrote: > The Language Spec (whatever it is) should not, but hopefully the > Library Reference should. If you follow > http://docs.python.org/dev/py3k/library/functions.html#float link and > the references therein, you'll end up with > > digit ::= "0"..."9" > > http://docs.python.org/dev/py3k/reference/lexical_analysis.html#grammar-token-digit So fix the doc for builtin float() and perhaps int(). -- Terry Jan Reedy From alexander.belopolsky at gmail.com Mon Nov 29 00:05:56 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 18:05:56 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DFD1.10901@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <4CF2DFD1.10901@v.loewis.de> Message-ID: +1 on all point below. On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" wrote: >>> Now, one may wonder what precisely a "possibly signed floating point >>> number" is, but most likely, this refers to >>> >>> floatnumber ? ::= ?pointfloat | exponentfloat >>> pointfloat ? ?::= ?[intpart] fraction | intpart "." >>> exponentfloat ::= ?(intpart | pointfloat) exponent >>> intpart ? ? ? ::= ?digit+ >>> fraction ? ? ?::= ?"." digit+ >>> exponent ? ? ?::= ?("e" | "E") ["+" | "-"] digit+ >>> digit ? ? ? ? ?::= ?"0"..."9" >> >> I don't see why the language spec should limit the wealth of number >> formats supported by float(). > > If it doesn't, there should be some other specification of what > is correct and what is not. It must not be unspecified. > >> It is not uncommon for Asians and other non-Latin script users to >> use their own native script symbols for numbers. Just because these >> digits may look strange to someone doesn't mean that they are >> meaningless or should be discarded. > > Then these users should speak up and indicate their need, or somebody > should speak up and confirm that there are users who actually want > '????.??' to denote 1234.56. To my knowledge, there is no writing > system in which '????.??e4' means 12345600.0. > >> Please also remember that Python3 now allows Unicode names for >> identifiers for much the same reasons. > > No no no. Addition of Unicode identifiers has a well-designed, > deliberate specification, with a PEP and all. The support for > non-ASCII digits in float appears to be ad-hoc, and not founded > on actual needs of actual users. > >> Note that the support in float() (and the other numeric constructors) >> to work with Unicode code points was explicitly added when Unicode >> support was added to Python and has been available since Python 1.6. > > That doesn't necessarily make it useful. Alexander's complaint is that > it makes Python unstable (i.e. changing as the UCD changes). > >> It is not a bug by any definition of "bug" > > Most certainly it is: the documentation is either underspecified, > or deviates from the implementation (when taking the most plausible > interpretation). This is the very definition of "bug". > > Regards, > Martin > From martin at v.loewis.de Mon Nov 29 00:08:29 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 00:08:29 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DE2F.5040405@v.loewis.de> Message-ID: <4CF2E0ED.1080807@v.loewis.de> Am 29.11.2010 00:01, schrieb Alexander Belopolsky: > On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. L?wis" wrote: > .. >>> This definition fails long before we get beyond 127-th code point: >>> >>>>>> float('infinity') >>> inf >> >> What do infer from that? That the definition is wrong, or the code is wrong? > > The development version of the reference manual is more detailed, but > as far as I can tell, it still defines digit as 0-9. > > http://docs.python.org/dev/py3k/library/functions.html#float > I wasn't asking about 0..9, but about "infinity". According to the spec, it shouldn't accept that (and neither should it accept 'infinitY'). However, whether that's a spec bug or an implementation bug - it seems like a minor issue to me (i.e. easily fixed). Regards, Martin From alexander.belopolsky at gmail.com Mon Nov 29 00:12:44 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 18:12:44 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DFD1.10901@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <4CF2DFD1.10901@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" wrote: .. >> Note that the support in float() (and the other numeric constructors) >> to work with Unicode code points was explicitly added when Unicode >> support was added to Python and has been available since Python 1.6. > > That doesn't necessarily make it useful. Alexander's complaint is that > it makes Python unstable (i.e. changing as the UCD changes). > What makes it worse, is that while superficially, Unicode versions follow the same X.Y.Z format as Python versions, the stability promises are completely different. For example, it appears that the general category for the ZERO WIDTH SPACE was changed in Unicode 4.0.1. I don't think a change affecting str.split(), int(), float() and probably numerous other library functions would be acceptable in a Python micro release. From alexander.belopolsky at gmail.com Mon Nov 29 00:16:24 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 18:16:24 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2E0ED.1080807@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DE2F.5040405@v.loewis.de> <4CF2E0ED.1080807@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 6:08 PM, "Martin v. L?wis" wrote: > Am 29.11.2010 00:01, schrieb Alexander Belopolsky: >> On Sun, Nov 28, 2010 at 5:56 PM, "Martin v. L?wis" wrote: >> .. >>>> This definition fails long before we get beyond 127-th code point: >>>> >>>>>>> float('infinity') >>>> inf >>> >>> What do infer from that? That the definition is wrong, or the code is wrong? >> >> The development version of the reference manual is more detailed, but >> as far as I can tell, it still defines digit as 0-9. >> >> http://docs.python.org/dev/py3k/library/functions.html#float >> > > I wasn't asking about 0..9, but about "infinity". According to the > spec, it shouldn't accept that (and neither should it accept > 'infinitY'). According to the link that I mentioned, infinity ::= "Infinity" | "inf" and "Case is not significant, so, for example, ?inf?, ?Inf?, ?INFINITY? and ?iNfINity? are all acceptable spellings for positive infinity." I completely agree with your arguments and the reference manual has been improved a lot in the recent years. From martin at v.loewis.de Mon Nov 29 00:19:54 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 00:19:54 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <4CF2DFD1.10901@v.loewis.de> Message-ID: <4CF2E39A.6060605@v.loewis.de> > What makes it worse, is that while superficially, Unicode versions > follow the same X.Y.Z format as Python versions, the stability > promises are completely different. For example, it appears that the > general category for the ZERO WIDTH SPACE was changed in Unicode > 4.0.1. I don't think a change affecting str.split(), int(), float() > and probably numerous other library functions would be acceptable in a > Python micro release. Well, we managed to completely break Unicode normalization between 2.6.5 and 2.6.6, due to a bug. You can see the Unicode Consortium's stability policy at http://unicode.org/policies/stability_policy.html In a sense, this is stronger than Python's backwards compatibility promises (which allow for certain incompatible changes to occur over time, whereas Unicode makes promises about all future versions). Regards, Martin From benjamin at python.org Mon Nov 29 00:23:01 2010 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 28 Nov 2010 17:23:01 -0600 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DAD7.2000408@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: 2010/11/28 M.-A. Lemburg : > > > "Martin v. L?wis" wrote: >>>>>>> float('????.??') >>>> 1234.56 >> >> I think it's a bug that this works. The definition of the float builtin says >> >> Convert a string or a number to floating point. If the argument is a >> string, it must contain a possibly signed decimal or floating point >> number, possibly embedded in whitespace. The argument may also be >> '[+|-]nan' or '[+|-]inf'. >> >> Now, one may wonder what precisely a "possibly signed floating point >> number" is, but most likely, this refers to >> >> floatnumber ? ::= ?pointfloat | exponentfloat >> pointfloat ? ?::= ?[intpart] fraction | intpart "." >> exponentfloat ::= ?(intpart | pointfloat) exponent >> intpart ? ? ? ::= ?digit+ >> fraction ? ? ?::= ?"." digit+ >> exponent ? ? ?::= ?("e" | "E") ["+" | "-"] digit+ >> digit ? ? ? ? ?::= ?"0"..."9" > > I don't see why the language spec should limit the wealth of number > formats supported by float(). > > It is not uncommon for Asians and other non-Latin script users to > use their own native script symbols for numbers. Just because these > digits may look strange to someone doesn't mean that they are > meaningless or should be discarded. That's different. Python doesn't assign any semantic meaning to the characters in identifiers. The non-latin support for numerals, though, could change the meaning of a program dramatically and needs to be well-specified. Whether int() should do this is debatable. I, for one, think this kind of support belongs in the locale module. -- Regards, Benjamin From alexander.belopolsky at gmail.com Mon Nov 29 00:29:47 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 18:29:47 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2E39A.6060605@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <4CF2DFD1.10901@v.loewis.de> <4CF2E39A.6060605@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 6:19 PM, "Martin v. L?wis" wrote: .. > You can see the Unicode Consortium's stability policy at > > http://unicode.org/policies/stability_policy.html > >From the link above: """ As more experience is gathered in implementing the characters, adjustments in the properties may become necessary. Examples of such properties include, but are not limited to, the following: General_Category ... """ > In a sense, this is stronger than Python's backwards compatibility > promises (which allow for certain incompatible changes to occur > over time, whereas Unicode makes promises about all future versions). I would say it is *different* and should be taken into account when tying language features to Unicode specifications. This was done in PEP 3131. Note that one of the stated objections was "Unicode is young; its problems are not yet well understood and solved;" (It is still true.) From martin at v.loewis.de Mon Nov 29 00:33:23 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 00:33:23 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> Message-ID: <4CF2E6C3.3010009@v.loewis.de> >>>>>>> float('????.??') >>>> 1234.56 > > Even if this is somehow an accident or something that someone snuck in, > I think it a good idea that *users* be able to input amounts with their > native digits. That is different from requiring *programmers* to write > literals with euro-ascii-digits So one question is what kind of data float() is aimed at. I claim that it is about "programmer" data, not "user" data. If it supported "user" data, it probably would have to support "1,000" to denote 1e3 in the U.S., and denote 1e0 in Germany. Our users are generally confused on whether they should use th full stop or the comma as the decimal separator. As not even the locale-dependent issues are considered in float(), it is clear to me that entering local numbers cannot possibly be the objective of the function. Instead, following a wide-spread Python convention, it is meant to be the reverse of repr(). Can you name a single person who actually wants to write '????.??' as a number? I'm fairly skeptical that users of arabic-indic digits. Instead, http://en.wikipedia.org/wiki/Decimal_separator suggests that they would rather U+066B, i.e. '???????', which isn't supported by Python. Regards, Martin From martin at v.loewis.de Mon Nov 29 00:40:31 2010 From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 00:40:31 +0100 Subject: [Python-Dev] PEP 384 final review Message-ID: <4CF2E86F.5000606@v.loewis.de> I have now completed http://www.python.org/dev/peps/pep-0384/ Benjamin has volunteered to rule on this PEP. Please comment with any changes you want to see, or speak in favor or against this PEP. Regards, Martin From fuzzyman at voidspace.org.uk Mon Nov 29 00:44:50 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Sun, 28 Nov 2010 23:44:50 +0000 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2E6C3.3010009@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2E6C3.3010009@v.loewis.de> Message-ID: <4CF2E972.2040209@voidspace.org.uk> On 28/11/2010 23:33, "Martin v. L?wis" wrote: >>>>>>>> float('????.??') >>>>> 1234.56 >> Even if this is somehow an accident or something that someone snuck in, >> I think it a good idea that *users* be able to input amounts with their >> native digits. That is different from requiring *programmers* to write >> literals with euro-ascii-digits > So one question is what kind of data float() is aimed at. I claim that > it is about "programmer" data, not "user" data. If it supported "user" > data, it probably would have to support "1,000" to denote 1e3 in the > U.S., and denote 1e0 in Germany. Our users are generally confused > on whether they should use th full stop or the comma as the decimal > separator. > FWIW the C# equivalent is locale aware *unless* you pass in a specific culture. (System.Double.Parse): http://msdn.microsoft.com/en-us/library/fd84bdyt.aspx If you're not aware that your code may be run on non-US computers this is a trap for the unwary. If you *are* aware then it is very useful. An alternative overload allows you to specify the culture used to do the conversion: http://msdn.microsoft.com/en-us/library/t9ebt447.aspx Michael > As not even the locale-dependent issues are considered in float(), > it is clear to me that entering local numbers cannot possibly be > the objective of the function. > > Instead, following a wide-spread Python convention, it is meant to be > the reverse of repr(). > > Can you name a single person who actually wants to write '????.??' > as a number? I'm fairly skeptical that users of arabic-indic digits. > Instead, > > http://en.wikipedia.org/wiki/Decimal_separator > > suggests that they would rather U+066B, i.e. '???????', which isn't > supported by Python. > > Regards, > Martin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From alexander.belopolsky at gmail.com Mon Nov 29 00:56:00 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 18:56:00 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DFD1.10901@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <4CF2DFD1.10901@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" wrote: .. > No no no. Addition of Unicode identifiers has a well-designed, > deliberate specification, with a PEP and all. The support for > non-ASCII digits in float appears to be ad-hoc, and not founded > on actual needs of actual users. > I wonder how carefully right-to-left scripts were considered when PEP 3131 was discussed. Try the following on the python prompt: >>> ?= int('???') >>> ? 123 In my OSX Terminal window, entering ? flips the >>> prompt and the session looks like this: ('???')int = ? <<< From martin at v.loewis.de Mon Nov 29 00:59:12 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 00:59:12 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2E972.2040209@voidspace.org.uk> References: <20101128214311.092abd35@pitrou.net> <4CF2E6C3.3010009@v.loewis.de> <4CF2E972.2040209@voidspace.org.uk> Message-ID: <4CF2ECD0.4000003@v.loewis.de> > FWIW the C# equivalent is locale aware *unless* you pass in a specific > culture. > (System.Double.Parse): That's not quite the equivalent of float(), I would say: this one apparently is locale-aware, so it is more the equivalent of locale.atof. The next question then is if it supports indo-arabic digits in any locale (or more specifically in an arabic locale). Regards, Martin From solipsis at pitrou.net Mon Nov 29 01:01:12 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Nov 2010 01:01:12 +0100 Subject: [Python-Dev] Python and the Unicode Character Database References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: <20101129010112.343eaf64@pitrou.net> On Sun, 28 Nov 2010 17:23:01 -0600 Benjamin Peterson wrote: > 2010/11/28 M.-A. Lemburg : > > > > > > "Martin v. L?wis" wrote: > >>>>>>> float('????.??') > >>>> 1234.56 > >> > >> I think it's a bug that this works. The definition of the float builtin says > >> > >> Convert a string or a number to floating point. If the argument is a > >> string, it must contain a possibly signed decimal or floating point > >> number, possibly embedded in whitespace. The argument may also be > >> '[+|-]nan' or '[+|-]inf'. > >> > >> Now, one may wonder what precisely a "possibly signed floating point > >> number" is, but most likely, this refers to > >> > >> floatnumber ? ::= ?pointfloat | exponentfloat > >> pointfloat ? ?::= ?[intpart] fraction | intpart "." > >> exponentfloat ::= ?(intpart | pointfloat) exponent > >> intpart ? ? ? ::= ?digit+ > >> fraction ? ? ?::= ?"." digit+ > >> exponent ? ? ?::= ?("e" | "E") ["+" | "-"] digit+ > >> digit ? ? ? ? ?::= ?"0"..."9" > > > > I don't see why the language spec should limit the wealth of number > > formats supported by float(). > > > > It is not uncommon for Asians and other non-Latin script users to > > use their own native script symbols for numbers. Just because these > > digits may look strange to someone doesn't mean that they are > > meaningless or should be discarded. > > That's different. Python doesn't assign any semantic meaning to the > characters in identifiers. The non-latin support for numerals, though, > could change the meaning of a program dramatically and needs to be > well-specified. Whether int() should do this is debatable. Perhaps int(), float(), Decimal() and friends could take an optional parameter indicating whether non-ascii digits are considered. It would then satisfy all parties. Antoine. From martin at v.loewis.de Mon Nov 29 01:02:18 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 01:02:18 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <4CF2DFD1.10901@v.loewis.de> Message-ID: <4CF2ED8A.2010503@v.loewis.de> Am 29.11.2010 00:56, schrieb Alexander Belopolsky: > On Sun, Nov 28, 2010 at 6:03 PM, "Martin v. L?wis" wrote: > .. >> No no no. Addition of Unicode identifiers has a well-designed, >> deliberate specification, with a PEP and all. The support for >> non-ASCII digits in float appears to be ad-hoc, and not founded >> on actual needs of actual users. >> > > I wonder how carefully right-to-left scripts were considered when PEP > 3131 was discussed. IIRC, some Hebrew users have spoken in favor of the PEP, despite the obvious difficulties it would create. I may misremember, but I think someone pointed out that they had these difficulties all the time, and that it wasn't really a burden. Unicode specifies that one should always use "logical order" in memory, and that's what the PEP does. Rendering is then a tool issue. Regards, Martin From alexander.belopolsky at gmail.com Mon Nov 29 01:04:53 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 19:04:53 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2ECD0.4000003@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2E6C3.3010009@v.loewis.de> <4CF2E972.2040209@voidspace.org.uk> <4CF2ECD0.4000003@v.loewis.de> Message-ID: On Sun, Nov 28, 2010 at 6:59 PM, "Martin v. L?wis" wrote: .. > The next question then is if it supports indo-arabic digits in any > locale (or more specifically in an arabic locale). And once you answered that question, does it support Devanagari or Bengali digits? And if so, an arbitrary mix of those and indo-arabic digits? From alexander.belopolsky at gmail.com Mon Nov 29 01:25:37 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 19:25:37 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101129010112.343eaf64@pitrou.net> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <20101129010112.343eaf64@pitrou.net> Message-ID: On Sun, Nov 28, 2010 at 7:01 PM, Antoine Pitrou wrote: .. >> That's different. Python doesn't assign any semantic meaning to the >> characters in identifiers. The non-latin support for numerals, though, >> could change the meaning of a program dramatically and needs to be >> well-specified. Whether int() should do this is debatable. > > Perhaps int(), float(), Decimal() and friends could take an optional > parameter indicating whether non-ascii digits are considered. It would > then satisfy all parties. What parties? I don't think anyone has claimed to actually have used non-ASCII digits with float(). Of course it is fun that Python can process Bengali numerals, but so would be allowing Roman numerals. There is a reason why after careful consideration, PEP 313 was ultimately rejected. BTW, it is common in Russia to specify months using roman numerals. Maybe we should consider allowing datetime.date() accept '1.IV.2011'. From fuzzyman at voidspace.org.uk Mon Nov 29 01:25:40 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 29 Nov 2010 00:25:40 +0000 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2ECD0.4000003@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2E6C3.3010009@v.loewis.de> <4CF2E972.2040209@voidspace.org.uk> <4CF2ECD0.4000003@v.loewis.de> Message-ID: <4CF2F304.60905@voidspace.org.uk> On 28/11/2010 23:59, "Martin v. L?wis" wrote: >> FWIW the C# equivalent is locale aware *unless* you pass in a specific >> culture. >> (System.Double.Parse): > That's not quite the equivalent of float(), I would say: this one > apparently is locale-aware, so it is more the equivalent of locale.atof. Right. It is *the* standard way of getting a float from a string though, whereas in Python we have two depending on whether or not you want to be locale aware. The standard way in C# is locale aware. To be non-locale aware you pass in a specific culture or number format. > The next question then is if it supports indo-arabic digits in any > locale (or more specifically in an arabic locale). I don't think so actually. The float parse formatting rules are defined like this: [ws][$][sign][integral-digits[,]]integral-digits[.[fractional-digits]][E[sign]exponential-digits][ws] (From http://msdn.microsoft.com/en-us/library/7yd1h1be.aspx ) integral-digits, fractional-digits and exponential-digits are all defined as "A series of digits ranging from 0 to 9". Arguably this is not be conclusive. In fact the NumberFormatInfo class seems to hint that it may be otherwise: http://msdn.microsoft.com/en-us/library/system.globalization.numberformatinfo.aspx See DigitSubstitution on that page. I would have to try it to be sure and I don't have a Windows VM in convenient reach right now. All the best, Michael > Regards, > Martin -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From fuzzyman at voidspace.org.uk Mon Nov 29 01:28:59 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 29 Nov 2010 00:28:59 +0000 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2E6C3.3010009@v.loewis.de> <4CF2E972.2040209@voidspace.org.uk> <4CF2ECD0.4000003@v.loewis.de> Message-ID: <4CF2F3CB.6090808@voidspace.org.uk> On 29/11/2010 00:04, Alexander Belopolsky wrote: > On Sun, Nov 28, 2010 at 6:59 PM, "Martin v. L?wis" wrote: > .. >> The next question then is if it supports indo-arabic digits in any >> locale (or more specifically in an arabic locale). > And once you answered that question, does it support Devanagari or > Bengali digits? And if so, an arbitrary mix of those and indo-arabic > digits? Haha. Go and try it yourself. :-) Michael -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From solipsis at pitrou.net Mon Nov 29 01:29:40 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Nov 2010 01:29:40 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <20101129010112.343eaf64@pitrou.net> Message-ID: <1290990580.8242.2.camel@localhost.localdomain> > > Perhaps int(), float(), Decimal() and friends could take an optional > > parameter indicating whether non-ascii digits are considered. It would > > then satisfy all parties. > > What parties? I don't think anyone has claimed to actually have used > non-ASCII digits with float(). Have you done a poll of all Python 3 users? > Of course it is fun that Python can > process Bengali numerals, but so would be allowing Roman numerals. > There is a reason why after careful consideration, PEP 313 was > ultimately rejected. That's mostly irrelevant. This feature exists and someone, somewhere, may be using it. We normally don't remove stuff without deprecation. Antoine. From ncoghlan at gmail.com Mon Nov 29 01:48:51 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Nov 2010 10:48:51 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF28310.7070304@voidspace.org.uk> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> Message-ID: On Mon, Nov 29, 2010 at 2:28 AM, Michael Foord wrote: > For wrapping mutable types I'm tempted to say YAGNI. For the standard > library wrapping integers meets almost all our use-cases except for one > float. (At work we have a decimal constant as it happens.) Perhaps we could > require immutable types for groups but allow arbitrary values for individual > named values? Whereas my opinion is that "immutable vs mutable" is such a blurry distinction that we shouldn't try to make it at the lowest level. Would it be possible to name frozenset instances? Tuples? How about objects that are conceptually immutable, but don't close all the loopholes allowing you to mutate them? (e.g. Decimal, Fraction) Better to design a named value API that doesn't care about mutability, and then leave questions of reverse mappings from values back to names to the grouping API level. At that level, it would be trivial (and natural) to limit names to referencing Hashable values so that a reverse lookup table would be easy to implement. For standard library purposes, we could even reasonably provide an int-only grouping API, since the main use case is almost certainly to be in managing translation of OS-level integer constants to named values. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Mon Nov 29 01:55:33 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 29 Nov 2010 11:55:33 +1100 Subject: [Python-Dev] Python and the Unicode Character Database References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <20101129010112.343eaf64@pitrou.net> Message-ID: <8739ql54oq.fsf@benfinney.id.au> Alexander Belopolsky writes: > On Sun, Nov 28, 2010 at 7:01 PM, Antoine Pitrou wrote: > > Perhaps int(), float(), Decimal() and friends could take an optional > > parameter indicating whether non-ascii digits are considered. It > > would then satisfy all parties. > > What parties? I don't think anyone has claimed to actually have used > non-ASCII digits with float(). Rather, it has been pointed out that there is an unknown amount of existing code which does that. You're not going to know how much or how little from this forum. > Of course it is fun that Python can process Bengali numerals, but so > would be allowing Roman numerals. There is a reason why after careful > consideration, PEP 313 was ultimately rejected. Rejecting a proposed *new* capability is a different matter from disabling an *existing* capability which works in existing Python releases. -- \ ?Following fashion and the status quo is easy. Thinking about | `\ your users' lives and creating something practical is much | _o__) harder.? ?Ryan Singer, 2008-07-09 | Ben Finney From fuzzyman at voidspace.org.uk Mon Nov 29 01:57:27 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 29 Nov 2010 00:57:27 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> Message-ID: <4CF2FA77.3000604@voidspace.org.uk> On 29/11/2010 00:48, Nick Coghlan wrote: > On Mon, Nov 29, 2010 at 2:28 AM, Michael Foord > wrote: >> For wrapping mutable types I'm tempted to say YAGNI. For the standard >> library wrapping integers meets almost all our use-cases except for one >> float. (At work we have a decimal constant as it happens.) Perhaps we could >> require immutable types for groups but allow arbitrary values for individual >> named values? > Whereas my opinion is that "immutable vs mutable" is such a blurry > distinction that we shouldn't try to make it at the lowest level. > Would it be possible to name frozenset instances? Tuples? How about > objects that are conceptually immutable, but don't close all the > loopholes allowing you to mutate them? (e.g. Decimal, Fraction) > > Better to design a named value API that doesn't care about mutability, > and then leave questions of reverse mappings from values back to names > to the grouping API level. At that level, it would be trivial (and > natural) to limit names to referencing Hashable values so that a > reverse lookup table would be easy to implement. For standard library > purposes, we could even reasonably provide an int-only grouping API, > since the main use case is almost certainly to be in managing > translation of OS-level integer constants to named values. Sounds reasonable to me. Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From tjreedy at udel.edu Mon Nov 29 02:00:56 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 28 Nov 2010 20:00:56 -0500 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF2E86F.5000606@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> Message-ID: On 11/28/2010 6:40 PM, "Martin v. L?wis" wrote: > I have now completed > > http://www.python.org/dev/peps/pep-0384/ The current text contains several error messages like: "System Message: WARNING/2 (pep-0384.txt, line 194) Bullet list ends without a blank line; unexpected unindent." Terry Jan Reedy From steve at pearwood.info Mon Nov 29 01:14:31 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 29 Nov 2010 11:14:31 +1100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2D4E9.3060607@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> Message-ID: <4CF2F067.5020705@pearwood.info> Martin v. L?wis wrote: >>>>>> float('????.??') >>> 1234.56 > > I think it's a bug that this works. The definition of the float builtin says [...] I think that's a documentation bug rather than a coding bug. If Python wishes to limit the digits allowed in numeric *literals* to ASCII 0...9, that's one thing, but I think that the digits allowed in numeric *strings* should allow the full range of digits supported by the Unicode standard. The former ensures that literals in code are always readable; the later allows users to enter numbers in their own number system. How could that be a bad thing? -- Steven From rob.cliffe at btinternet.com Sun Nov 28 02:07:08 2010 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Sun, 28 Nov 2010 01:07:08 +0000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF2C86C.9030505@canterbury.ac.nz> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CEDDC2D.204@canterbury.ac.nz> <4CEE5C1C.9000905@btinternet.com> <4CF2C86C.9030505@canterbury.ac.nz> Message-ID: <4CF1AB3C.3060408@btinternet.com> On 28/11/2010 21:23, Greg Ewing wrote: > Rob Cliffe wrote: > >> But couldn't they be presented to the Python programmer as a single >> type, with the implementation details hidden "under the hood"? > > Not in CPython, because tuple items are kept in the same block > of memory as the object header. Because CPython can't move > objects, this means that the size of the tuple must be known > when the object is created. > But when a frozen list a.k.a. tuple would be created - either directly, or by setting a list's mutable flag to False which would really turn it into a tuple - the size *would* be known. And since the object would now be immutable, there would be no requirement for its size to change. (My idea doesn't require additional functionality, just a different API.) Rob Cliffe From alexander.belopolsky at gmail.com Mon Nov 29 02:24:24 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 20:24:24 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <8739ql54oq.fsf@benfinney.id.au> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <20101129010112.343eaf64@pitrou.net> <8739ql54oq.fsf@benfinney.id.au> Message-ID: On Sun, Nov 28, 2010 at 7:55 PM, Ben Finney wrote: .. >> Of course it is fun that Python can process Bengali numerals, but so >> would be allowing Roman numerals. There is a reason why after careful >> consideration, PEP 313 was ultimately rejected. > > Rejecting a proposed *new* capability is a different matter from > disabling an *existing* capability which works in existing Python > releases. Was this capability ever documented? It does not feel like a deliberate feature. If it was, '\N{ARABIC DECIMAL SEPARATOR}' would be accepted in arabic-indic notation. If feels more like a CPython implementation detail similar to say: >>> int('10') is 10 True >>> int('10000') is 10000 False Note that the underlying PyUnicode_EncodeDecimal() function is described in the unicodeobject.h header file as follows: /* --- Decimal Encoder ---------------------------------------------------- */ /* Takes a Unicode string holding a decimal value and writes it into an output buffer using standard ASCII digit codes. .. The encoder converts whitespace to ' ', decimal characters to their corresponding ASCII digit and all other Latin-1 characters except \0 as-is. Characters outside this range (Unicode ordinals 1-256) are treated as errors. This includes embedded NULL bytes. */ So the support for non-ASCII digits is accidental and should be treated as a bug. From ben+python at benfinney.id.au Mon Nov 29 02:25:56 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 29 Nov 2010 12:25:56 +1100 Subject: [Python-Dev] Python and the Unicode Character Database References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> Message-ID: <87y68c53a3.fsf@benfinney.id.au> Steven D'Aprano writes: > If Python wishes to limit the digits allowed in numeric *literals* to > ASCII 0...9, that's one thing, but I think that the digits allowed in > numeric *strings* should allow the full range of digits supported by > the Unicode standard. I assume you specifically mean that the numeric class constructors, like ?int? and ?float?, should parse their input string such that any character Unicode defines as a numeric digit is mapped to the corresponding digit. That sounds attractive, but it raises questions about mixed notations, mixing digits from different writing systems, and probably other questionss I haven't thought of. It's not something to make a simple yes-or-no-decision on now, IMO. This sounds best suited to a PEP, which someone who cares enough can champion in ?python-ideas?. -- \ ?The manager has personally passed all the water served here.? | `\ ?hotel, Acapulco | _o__) | Ben Finney From steve at pearwood.info Mon Nov 29 00:43:59 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 29 Nov 2010 10:43:59 +1100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: Message-ID: <4CF2E93F.70208@pearwood.info> Alexander Belopolsky wrote: > Two recently reported issues brought into light the fact that Python > language definition is closely tied to character properties maintained > by the Unicode Consortium. [1,2] For example, when Python switches to > Unicode 6.0.0 (planned for the upcoming 3.2 release), we will gain two > additional characters that Python can use in identifiers. [3] [...] Why do you consider this a problem? It would be a problem if previously valid identifiers *stopped* being valid, but not the other way around. > Of course, the likelihood is low that this change will affect any > user, but the change in str.isspace() reported in [1] is likely to > cause some trouble: Looking at the thread here: http://bugs.python.org/issue10567 I interpret it as indicting that Python's isspace() has been buggy for many years, and is only now being fixed. It's always unfortunate when people rely on bugs, but I'm not sure we should be promising to support bug-for-bug compatibility from one version to the next :) > While we have little choice but to follow UCD in defining > str.isidentifier(), I think Python can promise users more stability in > what it treats as space or as a digit in its builtins. For example, > I don't think that supporting > >>>> float('????.??') > 1234.56 > > is more important than to assure users that once their program > accepted some text as a number, they can assume that the text is > ASCII. Seems like a pretty foolish assumption, if you ask me, pretty much akin to assuming that if string.isalpha() returns true that string is ASCII. Support for non-Arabic numerals in number strings goes back to at least Python 2.4: [steve at sylar ~]$ python2.4 Python 2.4.6 (#1, Mar 30 2009, 10:08:01) [GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> float(u'????.??') 1234.5599999999999 The fact that this is (apparently) only being raised now means that it isn't actually a problem in real life. I'd even say that it's a feature, and that if Python didn't support non-Arabic numerals, it should. -- Steven From alexander.belopolsky at gmail.com Mon Nov 29 03:32:15 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 28 Nov 2010 21:32:15 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2E93F.70208@pearwood.info> References: <4CF2E93F.70208@pearwood.info> Message-ID: On Sun, Nov 28, 2010 at 6:43 PM, Steven D'Aprano wrote: .. >> is more important than to assure users that once their program >> accepted some text as a number, they can assume that the text is >> ASCII. > > Seems like a pretty foolish assumption, if you ask me, pretty much akin to > assuming that if string.isalpha() returns true that string is ASCII. > It is not to 99.9% of Python users whose code is written for 2.x. Their strings are byte strings and string.isdigit() does imply ASCII even if string.isalpha() does not in many locales. .. > The fact that this is (apparently) only being raised now means that it isn't > actually a problem in real life. I'd even say that it's a feature, and that > if Python didn't support non-Arabic numerals, it should. > I raised this problem because I found a bug that is related to this feature. The bug is also a regression from 2.x. In 2.7: >>> float(u'1234\xa1') .. ValueError: invalid literal for float(): 1234? The last character is lost, but the error message is still meaningful. In 3.x, however: >>> float('1234\xa1') .. ValueError See http://bugs.python.org/issue10557 While investigating this issue I found that by the time the string gets to the number parser (_Py_dg_strtod), all non-ascii characters are dropped by PyUnicode_EncodeDecimal() so it cannot produce meaningful diagnostic. Of course, PyUnicode_EncodeDecimal(), can be fixed by making it pass non-ascii chars through as UTF-8 bytes, but I was wondering if preserving the ability to parse exotic numerals was worth the effort. From rrr at ronadam.com Mon Nov 29 04:03:39 2010 From: rrr at ronadam.com (Ron Adam) Date: Sun, 28 Nov 2010 21:03:39 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> Message-ID: <4CF3180B.1060306@ronadam.com> On 11/27/2010 04:51 AM, Nick Coghlan wrote: > x = named_value("FOO", 1) > y = named_value("BAR", "Hello World!") > z = named_value("BAZ", dict(a=1, b=2, c=3)) > > print(x, y, z, sep="\n") > print("\n".join(map(repr, (x, y, z)))) > print("\n".join(map(str, map(type, (x, y, z))))) > > set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) > print("\n".join(map(repr, (foo, bar, baz)))) > print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz)) > > ========================================================================== > > # Session output for the last 6 lines >>>> >>> print(x, y, z, sep="\n") > 1 > Hello World! > {'a': 1, 'c': 3, 'b': 2} > >>>> >>> print("\n".join(map(repr, (x, y, z)))) > FOO=1 > BAR='Hello World!' > BAZ={'a': 1, 'c': 3, 'b': 2} This reminds me of python annotations. Which seem like an already forgotten new feature. Maybe they can help with this? It does associate additional info to names and creates a nice dictionary to reference. >>> def name_values( FOO: 1, BAR: "Hello World!", BAZ: dict(a=1, b=2, c=3) ): ... return FOO, BAR, BAZ ... >>> foo(1,2,3) (1, 2, 3) >>> foo.__annotations__ {'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}} Cheers, Ron From rrr at ronadam.com Mon Nov 29 04:03:39 2010 From: rrr at ronadam.com (Ron Adam) Date: Sun, 28 Nov 2010 21:03:39 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> Message-ID: <4CF3180B.1060306@ronadam.com> On 11/27/2010 04:51 AM, Nick Coghlan wrote: > x = named_value("FOO", 1) > y = named_value("BAR", "Hello World!") > z = named_value("BAZ", dict(a=1, b=2, c=3)) > > print(x, y, z, sep="\n") > print("\n".join(map(repr, (x, y, z)))) > print("\n".join(map(str, map(type, (x, y, z))))) > > set_named_values(globals(), foo=x._raw(), bar=y._raw(), baz=z._raw()) > print("\n".join(map(repr, (foo, bar, baz)))) > print(type(x) is type(foo), type(y) is type(bar), type(z) is type(baz)) > > ========================================================================== > > # Session output for the last 6 lines >>>> >>> print(x, y, z, sep="\n") > 1 > Hello World! > {'a': 1, 'c': 3, 'b': 2} > >>>> >>> print("\n".join(map(repr, (x, y, z)))) > FOO=1 > BAR='Hello World!' > BAZ={'a': 1, 'c': 3, 'b': 2} This reminds me of python annotations. Which seem like an already forgotten new feature. Maybe they can help with this? It does associate additional info to names and creates a nice dictionary to reference. >>> def name_values( FOO: 1, BAR: "Hello World!", BAZ: dict(a=1, b=2, c=3) ): ... return FOO, BAR, BAZ ... >>> foo(1,2,3) (1, 2, 3) >>> foo.__annotations__ {'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}} Cheers, Ron From stephen at xemacs.org Mon Nov 29 04:39:32 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 29 Nov 2010 12:39:32 +0900 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2DAD7.2000408@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> M.-A. Lemburg writes: > It is not uncommon for Asians and other non-Latin script users to > use their own native script symbols for numbers. Japanese don't, in computational or scientific work where float() would be used. Japanese numerals are used for dates and for certain felicitous ages (and even there so-called "Arabic" numerals are perfectly acceptable). Otherwise, it's all ASCII (although it might be "full-width" compatibility variants). > Please also remember that Python3 now allows Unicode names for > identifiers for much the same reasons. I don't think it's the same reason, not for Japanese, anyway. I agree that Python should make it easy for the programmer to get numerical values of native numeric strings, but it's not at all clear to me that there is any point to having float() recognize them by default. From ncoghlan at gmail.com Mon Nov 29 04:58:05 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Nov 2010 13:58:05 +1000 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull wrote: > I agree that Python should make it easy for the programmer to get > numerical values of native numeric strings, but it's not at all clear > to me that there is any point to having float() recognize them by > default. Indeed, as someone else suggested earlier in the thread, supporting non-ASCII digits sounds more like a job for the locale module than for the builtin types. Deprecating non-ASCII support in the latter, while ensuring it is properly supported in the former sounds like a better way forward than maintaining the status quo (starting in 3.3 though, with the first beta just around the corner, we don't want to be monkeying with this in 3.2) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From martin at v.loewis.de Mon Nov 29 08:18:59 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 08:18:59 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101129010112.343eaf64@pitrou.net> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <20101129010112.343eaf64@pitrou.net> Message-ID: <4CF353E3.4020706@v.loewis.de> > Perhaps int(), float(), Decimal() and friends could take an optional > parameter indicating whether non-ascii digits are considered. It would > then satisfy all parties. Not really. I still would want to see what the actual requirement is: i.e. do any users actually have the desire to have these digits accepted, yet the alternative decimal points rejected? Regards, Martin From martin at v.loewis.de Mon Nov 29 08:22:46 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 08:22:46 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF2F067.5020705@pearwood.info> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> Message-ID: <4CF354C6.9020302@v.loewis.de> > The former ensures that literals in code are always readable; the later > allows users to enter numbers in their own number system. How could that > be a bad thing? It's YAGNI, feature bloat. It gives the illusion of supporting something that actually isn't supported very well (namely, parsing local number strings). I claim that there is no meaningful application of this feature. Regards, Martin From martin at v.loewis.de Mon Nov 29 08:25:19 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 08:25:19 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <1290990580.8242.2.camel@localhost.localdomain> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <20101129010112.343eaf64@pitrou.net> <1290990580.8242.2.camel@localhost.localdomain> Message-ID: <4CF3555F.9040106@v.loewis.de> > That's mostly irrelevant. This feature exists and someone, somewhere, > may be using it. We normally don't remove stuff without deprecation. Sure: it should be deprecated before being removed. Regards, Martin From amauryfa at gmail.com Mon Nov 29 08:55:13 2010 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Mon, 29 Nov 2010 08:55:13 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF2E86F.5000606@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> Message-ID: 2010/11/29 "Martin v. L?wis" > I have now completed > > http://www.python.org/dev/peps/pep-0384/ was structseq.h considered? IMO it could be made PEP384-compliant with two additions that would replace two non-compliant functions: - A new function to create types, since PyStructSequence_InitType is supposed to work on a unititialized static variable: PyTypeObject *PyStructSequence_NewType(PyStructSequence_Desc *desc); - PyStructSequence_SetItem(), similar to the macro PyStructSequence_SET_ITEM; the PyStructSequence structure should be hidden. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Mon Nov 29 09:09:14 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 09:09:14 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: References: <4CF2E86F.5000606@v.loewis.de> Message-ID: <4CF35FAA.50600@v.loewis.de> > I have now completed > > http://www.python.org/dev/peps/pep-0384/ > > > was structseq.h considered? No, it wasn't - unfortunately, it still doesn't get included when including Python.h. I'll add it. > IMO it could be made PEP384-compliant with two additions that would > replace two non-compliant functions: > > - A new function to create types, since PyStructSequence_InitType > is supposed to work on a unititialized static variable: > PyTypeObject *PyStructSequence_NewType(PyStructSequence_Desc *desc); > - PyStructSequence_SetItem(), similar to the > macro PyStructSequence_SET_ITEM; the PyStructSequence structure should > be hidden. Sounds good. Regards, Martin From mal at egenix.com Mon Nov 29 09:35:05 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 29 Nov 2010 09:35:05 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> Message-ID: <4CF365B9.5040303@egenix.com> Alexander Belopolsky wrote: > On Sun, Nov 28, 2010 at 5:42 PM, M.-A. Lemburg wrote: > .. >> I don't see why the language spec should limit the wealth of number >> formats supported by float(). >> > > The Language Spec (whatever it is) should not, but hopefully the > Library Reference should. If you follow > http://docs.python.org/dev/py3k/library/functions.html#float link and > the references therein, you'll end up with ... the language spec again :-) > digit ::= "0"..."9" > > http://docs.python.org/dev/py3k/reference/lexical_analysis.html#grammar-token-digit That's obviously a bug in the documentation, since the Python 2.7 docs don't mention any such relationship to the language spec: http://docs.python.org/library/functions.html#float -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From g.brandl at gmx.net Mon Nov 29 09:36:56 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 29 Nov 2010 09:36:56 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF35FAA.50600@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> <4CF35FAA.50600@v.loewis.de> Message-ID: Am 29.11.2010 09:09, schrieb "Martin v. L?wis": >> I have now completed >> >> http://www.python.org/dev/peps/pep-0384/ >> >> >> was structseq.h considered? > > No, it wasn't - unfortunately, it still doesn't get included when > including Python.h. I'll add it. Would 3.2 be a good time to finally include it? All of its macros and declarations are named PyStructSequence*, so there shouldn't be a name clash concern. Georg From g.brandl at gmx.net Mon Nov 29 09:52:19 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 29 Nov 2010 09:52:19 +0100 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: <4CF19B3C.2000308@pearwood.info> References: <4CF18220.7000202@pearwood.info> <4CF19B3C.2000308@pearwood.info> Message-ID: Am 28.11.2010 00:58, schrieb Steven D'Aprano: > Georg Brandl wrote: >> Am 27.11.2010 23:11, schrieb Steven D'Aprano: > >>> I wasn't able to find a comment bubble that contained anything, so I >>> don't know what sort of information you expect them to contain -- every >>> one I tried said "0 comments". >> >> Maybe you should have tried the page I recommended as a demo, and where Nick >> made his comments? :) > > Aha! I never would have guessed that the bubbles are clickable -- I > thought you just moused-over them and they showed static comments put > there by the developers, part of the documentation itself. I didn't > realise that it was for users to add spam^W comments to the page. With > that perspective, I need to rethink. > > Yes, I failed to fully read the instructions you sent, or understand > them. That's what users do -- they don't read your instructions, and > they misunderstand them. If your UI isn't easily discoverable, users > will not be able to use it, and will be frustrated and annoyed. The user > is always right, even when they're doing it wrong *wink* That's right, of course. I really come to the conclusion that having a text link that "looks like" a link, i.e. is underlined, will have a better UI experience (since we cannot put notes "click bubble to comment" everywhere). >>> But it seems to me that comments are superfluous, if not actively harmful: >> >> (I've not read anything about harmful below. Was that just FUD?) > > Lowering accessibility to parts of the documentation is what I was > talking about when I said "actively harmful". But now that I have better > understanding of what the comment system is actually for, I have to rethink. Thanks! Georg From doko at ubuntu.com Mon Nov 29 11:24:22 2010 From: doko at ubuntu.com (Matthias Klose) Date: Mon, 29 Nov 2010 11:24:22 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF2E86F.5000606@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> Message-ID: <4CF37F56.9030808@ubuntu.com> On 29.11.2010 00:40, "Martin v. L?wis" wrote: > I have now completed > > http://www.python.org/dev/peps/pep-0384/ > > Benjamin has volunteered to rule on this PEP. > > Please comment with any changes you want to see, or speak in > favor or against this PEP. I looked at a diff with r84330 from the py3k branch. Extensions built with Py_LIMITED_API have the python version encoded in it's name. Which abi name should be used for these extensions? - The m and u modifiers in the abi name are complimentary (?) - debug builds and Py_LIMITED_API are incompatible (?) and therefore the current name should be used? - For posix systems the implementation is currently part of the abi name, are Py_LIMITED_API extensions supposed to be compatible with e.g. PyPy? Should the LIMITED_API abi name include the implementation string? - Should the distutils support for LIMITED_API be part of the pep, or be implemented later? In favour of the pep. Matthias From mal at egenix.com Mon Nov 29 12:02:57 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 29 Nov 2010 12:02:57 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CF38861.5090309@egenix.com> Nick Coghlan wrote: > On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull wrote: >> I agree that Python should make it easy for the programmer to get >> numerical values of native numeric strings, but it's not at all clear >> to me that there is any point to having float() recognize them by >> default. > > Indeed, as someone else suggested earlier in the thread, supporting > non-ASCII digits sounds more like a job for the locale module than for > the builtin types. > > Deprecating non-ASCII support in the latter, while ensuring it is > properly supported in the former sounds like a better way forward than > maintaining the status quo (starting in 3.3 though, with the first > beta just around the corner, we don't want to be monkeying with this > in 3.2) Since when do we only support certain Unicode features in specific locales ? If we would go down that road, we would also have to disable other Unicode features based on locale, e.g. whether to apply non-ASCII case mappings, what to consider whitespace, etc. We don't do that for a good reason: Unicode is supposed to be universal and not limited to a single locale. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From sylvain.thenault at logilab.fr Mon Nov 29 12:53:11 2010 From: sylvain.thenault at logilab.fr (Sylvain =?utf-8?B?VGjDqW5hdWx0?=) Date: Mon, 29 Nov 2010 12:53:11 +0100 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: <4CEE9B72.1070002@ronadam.com> References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> Message-ID: <20101129115311.GD18888@lupus.logilab.fr> On 25 novembre 11:22, Ron Adam wrote: > On 11/25/2010 08:30 AM, Emile Anclin wrote: > > > >hello, > > > >working on Pylint, we have a lot of voluntary corrupted files to test > >Pylint behavior; for instance > > > >$ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py > ># -*- coding: IBO-8859-1 -*- > >""" check correct unknown encoding declaration > >""" > > > >__revision__ = '????' > > > > > >and we try to find that module : > >find_module('func_unknown_encoding', None). But python3 raises SyntaxError > >in that case ; it didn't raise SyntaxError on python2 nor does so on our > >func_nonascii_noencoding and func_wrong_encoding modules (with obvious > >names) > > > >Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) > >[GCC 4.3.4] on linux2 > >Type "help", "copyright", "credits" or "license" for more information. > >>>>from imp import find_module > >>>>find_module('func_unknown_encoding', None) > >Traceback (most recent call last): > > File "", line 1, in > >SyntaxError: encoding problem: with BOM > > I don't think there is a clear reason by design. Also try importing > the same modules directly and noting the differences in the errors > you get. IMO the point is that we can consider as a bug the fact that find_module tries to somewhat read the content of the file, no? Though it seems to only doing this for encoding detection or like since find_module doesn't choke on a module containing another kind of syntax error. So the question is, should we deal with this in pylint/astng, or can we expect this to be fixed at some point? -- Sylvain Th?nault LOGILAB, Paris (France) Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services CubicWeb, the semantic web framework: http://www.cubicweb.org From ncoghlan at gmail.com Mon Nov 29 13:43:26 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Nov 2010 22:43:26 +1000 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF38861.5090309@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF38861.5090309@egenix.com> Message-ID: On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg wrote: > If we would go down that road, we would also have to disable other > Unicode features based on locale, e.g. whether to apply non-ASCII > case mappings, what to consider whitespace, etc. > > We don't do that for a good reason: Unicode is supposed to be > universal and not limited to a single locale. Because parsing numbers is about more than just the characters used for the individual digits. There are additional semantics associated with digit ordering (for any number) and decimal separators and exponential notation (for floating point numbers) and those vary by locale. We deliberately chose to make the builtin numeric parsers unaware of all of those things, and assuming that we can simply parse other digits as if they were their ASCII equivalents and otherwise assume a C locale seems questionable. If the existing semantics can be adequately defined, documented and defended, then retaining them would be fine. However, the language reference needs to define the behaviour properly so that other implementations know what they need to support and what can be chalked up as being just an implementation accident of CPython. (As a point in the plus column, both decimal.Decimal and fractions.Fraction were able to handle the '????.??' example in a manner consistent with the int and float handling) Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From merwok at netwok.org Mon Nov 29 14:14:30 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 29 Nov 2010 14:14:30 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF2E86F.5000606@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> Message-ID: <4CF3A736.4050003@netwok.org> Hello, > Please comment with any changes you want to see, or speak in > favor or against this PEP. How to get a diff between py3k and this branch? Regards From doko at ubuntu.com Mon Nov 29 14:37:33 2010 From: doko at ubuntu.com (Matthias Klose) Date: Mon, 29 Nov 2010 14:37:33 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF3A736.4050003@netwok.org> References: <4CF2E86F.5000606@v.loewis.de> <4CF3A736.4050003@netwok.org> Message-ID: <4CF3AC9D.20309@ubuntu.com> On 29.11.2010 14:14, ?ric Araujo wrote: > Hello, > >> Please comment with any changes you want to see, or speak in >> favor or against this PEP. > > How to get a diff between py3k and this branch? I used svn diff svn://svn.python.org/python/branches/py3k at 84330 svn://svn.python.org/python/branches/pep-0384 From ncoghlan at gmail.com Mon Nov 29 14:58:50 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Nov 2010 23:58:50 +1000 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF3AC9D.20309@ubuntu.com> References: <4CF2E86F.5000606@v.loewis.de> <4CF3A736.4050003@netwok.org> <4CF3AC9D.20309@ubuntu.com> Message-ID: On Mon, Nov 29, 2010 at 11:37 PM, Matthias Klose wrote: > On 29.11.2010 14:14, ?ric Araujo wrote: >> >> Hello, >> >>> Please comment with any changes you want to see, or speak in >>> favor or against this PEP. >> >> How to get a diff between py3k and this branch? > > I used > svn diff svn://svn.python.org/python/branches/py3k at 84330 > svn://svn.python.org/python/branches/pep-0384 I had to use the full read/write svn+ssh:pythondev at svn.python.org repository URLs to get it to give me a diff. The http read only URLs didn't work (no diff returned, just "svn: OPTIONS of 'http://svn.python.org/python/branches/pep-0384': 200 OK (http://svn.python.org)"), and the bare svn protocol isn't enabled on svn.python.org. Since directory diffs don't appear to be enabled on the svn.python.org ViewVC instance, it would probably be a good idea to put this up on Reitveld so people can more easily see the details of what has been changed on the branch to date. If nobody beats me to it, I'll put one up in the morning. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Nov 29 15:07:32 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Nov 2010 00:07:32 +1000 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF2E86F.5000606@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> Message-ID: On Mon, Nov 29, 2010 at 9:40 AM, "Martin v. L?wis" wrote: > I have now completed > > http://www.python.org/dev/peps/pep-0384/ > > Benjamin has volunteered to rule on this PEP. > > Please comment with any changes you want to see, or speak in > favor or against this PEP. This is probably an issue independent of the PEP, but there appear to be a *lot* of exposed typedefs for various type slots and other function signatures that don't start with the Py prefix (i.e. getter, setter, unaryfunc and friends). Python.h shouldn't be leaking unprefixed names like that. We certainly shouldn't be enshrining them in the stable ABI without adding prefixes first. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Nov 29 15:19:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Nov 2010 15:19:07 +0100 Subject: [Python-Dev] Python and the Unicode Character Database References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20101129151907.64e3f6ae@pitrou.net> On Mon, 29 Nov 2010 13:58:05 +1000 Nick Coghlan wrote: > On Mon, Nov 29, 2010 at 1:39 PM, Stephen J. Turnbull wrote: > > I agree that Python should make it easy for the programmer to get > > numerical values of native numeric strings, but it's not at all clear > > to me that there is any point to having float() recognize them by > > default. > > Indeed, as someone else suggested earlier in the thread, supporting > non-ASCII digits sounds more like a job for the locale module than for > the builtin types. Not sure, really. For example, "\d" in a regular expression will match all Unicode digits, unless you pass the re.ASCII flag. The C locale mechanism generally does a poor job of supporting what MS seems to call "culture-specific" characteristics. Regards Antoine. From solipsis at pitrou.net Mon Nov 29 15:22:24 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Nov 2010 15:22:24 +0100 Subject: [Python-Dev] Python and the Unicode Character Database References: <4CF2E93F.70208@pearwood.info> Message-ID: <20101129152224.7c253a8c@pitrou.net> On Sun, 28 Nov 2010 21:32:15 -0500 Alexander Belopolsky wrote: > On Sun, Nov 28, 2010 at 6:43 PM, Steven D'Aprano wrote: > .. > >> is more important than to assure users that once their program > >> accepted some text as a number, they can assume that the text is > >> ASCII. > > > > Seems like a pretty foolish assumption, if you ask me, pretty much akin to > > assuming that if string.isalpha() returns true that string is ASCII. > > > > It is not to 99.9% of Python users whose code is written for 2.x. > Their strings are byte strings and string.isdigit() does imply ASCII > even if string.isalpha() does not in many locales. We are not talking about string.isdigit(), we are talking about the float() constructor when given an unicode string. Constructing a float from an unicode string is certainly a common thing, even in 2.x. Regards Antoine. From foom at fuhm.net Mon Nov 29 15:15:12 2010 From: foom at fuhm.net (James Y Knight) Date: Mon, 29 Nov 2010 09:15:12 -0500 Subject: [Python-Dev] PEP 384 final review In-Reply-To: References: <4CF2E86F.5000606@v.loewis.de> <4CF3A736.4050003@netwok.org> <4CF3AC9D.20309@ubuntu.com> Message-ID: <28693E2E-A60E-4F83-BF55-DBD6EAD88353@fuhm.net> On Nov 29, 2010, at 8:58 AM, Nick Coghlan wrote: > The http read only URLs > didn't work (no diff returned, just "svn: OPTIONS of > 'http://svn.python.org/python/branches/pep-0384': 200 OK > (http://svn.python.org)"), That was the wrong url: you should've used http://svn.python.org/projects/python/branches/pep-0384 James -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Mon Nov 29 16:19:19 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 29 Nov 2010 16:19:19 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF38861.5090309@egenix.com> Message-ID: <4CF3C477.1020007@egenix.com> Nick Coghlan wrote: > On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg wrote: >> If we would go down that road, we would also have to disable other >> Unicode features based on locale, e.g. whether to apply non-ASCII >> case mappings, what to consider whitespace, etc. >> >> We don't do that for a good reason: Unicode is supposed to be >> universal and not limited to a single locale. > > Because parsing numbers is about more than just the characters used > for the individual digits. There are additional semantics associated > with digit ordering (for any number) and decimal separators and > exponential notation (for floating point numbers) and those vary by > locale. We deliberately chose to make the builtin numeric parsers > unaware of all of those things, and assuming that we can simply parse > other digits as if they were their ASCII equivalents and otherwise > assume a C locale seems questionable. Sure, and those additional semantics are locale dependent, even between ASCII-only locales. However, that does not apply to the basic building blocks, the decimal digits themselves. > If the existing semantics can be adequately defined, documented and > defended, then retaining them would be fine. However, the language > reference needs to define the behaviour properly so that other > implementations know what they need to support and what can be chalked > up as being just an implementation accident of CPython. (As a point in > the plus column, both decimal.Decimal and fractions.Fraction were able > to handle the '????.??' example in a manner consistent with the int > and float handling) The support is built into the C API, so there's not really much surprise there. Regarding documentation, we'd just have to add that numbers may be made up of an Unicode code point in the category "Nd". See http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf, section 4.6 for details.... """ Decimal digits form a large subcategory of numbers consisting of those digits that can be used to form decimal-radix numbers. They include script-specific digits, but exclude char- acters such as Roman numerals and Greek acrophonic numerals. (Note that <1, 5> = 15 = fifteen, but = IV = four.) Decimal digits also exclude the compatibility subscript or superscript digits to prevent simplistic parsers from misinterpreting their values in context. """ int(), float() and long() (in Python2) are such simplistic parsers. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ziade.tarek at gmail.com Mon Nov 29 16:59:42 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 29 Nov 2010 16:59:42 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF37F56.9030808@ubuntu.com> References: <4CF2E86F.5000606@v.loewis.de> <4CF37F56.9030808@ubuntu.com> Message-ID: On Mon, Nov 29, 2010 at 11:24 AM, Matthias Klose wrote: > On 29.11.2010 00:40, "Martin v. L?wis" wrote: >> >> I have now completed >> >> http://www.python.org/dev/peps/pep-0384/ >> >> Benjamin has volunteered to rule on this PEP. >> >> Please comment with any changes you want to see, or speak in >> favor or against this PEP. > > I looked at a diff with r84330 from the py3k branch. > > Extensions built with Py_LIMITED_API have the python version encoded in it's > name. ?Which abi name should be used for these extensions? >.. > ?- Should the distutils support for LIMITED_API be part of the pep, or > ? be implemented later? In any case, it has to be implemented in Distutils2, not in Distutils. Distutils is frozen and just in maintenance mode. Once Distutils2 final is released (it's currently in alpha), it will be installable from 2.4 to 3.x and can provide this feature. For Python itself we can backport the feature in its setup.py, until Distutils2 is back to the sdtlib > In favour of the pep. +1 > > ?Matthias > -- Tarek Ziad? | http://ziade.org From alexander.belopolsky at gmail.com Mon Nov 29 17:07:03 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 29 Nov 2010 11:07:03 -0500 Subject: [Python-Dev] [Preview] Comments and change proposals on documentation In-Reply-To: References: <4CF18220.7000202@pearwood.info> <4CF19B3C.2000308@pearwood.info> Message-ID: On Mon, Nov 29, 2010 at 3:52 AM, Georg Brandl wrote: .. >> Yes, I failed to fully read the instructions you sent, or understand >> them. That's what users do -- they don't read your instructions, and >> they misunderstand them. If your UI isn't easily discoverable, users >> will not be able to use it, and will be frustrated and annoyed. The user >> is always right, even when they're doing it wrong *wink* > > That's right, of course. ?I really come to the conclusion that having a text > link that "looks like" a link, i.e. is underlined, will have a better UI > experience (since we cannot put notes "click bubble to comment" everywhere). > Please don't make comment bubbles more visible. Doing so will only decrease signal to noise ratio. I think a little bit of a learning barrier is a good thing: it will keep down the number of "Bart was here" comments. From alexander.belopolsky at gmail.com Mon Nov 29 19:09:58 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 29 Nov 2010 13:09:58 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF354C6.9020302@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> Message-ID: On Mon, Nov 29, 2010 at 2:22 AM, "Martin v. L?wis" wrote: >> The former ensures that literals in code are always readable; the later >> allows users to enter numbers in their own number system. How could that >> be a bad thing? > > It's YAGNI, feature bloat. It gives the illusion of supporting something > that actually isn't supported very well (namely, parsing local number > strings). I claim that there is no meaningful application > of this feature. > Speaking of YAGNI, does anyone want to defend >>> complex('????.??j') 1234.56j ? Especially given that we reject complex('1234.56i'): http://bugs.python.org/issue10562 From solipsis at pitrou.net Mon Nov 29 19:33:02 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Nov 2010 19:33:02 +0100 Subject: [Python-Dev] Python and the Unicode Character Database References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> Message-ID: <20101129193302.115dbcd5@pitrou.net> On Mon, 29 Nov 2010 08:22:46 +0100 "Martin v. L?wis" wrote: > > The former ensures that literals in code are always readable; the later > > allows users to enter numbers in their own number system. How could that > > be a bad thing? > > It's YAGNI, feature bloat. It gives the illusion of supporting something > that actually isn't supported very well (namely, parsing local number > strings). I claim that there is no meaningful application > of this feature. Still, if it's not detrimental and it it's not difficult to support, then why do you care? You aren't even maintaining that part of the code. I don't think "remove feature bloat" is part of our development goals or practices. Given the diversity of our user base, such removal should be done carefully and only for serious reasons. Regards Antoine. From mal at egenix.com Mon Nov 29 19:59:57 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 29 Nov 2010 19:59:57 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> Message-ID: <4CF3F82D.2040000@egenix.com> Alexander Belopolsky wrote: > On Mon, Nov 29, 2010 at 2:22 AM, "Martin v. L?wis" wrote: >>> The former ensures that literals in code are always readable; the later >>> allows users to enter numbers in their own number system. How could that >>> be a bad thing? >> >> It's YAGNI, feature bloat. It gives the illusion of supporting something >> that actually isn't supported very well (namely, parsing local number >> strings). I claim that there is no meaningful application >> of this feature. This is not about parsing local number strings, it's about parsing number strings represented using different scripts - besides en-US is a locale as well, ye know :-) > Speaking of YAGNI, does anyone want to defend > >>>> complex('????.??j') > 1234.56j > > ? Yes. The same arguments apply. Just because ASCII-proponents may have a hard time reading such literals, doesn't mean that script users have the same trouble. > Especially given that we reject complex('1234.56i'): > > http://bugs.python.org/issue10562 We've had that discussion long before we had Unicode in Python. The main reason was that 'i' looked to similar to 1 in a number of fonts which is why it was rejected for Python source code. However, I don't any reason why we shouldn't accept both i and j for complex(), though, since the input to that constructor doesn't have to originate in Python source code. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From brett at python.org Mon Nov 29 20:22:22 2010 From: brett at python.org (Brett Cannon) Date: Mon, 29 Nov 2010 11:22:22 -0800 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: <20101129115311.GD18888@lupus.logilab.fr> References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> <20101129115311.GD18888@lupus.logilab.fr> Message-ID: On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault wrote: > On 25 novembre 11:22, Ron Adam wrote: >> On 11/25/2010 08:30 AM, Emile Anclin wrote: >> > >> >hello, >> > >> >working on Pylint, we have a lot of voluntary corrupted files to test >> >Pylint behavior; for instance >> > >> >$ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py >> ># -*- coding: IBO-8859-1 -*- >> >""" check correct unknown encoding declaration >> >""" >> > >> >__revision__ = '????' >> > >> > >> >and we try to find that module : >> >find_module('func_unknown_encoding', None). But python3 raises SyntaxError >> >in that case ; it didn't raise SyntaxError on python2 nor does so on our >> >func_nonascii_noencoding and func_wrong_encoding modules (with obvious >> >names) >> > >> >Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) >> >[GCC 4.3.4] on linux2 >> >Type "help", "copyright", "credits" or "license" for more information. >> >>>>from imp import find_module >> >>>>find_module('func_unknown_encoding', None) >> >Traceback (most recent call last): >> > ? File "", line 1, in >> >SyntaxError: encoding problem: with BOM >> >> I don't think there is a clear reason by design. ?Also try importing >> the same modules directly and noting the differences in the errors >> you get. > > IMO the point is that we can consider as a bug the fact that find_module > tries to somewhat read the content of the file, no? Though it seems to only > doing this for encoding detection or like since find_module doesn't choke on > a module containing another kind of syntax error. > > So the question is, should we deal with this in pylint/astng, or can we expect > this to be fixed at some point? Considering these semantics changed between Python 2 and 3 w/o a discernable benefit (I would consider it a negative as finding a module should not be impacted by syntactic correctness; the full act of importing should be the only thing that cares about that), I would consider it a bug that should be filed. From tjreedy at udel.edu Mon Nov 29 20:23:28 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 29 Nov 2010 14:23:28 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF3C477.1020007@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF38861.5090309@egenix.com> <4CF3C477.1020007@egenix.com> Message-ID: On 11/29/2010 10:19 AM, M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Mon, Nov 29, 2010 at 9:02 PM, M.-A. Lemburg wrote: >>> If we would go down that road, we would also have to disable other >>> Unicode features based on locale, e.g. whether to apply non-ASCII >>> case mappings, what to consider whitespace, etc. >>> >>> We don't do that for a good reason: Unicode is supposed to be >>> universal and not limited to a single locale. >> >> Because parsing numbers is about more than just the characters used >> for the individual digits. There are additional semantics associated >> with digit ordering (for any number) and decimal separators and >> exponential notation (for floating point numbers) and those vary by >> locale. We deliberately chose to make the builtin numeric parsers >> unaware of all of those things, and assuming that we can simply parse >> other digits as if they were their ASCII equivalents and otherwise >> assume a C locale seems questionable. > > Sure, and those additional semantics are locale dependent, even > between ASCII-only locales. However, that does not apply to the > basic building blocks, the decimal digits themselves. > >> If the existing semantics can be adequately defined, documented and >> defended, then retaining them would be fine. However, the language >> reference needs to define the behaviour properly so that other >> implementations know what they need to support and what can be chalked >> up as being just an implementation accident of CPython. (As a point in >> the plus column, both decimal.Decimal and fractions.Fraction were able >> to handle the '????.??' example in a manner consistent with the int >> and float handling) > > The support is built into the C API, so there's not really much > surprise there. > > Regarding documentation, we'd just have to add that numbers may > be made up of an Unicode code point in the category "Nd". > > See http://www.unicode.org/versions/Unicode5.2.0/ch04.pdf, section > 4.6 for details.... > > """ > Decimal digits form a large subcategory of numbers consisting of those digits that can be > used to form decimal-radix numbers. They include script-specific digits, but exclude char- > acters such as Roman numerals and Greek acrophonic numerals. (Note that<1, 5> = 15 = > fifteen, but = IV = four.) Decimal digits also exclude the compatibility subscript or > superscript digits to prevent simplistic parsers from misinterpreting their values in context. > """ > > int(), float() and long() (in Python2) are such simplistic > parsers. Since you are the knowledgable advocate of the current behavior, perhaps you could open an issue and propose a doc patch, even if not .rst formatted. -- Terry Jan Reedy From alexander.belopolsky at gmail.com Mon Nov 29 20:38:46 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 29 Nov 2010 14:38:46 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101129193302.115dbcd5@pitrou.net> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> Message-ID: On Mon, Nov 29, 2010 at 1:33 PM, Antoine Pitrou wrote: > On Mon, 29 Nov 2010 08:22:46 +0100 > "Martin v. L?wis" wrote: >> > The former ensures that literals in code are always readable; the later >> > allows users to enter numbers in their own number system. How could that >> > be a bad thing? >> >> It's YAGNI, feature bloat. It gives the illusion of supporting something >> that actually isn't supported very well (namely, parsing local number >> strings). I claim that there is no meaningful application >> of this feature. > > Still, if it's not detrimental and it it's not difficult to support, > then why do you care? It is difficult to support. A fix for issue10557 would be much simpler if we did not support non-European digits. I now added a patch that handles non-ascii digits, so you can see what's involved. Note that when Unicode Consortium inevitably adds more Nd characters to the non-BMP planes, we will have to add surrogate pairs' support to this code. In any case, there is little we can do about it in 3.2 other than fix bugs like issue10557 without breaking currently valid code, so I created a separate issue to continue this debate in context of 3.3. [issue10581] Now, I would like to bring this thread back to it's subject. Given that UCD is now affecting the language definition and the standard library behavior, how should changes to UCD be handled? - Should Python documentation refer to the specific version of Unicode that it supports? Current documentation refers to old versions. Should version be updated or removed to imply the latest? - How UCD updates should be handled during the language moratorium? During PEP 3003 discussion, it was suggested to handle it on a case by case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP 3003. Should this upgrade be backported to 2.7? - How specific should library reference manual be in defining methods affected by UCD such as str.upper()? - What is an acceptable level of variation between Python implementations? For example, if '\UXXXXXXXX'.isalpha() returns true in one implementation, can it return false in another? Note that even CPython narrow and wide builds are presently not consistent in this respect. [issue10581] http://bugs.python.org/issue10581 From alexander.belopolsky at gmail.com Mon Nov 29 20:43:14 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 29 Nov 2010 14:43:14 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2DAD7.2000408@egenix.com> <87pqto6bnv.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF38861.5090309@egenix.com> <4CF3C477.1020007@egenix.com> Message-ID: On Mon, Nov 29, 2010 at 2:23 PM, Terry Reedy wrote: .. > Since you are the knowledgable advocate of the current behavior, perhaps you > could open an issue and propose a doc patch, even if not .rst formatted. > I am not an advocate of the current behavior, but an issue for doc patches is at . From martin at v.loewis.de Mon Nov 29 20:38:59 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 20:38:59 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: References: <4CF2E86F.5000606@v.loewis.de> <4CF37F56.9030808@ubuntu.com> Message-ID: <4CF40153.8030100@v.loewis.de> >> - Should the distutils support for LIMITED_API be part of the pep, or >> be implemented later? > > In any case, it has to be implemented in Distutils2, not in Distutils. > Distutils is frozen and just in maintenance mode. I think it's too late for that. PEP 3149 is accepted, and it does specify a change to distutils (namely, the abi= parameter). ISTM that an approved PEP will override the distutils code freeze. > For Python itself we can backport the feature in its setup.py, until > Distutils2 is back to the sdtlib This won't be for python itself, but for extension modules. Regards, Martin From ziade.tarek at gmail.com Mon Nov 29 20:45:35 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 29 Nov 2010 20:45:35 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF40153.8030100@v.loewis.de> References: <4CF2E86F.5000606@v.loewis.de> <4CF37F56.9030808@ubuntu.com> <4CF40153.8030100@v.loewis.de> Message-ID: 2010/11/29 "Martin v. L?wis" : >>> ?- Should the distutils support for LIMITED_API be part of the pep, or >>> ? be implemented later? >> >> In any case, it has to be implemented in Distutils2, not in Distutils. >> Distutils is frozen and just in maintenance mode. > > I think it's too late for that. PEP 3149 is accepted, and it does > specify a change to distutils (namely, the abi= parameter). ISTM that > an approved PEP will override the distutils code freeze. Having an accepted PEP does not imply that it should be implemented in the standard library. For instance PEP 345 and PEP 376 are accepted but implemented in Distutils2. it's also a: - good opportunity to boost Distutils2 adoption - way to get feedback from people for that abi= option and have the chance to correct any design issue before d2 is added in the sdtlib > >> For Python itself we can backport the feature in its setup.py, until >> Distutils2 is back to the sdtlib > > This won't be for python itself, but for extension modules. ok. > > Regards, > Martin > -- Tarek Ziad? | http://ziade.org From rrr at ronadam.com Mon Nov 29 21:21:07 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 29 Nov 2010 14:21:07 -0600 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> <20101129115311.GD18888@lupus.logilab.fr> Message-ID: On 11/29/2010 01:22 PM, Brett Cannon wrote: > On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault > wrote: >> On 25 novembre 11:22, Ron Adam wrote: >>> On 11/25/2010 08:30 AM, Emile Anclin wrote: >>>> >>>> hello, >>>> >>>> working on Pylint, we have a lot of voluntary corrupted files to test >>>> Pylint behavior; for instance >>>> >>>> $ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py >>>> # -*- coding: IBO-8859-1 -*- >>>> """ check correct unknown encoding declaration >>>> """ >>>> >>>> __revision__ = '????' >>>> >>>> >>>> and we try to find that module : >>>> find_module('func_unknown_encoding', None). But python3 raises SyntaxError >>>> in that case ; it didn't raise SyntaxError on python2 nor does so on our >>>> func_nonascii_noencoding and func_wrong_encoding modules (with obvious >>>> names) >>>> >>>> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) >>>> [GCC 4.3.4] on linux2 >>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>> >from imp import find_module >>>>>>> find_module('func_unknown_encoding', None) >>>> Traceback (most recent call last): >>>> File "", line 1, in >>>> SyntaxError: encoding problem: with BOM >>> >>> I don't think there is a clear reason by design. Also try importing >>> the same modules directly and noting the differences in the errors >>> you get. >> >> IMO the point is that we can consider as a bug the fact that find_module >> tries to somewhat read the content of the file, no? Though it seems to only >> doing this for encoding detection or like since find_module doesn't choke on >> a module containing another kind of syntax error. >> >> So the question is, should we deal with this in pylint/astng, or can we expect >> this to be fixed at some point? > > Considering these semantics changed between Python 2 and 3 w/o a > discernable benefit (I would consider it a negative as finding a > module should not be impacted by syntactic correctness; the full act > of importing should be the only thing that cares about that), I would > consider it a bug that should be filed. The output of imp.find_module() returns an open file io object, and it's output feeds directly into to imp.load_module(). >>> imp.find_module('pydoc') (<_io.TextIOWrapper name=4 encoding='utf-8'>, '/usr/local/lib/python3.2/pydoc.py', ('.py', 'U', 1)) So I think the imp.find_module() is suppose to be used when you *do* want to do the full act of importing and not for just finding out if or where module xyz exists. Ron From martin at v.loewis.de Mon Nov 29 21:22:02 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 21:22:02 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF37F56.9030808@ubuntu.com> References: <4CF2E86F.5000606@v.loewis.de> <4CF37F56.9030808@ubuntu.com> Message-ID: <4CF40B6A.6080407@v.loewis.de> > Extensions built with Py_LIMITED_API have the python version encoded in > it's name. Which abi name should be used for these extensions? PEP 3149, IIUC, says it should be "abi3". I don't understand what that means, though (with respect to, say, distutils) > - The m and u modifiers in the abi name are complimentary (?) See above: none of these will be used. Of course, it is possible to name an ABI-conforming extensions with the regular ABI name of the Python release. > - For posix systems the implementation is currently part of the abi name, > are Py_LIMITED_API extensions supposed to be compatible with e.g. PyPy? That's a choice that PyPy needs to make, of course, but Amaury has indicated that they are interested in doing so. > Should the LIMITED_API abi name include the implementation string? > - Should the distutils support for LIMITED_API be part of the pep, or > be implemented later? Depends on what support you want. Currently, all you need to do is to define Py_LIMITED_API to the preprocessor - this is something that is already supported in distutils. If you want the support suggested in PEP 3149 (specifying abi=3), it should certainly be implemented in Python 3.2, despite the distutils freeze. Regards, Martin From martin at v.loewis.de Mon Nov 29 21:36:46 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 21:36:46 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: References: <4CF2E86F.5000606@v.loewis.de> Message-ID: <4CF40EDE.10004@v.loewis.de> > This is probably an issue independent of the PEP but there appear to > be a *lot* of exposed typedefs for various type slots and other > function signatures that don't start with the Py prefix (i.e. getter, > setter, unaryfunc and friends). It's indeed independent: the names don't actually affect the ABI, but the API. Changing them is possible later without risking binary compatibility. > Python.h shouldn't be leaking > unprefixed names like that. We certainly shouldn't be enshrining them > in the stable ABI without adding prefixes first. The stable ABI isn't actually enshrining them - what gets enshrined is the value of the typedefs, not their names. I don't mind renaming them, though. I see a number of different cases: - struct names. I don't see a problem to have "typedef struct PyFoo PyFoo" I vaguely recall that there had been compiler problems with that construct at some point, but to my knowledge, they are past, and this is actually both well-formed C and well-formed C++. - function pointer type names - "various" other types For the struct types, in particular for the ones which already have a typedef, I think renaming them should be possible right away. Applications that break should be able to use the typedef instead, and continue to work with older releases. For the function pointer type names, caution is necessary. We cannot remove them, since it would break a lot of code. I also think that some smart naming scheme would be desirable that makes the names all sound right, yet allows easy mapping from the existing types. Once such a scheme is added, we should have a graceful deprecation procedure, such as: - release A: add typedefs in addition to existing pointer types, deprecate pointer types in documentation - release B>A: make the old names somehow conditional (e.g. put them all into a header file rename3.h, or some such) - release C>B: remove rename3.h For the other rest, I think many of them are considered internal (of course, they shouldn't appear in the ABI then at all). Renaming them right away might be fine. Regards, Martin From martin at v.loewis.de Mon Nov 29 21:41:09 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Mon, 29 Nov 2010 21:41:09 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <4CF3A736.4050003@netwok.org> References: <4CF2E86F.5000606@v.loewis.de> <4CF3A736.4050003@netwok.org> Message-ID: <4CF40FE5.8080800@v.loewis.de> Am 29.11.2010 14:14, schrieb ?ric Araujo: > Hello, > >> Please comment with any changes you want to see, or speak in >> favor or against this PEP. > > How to get a diff between py3k and this branch? As others have already explained: svn diff http://svn.python.org/projects/python/branches/py3k at 84329 http://svn.python.org/projects/python/branches/pep-0384 (84329 is the value of svnmerge-integrated). In any case, I posted it to Rietveld as http://codereview.appspot.com/3262043/ Regards, Martin From greg.ewing at canterbury.ac.nz Mon Nov 29 21:47:23 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Nov 2010 09:47:23 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF1AB3C.3060408@btinternet.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CEDDC2D.204@canterbury.ac.nz> <4CEE5C1C.9000905@btinternet.com> <4CF2C86C.9030505@canterbury.ac.nz> <4CF1AB3C.3060408@btinternet.com> Message-ID: <4CF4115B.7080200@canterbury.ac.nz> Rob Cliffe wrote: > But when a frozen list a.k.a. tuple would be created - either directly, > or by setting a list's mutable flag to False which would really turn it > into a tuple - the size *would* be known. But at that point the object consists of two memory blocks -- one containing just the object header and a pointer to the items, and the other containing the items. To turn that into a true tuple structure would require resizing the main object block to be big enough to hold the items and copying them into it. The main object can't be moved (because there are PyObject *s all over the place pointing to it), so if there's not enough room at its current location, you're out of luck. So lists frozen after creation would have to remain as two blocks, making them second-class citizens compared to those that were created frozen. Either that or store all lists/tuples as two blocks, and give up some of the performance advantages of the current tuple structure. -- Greg From martin at v.loewis.de Mon Nov 29 22:04:03 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 22:04:03 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <20101129193302.115dbcd5@pitrou.net> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> Message-ID: <4CF41543.1030800@v.loewis.de> Am 29.11.2010 19:33, schrieb Antoine Pitrou: > On Mon, 29 Nov 2010 08:22:46 +0100 > "Martin v. L?wis" wrote: >>> The former ensures that literals in code are always readable; the later >>> allows users to enter numbers in their own number system. How could that >>> be a bad thing? >> >> It's YAGNI, feature bloat. It gives the illusion of supporting something >> that actually isn't supported very well (namely, parsing local number >> strings). I claim that there is no meaningful application >> of this feature. > > Still, if it's not detrimental and it it's not difficult to support, > then why do you care? You aren't even maintaining that part of the code. I sure do maintain the Unicode database implementation in Python - the one that is being used (IMO incorrectly) to implement the conversion in question (and also the one that triggered this thread). > I don't think "remove feature bloat" is part of our development goals > or practices. Given the diversity of our user base, such removal should > be done carefully and only for serious reasons. I think it's a serious reason that the intuitive expectation of many people (including committers) deviates from the actual implementation - so much that they clarify the documentation in a way that makes the difference explicit. Having a mismatch between the expected behavior and the actual behavior is a serious problem because it could lead to security issues, e.g. when someone relies on float() to perform certain syntactic checking, making it then possible to sneak in values that cause corruption later on (speaking theoretically, of course - I'm not aware of an application that is vulnerable in this manner). Regards, Martin From martin at v.loewis.de Mon Nov 29 22:13:41 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 22:13:41 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> Message-ID: <4CF41785.5020807@v.loewis.de> > - Should Python documentation refer to the specific version of Unicode > that it supports? You mean, mention it somewhere? Sure (although it would be nice if the documentation generator would automatically extract it from the source, just as it extracts the Python version number). Of course, such mentioning should explain that this is specific to CPython, and not an aspect of Python-the-language. > Current documentation refers to old versions. Should version be > updated or removed to imply the latest? What specific reference are you referring to? > - How UCD updates should be handled during the language moratorium? It's clearly not affected. > During PEP 3003 discussion, it was suggested to handle it on a case by > case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP > 3003. It's covered by "As the standard library is not directly tied to the language definition it is not covered by this moratorium." > Should this upgrade be backported to 2.7? No, it's a new feature. > - How specific should library reference manual be in defining methods > affected by UCD such as str.upper()? It should specify what this actually does in Unicode terminology (probably in addition to a layman's rephrase of that) > - What is an acceptable level of variation between Python > implementations? For example, if '\UXXXXXXXX'.isalpha() returns true > in one implementation, can it return false in another? Implementations are free to use any version of the UCD. Regards, Martin From martin at v.loewis.de Mon Nov 29 22:14:07 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Mon, 29 Nov 2010 22:14:07 +0100 Subject: [Python-Dev] PEP 384 final review In-Reply-To: References: <4CF2E86F.5000606@v.loewis.de> <4CF35FAA.50600@v.loewis.de> Message-ID: <4CF4179F.9080700@v.loewis.de> Am 29.11.2010 09:36, schrieb Georg Brandl: > Am 29.11.2010 09:09, schrieb "Martin v. L?wis": >>> I have now completed >>> >>> http://www.python.org/dev/peps/pep-0384/ >>> >>> >>> was structseq.h considered? >> >> No, it wasn't - unfortunately, it still doesn't get included when >> including Python.h. I'll add it. > > Would 3.2 be a good time to finally include it? All of its macros and > declarations are named PyStructSequence*, so there shouldn't be a > name clash concern. Sure, I see no problem with that. Regards, Martin From greg.ewing at canterbury.ac.nz Mon Nov 29 22:36:51 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 30 Nov 2010 10:36:51 +1300 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> Message-ID: <4CF41CF3.7040001@canterbury.ac.nz> I don't see how the grouping can be completely separated from the value-naming. If the named values are to be subclassed from the base values, then you want all the members of a group to belong to the *same* subclass. You can't get that by treating each named value on its own and then trying to group them together afterwards. -- Greg From steve at pearwood.info Mon Nov 29 23:09:15 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 30 Nov 2010 09:09:15 +1100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> Message-ID: <4CF4248B.1060409@pearwood.info> Alexander Belopolsky wrote: > Speaking of YAGNI, does anyone want to defend > >>>> complex('????.??j') > 1234.56j *If* we allow float('????.??') (as we currently do, but is being disputed by some), then we should allow complex('????.??j'). It would be silly for complex to be more restrictive than float. > Especially given that we reject complex('1234.56i'): I don't understand why you use 'i' when Python uses 'j' as the symbol for imaginary numbers. >>> complex('1234.56j') 1234.56j works fine. I have no problem with Python choosing one of i/j as the symbol for imaginary-1 and rejecting the other. I prefer i rather than j, but that's because my background is in maths rather than electrical engineering, but I can live with either. But in any case, please don't conflate the question of whether Python should accept j and/or i for complex numbers with the question of supporting non-arabic numerals. The two issues are unrelated. -- Steven From rrr at ronadam.com Tue Nov 30 00:38:26 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 29 Nov 2010 17:38:26 -0600 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF3180B.1060306@ronadam.com> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF3180B.1060306@ronadam.com> Message-ID: On 11/28/2010 09:03 PM, Ron Adam wrote: > It does associate additional info to names and creates a nice dictionary to > reference. > > > >>> def name_values( FOO: 1, > BAR: "Hello World!", > BAZ: dict(a=1, b=2, c=3) ): > ... return FOO, BAR, BAZ > ... > >>> foo(1,2,3) > (1, 2, 3) > >>> foo.__annotations__ > {'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}} sigh... I havn't been very focused lately. That should have been: >>> def named_values(FOO:1, BAR:"Hello World!", BAZ:dict(a=1, b=2, c=3)): ... return FOO, BAR, BAZ ... >>> named_values.__annotations__ {'BAR': 'Hello World!', 'FOO': 1, 'BAZ': {'a': 1, 'c': 3, 'b': 2}} >>> named_values(1, 2, 3) (1, 2, 3) Cheers, Ron From ncoghlan at gmail.com Tue Nov 30 03:04:28 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Nov 2010 12:04:28 +1000 Subject: [Python-Dev] PEP 384 final review In-Reply-To: <28693E2E-A60E-4F83-BF55-DBD6EAD88353@fuhm.net> References: <4CF2E86F.5000606@v.loewis.de> <4CF3A736.4050003@netwok.org> <4CF3AC9D.20309@ubuntu.com> <28693E2E-A60E-4F83-BF55-DBD6EAD88353@fuhm.net> Message-ID: On Tue, Nov 30, 2010 at 12:15 AM, James Y Knight wrote: > > On Nov 29, 2010, at 8:58 AM, Nick Coghlan wrote: > > The http read only URLs > didn't work (no diff returned, just "svn: OPTIONS of > 'http://svn.python.org/python/branches/pep-0384': 200 OK > (http://svn.python.org)"), > > That was the wrong url: you should've > used?http://svn.python.org/projects/python/branches/pep-0384 > James Ah, thanks, I always forget that part (since it isn't there in the read/write URLs). The SVN output may qualify as one of the least helpful error messages I have ever seen, though :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Nov 30 03:23:04 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Nov 2010 12:23:04 +1000 Subject: [Python-Dev] constant/enum type in stdlib In-Reply-To: <4CF41CF3.7040001@canterbury.ac.nz> References: <20101121034404.52924F20A@mail.python.org> <4CE9BF4A.1020302@netwok.org> <4CEA89E8.5090107@voidspace.org.uk> <20101122163722.7e96d123@pitrou.net> <4CEA9584.7040301@avl.com> <20101122172440.77d27ed5@pitrou.net> <20101122164654.2109.588145158.divmod.xquotient.165@localhost.localdomain> <4CEBC6BD.9060402@voidspace.org.uk> <4CED0557.9090101@voidspace.org.uk> <4CED4E34.5060400@voidspace.org.uk> <4CF1706E.5030503@g.nevcal.com> <1D372F35-B455-4982-997B-2C54A7D56741@gmail.com> <4CF28310.7070304@voidspace.org.uk> <4CF41CF3.7040001@canterbury.ac.nz> Message-ID: On Tue, Nov 30, 2010 at 7:36 AM, Greg Ewing wrote: > I don't see how the grouping can be completely separated > from the value-naming. If the named values are to be > subclassed from the base values, then you want all the > members of a group to belong to the *same* subclass. > You can't get that by treating each named value on its > own and then trying to group them together afterwards. Note that my sample implementation cached the created types, so that (for example) there was only ever one "Named" type (my implementation wasn't quite kosher in that respect, since functools.lru_cache has a non-optional size limit - setting maxsize to float('inf') deals with that). A grouping API would use either single or multiple inheritance to create members that supported both the naming aspects as well as the grouping aspects. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Tue Nov 30 04:46:33 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 29 Nov 2010 22:46:33 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF4248B.1060409@pearwood.info> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF4248B.1060409@pearwood.info> Message-ID: On Mon, Nov 29, 2010 at 5:09 PM, Steven D'Aprano wrote: .. > But in any case, please don't conflate the question of whether Python should > accept j and/or i for complex numbers with the question of supporting > non-arabic numerals. The two issues are unrelated. The two issues are related because they are both about how strict numerical constructors should be. If we want to accept wide variations in how numbers can be spelled, then surely using i for the imaginary unit is much more common than using ? for the digit 7. I see two problems with supporting non-ascii spellings: 1. Support costs. 2. User confusion. The two are related because when users are confused, they will report invalid bugs when Python does not meet their expectations. For example, why >>> int('???', 10) 123 works, but >>> int('??????', 16) Traceback (most recent call last): .. UnicodeEncodeError: 'decimal' codec can't encode character '\uff21' in position 3: invalid decimal Unicode string does not? And if 'decimal' is a codec, why >>> '123'.encode('decimal') Traceback (most recent call last): ... LookupError: unknown encoding: decimal Before anyone suggests that int(.., 16) should consult the new Hex_Digit property in the UCD, let me remind that int() supports bases from 2 through 36. I thought Python design was primarily driven by practicality. Here the only plausible argument that one can make is that if Unicode says it is a digit, we should treat it as a digit. Purity over practicality. In practical terms, UCD comes at a price. The unicodedata module size is over 700K on my machine. This is almost half the size of the python executable and by far the largest extension module. (only CJK encodings come close.) Making builtins depend on the largest extension module for operation does not strike me as sound design. From stephen at xemacs.org Tue Nov 30 05:20:11 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 30 Nov 2010 13:20:11 +0900 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF3F82D.2040000@egenix.com> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF3F82D.2040000@egenix.com> Message-ID: <87d3pn5tok.fsf@uwakimon.sk.tsukuba.ac.jp> M.-A. Lemburg writes: > Just because ASCII-proponents may have a hard time reading such > literals, That's not the point. > doesn't mean that script users have the same trouble. The script users may have no trouble reading them, but that doesn't mean it's not a YAGNI. In Japanese, it's a YAGNI except in addresses on New Year cards and in dates, which could be handled by specialized modules, or by a generic module for extracting numeric information from general (as opposed to program) text. Neither of those is likely to appear in program text in context where they would be used as a numeric literal. In fact, Python *does* consider it a YAGNI for Han! Although my apartment number would be written "???" on a New Year card, Python won't parse it as 704: unicodedata considers those digits to be Lo, except for "?" which fails anyway because it's Nl, not Nd. (To add insult to injury, it doesn't even return numeric values for those characters, even though any Han-user would consider them numeric when used in isolation, except that Japanese would be likely to consider "?" to be the non-numeric "maru" symbol, ie, circle, meaning "OK"!) The whole concept of numeric in Unicode is a mess; why import that mess into Python? Can you give any examples where people do computation, keep books, or do nuclear physics in non-Arabic numerals? I suppose Arabic users might, but even there I suspect not. From stephen at xemacs.org Tue Nov 30 05:39:21 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 30 Nov 2010 13:39:21 +0900 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF4248B.1060409@pearwood.info> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF4248B.1060409@pearwood.info> Message-ID: <87bp575ssm.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > But in any case, please don't conflate the question of whether Python > should accept j and/or i for complex numbers with the question of > supporting non-arabic numerals. The two issues are unrelated. Different, yes, unrelated, no. They're both about whether variant forms of universally used literals should be allowed in a programming language, or whether only the canonical form is allowed. Note that *nobody* is saying that Python should have no facility for parsing these numbers, only that by default literal decimal numerals should be encoded as ASCII digits. For example, I would not object to int() getting a Boolean flag meaning "consult unicodedata for non-ASCII digits", just as it has an optional parameter meaning "decode in base other than 10".[1] OTOH, until somebody says "Yes, in Mecca the bazaar traders keep books on their Lenovos using ISO-8859-6 numerals, and it would be painful for them to switch to what we call 'Arabic' numerals", I'm going to consider it a YAGNI. Just as even though mathematicians clearly prefer "i" as the imaginary unit, there's not enough pain involved in them switching to "j" to make it worth supporting both. (BTW, my first reaction to the "j" notation was "cool, Python supports quaternions out of the box!" It took only a second or so to return to reality, but that was my first reaction.) Footnotes: [1] That might not be a good idea on other grounds, but in principle I would be OK with such built-ins accepting non-ASCII digits on request. From merwok at netwok.org Tue Nov 30 07:33:51 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Tue, 30 Nov 2010 07:33:51 +0100 Subject: [Python-Dev] PEP 291 versus Python 3 Message-ID: <4CF49ACF.6070904@netwok.org> Good morning python-dev, PEP 291 (Backward Compatibility for Standard Library) does not seem to take Python 3 into account. Is this PEP only relevant for the 2.7 branch?* If it?s supposed to apply to 3.x too, despite the view that 3.0 was a clean break, what does it mean to have a module that is developed in the py3k branch and should retain compatibility with 2.3 or 1.5.2? * Tarek?s interpretation: ?The 2.x needs to stay 2.3 compatible so we should keep the 3.x as similar as possible for bugfixes.? In the particular case of distutils (should be compatible with 2.3), we (including I) have been lax. Our tests for example use modern unittest features like skips, which makes them not runnable on old Pythons. I am very uncomfortable with code that seems to run fine but which tests (however few) cannot be run, so I think I?ll have to trade the skips for old-style ?return? statements. The other way of solving that is to change the compat policy. If I remember correctly, the rationale for code compat in distutils is that people may copy distutils from Python x.y to their install of x.y-n; I don?t know if this is still an active practice, and if it is, I don?t know if it should be supported, considering that distutils2 (compatible with 2.4+ and available from PyPI) is coming. Regards From regebro at gmail.com Tue Nov 30 09:10:37 2010 From: regebro at gmail.com (Lennart Regebro) Date: Tue, 30 Nov 2010 09:10:37 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: Message-ID: On Sun, Nov 28, 2010 at 21:24, Alexander Belopolsky wrote: > While we have little choice but to follow UCD in defining > str.isidentifier(), I think Python can promise users more stability in > what it treats as space or as a digit in its builtins. Why? I can see this is a problem if one character that earlier was allowed no longer is. That breaks backwards compatibility. This doesn't. >>>> float('????.??') > 1234.56 > > is more important than to assure users that once their program > accepted some text as a number, they can assume that the text is > ASCII. *I* think it is more important. In python 3, you can never ever assume anything is ASCII any more. ASCII is practically dead an buried as far as Python goes, unless you explicitly encode to it. > def deposit(self, amountstr): > self.balance += float(amountstr) > audit_log("Deposited: " + amountstr) > > Auditor: > > $ cat numbered-account.log > Deposited: ?????.?? That log reasonably should be in UTF-8 or something else, in which case this is not a problem. And that's ignoring that it makes way more sense to log the numerical amount. -- Lennart Regebro: http://regebro.wordpress.com/ Python 3 Porting: http://python3porting.com/ +33 661 58 14 64 From hagen at zhuliguan.net Tue Nov 30 09:15:54 2010 From: hagen at zhuliguan.net (=?ISO-8859-1?Q?Hagen_F=FCrstenau?=) Date: Tue, 30 Nov 2010 09:15:54 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF41785.5020807@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF41785.5020807@v.loewis.de> Message-ID: >> During PEP 3003 discussion, it was suggested to handle it on a case by >> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP >> 3003. > > It's covered by "As the standard library is not directly tied to the > language definition it is not covered by this moratorium." How is this restricted to the stdlib if it defines the set of valid identifiers? - Hagen From stephen at xemacs.org Tue Nov 30 09:23:10 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 30 Nov 2010 17:23:10 +0900 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: Message-ID: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> Lennart Regebro writes: > *I* think it is more important. In python 3, you can never ever assume > anything is ASCII any more. Sure you can. In Python program text, all keywords will be ASCII (English, even, though it may be en_NL.UTF-8) for the forseeable future. I see no reason not to make a similar promise for numeric literals. I see no good reason to allow compatibility full-width Japanese "ASCII" numerals or Arabic cursive numerals in "for i in range(...)" for example. As soon as somebody gives an example of a culture, however minor, that uses computers but actively prefers to use non-ASCII numerals to express numbers in an IT context, I'll review my thinking. But at the moment it's 101% YAGNI. From sylvain.thenault at logilab.fr Tue Nov 30 09:34:18 2010 From: sylvain.thenault at logilab.fr (Sylvain =?utf-8?B?VGjDqW5hdWx0?=) Date: Tue, 30 Nov 2010 09:34:18 +0100 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> <20101129115311.GD18888@lupus.logilab.fr> Message-ID: <20101130083418.GB4157@lupus.logilab.fr> On 29 novembre 14:21, Ron Adam wrote: > On 11/29/2010 01:22 PM, Brett Cannon wrote: > >Considering these semantics changed between Python 2 and 3 w/o a > >discernable benefit (I would consider it a negative as finding a > >module should not be impacted by syntactic correctness; the full act > >of importing should be the only thing that cares about that), I would > >consider it a bug that should be filed. > > The output of imp.find_module() returns an open file io object, and > it's output feeds directly into to imp.load_module(). > > >>> imp.find_module('pydoc') > (<_io.TextIOWrapper name=4 encoding='utf-8'>, > '/usr/local/lib/python3.2/pydoc.py', ('.py', 'U', 1)) > > So I think the imp.find_module() is suppose to be used when you *do* > want to do the full act of importing and not for just finding out if > or where module xyz exists. in python 2, find_module was usable for such usage, and this is a needed api for a tool like pylint. Is there another way to do so with python 3? -- Sylvain Th?nault LOGILAB, Paris (France) Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations D?veloppement logiciel sur mesure: http://www.logilab.fr/services CubicWeb, the semantic web framework: http://www.cubicweb.org From cornsea at gmail.com Tue Nov 30 09:41:19 2010 From: cornsea at gmail.com (haiyang kang) Date: Tue, 30 Nov 2010 16:41:19 +0800 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: hi, I agree with this. I never seen any man in China using chinese number literals (at least two kinds:?, ?, same meaning with 1) in Python program, except UI output. They can do some mappings when want to output these non-ascii numbers. Example: if 1: print "?" I think it is a little ugly to have code like this: num = float("?.?"), expected result is: num = 1.1 br, khy On Tue, Nov 30, 2010 at 4:23 PM, Stephen J. Turnbull wrote: > Lennart Regebro writes: > > ?> *I* think it is more important. In python 3, you can never ever assume > ?> anything is ASCII any more. > > Sure you can. ?In Python program text, all keywords will be ASCII > (English, even, though it may be en_NL.UTF-8) for the forseeable > future. > > I see no reason not to make a similar promise for numeric literals. ?I > see no good reason to allow compatibility full-width Japanese "ASCII" > numerals or Arabic cursive numerals in "for i in range(...)" for > example. > > As soon as somebody gives an example of a culture, however minor, that > uses computers but actively prefers to use non-ASCII numerals to > express numbers in an IT context, I'll review my thinking. ?But at the > moment it's 101% YAGNI. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/cornsea%40gmail.com > From ziade.tarek at gmail.com Tue Nov 30 10:14:20 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 30 Nov 2010 10:14:20 +0100 Subject: [Python-Dev] PEP 291 versus Python 3 In-Reply-To: <4CF49ACF.6070904@netwok.org> References: <4CF49ACF.6070904@netwok.org> Message-ID: On Tue, Nov 30, 2010 at 7:33 AM, ?ric Araujo wrote: > Good morning python-dev, > > PEP 291 (Backward Compatibility for Standard Library) does not seem to > take Python 3 into account. ?Is this PEP only relevant for the 2.7 > branch?* ?If it?s supposed to apply to 3.x too, despite the view that > 3.0 was a clean break, what does it mean to have a module that is > developed in the py3k branch and should retain compatibility with 2.3 or > 1.5.2? > > * Tarek?s interpretation: ?The 2.x needs to stay 2.3 compatible > ?so we should keep the 3.x as similar as possible for bugfixes.? > > In the particular case of distutils (should be compatible with 2.3), we > (including I) have been lax. ?Our tests for example use modern unittest > features like skips, which makes them not runnable on old Pythons. ?I am > very uncomfortable with code that seems to run fine but which tests > (however few) cannot be run, so I think I?ll have to trade the skips for > old-style ?return? statements. You shouldn't be uncomfortable with the current state of distutils and try to improve its tests (or improve any other nasty stuff you'll find in that code) Distutils is dead code. All we have to do is the bare minimum maintenance. Everything else is a waste of time. >?The other way of solving that is to > change the compat policy. ?If I remember correctly, the rationale for > code compat in distutils is that people may copy distutils from Python > x.y to their install of x.y-n; I don?t know if this is still an active > practice, and if it is, I don?t know if it should be supported, > considering that distutils2 (compatible with 2.4+ and available from > PyPI) is coming. Again, don't worry about these rules in Distutils now. The only rule that now apply to Distutils is that we do only bug fixing, and we should not waste our precious time to do other stuff in there. Plain python tests are fine for what we want to do and simplify our forward ports and backports. One thing we should do though, is fix those bugs in Distutils2 first when they exist there too. I really appreciate all the hard work your are doing in triaging the issues and bug fixing by the way ! Tarek From emile.anclin at logilab.fr Tue Nov 30 10:39:29 2010 From: emile.anclin at logilab.fr (Emile Anclin) Date: Tue, 30 Nov 2010 10:39:29 +0100 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: References: <201011251530.23947.emile.anclin@logilab> <20101129115311.GD18888@lupus.logilab.fr> Message-ID: <201011301039.30033.emile.anclin@logilab> On Monday 29 November 2010 20:22:22 Brett Cannon wrote: > > Considering these semantics changed between Python 2 and 3 w/o a > discernable benefit (I would consider it a negative as finding a > module should not be impacted by syntactic correctness; the full act > of importing should be the only thing that cares about that), I would > consider it a bug that should be filed. ok, here it is : http://bugs.python.org/issue10588 Since I did not understand all of it, I just quoted Brett Cannon in the ticket. -- Emile Anclin http://www.logilab.fr/ http://www.logilab.org/ Informatique scientifique & et gestion de connaissances From steve at pearwood.info Tue Nov 30 13:59:49 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 30 Nov 2010 23:59:49 +1100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CF4F545.5030902@pearwood.info> haiyang kang wrote: > hi, > > I agree with this. > > I never seen any man in China using chinese number literals (at > least two kinds:?, ?, same meaning with 1) > in Python program, except UI output. > > They can do some mappings when want to output these non-ascii numbers. > Example: if 1: print "?" > > I think it is a little ugly to have code like this: num = > float("?.?"), expected result is: num = 1.1 I don't expect that anyone would sensibly write code like that, except for testing. You wouldn't write num = float("1.1") instead of just num = 1.1 either. But you should be able to write: text = input("Enter a number using your preferred digits: ") num = float(text) without caring whether the user enters ?.? or 1.1 or something else. -- Steven From fuzzyman at voidspace.org.uk Tue Nov 30 14:09:16 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 30 Nov 2010 13:09:16 +0000 Subject: [Python-Dev] PEP 291 versus Python 3 In-Reply-To: <4CF49ACF.6070904@netwok.org> References: <4CF49ACF.6070904@netwok.org> Message-ID: <4CF4F77C.4000308@voidspace.org.uk> On 30/11/2010 06:33, ?ric Araujo wrote: > Good morning python-dev, > > PEP 291 (Backward Compatibility for Standard Library) does not seem to > take Python 3 into account. Is this PEP only relevant for the 2.7 > branch?* If it?s supposed to apply to 3.x too, despite the view that > 3.0 was a clean break, what does it mean to have a module that is > developed in the py3k branch and should retain compatibility with 2.3 or > 1.5.2? PEP 291 is very old and should probably be retired. I don't think anyone is maintaining standard libraries in py3k that are also compatible with Python 2.anything. (At least not in a single codebase.) For Python 2.7 that may not be true, but for Python 3 I think we can start with a clean slate on compatibility. > * Tarek?s interpretation: ?The 2.x needs to stay 2.3 compatible > so we should keep the 3.x as similar as possible for bugfixes.? > > In the particular case of distutils (should be compatible with 2.3), we > (including I) have been lax. Our tests for example use modern unittest > features like skips, which makes them not runnable on old Pythons. They can be run on old Pythons with unittest2. This is what distutils2 is doing. > I am > very uncomfortable with code that seems to run fine but which tests > (however few) cannot be run, so I think I?ll have to trade the skips for > old-style ?return? statements. The other way of solving that is to > change the compat policy. This is only an issue for distutils in Python 2.7 right? Maintaining the compat policy for that will be a short-lived pain, and distutils itself is getting only infrequent bugfixes *anyway*, right? I defer to Tarek on that particular decision. All the best, Michael > If I remember correctly, the rationale for > code compat in distutils is that people may copy distutils from Python > x.y to their install of x.y-n; I don?t know if this is still an active > practice, and if it is, I don?t know if it should be supported, > considering that distutils2 (compatible with 2.4+ and available from > PyPI) is coming. > > Regards > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From steve at pearwood.info Tue Nov 30 14:23:22 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 01 Dec 2010 00:23:22 +1100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CF4FACA.8040900@pearwood.info> Stephen J. Turnbull wrote: > Lennart Regebro writes: > > > *I* think it is more important. In python 3, you can never ever assume > > anything is ASCII any more. > > Sure you can. In Python program text, all keywords will be ASCII > (English, even, though it may be en_NL.UTF-8) for the forseeable > future. > > I see no reason not to make a similar promise for numeric literals. I > see no good reason to allow compatibility full-width Japanese "ASCII" > numerals or Arabic cursive numerals in "for i in range(...)" for > example. I agree with you that numeric *literals* should be restricted to the ASCII digits. I don't think anyone here is arguing differently -- if they are, they should speak up and try to make the case for allowing numeric literals in arbitrary scripts. Python doesn't currently allow non-ASCII numeric literals, and even if such a change were desirable, it would run up against the moratorium. So let's just forget the specter of code like: x = math.sqrt(????.?? ** ?.?) It ain't gonna happen :) But I think there is a good case for allowing the constructors int, float and complex to continue to accept numeric *strings* with non-ASCII digits. The code already exists, there's probably people out there who rely on it, and in the absence of any convincing demonstration that the existing behaviour is causing widespread difficulty, we should leave well-enough alone. Various people have suggested that there should be a function in the locale module that handles numeric string input in non-ASCII digits. This is a de facto admission that there are use-cases for taking user input like the string '?' and turning it into the int 3. Python can already do this, and has been able to for many years: [steve at sylar ~]$ python2.4 Python 2.4.6 (#1, Mar 30 2009, 10:08:01) [GCC 4.1.2 20070925 (Red Hat 4.1.2-27)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> int(u'?') 3 It seems to me that there's no need to move this functionality into locale. -- Steven From solipsis at pitrou.net Tue Nov 30 14:32:54 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 14:32:54 +0100 Subject: [Python-Dev] Python and the Unicode Character Database References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF4FACA.8040900@pearwood.info> Message-ID: <20101130143254.1964e4a8@pitrou.net> On Wed, 01 Dec 2010 00:23:22 +1100 Steven D'Aprano wrote: > > But I think there is a good case for allowing the constructors int, > float and complex to continue to accept numeric *strings* with non-ASCII > digits. The code already exists, there's probably people out there who > rely on it, and in the absence of any convincing demonstration that the > existing behaviour is causing widespread difficulty, we should leave > well-enough alone. +1 > It seems to me that there's no need to move this functionality into locale. Not only, but moving it into locale won't make it easier to maintain anyway. Regards Antoine. From solipsis at pitrou.net Tue Nov 30 14:38:22 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 14:38:22 +0100 Subject: [Python-Dev] Module size References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF4248B.1060409@pearwood.info> Message-ID: <20101130143822.40a827de@pitrou.net> On Mon, 29 Nov 2010 22:46:33 -0500 Alexander Belopolsky wrote: > > In practical terms, UCD comes at a price. The unicodedata module size > is over 700K on my machine. This is almost half the size of the > python executable and by far the largest extension module. (only CJK > encodings come close.) Making builtins depend on the largest > extension module for operation does not strike me as sound design. Well, do they depend on it? _PyUnicode_EncodeDecimal seems to depend only on Objects/unicodectype.c. $ size Objects/unicode*.o text data bss dec hex filename 60398 0 0 60398 ebee Objects/unicodectype.o 130440 13559 2208 146207 23b1f Objects/unicodeobject.o Antoine. From alexander.belopolsky at gmail.com Tue Nov 30 15:18:13 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 09:18:13 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF4F545.5030902@pearwood.info> References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF4F545.5030902@pearwood.info> Message-ID: On Tue, Nov 30, 2010 at 7:59 AM, Steven D'Aprano wrote: .. > But you should be able to write: > > text = input("Enter a number using your preferred digits: ") > num = float(text) > > without caring whether the user enters ?.? or 1.1 or something else. > I find it ironic that people who argue for preservation of the current behavior do it without checking what it actually is: >>> float('?.?') .. UnicodeEncodeError: 'decimal' codec can't encode character '\u4e00' .. This one of the biggest problems with this feature. It does not fit user's expectations. Even the original author of the decimal "codec" expected the above to work. [1] > Python can already do this, and has been able to for many years: > >>> int(u'?') > 3 but you can do this without support from int() as well: >>> import unicodedata >>> unicodedata.digit('?') 3 and for Unihan numbers, you can do >>> unicodedata.numeric('?') 1.0 and >>> unicodedata.numeric('?') 8.0 and if you are so inclined, >>> [unicodedata.numeric(c) for c in "? ? ? ? ?".split()] [10000.0, 5000.0, 0.6, 0.875, 90000.0] Do you want to see all these supported by float()? [1] "makeunicodedata.py does not support Unihan digit data" http://bugs.python.org/issue10575 From alexander.belopolsky at gmail.com Tue Nov 30 15:32:38 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 09:32:38 -0500 Subject: [Python-Dev] Module size In-Reply-To: <20101130143822.40a827de@pitrou.net> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF4248B.1060409@pearwood.info> <20101130143822.40a827de@pitrou.net> Message-ID: On Tue, Nov 30, 2010 at 8:38 AM, Antoine Pitrou wrote: > On Mon, 29 Nov 2010 22:46:33 -0500 > Alexander Belopolsky wrote: >> >> In practical terms, UCD comes at a price. ?The unicodedata module size >> is over 700K on my machine. ?This is almost half the size of the >> python executable and by far the largest extension module. (only CJK >> encodings come close.) ?Making builtins depend on the largest >> extension module for operation does not strike me as sound design. > > Well, do they depend on it? _PyUnicode_EncodeDecimal seems to depend > only on Objects/unicodectype.c. My mistake. That was a late night post. I wonder why unicodedata.so is so big then. It must be character names: $ python -v >>> '\N{DIGIT ONE}' dlopen("/.../unicodedata.so", 2); import unicodedata # dynamically loaded from /.../unicodedata.so '1' From solipsis at pitrou.net Tue Nov 30 15:41:48 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 15:41:48 +0100 Subject: [Python-Dev] Module size In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF4248B.1060409@pearwood.info> <20101130143822.40a827de@pitrou.net> Message-ID: <1291128108.3538.10.camel@localhost.localdomain> Le mardi 30 novembre 2010 ? 09:32 -0500, Alexander Belopolsky a ?crit : > On Tue, Nov 30, 2010 at 8:38 AM, Antoine Pitrou wrote: > > On Mon, 29 Nov 2010 22:46:33 -0500 > > Alexander Belopolsky wrote: > >> > >> In practical terms, UCD comes at a price. The unicodedata module size > >> is over 700K on my machine. This is almost half the size of the > >> python executable and by far the largest extension module. (only CJK > >> encodings come close.) Making builtins depend on the largest > >> extension module for operation does not strike me as sound design. > > > > Well, do they depend on it? _PyUnicode_EncodeDecimal seems to depend > > only on Objects/unicodectype.c. > > My mistake. That was a late night post. I wonder why unicodedata.so > is so big then. > > It must be character names: > > $ python -v > >>> '\N{DIGIT ONE}' > dlopen("/.../unicodedata.so", 2); > import unicodedata # dynamically loaded from /.../unicodedata.so > '1' From a quick peek using hexdump, character names seem to only account for 1/4 of the module size. That said, I don't think the size is very important. For any non-trivial Python application, the size of unicodedata will be negligible compared to the size of Python objects. Regards Antoine. From tlesher at gmail.com Tue Nov 30 15:48:32 2010 From: tlesher at gmail.com (Tim Lesher) Date: Tue, 30 Nov 2010 09:48:32 -0500 Subject: [Python-Dev] Module size In-Reply-To: <1291128108.3538.10.camel@localhost.localdomain> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <4CF4248B.1060409@pearwood.info> <20101130143822.40a827de@pitrou.net> <1291128108.3538.10.camel@localhost.localdomain> Message-ID: On Tue, Nov 30, 2010 at 09:41, Antoine Pitrou wrote: > That said, I don't think the size is very important. For any non-trivial > Python application, the size of unicodedata will be negligible compared > to the size of Python objects. That depends very much on the platform and the application. For our embedded use of Python, static data size (like the text segment of a shared object) is far dearer than the heap space used by Python objects, which is why we've had to excise both the UCD and the CJK codecs in our builds. -- Tim Lesher From cornsea at gmail.com Tue Nov 30 15:56:33 2010 From: cornsea at gmail.com (haiyang kang) Date: Tue, 30 Nov 2010 22:56:33 +0800 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF4F545.5030902@pearwood.info> References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF4F545.5030902@pearwood.info> Message-ID: > But you should be able to write: > > text = input("Enter a number using your preferred digits: ") > num = float(text) > > without caring whether the user enters ?.? or 1.1 or something else. yes. from logical point of view, this can happen. But i really doubt that if really there are users who would like to input number like that, means that they first use google pinyin method to input ?, then change to english input method to input . , then change to google pinyin again for the other ?; or maybe you mean they input the whole ?.? words with google pinyin input method. To input 1, users only need to type one time keyboard, but to input ?, they need to type three times (yi SPACE). Of course, users can also input something accidentally, but we just need to give them some kind reminders. At least coders in my around will restrain their system users to input numbers with ASCII, and seems that users are still happy with the ASCII type numbers :). br, khy From alexander.belopolsky at gmail.com Tue Nov 30 16:05:42 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 10:05:42 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF41785.5020807@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF41785.5020807@v.loewis.de> Message-ID: On Mon, Nov 29, 2010 at 4:13 PM, "Martin v. L?wis" wrote: >> - Should Python documentation refer to the specific version of Unicode >> that it supports? > > You mean, mention it somewhere? Sure (although it would be nice if the > documentation generator would automatically extract it from the source, > just as it extracts the Python version number). > > Of course, such mentioning should explain that this is specific to > CPython, and not an aspect of Python-the-language. > >> Current documentation refers to old versions. ?Should version be >> updated or removed to imply the latest? > > What specific reference are you referring to? > I found two places: A reference to Unicode 3.0 (!) in the Data Model section and a reference to 5.2.0 in unicodedata docs. See http://mail.python.org/pipermail/docs/2010-November/002074.html >> - How UCD updates should be handled during the language moratorium? > > It's clearly not affected. > This is not what Guido said last year: """ > One question: > > There are currently number of patch waiting on the tracker for > additional Unicode feature support and it's also likely that we'll > want to upgrade to a more recent Unicode version within the > next few years. > > How would such indirect changes be seen under the moratorium ? That would fall under the Case-by-Case Exemptions section. "Within the next few years" sounds like it might well wait until the moratorium is ended though. :-) """ http://mail.python.org/pipermail/python-dev/2009-November/093666.html I don't see it as a big deal, but technically speaking, with Unicode 6.0 changing properties of two characters to become identifiers Python language definition is affected. For example, an alternative implementation based on 5.2.0 will not accept a valid CPython program that uses one of these characters. >> During PEP 3003 discussion, it was suggested to handle it on a case by >> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP >> 3003. > > It's covered by "As the standard library is not directly tied to the > language definition it is not covered by this moratorium." > See above. Also, it has been suggested that semantics of built-ins cannot change. (If that was so, it would put int('????') debate to rest at least for the time being.:-) >> ?Should this upgrade be backported to 2.7? > > No, it's a new feature. > Given that 2.7 will be maintained for 5 years and arguably Unicode Consortium takes backward compatibility very seriously, wouldn't it make sense to consider a backport at some point? I am sure we will soon see a bug report that the following does not work in 2.7: :-) >>> ord('\N{CAT FACE WITH WRY SMILE}') 128572 >> - How specific should library reference manual be in defining methods >> affected by UCD such as str.upper()? > > It should specify what this actually does in Unicode terminology > (probably in addition to a layman's rephrase of that) > I opened an issue for this: http://bugs.python.org/issue10587 >> .. For example, if '\UXXXXXXXX'.isalpha() returns true >> in one implementation, can it return false in another? > > Implementations are free to use any version of the UCD. I was more concerned about wide an narrow unicode CPython builds. Is it a bug that '\UXXXXXXXX'.isalpha() may disagree even when the two implementations are based on the same version of UCD? Thanks for your answers. From alexander.belopolsky at gmail.com Tue Nov 30 16:11:24 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 10:11:24 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF4F545.5030902@pearwood.info> Message-ID: On Tue, Nov 30, 2010 at 9:56 AM, haiyang kang wrote: >> But you should be able to write: >> >> text = input("Enter a number using your preferred digits: ") >> num = float(text) >> >> without caring whether the user enters ?.? or 1.1 or something else. > > yes. from logical point of view, this can happen. ... Please stop discussing a non-feature. Python's float *does not* accept ' ?.?'. This was reported as a bug and closed as invalid. See "makeunicodedata.py does not support Unihan digit data" http://bugs.python.org/issue10575 From barry at python.org Tue Nov 30 16:35:31 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Nov 2010 10:35:31 -0500 Subject: [Python-Dev] PEP 291 versus Python 3 In-Reply-To: <4CF4F77C.4000308@voidspace.org.uk> References: <4CF49ACF.6070904@netwok.org> <4CF4F77C.4000308@voidspace.org.uk> Message-ID: <20101130103531.54d79465@mission> On Nov 30, 2010, at 01:09 PM, Michael Foord wrote: >PEP 291 is very old and should probably be retired. I don't think anyone is >maintaining standard libraries in py3k that are also compatible with Python >2.anything. (At least not in a single codebase.) I agree. I think we should change the status of PEP 291 to Final, and add a few words to make it clear it applies only to Python 2. Since Neal owns the PEP, he should get first crack at doing the update, but I volunteer to make those changes if he declines (or does not respond). We may eventually need a similar document for Python 3, but it should be a new PEP. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From stefan-usenet at bytereef.org Tue Nov 30 16:55:19 2010 From: stefan-usenet at bytereef.org (Stefan Krah) Date: Tue, 30 Nov 2010 16:55:19 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> <4CF4F545.5030902@pearwood.info> Message-ID: <20101130155519.GA23354@yoda.bytereef.org> Alexander Belopolsky wrote: > On Tue, Nov 30, 2010 at 9:56 AM, haiyang kang wrote: > >> But you should be able to write: > >> > >> text = input("Enter a number using your preferred digits: ") > >> num = float(text) > >> > >> without caring whether the user enters ?.? or 1.1 or something else. > > > > yes. from logical point of view, this can happen. ... > > Please stop discussing a non-feature. Python's float *does not* > accept ' ?.?'. This was reported as a bug and closed as invalid. That seems irrelevant to me. One of the main topics of this thread is whether actual native speakers would be happy with ascii-only input for float(). haiyang kang confirmed that this is the case. I hope that more local speakers will contribute their views. Stefan Krah From alexander.belopolsky at gmail.com Tue Nov 30 17:40:19 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 11:40:19 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> Message-ID: On Mon, Nov 29, 2010 at 2:38 PM, Alexander Belopolsky wrote: .. >> Still, if it's not detrimental and it it's not difficult to support, >> then why do you care? > > It is difficult to support. ?A fix for issue10557 would be much > simpler if we did not support non-European digits. ?I now added a > patch that handles non-ascii digits, so you can see what's involved. > Note that when Unicode Consortium inevitably adds more Nd characters > to the non-BMP planes, we will have to add surrogate pairs' support to > this code. > It turns out that this did in fact happen: # Newly assigned in Unicode 3.1.0 (March, 2001) .. 1D7CE..1D7FF ; 3.1 # [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE See http://unicode.org/Public/UNIDATA/DerivedAge.txt And of course, >>> unicodedata.digit('\U0001D7CE') 0 but >>> int('\U0001D7CE') .. UnicodeEncodeError: 'decimal' codec can't encode character '\ud835' .. on a narrow Unicode build. (Note the character reported in the error message!) If you think non-ASCII digits are not difficult to support, please contribute to the following tracker issues: http://bugs.python.org/issue10581 (Review and document string format accepted in numeric data type constructors) http://bugs.python.org/issue10557 (Malformed error message from float()) http://bugs.python.org/issue10435 (Document unicode C-API in reST - Specifically, PyUnicode_EncodeDecimal) http://bugs.python.org/issue8646 (PyUnicode_EncodeDecimal is undocumented) http://bugs.python.org/issue6632 (Include more fullwidth chars in the decimal codec) and back to the issue of user confusion http://bugs.python.org/issue652104 [closed/invalid] (int(u"\u1234") raises UnicodeEncodeError by Guido van Rossum) From fuzzyman at voidspace.org.uk Tue Nov 30 18:40:52 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 30 Nov 2010 17:40:52 +0000 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> Message-ID: <4CF53724.8090000@voidspace.org.uk> On 30/11/2010 16:40, Alexander Belopolsky wrote: > [snip...] > And of course, > >>>> unicodedata.digit('\U0001D7CE') > 0 > > but > >>>> int('\U0001D7CE') > .. > UnicodeEncodeError: 'decimal' codec can't encode character '\ud835' .. > > on a narrow Unicode build. (Note the character reported in the error message!) > > > If you think non-ASCII digits are not difficult to support, please > contribute to the following tracker issues: > Would moving this functionality to the locale module make the issues any easier to fix? Michael > http://bugs.python.org/issue10581 > (Review and document string format accepted in numeric data type constructors) > > http://bugs.python.org/issue10557 > (Malformed error message from float()) > > http://bugs.python.org/issue10435 > (Document unicode C-API in reST - Specifically, PyUnicode_EncodeDecimal) > > http://bugs.python.org/issue8646 > (PyUnicode_EncodeDecimal is undocumented) > > http://bugs.python.org/issue6632 > (Include more fullwidth chars in the decimal codec) > > and back to the issue of user confusion > > http://bugs.python.org/issue652104 [closed/invalid] > (int(u"\u1234") raises UnicodeEncodeError by Guido van Rossum) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From alexander.belopolsky at gmail.com Tue Nov 30 19:21:30 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 13:21:30 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF53724.8090000@voidspace.org.uk> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> Message-ID: On Tue, Nov 30, 2010 at 12:40 PM, Michael Foord wrote: .. >> If you think non-ASCII digits are not difficult to support, please >> contribute to the following tracker issues: >> > > Would moving this functionality to the locale module make the issues any > easier to fix? > Sure, if we code it in Python, supporting it will by much easier: def normalize_digits(s): digits = {m.group(1) for m in re.finditer('(\d)', s)} trtab = {ord(d): str(unicodedata.digit(d)) for d in digits} return s.translate(trtab) >>> normalize_digits('????.??') '1234.56' I am not sure this belongs to the locale module, however. It seems to me, something like 'unicodealgo' for unicode algorithms would be more appropriate. From solipsis at pitrou.net Tue Nov 30 19:29:52 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 19:29:52 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> Message-ID: <1291141792.8628.0.camel@localhost.localdomain> > Sure, if we code it in Python, supporting it will by much easier: > > def normalize_digits(s): > digits = {m.group(1) for m in re.finditer('(\d)', s)} > trtab = {ord(d): str(unicodedata.digit(d)) for d in digits} > return s.translate(trtab) > > >>> normalize_digits('????.??') > '1234.56' > > I am not sure this belongs to the locale module, however. It seems to > me, something like 'unicodealgo' for unicode algorithms would be more > appropriate. It could simply be in unicodedata if you split the implementation into a core C part and some Python bits. Regards Antoine. From alexander.belopolsky at gmail.com Tue Nov 30 19:59:29 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 30 Nov 2010 13:59:29 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <1291141792.8628.0.camel@localhost.localdomain> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <1291141792.8628.0.camel@localhost.localdomain> Message-ID: On Tue, Nov 30, 2010 at 1:29 PM, Antoine Pitrou wrote: .. >> I am not sure this belongs to the locale module, however. ?It seems to >> me, something like 'unicodealgo' for unicode algorithms would be more >> appropriate. > > It could simply be in unicodedata if you split the implementation into a > core C part and some Python bits. > Splitting unicodedata may not be a bad idea. There are many more pieces in UCD than covered by unicodedata. [1] Hardcoding them all into unicodedata module is hard to justify, but some are quite useful. For example, PropertyValueAliases.txt is quite useful for those like myself who cannot remember what Pd or Zl category names stand for. SpecialCasing.txt is required for proper casing, but is not currently included in Python. I would not want to change str.upper or str.title because of this, but providing the raw info to someone who wants to implement proper case mappings may not be a bad idea. Blocks.txt is certainly useful for any language-dependent processing. On the other hand, I think we should keep Unicode data and Unicode algorithms separate. And the latter may not even belong to the Python stdlib. [1] http://unicode.org/Public/UNIDATA/ From martin at v.loewis.de Tue Nov 30 20:13:01 2010 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 30 Nov 2010 20:13:01 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF41785.5020807@v.loewis.de> Message-ID: <4CF54CBD.9030703@v.loewis.de> Am 30.11.2010 09:15, schrieb Hagen F?rstenau: >>> During PEP 3003 discussion, it was suggested to handle it on a case by >>> case basis, but I don't see discussion of the upgrade to 6.0.0 in PEP >>> 3003. >> >> It's covered by "As the standard library is not directly tied to the >> language definition it is not covered by this moratorium." > > How is this restricted to the stdlib if it defines the set of valid > identifiers? The language does not change. The language specification says Python 3.0 introduces additional characters from outside the ASCII range (see PEP 3131). For these characters, the classification uses the version of the Unicode Character Database as included in the unicodedata module. That remains unchanged. It was a deliberate design decision of PEP 3131 to not codify a fixed set of characters that can be used in identifiers. Regards, Martin From martin at v.loewis.de Tue Nov 30 20:16:49 2010 From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 30 Nov 2010 20:16:49 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF53724.8090000@voidspace.org.uk> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> Message-ID: <4CF54DA1.5080900@v.loewis.de> > Would moving this functionality to the locale module make the issues any > easier to fix? You could delegate it to the C library, so: yes. Regards, Martin From solipsis at pitrou.net Tue Nov 30 20:23:13 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 20:23:13 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF54DA1.5080900@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <4CF54DA1.5080900@v.loewis.de> Message-ID: <1291144993.8628.1.camel@localhost.localdomain> Le mardi 30 novembre 2010 ? 20:16 +0100, "Martin v. L?wis" a ?crit : > > Would moving this functionality to the locale module make the issues any > > easier to fix? > > You could delegate it to the C library, so: yes. I hope you don't suggest delegating it to the C locale functions. Do you? From martin at v.loewis.de Tue Nov 30 20:40:54 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 30 Nov 2010 20:40:54 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <1291144993.8628.1.camel@localhost.localdomain> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <4CF54DA1.5080900@v.loewis.de> <1291144993.8628.1.camel@localhost.localdomain> Message-ID: <4CF55346.1040108@v.loewis.de> Am 30.11.2010 20:23, schrieb Antoine Pitrou: > Le mardi 30 novembre 2010 ? 20:16 +0100, "Martin v. L?wis" a ?crit : >>> Would moving this functionality to the locale module make the issues any >>> easier to fix? >> >> You could delegate it to the C library, so: yes. > > I hope you don't suggest delegating it to the C locale functions. > Do you? Yes, I do. Why do you hope I don't? Regards, Martin From brett at python.org Tue Nov 30 20:41:47 2010 From: brett at python.org (Brett Cannon) Date: Tue, 30 Nov 2010 11:41:47 -0800 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> <20101129115311.GD18888@lupus.logilab.fr> Message-ID: On Mon, Nov 29, 2010 at 12:21, Ron Adam wrote: > > > On 11/29/2010 01:22 PM, Brett Cannon wrote: >> >> On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault >> ?wrote: >>> >>> On 25 novembre 11:22, Ron Adam wrote: >>>> >>>> On 11/25/2010 08:30 AM, Emile Anclin wrote: >>>>> >>>>> hello, >>>>> >>>>> working on Pylint, we have a lot of voluntary corrupted files to test >>>>> Pylint behavior; for instance >>>>> >>>>> $ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py >>>>> # -*- coding: IBO-8859-1 -*- >>>>> """ check correct unknown encoding declaration >>>>> """ >>>>> >>>>> __revision__ = '????' >>>>> >>>>> >>>>> and we try to find that module : >>>>> find_module('func_unknown_encoding', None). But python3 raises >>>>> SyntaxError >>>>> in that case ; it didn't raise SyntaxError on python2 nor does so on >>>>> our >>>>> func_nonascii_noencoding and func_wrong_encoding modules (with obvious >>>>> names) >>>>> >>>>> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) >>>>> [GCC 4.3.4] on linux2 >>>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>>> >>>>>>> >from imp import find_module >>>>>>>> >>>>>>>> find_module('func_unknown_encoding', None) >>>>> >>>>> Traceback (most recent call last): >>>>> ? File "", line 1, in >>>>> SyntaxError: encoding problem: with BOM >>>> >>>> I don't think there is a clear reason by design. ?Also try importing >>>> the same modules directly and noting the differences in the errors >>>> you get. >>> >>> IMO the point is that we can consider as a bug the fact that find_module >>> tries to somewhat read the content of the file, no? Though it seems to >>> only >>> doing this for encoding detection or like since find_module doesn't choke >>> on >>> a module containing another kind of syntax error. >>> >>> So the question is, should we deal with this in pylint/astng, or can we >>> expect >>> this to be fixed at some point? >> >> Considering these semantics changed between Python 2 and 3 w/o a >> discernable benefit (I would consider it a negative as finding a >> module should not be impacted by syntactic correctness; the full act >> of importing should be the only thing that cares about that), I would >> consider it a bug that should be filed. > > The output of imp.find_module() returns an open file io object, and it's > output feeds directly into to imp.load_module(). > >>>> imp.find_module('pydoc') > (<_io.TextIOWrapper name=4 encoding='utf-8'>, > '/usr/local/lib/python3.2/pydoc.py', ('.py', 'U', 1)) > > So I think the imp.find_module() is suppose to be used when you *do* want to > do the full act of importing and not for just finding out if or where module > xyz exists. Going with your line of argument, why can't imp.load_module be the call that figures out there is a syntax error? If you look at this from the perspective of PEP 302, finding a module has absolutely nothing to do with the validity of the found source, just that something was found somewhere which (hopefully) contains code that represents the module. From solipsis at pitrou.net Tue Nov 30 20:44:14 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 20:44:14 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF55346.1040108@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <4CF54DA1.5080900@v.loewis.de> <1291144993.8628.1.camel@localhost.localdomain> <4CF55346.1040108@v.loewis.de> Message-ID: <1291146254.8628.4.camel@localhost.localdomain> Le mardi 30 novembre 2010 ? 20:40 +0100, "Martin v. L?wis" a ?crit : > Am 30.11.2010 20:23, schrieb Antoine Pitrou: > > Le mardi 30 novembre 2010 ? 20:16 +0100, "Martin v. L?wis" a ?crit : > >>> Would moving this functionality to the locale module make the issues any > >>> easier to fix? > >> > >> You could delegate it to the C library, so: yes. > > > > I hope you don't suggest delegating it to the C locale functions. > > Do you? > > Yes, I do. Why do you hope I don't? Because we all know how locale is a pile of cr*p, both in specification and in implementations. Our unit tests for it are a clear proof of that. Actually, I remember you saying that locale should ideally be replaced with a wrapper around the ICU library. Regards Antoine. From brett at python.org Tue Nov 30 20:46:07 2010 From: brett at python.org (Brett Cannon) Date: Tue, 30 Nov 2010 11:46:07 -0800 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: <20101130083418.GB4157@lupus.logilab.fr> References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> <20101129115311.GD18888@lupus.logilab.fr> <20101130083418.GB4157@lupus.logilab.fr> Message-ID: On Tue, Nov 30, 2010 at 00:34, Sylvain Th?nault wrote: > On 29 novembre 14:21, Ron Adam wrote: >> On 11/29/2010 01:22 PM, Brett Cannon wrote: >> >Considering these semantics changed between Python 2 and 3 w/o a >> >discernable benefit (I would consider it a negative as finding a >> >module should not be impacted by syntactic correctness; the full act >> >of importing should be the only thing that cares about that), I would >> >consider it a bug that should be filed. >> >> The output of imp.find_module() returns an open file io object, and >> it's output feeds directly into to imp.load_module(). >> >> >>> imp.find_module('pydoc') >> (<_io.TextIOWrapper name=4 encoding='utf-8'>, >> '/usr/local/lib/python3.2/pydoc.py', ('.py', 'U', 1)) >> >> So I think the imp.find_module() is suppose to be used when you *do* >> want to do the full act of importing and not for just finding out if >> or where module xyz exists. > > in python 2, find_module was usable for such usage, and this is a needed api > for a tool like pylint. Is there another way to do so with python 3? At the moment, no. Best option would be to create an importlib.find_module function which returns a loader if the module is found, else returns None. The loader can have its get_source method called to read the source code (w/o verification). I have this planned for Python 3.3 but not 3.2 with us so close to 3.2b1. > -- > Sylvain Th?nault ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? LOGILAB, Paris (France) > Formations Python, Debian, M?th. Agiles: http://www.logilab.fr/formations > D?veloppement logiciel sur mesure: ? ? ? http://www.logilab.fr/services > CubicWeb, the semantic web framework: ? ?http://www.cubicweb.org > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org > From martin at v.loewis.de Tue Nov 30 20:55:52 2010 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 30 Nov 2010 20:55:52 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <1291146254.8628.4.camel@localhost.localdomain> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <4CF54DA1.5080900@v.loewis.de> <1291144993.8628.1.camel@localhost.localdomain> <4CF55346.1040108@v.loewis.de> <1291146254.8628.4.camel@localhost.localdomain> Message-ID: <4CF556C8.9010704@v.loewis.de> > Because we all know how locale is a pile of cr*p, both in specification > and in implementations. Our unit tests for it are a clear proof of that. I wouldn't use expletives, but rather claim that the locale module is highly platform-dependent. > Actually, I remember you saying that locale should ideally be replaced > with a wrapper around the ICU library. By that, I stand - however, I have given up the hope that this will happen anytime soon. Wrt. to local number parsing, I think that the locale module would be way better than the nonsense that Python currently does. In the locale module, somebody at least has thought about what specifically constitutes a number. The current not-ASCII-but-not-local-either approach is just useless. Maintaining a reasonable implementation is a burden, so deferring to the C library is more attractive than having to maintain an unreasonable implementation. Regards, Martin From solipsis at pitrou.net Tue Nov 30 21:11:59 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 21:11:59 +0100 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <4CF556C8.9010704@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <4CF54DA1.5080900@v.loewis.de> <1291144993.8628.1.camel@localhost.localdomain> <4CF55346.1040108@v.loewis.de> <1291146254.8628.4.camel@localhost.localdomain> <4CF556C8.9010704@v.loewis.de> Message-ID: <1291147919.8628.12.camel@localhost.localdomain> Le mardi 30 novembre 2010 ? 20:55 +0100, "Martin v. L?wis" a ?crit : > Wrt. to local number parsing, I think that the locale module would be > way better than the nonsense that Python currently does. In the locale > module, somebody at least has thought about what specifically > constitutes a number. The current not-ASCII-but-not-local-either > approach is just useless. It depends what you need. If you parse integers it's probably good enough. And it's better to have a trustable standard (unicode) than a myriad of ad-hoc, possibly buggy or incomplete, often unavailable, cultural specifications drafted by OS vendors who have no business (and no expertise) in drafting them. At least you can build more sophisticated routines on the simple information given to you by the unicode database. You cannot build anything solid on the C locale functions (and even then you are limited by various issues inherent in the locale semantics, such as the fact that it relies on process-wide state, which would only be ok, at best, for single-user applications). There's a reason that e.g. Babel (*) reimplements locale-like functionality from scratch. (*) http://pypi.python.org/pypi/Babel/ Regards Antoine. From brett at python.org Tue Nov 30 21:11:58 2010 From: brett at python.org (Brett Cannon) Date: Tue, 30 Nov 2010 12:11:58 -0800 Subject: [Python-Dev] PEP 291 versus Python 3 In-Reply-To: <20101130103531.54d79465@mission> References: <4CF49ACF.6070904@netwok.org> <4CF4F77C.4000308@voidspace.org.uk> <20101130103531.54d79465@mission> Message-ID: On Tue, Nov 30, 2010 at 07:35, Barry Warsaw wrote: > On Nov 30, 2010, at 01:09 PM, Michael Foord wrote: > >>PEP 291 is very old and should probably be retired. I don't think anyone is >>maintaining standard libraries in py3k that are also compatible with Python >>2.anything. (At least not in a single codebase.) > > I agree. Same here; I have purposefully ignored compatibility requirements because I always found those promises to be extremely annoying and somewhat painful to enforce. > ?I think we should change the status of PEP 291 to Final, and add a > few words to make it clear it applies only to Python 2. ?Since Neal owns the > PEP, he should get first crack at doing the update, but I volunteer to make > those changes if he declines (or does not respond). > I will channel Neal: "I decline and/or do not want to respond". =) > We may eventually need a similar document for Python 3, but it should be a new > PEP. I hope not. From solipsis at pitrou.net Tue Nov 30 21:13:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Nov 2010 21:13:07 +0100 Subject: [Python-Dev] ICU In-Reply-To: <4CF556C8.9010704@v.loewis.de> References: <20101128214311.092abd35@pitrou.net> <4CF2D4E9.3060607@v.loewis.de> <4CF2F067.5020705@pearwood.info> <4CF354C6.9020302@v.loewis.de> <20101129193302.115dbcd5@pitrou.net> <4CF53724.8090000@voidspace.org.uk> <4CF54DA1.5080900@v.loewis.de> <1291144993.8628.1.camel@localhost.localdomain> <4CF55346.1040108@v.loewis.de> <1291146254.8628.4.camel@localhost.localdomain> <4CF556C8.9010704@v.loewis.de> Message-ID: <1291147987.8628.13.camel@localhost.localdomain> Oh, about ICU: > > Actually, I remember you saying that locale should ideally be replaced > > with a wrapper around the ICU library. > > By that, I stand - however, I have given up the hope that this will > happen anytime soon. Perhaps this could be made a GSOC topic. Regards Antoine. From ben+python at benfinney.id.au Tue Nov 30 21:24:08 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 01 Dec 2010 07:24:08 +1100 Subject: [Python-Dev] Python and the Unicode Character Database References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <87r5e236hj.fsf@benfinney.id.au> haiyang kang writes: > I think it is a little ugly to have code like this: num = > float("?.?"), expected result is: num = 1.1 That's a straw man, though. The string need not be a literal in the program; it can be input to the program. num = float(input_from_the_external_world) Does that change your assessment of whether non-ASCII digits are used? -- \ ?The greatest tragedy in mankind's entire history may be the | `\ hijacking of morality by religion.? ?Arthur C. Clarke, 1991 | _o__) | Ben Finney From barry at python.org Tue Nov 30 22:05:43 2010 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Nov 2010 16:05:43 -0500 Subject: [Python-Dev] PEP 291 versus Python 3 In-Reply-To: References: <4CF49ACF.6070904@netwok.org> <4CF4F77C.4000308@voidspace.org.uk> <20101130103531.54d79465@mission> Message-ID: <20101130160543.3b478311@mission> On Nov 30, 2010, at 12:11 PM, Brett Cannon wrote: >I will channel Neal: "I decline and/or do not want to respond". =) PEP 291 updated. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tjreedy at udel.edu Tue Nov 30 23:43:22 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 30 Nov 2010 17:43:22 -0500 Subject: [Python-Dev] Python and the Unicode Character Database In-Reply-To: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87wrnv43v5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 11/30/2010 3:23 AM, Stephen J. Turnbull wrote: > I see no reason not to make a similar promise for numeric literals. I > see no good reason to allow compatibility full-width Japanese "ASCII" > numerals or Arabic cursive numerals in "for i in range(...)" for > example. I do not think that anyone, at least not me, has argued for anything other than 0-9 digits (or 0-f for hex) in literals in program code. The only issue is whether non-programmer *users* should be able to use their native digits in applications in response to input prompts. -- Terry Jan Reedy From rrr at ronadam.com Tue Nov 30 23:48:56 2010 From: rrr at ronadam.com (Ron Adam) Date: Tue, 30 Nov 2010 16:48:56 -0600 Subject: [Python-Dev] python3k : imp.find_module raises SyntaxError In-Reply-To: References: <201011251530.23947.emile.anclin@logilab> <4CEE9B72.1070002@ronadam.com> <20101129115311.GD18888@lupus.logilab.fr> Message-ID: On 11/30/2010 01:41 PM, Brett Cannon wrote: > On Mon, Nov 29, 2010 at 12:21, Ron Adam wrote: >> >> >> On 11/29/2010 01:22 PM, Brett Cannon wrote: >>> >>> On Mon, Nov 29, 2010 at 03:53, Sylvain Th?nault >>> wrote: >>>> >>>> On 25 novembre 11:22, Ron Adam wrote: >>>>> >>>>> On 11/25/2010 08:30 AM, Emile Anclin wrote: >>>>>> >>>>>> hello, >>>>>> >>>>>> working on Pylint, we have a lot of voluntary corrupted files to test >>>>>> Pylint behavior; for instance >>>>>> >>>>>> $ cat /home/emile/var/pylint/test/input/func_unknown_encoding.py >>>>>> # -*- coding: IBO-8859-1 -*- >>>>>> """ check correct unknown encoding declaration >>>>>> """ >>>>>> >>>>>> __revision__ = '????' >>>>>> >>>>>> >>>>>> and we try to find that module : >>>>>> find_module('func_unknown_encoding', None). But python3 raises >>>>>> SyntaxError >>>>>> in that case ; it didn't raise SyntaxError on python2 nor does so on >>>>>> our >>>>>> func_nonascii_noencoding and func_wrong_encoding modules (with obvious >>>>>> names) >>>>>> >>>>>> Python 3.2a2 (r32a2:84522, Sep 14 2010, 15:22:36) >>>>>> [GCC 4.3.4] on linux2 >>>>>> Type "help", "copyright", "credits" or "license" for more information. >>>>>>>> >>>>>>>> >from imp import find_module >>>>>>>>> >>>>>>>>> find_module('func_unknown_encoding', None) >>>>>> >>>>>> Traceback (most recent call last): >>>>>> File "", line 1, in >>>>>> SyntaxError: encoding problem: with BOM >>>>> >>>>> I don't think there is a clear reason by design. Also try importing >>>>> the same modules directly and noting the differences in the errors >>>>> you get. >>>> >>>> IMO the point is that we can consider as a bug the fact that find_module >>>> tries to somewhat read the content of the file, no? Though it seems to >>>> only >>>> doing this for encoding detection or like since find_module doesn't choke >>>> on >>>> a module containing another kind of syntax error. >>>> >>>> So the question is, should we deal with this in pylint/astng, or can we >>>> expect >>>> this to be fixed at some point? >>> >>> Considering these semantics changed between Python 2 and 3 w/o a >>> discernable benefit (I would consider it a negative as finding a >>> module should not be impacted by syntactic correctness; the full act >>> of importing should be the only thing that cares about that), I would >>> consider it a bug that should be filed. >> >> The output of imp.find_module() returns an open file io object, and it's >> output feeds directly into to imp.load_module(). >> >>>>> imp.find_module('pydoc') >> (<_io.TextIOWrapper name=4 encoding='utf-8'>, >> '/usr/local/lib/python3.2/pydoc.py', ('.py', 'U', 1)) >> >> So I think the imp.find_module() is suppose to be used when you *do* want to >> do the full act of importing and not for just finding out if or where module >> xyz exists. > > Going with your line of argument, why can't imp.load_module be the > call that figures out there is a syntax error? If you look at this > from the perspective of PEP 302, finding a module has absolutely > nothing to do with the validity of the found source, just that > something was found somewhere which (hopefully) contains code that > represents the module. The part that I'm looking at, is what would find_module return if the encoding is bad or not found for the encoding? <_io.TextIOWrapper name=4 encoding='bad_encoding'> Maybe we could have some library introspection function in the inspect for just looking in the library rather than loading modules. But I think those would have the same issues, as packages need to be loaded in order to find sub modules.* * It almost seems like the concept of a sub-module (in a package) is flawed. I'm not sure I can explain what causes me to feel that way at the moment though. Ron