From v+python at g.nevcal.com Wed May 1 00:24:14 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 30 Apr 2013 15:24:14 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51802595.2040305@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> Message-ID: <5180448E.2030301@g.nevcal.com> On 4/30/2013 1:12 PM, Ethan Furman wrote: > Greetings, > > Eli asked me to put the reference implementation here for review. > > It is available at https://bitbucket.org/stoneleaf/aenum in ref435.py > and test_ref435.py Thanks for the code reference. Tests ran fine here on Python 3.3 If I alter test_ref435.py at the end, as follows, I get an error: nothing matches 'BDFL' Can someone explain why? if __name__ == '__main__': class AnotherName( Name ): 'just uses prior names' print(AnotherName['BDFL']) unittest.main() -------------- next part -------------- An HTML attachment was scrubbed... URL: From pjenvey at underboss.org Wed May 1 00:51:44 2013 From: pjenvey at underboss.org (Philip Jenvey) Date: Tue, 30 Apr 2013 15:51:44 -0700 Subject: [Python-Dev] PEP 435 -- Adding an Enum type to the Python standard library In-Reply-To: References: <5179F0B8.8010601@g.nevcal.com> <5179F30D.3050508@canterbury.ac.nz> <5179F6EC.5040301@g.nevcal.com> <517ABA70.8020909@g.nevcal.com> <517B286D.2020601@canterbury.ac.nz> <517B3818.4010405@g.nevcal.com> <517B3D41.6060109@stoneleaf.us> <517C1CCC.9090203@pearwood.info> <517C2B45.9010806@stoneleaf.us> <517C2E7D.4040407@stoneleaf.us> <517C5C9A.7020406@stoneleaf.us> <517C68F0.1010201@stoneleaf.us> Message-ID: <3F3F81DA-AE3B-4B7E-AB66-7DD3925D3897@underboss.org> On Apr 27, 2013, at 6:09 PM, Guido van Rossum wrote: > On Sat, Apr 27, 2013 at 5:10 PM, Ethan Furman wrote: >> class Planet( >> Enum, >> names=''' >> MERCURY >> VENUS >> EARTH >> MARS >> SATURN >> JUPITER >> URANUS >> PLUTO >> ''', >> ): >> '''Planets of the Solar System''' >> >> Not sure I like that. Ah well. > > The problem with this and similar proposals is that it puts things > inside string quotes that belong outside them. So does the convenience API outlined in the PEP, so this is just an alternative to that. -- Philip Jenvey From ethan at stoneleaf.us Wed May 1 00:34:36 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 15:34:36 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <5180448E.2030301@g.nevcal.com> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> Message-ID: <518046FC.1000100@stoneleaf.us> On 04/30/2013 03:24 PM, Glenn Linderman wrote: > On 4/30/2013 1:12 PM, Ethan Furman wrote: >> Greetings, >> >> Eli asked me to put the reference implementation here for review. >> >> It is available at https://bitbucket.org/stoneleaf/aenum in ref435.py and test_ref435.py > > Thanks for the code reference. > > Tests ran fine here on Python 3.3 > > If I alter test_ref435.py at the end, as follows, I get an error: nothing matches 'BDFL' > Can someone explain why? > > > if __name__ == '__main__': > class AnotherName( Name ): > 'just uses prior names' > print(AnotherName['BDFL']) Because Guido said no subclassing. At this point, if you try to subclass all your getting is the same type. So AnotherName is a string Enumeration. -- ~Ethan~ From ethan at stoneleaf.us Wed May 1 01:49:52 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 16:49:52 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <518046FC.1000100@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> Message-ID: <518058A0.9040501@stoneleaf.us> On 04/30/2013 03:34 PM, Ethan Furman wrote: > On 04/30/2013 03:24 PM, Glenn Linderman wrote: >> On 4/30/2013 1:12 PM, Ethan Furman wrote: >>> Greetings, >>> >>> Eli asked me to put the reference implementation here for review. >>> >>> It is available at https://bitbucket.org/stoneleaf/aenum in ref435.py and test_ref435.py >> >> Thanks for the code reference. >> >> Tests ran fine here on Python 3.3 >> >> If I alter test_ref435.py at the end, as follows, I get an error: nothing matches 'BDFL' >> Can someone explain why? >> >> >> if __name__ == '__main__': >> class AnotherName( Name ): >> 'just uses prior names' >> print(AnotherName['BDFL']) > > Because Guido said no subclassing. > > At this point, if you try to subclass all your getting is the same type. So AnotherName is a string Enumeration. It wouldn't be hard to check for instances of the Enum in question, and if there are some to raise an error instead. That way: --> class StrEnum(str, Enum): ... 'str-based enumerations' --> class Names(StrEnum): # this works, as StrEnum has no instances ... BDFL = 'GvR' --> class MoreNames(Names): # this fails, as Names has instances Thoughts? -- ~Ethan~ From eliswilson at hushmail.com Wed May 1 01:26:03 2013 From: eliswilson at hushmail.com (eliswilson at hushmail.com) Date: Tue, 30 Apr 2013 19:26:03 -0400 Subject: [Python-Dev] Biggest Fake Conference in Computer Science Message-ID: <20130430232603.CD2AEE6736@smtp.hushmail.com> Biggest Fake Conference in Computer Science We are researchers from different parts of the world and conducted a study on the world?s biggest bogus computer science conference WORLDCOMP http://sites.google.com/site/worlddump1 organized by Prof. Hamid Arabnia from University of Georgia, USA. We submitted a fake paper to WORLDCOMP 2011 and again (the same paper with a modified title) to WORLDCOMP 2012. This paper had numerous fundamental mistakes. Sample statements from that paper include: (1). Binary logic is fuzzy logic and vice versa (2). Pascal developed fuzzy logic (3). Object oriented languages do not exhibit any polymorphism or inheritance (4). TCP and IP are synonyms and are part of OSI model (5). Distributed systems deal with only one computer (6). Laptop is an example for a super computer (7). Operating system is an example for computer hardware Also, our paper did not express any conceptual meaning. However, it was accepted both the times without any modifications (and without any reviews) and we were invited to submit the final paper and a payment of $500+ fee to present the paper. We decided to use the fee for better purposes than making Prof. Hamid Arabnia richer. After that, we received few reminders from WORLDCOMP to pay the fee but we never responded. This fake paper is different from the two fake papers already published (see https://sites.google.com/site/worlddump4 for details) in WORLDCOMP. We MUST say that you should look at the above website if you have any thoughts of participating in WORLDCOMP. DBLP and other indexing agencies have stopped indexing WORLDCOMP?s proceedings since 2011 due to its fakeness. See http://www.informatik.uni-trier.de/~ley/db/conf/icai/index.html for of one of the conferences of WORLDCOMP and notice that there is no listing after 2010. See Section 2 of http://sites.google.com/site/dumpconf for comments from well-known researchers about WORLDCOMP. The status of your WORLDCOMP papers can be changed from scientific to other (i.e., junk or non-technical) at any time. Better not to have a paper than having it in WORLDCOMP and spoil the resume and peace of mind forever! Our study revealed that WORLDCOMP is money making business, using University of Georgia mask, for Prof. Hamid Arabnia. He is throwing out a small chunk of that money (around 20 dollars per paper published in WORLDCOMP?s proceedings) to his puppet (Mr. Ashu Solo or A.M.G. Solo) who publicizes WORLDCOMP and also defends it at various forums, using fake/anonymous names. The puppet uses fake names and defames other conferences to divert traffic to WORLDCOMP. He also makes anonymous phone calls and threatens the critiques of WORLDCOMP (See Item 7 of Section 5 of above website). That is, the puppet does all his best to get a maximum number of papers published at WORLDCOMP to get more money into his (and Prof. Hamid Arabnia?s) pockets. Prof. Hamid Arabnia makes a lot of tricks. For example, he appeared in a newspaper to fool the public, claiming him a victim of cyber-attack (see Item 8 in Section 5 of above website). Monte Carlo Resort (the venue of WORLDCOMP for more than 10 years, until 2012) has refused to provide the venue for WORLDCOMP?13 because of the fears of their image being tarnished due to WORLDCOMP?s fraudulent activities. That is why WORLDCOMP?13 is taking place at a different resort. WORLDCOMP will not be held after 2013. The draft paper submission deadline is over but still there are no committee members, no reviewers, and there is no conference Chairman. The only contact details available on WORLDCOMP?s website is just an email address! We ask Prof. Hamid Arabnia to publish all reviews for all the papers (after blocking identifiable details) since 2000 conference. Reveal the names and affiliations of all the reviewers (for each year) and how many papers each reviewer had reviewed on average. We also ask him to look at the Open Challenge (Section 6) at https://sites.google.com/site/moneycomp1 and respond if he has any professional values. Sorry for posting to multiple lists. Spreading the word is the only way to stop this bogus conference. Please forward this message to other mailing lists and people. We are shocked with Prof. Hamid Arabnia and his puppet?s activities at http://worldcomp-fake-bogus.blogspot.com Search Google using the keyword worldcomp fake for additional links. From timothy.c.delaney at gmail.com Wed May 1 02:06:17 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Wed, 1 May 2013 10:06:17 +1000 Subject: [Python-Dev] Enumeration items: mixed types? In-Reply-To: References: <517EF92E.5050906@stoneleaf.us> <517EFF60.6000902@stoneleaf.us> <517F1276.8070201@canterbury.ac.nz> Message-ID: On 1 May 2013 02:27, Eli Bendersky wrote: > > > > On Mon, Apr 29, 2013 at 5:38 PM, Greg Ewing wrote: > >> Ethan Furman wrote: >> >>> I suppose the other option is to have `.value` be whatever was assigned >>> (1, 'really big country', and (8273.199, 517) ), >>> >> >> I thought that was the intention all along, and that we'd >> given up on the idea of auto-assigning integer values >> (because it would require either new syntax or extremely >> dark magic). >> > > Yes, Guido rejected the auto-numbering syntax a while back. The only case > in which auto-numbering occurs (per PEP 435) is the "convenience syntax": > > Animal = Enum('Animal', 'fox dog cat') > Actually, since Guido has pronounced that definition order will be the default, there's no reason each Enum instance couldn't have an "ordinal" attribute. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nikolaus at rath.org Wed May 1 04:05:04 2013 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 30 Apr 2013 19:05:04 -0700 Subject: [Python-Dev] enum instances References: <517F1CFA.3070905@googlemail.com> <87ppxcagdg.fsf@vostro.rath.org> <517F3E70.2010108@hastings.org> Message-ID: <87bo8vtpyn.fsf@vostro.rath.org> Larry Hastings writes: > On 04/29/2013 07:42 PM, Nikolaus Rath wrote: >> State is a class, it just inherits from enum. Thus: >> >> type(State) == type(enum) == type(EnumMetaclass) >> issubclass(State, enum) == True >> >> >> HTH, >> >> -Nikolaus > > If you'd tried it, you'd have found that that isn't true. enum has a > metaclass, EnumMetaclass. Thus type(enum) == EnumMetaClass. That is exactly what I wrote above. > That didn't help, Aeh, yes. Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From larry at hastings.org Wed May 1 04:12:59 2013 From: larry at hastings.org (Larry Hastings) Date: Tue, 30 Apr 2013 19:12:59 -0700 Subject: [Python-Dev] enum instances In-Reply-To: <87bo8vtpyn.fsf@vostro.rath.org> References: <517F1CFA.3070905@googlemail.com> <87ppxcagdg.fsf@vostro.rath.org> <517F3E70.2010108@hastings.org> <87bo8vtpyn.fsf@vostro.rath.org> Message-ID: <51807A2B.6000804@hastings.org> On 04/30/2013 07:05 PM, Nikolaus Rath wrote: > Larry Hastings writes: >> On 04/29/2013 07:42 PM, Nikolaus Rath wrote: >>> State is a class, it just inherits from enum. Thus: >>> >>> type(State) == type(enum) == type(EnumMetaclass) >>> issubclass(State, enum) == True >>> >>> >>> HTH, >>> >>> -Nikolaus >> If you'd tried it, you'd have found that that isn't true. enum has a >> metaclass, EnumMetaclass. Thus type(enum) == EnumMetaClass. > That is exactly what I wrote above. type(EnumMetaClass) == type type(enum) == EnumMetaClass type(EnumMetaClass) != type(enum) //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 1 04:12:54 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 19:12:54 -0700 Subject: [Python-Dev] enum instances In-Reply-To: <87bo8vtpyn.fsf@vostro.rath.org> References: <517F1CFA.3070905@googlemail.com> <87ppxcagdg.fsf@vostro.rath.org> <517F3E70.2010108@hastings.org> <87bo8vtpyn.fsf@vostro.rath.org> Message-ID: <51807A26.5080400@stoneleaf.us> On 04/30/2013 07:05 PM, Nikolaus Rath wrote: > Larry Hastings writes: >> On 04/29/2013 07:42 PM, Nikolaus Rath wrote: >>> State is a class, it just inherits from enum. Thus: >>> >>> type(State) == type(enum) == type(EnumMetaclass) >>> issubclass(State, enum) == True >>> >>> >>> HTH, >>> >>> -Nikolaus >> >> If you'd tried it, you'd have found that that isn't true. enum has a >> metaclass, EnumMetaclass. Thus type(enum) == EnumMetaClass. > > That is exactly what I wrote above. Not really. You wrote type(enum) == type(EnumMetaClass) not type(enum) == EnumMetaClass -- ~Ethan~ From v+python at g.nevcal.com Wed May 1 04:39:57 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 30 Apr 2013 19:39:57 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <518058A0.9040501@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> Message-ID: <5180807D.9090707@g.nevcal.com> On 4/30/2013 4:49 PM, Ethan Furman wrote: > On 04/30/2013 03:34 PM, Ethan Furman wrote: >> On 04/30/2013 03:24 PM, Glenn Linderman wrote: >>> On 4/30/2013 1:12 PM, Ethan Furman wrote: >>>> Greetings, >>>> >>>> Eli asked me to put the reference implementation here for review. >>>> >>>> It is available at https://bitbucket.org/stoneleaf/aenum in >>>> ref435.py and test_ref435.py >>> >>> Thanks for the code reference. >>> >>> Tests ran fine here on Python 3.3 >>> >>> If I alter test_ref435.py at the end, as follows, I get an error: >>> nothing matches 'BDFL' >>> Can someone explain why? >>> >>> >>> if __name__ == '__main__': >>> class AnotherName( Name ): >>> 'just uses prior names' >>> print(AnotherName['BDFL']) >> >> Because Guido said no subclassing. Indeed, I heard him. But what I heard was that subclasses shouldn't be allowed to define new enumeration values, and that was the point of all his justification and the justifications in the Stack Overflow discussion he linked to. I don't want to disagree, or argue that point, there are reasons for it, although some have raised counter-arguments to it. This is not intended to be a counter-argument to the point that there should be no new enumeration values created in subclasses. >> >> At this point, if you try to subclass all your getting is the same >> type. So AnotherName is a string Enumeration. So if I get the same type, it'd be kind of nice if it worked the same too... even if the instances are of type Name, it seems that one should be able to look them up, the same as one can look them up using Name. > It wouldn't be hard to check for instances of the Enum in question, > and if there are some to raise an error instead. That way: > > --> class StrEnum(str, Enum): > ... 'str-based enumerations' > > --> class Names(StrEnum): # this works, as StrEnum has no instances > ... BDFL = 'GvR' > > --> class MoreNames(Names): # this fails, as Names has instances > > Thoughts? So to me, it would seem quite reasonable to allow only one class in the hierarchy to define instances. If no instances have been defined before, then defining an enumeration instance should occur for each attribute for which it is appropriate. But if a superclass has defined enumeration instances, then things defined in subclasses should not be taken as enumeration instances... and the choice should be between an error, and simply allowing it to be defined as a normal class attribute of the subclass. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nikolaus at rath.org Wed May 1 04:25:10 2013 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 30 Apr 2013 19:25:10 -0700 Subject: [Python-Dev] enum instances In-Reply-To: <87bo8vtpyn.fsf@vostro.rath.org> References: <517F1CFA.3070905@googlemail.com> <87ppxcagdg.fsf@vostro.rath.org> <517F3E70.2010108@hastings.org> <87bo8vtpyn.fsf@vostro.rath.org> Message-ID: <51807D06.2030607@rath.org> On 04/30/2013 07:05 PM, Nikolaus Rath wrote: > Larry Hastings writes: >> On 04/29/2013 07:42 PM, Nikolaus Rath wrote: >>> State is a class, it just inherits from enum. Thus: >>> >>> type(State) == type(enum) == type(EnumMetaclass) >>> issubclass(State, enum) == True >>> >> >> If you'd tried it, you'd have found that that isn't true. enum has a >> metaclass, EnumMetaclass. Thus type(enum) == EnumMetaClass. > > That is exactly what I wrote above. Sorry, I must have read what I thought rather than what I wrote. You're right, what I wrote was wrong. Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From Nikolaus at rath.org Wed May 1 04:09:46 2013 From: Nikolaus at rath.org (Nikolaus Rath) Date: Tue, 30 Apr 2013 19:09:46 -0700 Subject: [Python-Dev] Destructors and Closing of File Objects References: <87a9p41gr6.fsf@vostro.rath.org> <877gk35aqx.fsf@vostro.rath.org> <87y5c4r91p.fsf@vostro.rath.org> <517EED18.5000707@farowl.co.uk> Message-ID: <878v3ztpqt.fsf@vostro.rath.org> Armin Rigo writes: > Hi Jeff, > > On Mon, Apr 29, 2013 at 11:58 PM, Jeff Allen <"ja...py"@farowl.co.uk> wrote: >> In Jython, (...) > > Thanks Jeff for pointing this out. Jython thus uses a custom > mechanism similar to PyPy's, which is also similar to atexit's. It > should not be too hard to implement it in CPython 3 as well, if this > ends up classified as a bug. This is what my bug report was about > (sorry if I failed to be clear enough about it). Personally, I think it should just be mentioned in the documentation for the buffered writers. Otherwise it's hard to justify what deserves such a special mechanism and what doesn't (what about e.g. tempfile.NamedTemporaryFile). > Nikolaus: the bug report contains a failing test, is that what you're > looking for? That's what I was trying to write as well, yes. Now I know how to do it :-) Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From ethan at stoneleaf.us Wed May 1 06:19:49 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 21:19:49 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51802595.2040305@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> Message-ID: <518097E5.70306@stoneleaf.us> Latest code available at https://bitbucket.org/stoneleaf/aenum. --> class Color(Enum): ... red = 1 ... green = 2 ... blue = 3 Enum items are virtual attributes looked by EnumType's __getattr__. The win here is that --> Color.red.green.blue no longer works. ;) Subclassing an implemented Enum class now raises an error (is there a better word than 'implemented'?) --> class MoreColor(Color): ... cyan = 4 ... magenta = 5 ... yellow = 6 Traceback (most recent call last): File "", line 1, in File "./ref435.py", line 83, in __new__ raise EnumError("cannot subclass an implemented Enum class") ref435.EnumError: cannot subclass an implemented Enum class From barry at python.org Wed May 1 06:47:51 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 21:47:51 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <5180807D.9090707@g.nevcal.com> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> Message-ID: <20130430214751.023ef767@anarchist> On Apr 30, 2013, at 07:39 PM, Glenn Linderman wrote: >>> Because Guido said no subclassing. > >Indeed, I heard him. But what I heard was that subclasses shouldn't be >allowed to define new enumeration values, and that was the point of all his >justification and the justifications in the Stack Overflow discussion he >linked to. I don't want to disagree, or argue that point, there are reasons >for it, although some have raised counter-arguments to it. This is not >intended to be a counter-argument to the point that there should be no new >enumeration values created in subclasses. That's a shame, because disallowing subclassing to extend an enum will break my existing use cases. Maybe I won't be able to adopt stdlib.enums after all. :( -Barry From barry at python.org Wed May 1 07:41:02 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 22:41:02 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <518097E5.70306@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <518097E5.70306@stoneleaf.us> Message-ID: <20130430224102.14c52460@anarchist> On Apr 30, 2013, at 09:19 PM, Ethan Furman wrote: >Subclassing an implemented Enum class now raises an error (is there a better >word than 'implemented'?) > >--> class MoreColor(Color): >... cyan = 4 >... magenta = 5 >... yellow = 6 > >Traceback (most recent call last): > File "", line 1, in > File "./ref435.py", line 83, in __new__ > raise EnumError("cannot subclass an implemented Enum class") >ref435.EnumError: cannot subclass an implemented Enum class What does it break if you remove the `if base._enum` check? I mean, can we be consenting adults here or not? -Barry From barry at python.org Wed May 1 07:50:14 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 22:50:14 -0700 Subject: [Python-Dev] Enumeration items: `type(EnumClass.item) is EnumClass` ? In-Reply-To: <517ED179.2080100@pearwood.info> References: <517E768A.9090602@stoneleaf.us> <517E9C09.3010806@stoneleaf.us> <517EBAFB.3060109@pearwood.info> <517EC3DA.9090105@mrabarnett.plus.com> <517ED179.2080100@pearwood.info> Message-ID: <20130430225014.13488604@anarchist> On Apr 30, 2013, at 06:00 AM, Steven D'Aprano wrote: >flufl.enum has been in use for Mailman for many years, and I would like to >hear Barry's opinion on this. I'm not sure it matters now (I'm about a billion messages behind with little hope of catching up), but FWIW, I have use cases for extending through inheritance and haven't had any kind of semantic confusion. But then again, I haven't particularly needed to do type checks or isinstance checks, and I haven't needed to put methods on enums either. -Barry From barry at python.org Wed May 1 08:01:08 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 23:01:08 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <87a9oiw0hv.fsf@uwakimon.sk.tsukuba.ac.jp> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <87a9oiw0hv.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20130430230108.363ece53@anarchist> On Apr 29, 2013, at 11:10 AM, Stephen J. Turnbull wrote: >Ethan thinks that "Seasons(3)" is a typecast, not an access into a >mapping (which would be better expressed by "Seasons[3]"). Ie, the >inverse of "int(AUTUMN)". > >This is consistent with the "AUTUMN is-a Seasons" position that Ethan >and Guido take. It's inconsistent with the "AUTUMN is-a >Seasons_VALUE" implementation of Flufl.Enum. I think this sums it up perfectly. I get that using class definition syntax is what sends people down Ethan's path. -Barry From barry at python.org Wed May 1 08:08:27 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 23:08:27 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <517DDF23.3020101@stoneleaf.us> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> Message-ID: <20130430230827.062515ff@anarchist> On Apr 28, 2013, at 07:46 PM, Ethan Furman wrote: >and similarly, Enum behavior /should be/ (in my opinion ;) > >Season.AUTUMN is Season('AUTUMN') is Season(3) I think you'll have a problem with this. flufl.enum did this, but it has an inherent conflict, which is why we removed the getattr-like behavior. class Things(Enum): foo = 'bar' bar = 'foo' What does Things('foo') return? Note that it doesn't matter if that's spelled Things['foo']. Whether it's defined as lookup or instance "creation", you should only map values to items, and not attribute names to items, and definitely not both. Let getattr() do attribute name lookup just like normal. -Barry From barry at python.org Wed May 1 08:15:03 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 23:15:03 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130429053623.GA3804@ando> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <20130429053623.GA3804@ando> Message-ID: <20130430231503.1706662b@anarchist> On Apr 29, 2013, at 03:36 PM, Steven D'Aprano wrote: >That's not how I understand it. I expected that the correct way to use >enums is with identity checks: > >if arg is Season.SUMMER: > handle_summer() It's certainly the way I've recommended to use them. I think `is` reads better in context, and identity checks are usually preferred for singletons, which enum items are. You can use equality checks, but think about this: if thing == None: vs. if thing is None: -Barry From barry at python.org Wed May 1 08:18:43 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 30 Apr 2013 23:18:43 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <517E1828.1080802@stoneleaf.us> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> Message-ID: <20130430231843.341c7659@anarchist> On Apr 28, 2013, at 11:50 PM, Ethan Furman wrote: >But as soon as: > > type(Color.red) is Color # True > type(MoreColor.red) is MoreColor # True > >then: > > Color.red is MoreColor.red # must be False, no? > > >If that last statement can still be True, I'd love it if someone showed me >how. class Foo: a = object() b = object() class Bar(Foo): c = object() >>> Foo.a is Bar.a True -Barry From ethan at stoneleaf.us Wed May 1 08:29:09 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 23:29:09 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130430231843.341c7659@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> Message-ID: <5180B635.7000904@stoneleaf.us> On 04/30/2013 11:18 PM, Barry Warsaw wrote: > On Apr 28, 2013, at 11:50 PM, Ethan Furman wrote: > >> But as soon as: >> >> type(Color.red) is Color # True >> type(MoreColor.red) is MoreColor # True >> >> then: >> >> Color.red is MoreColor.red # must be False, no? >> >> >> If that last statement can still be True, I'd love it if someone showed me >> how. > > class Foo: > a = object() > b = object() > > class Bar(Foo): > c = object() > >>>> Foo.a is Bar.a > True Wow. I think I'm blushing from embarrassment. Thank you for answering my question, Barry. -- ~Ethan~ From ethan at stoneleaf.us Wed May 1 07:50:31 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 22:50:31 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <20130430214751.023ef767@anarchist> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> Message-ID: <5180AD27.3090306@stoneleaf.us> On 04/30/2013 09:47 PM, Barry Warsaw wrote: > On Apr 30, 2013, at 07:39 PM, Glenn Linderman wrote: > >>>> Because Guido said no subclassing. >> >> Indeed, I heard him. But what I heard was that subclasses shouldn't be >> allowed to define new enumeration values, and that was the point of all his >> justification and the justifications in the Stack Overflow discussion he >> linked to. I don't want to disagree, or argue that point, there are reasons >> for it, although some have raised counter-arguments to it. This is not >> intended to be a counter-argument to the point that there should be no new >> enumeration values created in subclasses. > > That's a shame, because disallowing subclassing to extend an enum will break > my existing use cases. Maybe I won't be able to adopt stdlib.enums after > all. :( What is your existing use case? The way I had subclassing working originally was for the subclass to create it's own versions of the superclass' enum items -- they weren't the same object, but they were equal: --> class Color(Enum): ... red = 1 ... green = 2 ... blue = 3 --> class MoreColor(Color): ... cyan = 4 ... magenta = 5 ... yellow = 6 --> Color.red is MoreColor.red False --> Color.red == MoreColor.red True If you switched from `is` to `==` would this work for you? -- ~Ethan~ From ethan at stoneleaf.us Wed May 1 08:40:22 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 30 Apr 2013 23:40:22 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <20130430224102.14c52460@anarchist> References: <51802595.2040305@stoneleaf.us> <518097E5.70306@stoneleaf.us> <20130430224102.14c52460@anarchist> Message-ID: <5180B8D6.8010606@stoneleaf.us> On 04/30/2013 10:41 PM, Barry Warsaw wrote: > > What does it break if you remove the `if base._enum` check? I mean, can we be > consenting adults here or not? I removed the error and added a couple lines to EnumType.__getattr_, and now subclassing works as I think you are used to it working. Very unsure on this change being permanent (I have no objections to it). -- ~Ethan~ From steve at pearwood.info Wed May 1 09:05:36 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 1 May 2013 17:05:36 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <518097E5.70306@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <518097E5.70306@stoneleaf.us> Message-ID: <20130501070536.GA10347@ando> On Tue, Apr 30, 2013 at 09:19:49PM -0700, Ethan Furman wrote: > Latest code available at https://bitbucket.org/stoneleaf/aenum. > > --> class Color(Enum): > ... red = 1 > ... green = 2 > ... blue = 3 Ethan, you seem to be writing a completely new PEP in opposition to Barry's PEP 435. But your implementation doesn't seem to match what your proto-PEP claims. Your proto-PEP (file enum.txt) says: ``Enum` - a valueless, unordered type. It's related integer value is merely to allow for database storage and selection from the enumerated class. An ``Enum`` will not compare equal with its integer value, but can compare equal to other enums of which it is a subclass. but: py> import aenum py> class Color(aenum.Enum): ... red = 1 ... py> Color.red == 1 True py> type(Color.red) is int True So the implemented behaviour is completely different from the documented behaviour. What gives? Given the vast number of words written about enum values being instances of the enum class, I'm surprised that your proto-PEP doesn't seem to mention one word about that. All it says is that enum values are singletons. (Unless I've missed something.) -- Steven From v+python at g.nevcal.com Wed May 1 09:09:49 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 01 May 2013 00:09:49 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <518097E5.70306@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <518097E5.70306@stoneleaf.us> Message-ID: <5180BFBD.9040900@g.nevcal.com> On 4/30/2013 9:19 PM, Ethan Furman wrote: > Latest code available at https://bitbucket.org/stoneleaf/aenum. > > --> class Color(Enum): > ... red = 1 > ... green = 2 > ... blue = 3 > > Enum items are virtual attributes looked by EnumType's __getattr__. > The win here is that > > --> Color.red.green.blue > > no longer works. ;) Color.red.green.blue not working seems like a win. Still seems like it should be possible to look them up from a subclass, though. --> class FancyColor( Color ): ... 'strikes my fancy' --> FancyColor['red'] Color.red > > Subclassing an implemented Enum class now raises an error (is there a > better word than 'implemented'?) > > --> class MoreColor(Color): > ... cyan = 4 > ... magenta = 5 > ... yellow = 6 > > Traceback (most recent call last): > File "", line 1, in > File "./ref435.py", line 83, in __new__ > raise EnumError("cannot subclass an implemented Enum class") > ref435.EnumError: cannot subclass an implemented Enum class Yes, I think raising an error is appropriate, if implementing Guido's "no subclass" comments, rather than treating what otherwise would look like enumeration settings as subclass attributes. My suggested error wording would be "Cannot define additional enumeration items in a subclass". That allows for "original enumeration items" to be defined in a subclass, of course. And it isn't the subclassing that is disallowed, but the definition of more enumeration items that is disallowed. At least, I hope that is the case. Then, should consenting adults lift the restriction, there wouldn't be surprises in other code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Wed May 1 09:19:05 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 01 May 2013 00:19:05 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130430230827.062515ff@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> Message-ID: <5180C1E9.4030100@g.nevcal.com> On 4/30/2013 11:08 PM, Barry Warsaw wrote: > On Apr 28, 2013, at 07:46 PM, Ethan Furman wrote: > >> and similarly, Enum behavior /should be/ (in my opinion ;) >> >> Season.AUTUMN is Season('AUTUMN') is Season(3) > I think you'll have a problem with this. flufl.enum did this, but it has an > inherent conflict, which is why we removed the getattr-like behavior. > > class Things(Enum): > foo = 'bar' > bar = 'foo' > > What does Things('foo') return? > > Note that it doesn't matter if that's spelled Things['foo']. > > Whether it's defined as lookup or instance "creation", you should only map > values to items, and not attribute names to items, and definitely not both. > Let getattr() do attribute name lookup just like normal. I agree that it is confusing to be able to index by either the name of the enum or its value, in the same method. The current implementation prefers the names, but will check values if the name is not found, I discovered by experimentation, after reading the tests. But when there are conflicts (which would be confusing at best), the inability to look up some enumerations by value, because the one with that name is found first, would be even more confusing. Can Things('foo') lookup by name and Things['foo'] lookup by value? Or does that confuse things too? -------------- next part -------------- An HTML attachment was scrubbed... URL: From pieter at nagel.co.za Wed May 1 09:32:28 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Wed, 01 May 2013 09:32:28 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? Message-ID: <1367393548.2868.262.camel@basilisk> Hi all, I write as a python lover for over 13 years who's always wanted something like PEP 428 in Python. I am concerned about the caching of stat() results as currently defined in the PEP. This means that all behaviour built on top of stat(), such as p.is_dir(), p.is_file(), p.st_size and the like can indefinitely hold on to stale data until restat() is called, and I consider this confusing. Perhaps in recognition of this, p.exists() is implemented differently, and it does restat() internally (although the PEP does not document this). If this behaviour is maintained, then at the very least this makes the API more complicated to document: some calls cache as a side effect, others update the cache as a side effect, and others, such as lstat(), don't cache at all. This also introduces a divergence of behaviour between os.path.isfile() and p.is_file(), that is confusing and will also need to be documented. I'm concerned about scenarios like users of the library polling, for example, for some file to appear, and being confused about why the arguably more sloppy poll for p.exists() works while a poll for p.is_file(), which expresses intent better, never terminates. In theory the caching mechanism could be further refined to only hold onto cached results for a limited amount of time, but I would argue this is unnecessary complexity, and caching should just be removed, along with restat(). Isn't the whole notion that stat() need to be cached for performance issues somewhat of a historical relic of older OS's and filesystem performance? AFAIK linux already has stat() caching as a side-effect of the filesystem layer's metadata caching. How does Windows and Mac OS fare here? Are there benchmarks proving that this is serious enough to complicate the API? If the ability to cache stat() calls is deemed important enough, how about a different API where is_file(), is_dir() and the like are added as methods on the result object that stat() returns? Then one can hold onto a stat() result as a temporary object and ask it multiple questions without doing another OS call, and is_file() etc. on the Path object can be documented as being forwarders to the stat() result just as p.st_size is currently - except that I believe they should forward to a fresh, uncached stat() call every time. I write directly to this list instead raising it to Antoine Pitrou in private just because I don't want to make extra work for him to first receive my feedback and the re-raise it on this list. If this is wrong or disrespectful, I apologize. -- Pieter Nagel From ncoghlan at gmail.com Wed May 1 11:15:58 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 May 2013 19:15:58 +1000 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: <1367393548.2868.262.camel@basilisk> References: <1367393548.2868.262.camel@basilisk> Message-ID: On Wed, May 1, 2013 at 5:32 PM, Pieter Nagel wrote: > Isn't the whole notion that stat() need to be cached for performance > issues somewhat of a historical relic of older OS's and filesystem > performance? AFAIK linux already has stat() caching as a side-effect of > the filesystem layer's metadata caching. How does Windows and Mac OS > fare here? Are there benchmarks proving that this is serious enough to > complicate the API? System calls typically release the GIL in threaded code (due to the possibility the underlying filesystem may be a network mount), which ends up being painfully expensive. The repeated stat calls also appear to be one of the main reasons walkdir is so much slower than performing the same operations in a loop rather than using a generator pipeline as walkdir does (see http://walkdir.readthedocs.org), although I admit it was a year or two ago I made those comparisons, and it wasn't the most scientific of benchmarking efforts. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Wed May 1 12:18:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 12:18:21 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? References: <1367393548.2868.262.camel@basilisk> Message-ID: <20130501121821.093fa030@fsol> Hi, On Wed, 01 May 2013 09:32:28 +0200 Pieter Nagel wrote: > Hi all, > > I write as a python lover for over 13 years who's always wanted > something like PEP 428 in Python. > > I am concerned about the caching of stat() results as currently defined > in the PEP. This means that all behaviour built on top of stat(), such > as p.is_dir(), p.is_file(), p.st_size and the like can indefinitely hold > on to stale data until restat() is called, and I consider this > confusing. I understand it might be confusing. On the other hand, if you call is_dir() then is_file() (then perhaps another metadata-reading operation), and there's a race condition where the file is modified in-between, you could have pretty much nonsensical results, if not for the caching. > Isn't the whole notion that stat() need to be cached for performance > issues somewhat of a historical relic of older OS's and filesystem > performance? AFAIK linux already has stat() caching as a side-effect of > the filesystem layer's metadata caching. How does Windows and Mac OS > fare here? Are there benchmarks proving that this is serious enough to > complicate the API? Surprisingly enough, some network filesystems have rather bad stat() performance. This has been reported for years as an issue with Python's import machinery, until 3.3 added a caching scheme where stat() calls are no more issued for each and every path directory and each and every imported module. But as written above caching is also a matter of functionality. I'll let other people chime in. > If the ability to cache stat() calls is deemed important enough, how > about a different API where is_file(), is_dir() and the like are added > as methods on the result object that stat() returns? That's a good idea too. It isn't straightforward since os.stat() is implemented in C. Regards Antoine. From pieter at nagel.co.za Wed May 1 13:22:20 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Wed, 01 May 2013 13:22:20 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: <20130501121821.093fa030@fsol> References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> Message-ID: <1367407340.2868.294.camel@basilisk> Antoine and Nick have convinced me that stat() calls can be a performance issue. I am still concerned about the best way to balance that against the need to keep the API simple, though. I'm still worried about the current behaviour that some path can answer True to is_file() in a long-running process just because it had been a file last week. In my experience there are use cases where most stat() calls one makes (including indirectly via is_file() and friends) want up-to-date data. There is also the risk of obtaining a Path object that already had its stat() value cached some time ago without your knowledge (i.e. if the Path was created for you by a walkdir type function that in its turn also called is_file() before returning the result). And needing to precede each is_file() etc. call with a restat() call whose return value is not even used introduces undesirable temporal coupling between the restat() and is_file() call. I see a few alternative solution, not mutually exclusive: 1) Change the signature of stat(), and everything that indirectly uses stat(), to take an optional 'fresh' keyword argument (or some synonym). Then stat(fresh=True) becomes synonymous with the current restat(), and the latter can be removed. Queries like is_file(fresh=True) will be implemented by forwarding fresh to the underlying stat() call they are implemented on. What the default for 'fresh' should be, can be debated, but I'd argue for the sake of naive code that fresh should default to True, and then code that is aware of stat() caching can use fresh=False as required. 2) The root of the issue is keeping the cached stat() value indefinitely. Therefore, limit the duration for which the cached value is valid. The challenge is to find a way to express how long the value should be cached, without needing to call time.monotonic() or the like that presumable are also OS calls that will release the GIL. One way would be to compute the number of virtual machine instructions executed since the stat() call was cached, and set the limit there. Is that still possible, now that sys.setcheckinterval() has been gutted? 3) Leave it up to performance critical code, such as the import machinery, or walkdirs that Nick mentioned, to do their own caching, and simplify the filepath API for the simple case. But one can still make life easier for code like that, by adding is_file() and friends on the stat result object as I suggested. But this almost sounds like a PEP of its own, because although pahtlib will benefit by it, it is actually an orthogonal issue. It raises all kinds of issues: should the signature be statresult.isfile() to match os.path, or statresult.is_file() to match PEP 428? -- Pieter Nagel From cf.natali at gmail.com Wed May 1 14:50:07 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Wed, 1 May 2013 14:50:07 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: <1367407340.2868.294.camel@basilisk> References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> Message-ID: > 3) Leave it up to performance critical code, such as the import > machinery, or walkdirs that Nick mentioned, to do their own caching, and > simplify the filepath API for the simple case. > > But one can still make life easier for code like that, by adding > is_file() and friends on the stat result object as I suggested. +1 from me. PEP 428 goes in the right direction with a distinction between "pure" path and "concrete" path. Pure path support syntactic operations, whereas I would expect concrete paths to actually access the file system. Having a method like restat() is a hint that something's wrong, I'm convinced this will bite some people. I'm also be in favor of having a wrapper class around os.stat() result which would export utility methods such as is_file()/is_directory() and owner/group, etc attributes. That way, the default behavior would be correct, and this helper class would make it easier for users like walkdir() to implement their own caching. As an added benefit, this would make path objects actually immutable, which is always a good thing (simpler, and you get thread-safety for free). From ethan at stoneleaf.us Wed May 1 15:16:47 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 06:16:47 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <20130501070536.GA10347@ando> References: <51802595.2040305@stoneleaf.us> <518097E5.70306@stoneleaf.us> <20130501070536.GA10347@ando> Message-ID: <518115BF.8010703@stoneleaf.us> On 05/01/2013 12:05 AM, Steven D'Aprano wrote: > On Tue, Apr 30, 2013 at 09:19:49PM -0700, Ethan Furman wrote: >> Latest code available at https://bitbucket.org/stoneleaf/aenum. > > Ethan, you seem to be writing a completely new PEP in opposition to > Barry's PEP 435. But your implementation doesn't seem to match what your > proto-PEP claims. aenum.py was my original competitor for enums; the files you should be paying attention to are the *ref435* files, as stated in the original post. (The stars are for pattern matching, not bolding.) Apologies for any confusion. -- ~Ethan~ From ncoghlan at gmail.com Wed May 1 15:54:10 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 1 May 2013 23:54:10 +1000 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> Message-ID: On 1 May 2013 22:58, "Charles-Fran?ois Natali" wrote: > > > 3) Leave it up to performance critical code, such as the import > > machinery, or walkdirs that Nick mentioned, to do their own caching, and > > simplify the filepath API for the simple case. > > > > But one can still make life easier for code like that, by adding > > is_file() and friends on the stat result object as I suggested. > > +1 from me. > PEP 428 goes in the right direction with a distinction between "pure" > path and "concrete" path. Pure path support syntactic operations, > whereas I would expect concrete paths to actually access the file > system. Having a method like restat() is a hint that something's > wrong, I'm convinced this will bite some people. > > I'm also be in favor of having a wrapper class around os.stat() result > which would export utility methods such as is_file()/is_directory() > and owner/group, etc attributes. > > That way, the default behavior would be correct, and this helper class > would make it easier for users like walkdir() to implement their own > caching. Walkdir is deliberately built as a decoupled pipeline modelled on os.walk - the only way it can really benefit from caching without destroying the API is if the caching is built into the underlying path objects that are then passed through the pipeline stages. However, I like the idea of a rich "stat" object, with "path.stat()" and "path.cached_stat()" accessors on the path objects. Up to date data by default, easy caching for use cases that need it without needing to pass the stat data around separately. Cheers, Nick. > > As an added benefit, this would make path objects actually immutable, > which is always a good thing (simpler, and you get thread-safety for > free). > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed May 1 16:39:58 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 07:39:58 -0700 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> Message-ID: I've not got the full context, but I would like to make it *very* clear in the API (e.g. through naming of the methods) when you are getting a possibly cached result from stat(), and I would be very concerned if existing APIs were going to get caching behavior. For every use cases that benefits from caching there's a complementary use case that caching breaks. Since both use cases are important we must offer both APIs, in a way that makes it clear to even the casual reader of the code what's going on. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Wed May 1 17:09:20 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 08:09:20 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51802595.2040305@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> Message-ID: <51813020.2040508@stoneleaf.us> New repo to avoid confusion: https://bitbucket.org/stoneleaf/ref435 which has the latest updates from the feedback. Subclassing is again disabled. Let's get the rest of it done, then we can come back to that issue if necessary. -- ~Ethan~ From steve at pearwood.info Wed May 1 17:35:32 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 02 May 2013 01:35:32 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51813020.2040508@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <51813020.2040508@stoneleaf.us> Message-ID: <51813644.8040707@pearwood.info> On 02/05/13 01:09, Ethan Furman wrote: > New repo to avoid confusion: > > https://bitbucket.org/stoneleaf/ref435 Apparently I have to log in before I can even see the repo. Not going to happen. -- Steven From guido at python.org Wed May 1 17:39:42 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 08:39:42 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51813644.8040707@pearwood.info> References: <51802595.2040305@stoneleaf.us> <51813020.2040508@stoneleaf.us> <51813644.8040707@pearwood.info> Message-ID: I can see it just fine without logging in, even in an Incognito Chrome window. On Wed, May 1, 2013 at 8:35 AM, Steven D'Aprano wrote: > On 02/05/13 01:09, Ethan Furman wrote: >> >> New repo to avoid confusion: >> >> https://bitbucket.org/stoneleaf/ref435 > > > Apparently I have to log in before I can even see the repo. > > Not going to happen. > > > > -- > Steven > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Wed May 1 17:40:46 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 08:40:46 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51813644.8040707@pearwood.info> References: <51802595.2040305@stoneleaf.us> <51813020.2040508@stoneleaf.us> <51813644.8040707@pearwood.info> Message-ID: <5181377E.50304@stoneleaf.us> On 05/01/2013 08:35 AM, Steven D'Aprano wrote: > On 02/05/13 01:09, Ethan Furman wrote: >> New repo to avoid confusion: >> >> https://bitbucket.org/stoneleaf/ref435 > > Apparently I have to log in before I can even see the repo. > > Not going to happen. Sorry, just made it public. Try again? -- ~Ethan~ From barry at python.org Wed May 1 17:44:32 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 1 May 2013 08:44:32 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <5180AD27.3090306@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> Message-ID: <20130501084432.246b4dbd@anarchist> On Apr 30, 2013, at 10:50 PM, Ethan Furman wrote: >The way I had subclassing working originally was for the subclass to create >it's own versions of the superclass' enum items -- they weren't the same >object, but they were equal: > >--> class Color(Enum): >... red = 1 >... green = 2 >... blue = 3 > >--> class MoreColor(Color): >... cyan = 4 >... magenta = 5 >... yellow = 6 > >--> Color.red is MoreColor.red >False > >--> Color.red == MoreColor.red >True > >If you switched from `is` to `==` would this work for you? Not really, because in practice you don't compare one enum against another explicitly. You have a value in a variable and you're comparing against a literal enum. So `is` is still the more natural spelling. My point is, if you want enums to behave more class-like because you're using the class syntax, then you shouldn't explicitly break this one class-like behavior just to protect some users from themselves. There doesn't even seem to be an easy way to override the default behavior if you really wanted to do it. -Barry From barry at python.org Wed May 1 17:47:55 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 1 May 2013 08:47:55 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5180C1E9.4030100@g.nevcal.com> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> Message-ID: <20130501084755.04f44a4f@anarchist> On May 01, 2013, at 12:19 AM, Glenn Linderman wrote: >Can Things('foo') lookup by name and Things['foo'] lookup by value? Or does >that confuse things too? I think it confuses things too much. Why isn't getattr() for lookup by name good enough? It is for regular classes. -Barry From guido at python.org Wed May 1 18:14:12 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 09:14:12 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <20130501084432.246b4dbd@anarchist> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: Personally I would probably compare enums using ==, but I agree that 'is' should also work -- since the instances are predefined there's no reason to ever have multiple equivalent instances, so we might as well guarantee it. I'm sorry that my requirements for the relationship between the enum class and its values ends up forcing the decision not to allow subclasses (and I really mean *no* subclasses, not just no subclasses that add new values), but after thinking it all over I still think this is the right way forward. Something has got to give, and I think that disallowing subclasses is better than having the isinstance relationships be inverted or having to test for enum-ness using something other than isinstance. I guess the only way to change my mind at this point would be to come up with overwhelming evidence that subclassing enums is a very useful feature without which enums are pretty much useless. But we'd probably have to give up something else, e.g. adding methods to enums, or any hope that the instance/class/subclass relationships make any sense. Contravariance sucks. On Wed, May 1, 2013 at 8:44 AM, Barry Warsaw wrote: > On Apr 30, 2013, at 10:50 PM, Ethan Furman wrote: > >>The way I had subclassing working originally was for the subclass to create >>it's own versions of the superclass' enum items -- they weren't the same >>object, but they were equal: >> >>--> class Color(Enum): >>... red = 1 >>... green = 2 >>... blue = 3 >> >>--> class MoreColor(Color): >>... cyan = 4 >>... magenta = 5 >>... yellow = 6 >> >>--> Color.red is MoreColor.red >>False >> >>--> Color.red == MoreColor.red >>True >> >>If you switched from `is` to `==` would this work for you? > > Not really, because in practice you don't compare one enum against another > explicitly. You have a value in a variable and you're comparing against a > literal enum. So `is` is still the more natural spelling. > > My point is, if you want enums to behave more class-like because you're using > the class syntax, then you shouldn't explicitly break this one class-like > behavior just to protect some users from themselves. There doesn't even seem > to be an easy way to override the default behavior if you really wanted to do > it. > > -Barry > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Wed May 1 18:16:29 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 18:16:29 +0200 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51813644.8040707@pearwood.info> References: <51802595.2040305@stoneleaf.us> <51813020.2040508@stoneleaf.us> <51813644.8040707@pearwood.info> Message-ID: Am 01.05.2013 17:35, schrieb Steven D'Aprano: > On 02/05/13 01:09, Ethan Furman wrote: >> New repo to avoid confusion: >> >> https://bitbucket.org/stoneleaf/ref435 > > Apparently I have to log in before I can even see the repo. > > Not going to happen. I'm sure he made the repo private by accident just to keep you out. Georg From christian at python.org Wed May 1 18:22:34 2013 From: christian at python.org (Christian Heimes) Date: Wed, 01 May 2013 18:22:34 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> Message-ID: <5181414A.7000609@python.org> Am 01.05.2013 16:39, schrieb Guido van Rossum: > I've not got the full context, but I would like to make it *very* > clear in the API (e.g. through naming of the methods) when you are > getting a possibly cached result from stat(), and I would be very > concerned if existing APIs were going to get caching behavior. For > every use cases that benefits from caching there's a complementary use > case that caching breaks. Since both use cases are important we must > offer both APIs, in a way that makes it clear to even the casual > reader of the code what's going on. I deem caching of stat calls as problematic. The correct and contemporary result of a stat() call has security implications, too. For example stat() is used to prevent TOCTOU race conditions such as [1]. Caching is useful but I would prefer explicit caching rather than implicit and automatic caching of stat() results. We can get a greater speed up for walkdir() without resorting to caching, too. Some operating systems and file system report the file type in the dirent struct that is returned by readdir(). This reduces the number of stat calls to zero. Christian [1] https://www.securecoding.cert.org/confluence/display/seccode/POS01-C.+Check+for+the+existence+of+links+when+dealing+with+files From tseaver at palladion.com Wed May 1 18:18:11 2013 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 01 May 2013 12:18:11 -0400 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/01/2013 12:14 PM, Guido van Rossum wrote: > But we'd probably have to give up something else, e.g. adding methods > to enums, or any hope that the instance/class/subclass relationships > make any sense. I'd be glad to drop both of those in favor of subclassing: I think the emphasis on "class-ness" makes no sense, given the driving usecases for adopting enums into the stdlib in the first place. IOW, I would vote that real-world usecases trump hypothetical purity. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGBQEMACgkQ+gerLs4ltQ6myQCZAZqKCR/6H6I8bogHtSwhTM9I ok8AnjBKfFyuse6caMF085wBHvlrf0uA =nJ5C -----END PGP SIGNATURE----- From g.brandl at gmx.net Wed May 1 18:29:32 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 18:29:32 +0200 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <51813020.2040508@stoneleaf.us> References: <51802595.2040305@stoneleaf.us> <51813020.2040508@stoneleaf.us> Message-ID: Am 01.05.2013 17:09, schrieb Ethan Furman: > New repo to avoid confusion: > > https://bitbucket.org/stoneleaf/ref435 > > which has the latest updates from the feedback. > > Subclassing is again disabled. Let's get the rest of it done, then we can > come back to that issue if necessary. Thanks. I'm reviewing the code and adding comments to https://bitbucket.org/stoneleaf/ref435/commits/4d2c4b94cdd35022a8a3e50554794f4a1c956e46 Georg From guido at python.org Wed May 1 18:43:33 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 09:43:33 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: On Wed, May 1, 2013 at 9:18 AM, Tres Seaver wrote: > I'd be glad to drop both of those in favor of subclassing: I think the > emphasis on "class-ness" makes no sense, given the driving usecases for > adopting enums into the stdlib in the first place. IOW, I would vote > that real-world usecases trump hypothetical purity. Yeah, this is the dilemma. But what *are* the real-world use cases? Please provide some. Here's how I would implement "extending" an enum if subclassing were not allowed: class Color(Enum): red = 1 white = 2 blue = 3 class ExtraColor(Enum): orange = 4 yellow = 5 green = 6 flag_colors = set(Color) | set(ExtraColor) Now I can test "c in flag_colors" to check whether c is a flag color. I can also loop over flag_colors. If I want the colors in definition order I could use a list instead: ordered_flag_colors = list(Color) + list(ExtraColor) But this would be less or more acceptable depending on whether it is a common or esoteric use case. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Wed May 1 19:21:30 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 10:21:30 -0700 Subject: [Python-Dev] Enum: subclassing? Message-ID: <51814F1A.1030104@stoneleaf.us> We may not want to /completely/ disallow subclassing. Consider: --> class StrEnum(str, Enum): ... '''string enums for Business Basic variable names''' ... --> class Vendors(StrEnum): EnumError: subclassing not allowed My point is that IntEnum, StrEnum, ListEnum, FloatEnum are all "subclasses" of Enum. To then have a subclass of that, such as Season(StrEnum), is subclassing a subclass. Now, if we do want to completely disallow it, we can ditch IntEnum and force the user to always specify the mixin type: --> class Season(str, Enum): . . . --> class Names(str, Enum): . . . But that's not very user friendly... although it's not too bad, either. One consequence of the way it is now (IntEnum, StrEnum, etc., are allowed) is that one can put methods and other non-Enum item in a base class and then inherit from that for actual implemented Enum classes. --> class StrEnum(str, Enum): ... def describe(self): ... print("Hi! I'm a %s widget!" % self.value) ... --> class Season(StrEnum): ... spring = 'green' ... summer = 'brown' ... autumn = 'red' ... winter = 'white' ... --> class Planet(StrEnum): ... mars = 'red' ... earth = 'blue' ... --> Season.summer.descripbe() Hi! I'm a brown widget! --> Planet.earth.describe() Hi! I'm a blue widget! -- ~Ethan~ From guido at python.org Wed May 1 19:48:45 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 10:48:45 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <51814F1A.1030104@stoneleaf.us> References: <51814F1A.1030104@stoneleaf.us> Message-ID: On Wed, May 1, 2013 at 10:21 AM, Ethan Furman wrote: > We may not want to /completely/ disallow subclassing. Consider: > > --> class StrEnum(str, Enum): > ... '''string enums for Business Basic variable names''' > ... > --> class Vendors(StrEnum): > EnumError: subclassing not allowed > > > My point is that IntEnum, StrEnum, ListEnum, FloatEnum are all "subclasses" > of Enum. To then have a subclass of > that, such as Season(StrEnum), is subclassing a subclass. True, and Enum itself also falls in this category. Maybe there could be a special marker that you have to set in the class body (or a keyword arg in the class statement) to flag that a class is meant as a "category of enums" rather than a specific enum type. Such categorical classes should not define any instances. (And maybe "defines no instances" is enough to flag an Enum class as subclassable.) > Now, if we do want to completely disallow it, we can ditch IntEnum and force > the user to always specify the mixin > type: > > --> class Season(str, Enum): > . > . > . > > --> class Names(str, Enum): > . > . > . > > But that's not very user friendly... although it's not too bad, either. Indeed, given that we mostly want IntEnum as a last-resort backward compatibility thing for os and socket, it may not be so bad. > One consequence of the way it is now (IntEnum, StrEnum, etc., are allowed) > is that one can put methods and other non-Enum item in a base class and then > inherit from that for actual implemented Enum classes. > > --> class StrEnum(str, Enum): > ... def describe(self): > ... print("Hi! I'm a %s widget!" % self.value) > ... > > --> class Season(StrEnum): > ... spring = 'green' > ... summer = 'brown' > ... autumn = 'red' > ... winter = 'white' > ... > > --> class Planet(StrEnum): > ... mars = 'red' > ... earth = 'blue' > ... > > --> Season.summer.descripbe() > Hi! I'm a brown widget! > > --> Planet.earth.describe() > Hi! I'm a blue widget! If the base class doesn't define any instances (and perhaps is marked specifically for this purpose) I'm fine with that. -- --Guido van Rossum (python.org/~guido) From eliben at gmail.com Wed May 1 20:04:32 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 11:04:32 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: On Wed, May 1, 2013 at 10:48 AM, Guido van Rossum wrote: > On Wed, May 1, 2013 at 10:21 AM, Ethan Furman wrote: > > We may not want to /completely/ disallow subclassing. Consider: > > > > --> class StrEnum(str, Enum): > > ... '''string enums for Business Basic variable names''' > > ... > > --> class Vendors(StrEnum): > > EnumError: subclassing not allowed > > > > > > My point is that IntEnum, StrEnum, ListEnum, FloatEnum are all > "subclasses" > > of Enum. To then have a subclass of > > that, such as Season(StrEnum), is subclassing a subclass. > > True, and Enum itself also falls in this category. Maybe there could > be a special marker that you have to set in the class body (or a > keyword arg in the class statement) to flag that a class is meant as a > "category of enums" rather than a specific enum type. Such categorical > classes should not define any instances. (And maybe "defines no > instances" is enough to flag an Enum class as subclassable.) > > > Now, if we do want to completely disallow it, we can ditch IntEnum and > force > > the user to always specify the mixin > > type: > > > > --> class Season(str, Enum): > > . > > . > > . > > > > --> class Names(str, Enum): > > . > > . > > . > > > > But that's not very user friendly... although it's not too bad, either. > > Indeed, given that we mostly want IntEnum as a last-resort backward > compatibility thing for os and socket, it may not be so bad. > > Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ attribute, but in the current ref435 implementation this isn't needed, so IntEnum is just: class IntEnum(int, Enum): ''' Class where every instance is a subclass of int. ''' So why don't we just drop IntEnum from the API and tell users they should do the above explicitly, i.e.: class SocketFamily(int, Enum): AF_UNIX = 1 AF_INET = 2 As opposed to having an IntEnum explicitly, this just saves 2 characters (comma+space), but is more explicit (zen!) and helps us avoid the special-casing the subclass restriction implementation. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed May 1 20:14:36 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 11:14:36 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: On Wed, May 1, 2013 at 11:04 AM, Eli Bendersky wrote: > Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ > attribute, but in the current ref435 implementation this isn't needed, so > IntEnum is just: > > class IntEnum(int, Enum): > ''' > Class where every instance is a subclass of int. > ''' > > So why don't we just drop IntEnum from the API and tell users they should do > the above explicitly, i.e.: > > class SocketFamily(int, Enum): > AF_UNIX = 1 > AF_INET = 2 > > As opposed to having an IntEnum explicitly, this just saves 2 characters > (comma+space), but is more explicit (zen!) and helps us avoid the > special-casing the subclass restriction implementation. Sounds good to me. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed May 1 20:44:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 20:44:39 +0200 Subject: [Python-Dev] Enum: subclassing? References: <51814F1A.1030104@stoneleaf.us> Message-ID: <20130501204439.01cfd6fd@fsol> On Wed, 01 May 2013 10:21:30 -0700 Ethan Furman wrote: > We may not want to /completely/ disallow subclassing. Consider: > > --> class StrEnum(str, Enum): > ... '''string enums for Business Basic variable names''' > ... > --> class Vendors(StrEnum): > EnumError: subclassing not allowed I don't see the point of disallowing subclassing. It sounds like a pointless restriction. However, perhaps the constructor should forbid the returning of a base type, e.g.: class Season(Enum): spring = 1 class MySeason(Season): """I look nicer than Season""" MySeason('spring') ... ValueError: Season.spring is not a MySeason instance (what this means is perhaps the subclassing of non-empty enum classes should be forbidden) Regards Antoine. > > > My point is that IntEnum, StrEnum, ListEnum, FloatEnum are all "subclasses" of Enum. To then have a subclass of > that, such as Season(StrEnum), is subclassing a subclass. > > Now, if we do want to completely disallow it, we can ditch IntEnum and force the user to always specify the mixin > type: > > --> class Season(str, Enum): > . > . > . > > --> class Names(str, Enum): > . > . > . > > But that's not very user friendly... although it's not too bad, either. > > One consequence of the way it is now (IntEnum, StrEnum, etc., are allowed) is that one can put methods and other > non-Enum item in a base class and then inherit from that for actual implemented Enum classes. > > --> class StrEnum(str, Enum): > ... def describe(self): > ... print("Hi! I'm a %s widget!" % self.value) > ... > > --> class Season(StrEnum): > ... spring = 'green' > ... summer = 'brown' > ... autumn = 'red' > ... winter = 'white' > ... > > --> class Planet(StrEnum): > ... mars = 'red' > ... earth = 'blue' > ... > > --> Season.summer.descripbe() > Hi! I'm a brown widget! > > --> Planet.earth.describe() > Hi! I'm a blue widget! > > -- > ~Ethan~ From g.brandl at gmx.net Wed May 1 20:47:19 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 20:47:19 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: Am 01.05.2013 20:04, schrieb Eli Bendersky: > Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ > attribute, but in the current ref435 implementation this isn't needed, so > IntEnum is just: > > class IntEnum(int, Enum): > ''' > Class where every instance is a subclass of int. > ''' > > So why don't we just drop IntEnum from the API and tell users they should do the > above explicitly, i.e.: > > class SocketFamily(int, Enum): > AF_UNIX = 1 > AF_INET = 2 > > As opposed to having an IntEnum explicitly, this just saves 2 characters > (comma+space), but is more explicit (zen!) and helps us avoid the special-casing > the subclass restriction implementation. Wait a moment... it might not be immediately useful for IntEnums (however, that's because base Enum currently defines __int__ which I find questionable), but with current ref435 you *can* create your own enum base classes with your own methods, and derive concrete enums from that. It also lets you have a base class for enums and use it in isinstance(). If you forbid subclassing completely that will be impossible. Georg From larry at hastings.org Wed May 1 20:54:00 2013 From: larry at hastings.org (Larry Hastings) Date: Wed, 01 May 2013 11:54:00 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5180B635.7000904@stoneleaf.us> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> Message-ID: <518164C8.2090204@hastings.org> On 04/30/2013 11:29 PM, Ethan Furman wrote: > On 04/30/2013 11:18 PM, Barry Warsaw wrote: >> On Apr 28, 2013, at 11:50 PM, Ethan Furman wrote: >> >>> But as soon as: >>> >>> type(Color.red) is Color # True >>> type(MoreColor.red) is MoreColor # True >>> >>> then: >>> >>> Color.red is MoreColor.red # must be False, no? >>> >>> >>> If that last statement can still be True, I'd love it if someone >>> showed me >>> how. >> >> class Foo: >> a = object() >> b = object() >> >> class Bar(Foo): >> c = object() >> >>>>> Foo.a is Bar.a >> True > > Wow. I think I'm blushing from embarrassment. > > Thank you for answering my question, Barry. Wait, what? I don't see how Barry's code answers your question. In his example, type(a) == type(b) == type(c) == object. You were asking "how can Color.red and MoreColor.red be the same object if they are of different types?" p.s. They can't. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed May 1 20:59:15 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 20:59:15 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <20130501204439.01cfd6fd@fsol> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> Message-ID: Am 01.05.2013 20:44, schrieb Antoine Pitrou: > On Wed, 01 May 2013 10:21:30 -0700 > Ethan Furman wrote: >> We may not want to /completely/ disallow subclassing. Consider: >> >> --> class StrEnum(str, Enum): >> ... '''string enums for Business Basic variable names''' >> ... >> --> class Vendors(StrEnum): >> EnumError: subclassing not allowed > > I don't see the point of disallowing subclassing. It sounds like > a pointless restriction. > > However, perhaps the constructor should forbid the returning of a base > type, e.g.: > > class Season(Enum): > spring = 1 > > class MySeason(Season): > """I look nicer than Season""" > > MySeason('spring') > ... > ValueError: Season.spring is not a MySeason instance > > (what this means is perhaps the subclassing of non-empty enum classes > should be forbidden) That's exactly what's implemented in the ref435 code at the moment. Georg From eliben at gmail.com Wed May 1 22:05:53 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 13:05:53 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> Message-ID: On Wed, May 1, 2013 at 11:59 AM, Georg Brandl wrote: > Am 01.05.2013 20:44, schrieb Antoine Pitrou: > > On Wed, 01 May 2013 10:21:30 -0700 > > Ethan Furman wrote: > >> We may not want to /completely/ disallow subclassing. Consider: > >> > >> --> class StrEnum(str, Enum): > >> ... '''string enums for Business Basic variable names''' > >> ... > >> --> class Vendors(StrEnum): > >> EnumError: subclassing not allowed > > > > I don't see the point of disallowing subclassing. It sounds like > > a pointless restriction. > > > > However, perhaps the constructor should forbid the returning of a base > > type, e.g.: > > > > class Season(Enum): > > spring = 1 > > > > class MySeason(Season): > > """I look nicer than Season""" > > > > MySeason('spring') > > ... > > ValueError: Season.spring is not a MySeason instance > > > > (what this means is perhaps the subclassing of non-empty enum classes > > should be forbidden) > > That's exactly what's implemented in the ref435 code at the moment. > > It can't be because __call__ is by-value lookup, not by-name lookup. By-name lookup is Season.spring or getattr(Season, 'spring') Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Wed May 1 22:09:53 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 13:09:53 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: On Wed, May 1, 2013 at 11:47 AM, Georg Brandl wrote: > Am 01.05.2013 20:04, schrieb Eli Bendersky: > > > Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ > > attribute, but in the current ref435 implementation this isn't needed, so > > IntEnum is just: > > > > class IntEnum(int, Enum): > > ''' > > Class where every instance is a subclass of int. > > ''' > > > > So why don't we just drop IntEnum from the API and tell users they > should do the > > above explicitly, i.e.: > > > > class SocketFamily(int, Enum): > > AF_UNIX = 1 > > AF_INET = 2 > > > > As opposed to having an IntEnum explicitly, this just saves 2 characters > > (comma+space), but is more explicit (zen!) and helps us avoid the > special-casing > > the subclass restriction implementation. > > Wait a moment... it might not be immediately useful for IntEnums (however, > that's because base Enum currently defines __int__ which I find > questionable), > but with current ref435 you *can* create your own enum base classes with > your > own methods, and derive concrete enums from that. It also lets you have a > base class for enums and use it in isinstance(). > > If you forbid subclassing completely that will be impossible. > I'm not sure what you mean, Georg, could you clarify? This works: >>> from ref435 import Enum >>> class SocketFamily(int, Enum): ... AF_UNIX = 1 ... AF_INET = 2 ... >>> SocketFamily.AF_INET SocketFamily.AF_INET [value=2] >>> SocketFamily.AF_INET == 2 True >>> type(SocketFamily.AF_INET) >>> isinstance(SocketFamily.AF_INET, SocketFamily) True Now, with the way things are currently implemented, class IntEnum is just syntactic sugar for above. Guido decided against allowing any kind of subclassing, but as an implementation need we should keep some restricted form to implement IntEnum. But is IntEnum really needed if the above explicit multiple-inheritance of int and Enum is possible? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed May 1 22:33:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 22:33:44 +0200 Subject: [Python-Dev] Enum: subclassing? References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> Message-ID: <20130501223344.6899e9a6@fsol> On Wed, 1 May 2013 13:05:53 -0700 Eli Bendersky wrote: > On Wed, May 1, 2013 at 11:59 AM, Georg Brandl wrote: > > > Am 01.05.2013 20:44, schrieb Antoine Pitrou: > > > On Wed, 01 May 2013 10:21:30 -0700 > > > Ethan Furman wrote: > > >> We may not want to /completely/ disallow subclassing. Consider: > > >> > > >> --> class StrEnum(str, Enum): > > >> ... '''string enums for Business Basic variable names''' > > >> ... > > >> --> class Vendors(StrEnum): > > >> EnumError: subclassing not allowed > > > > > > I don't see the point of disallowing subclassing. It sounds like > > > a pointless restriction. > > > > > > However, perhaps the constructor should forbid the returning of a base > > > type, e.g.: > > > > > > class Season(Enum): > > > spring = 1 > > > > > > class MySeason(Season): > > > """I look nicer than Season""" > > > > > > MySeason('spring') > > > ... > > > ValueError: Season.spring is not a MySeason instance > > > > > > (what this means is perhaps the subclassing of non-empty enum classes > > > should be forbidden) > > > > That's exactly what's implemented in the ref435 code at the moment. > > > > > It can't be because __call__ is by-value lookup, not by-name lookup. Ok, I've mixed up the example. But, still, since Season(1) should return the Season.spring singleton, I don't see any reasonable thing for MySeason(1) to return. Hence the request to raise an exception. Regards Antoine. From eliben at gmail.com Wed May 1 22:43:22 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 13:43:22 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <20130501223344.6899e9a6@fsol> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> Message-ID: On Wed, May 1, 2013 at 1:33 PM, Antoine Pitrou wrote: > On Wed, 1 May 2013 13:05:53 -0700 > Eli Bendersky wrote: > > On Wed, May 1, 2013 at 11:59 AM, Georg Brandl wrote: > > > > > Am 01.05.2013 20:44, schrieb Antoine Pitrou: > > > > On Wed, 01 May 2013 10:21:30 -0700 > > > > Ethan Furman wrote: > > > >> We may not want to /completely/ disallow subclassing. Consider: > > > >> > > > >> --> class StrEnum(str, Enum): > > > >> ... '''string enums for Business Basic variable names''' > > > >> ... > > > >> --> class Vendors(StrEnum): > > > >> EnumError: subclassing not allowed > > > > > > > > I don't see the point of disallowing subclassing. It sounds like > > > > a pointless restriction. > > > > > > > > However, perhaps the constructor should forbid the returning of a > base > > > > type, e.g.: > > > > > > > > class Season(Enum): > > > > spring = 1 > > > > > > > > class MySeason(Season): > > > > """I look nicer than Season""" > > > > > > > > MySeason('spring') > > > > ... > > > > ValueError: Season.spring is not a MySeason instance > > > > > > > > (what this means is perhaps the subclassing of non-empty enum classes > > > > should be forbidden) > > > > > > That's exactly what's implemented in the ref435 code at the moment. > > > > > > > > It can't be because __call__ is by-value lookup, not by-name lookup. > > Ok, I've mixed up the example. But, still, since Season(1) should > return the Season.spring singleton, I don't see any reasonable thing > for MySeason(1) to return. Hence the request to raise an exception. > What do you need MySeason for, though? IIUC, you don't ask to allow adding enum values in it, so it only leaves adding extra functionality (methods)? What are the use cases? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed May 1 22:45:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 22:45:53 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> Message-ID: <20130501224553.740bab91@fsol> On Wed, 1 May 2013 13:43:22 -0700 Eli Bendersky wrote: > On Wed, May 1, 2013 at 1:33 PM, Antoine Pitrou wrote: > > > On Wed, 1 May 2013 13:05:53 -0700 > > Eli Bendersky wrote: > > > On Wed, May 1, 2013 at 11:59 AM, Georg Brandl wrote: > > > > > > > Am 01.05.2013 20:44, schrieb Antoine Pitrou: > > > > > On Wed, 01 May 2013 10:21:30 -0700 > > > > > Ethan Furman wrote: > > > > >> We may not want to /completely/ disallow subclassing. Consider: > > > > >> > > > > >> --> class StrEnum(str, Enum): > > > > >> ... '''string enums for Business Basic variable names''' > > > > >> ... > > > > >> --> class Vendors(StrEnum): > > > > >> EnumError: subclassing not allowed > > > > > > > > > > I don't see the point of disallowing subclassing. It sounds like > > > > > a pointless restriction. > > > > > > > > > > However, perhaps the constructor should forbid the returning of a > > base > > > > > type, e.g.: > > > > > > > > > > class Season(Enum): > > > > > spring = 1 > > > > > > > > > > class MySeason(Season): > > > > > """I look nicer than Season""" > > > > > > > > > > MySeason('spring') > > > > > ... > > > > > ValueError: Season.spring is not a MySeason instance > > > > > > > > > > (what this means is perhaps the subclassing of non-empty enum classes > > > > > should be forbidden) > > > > > > > > That's exactly what's implemented in the ref435 code at the moment. > > > > > > > > > > > It can't be because __call__ is by-value lookup, not by-name lookup. > > > > Ok, I've mixed up the example. But, still, since Season(1) should > > return the Season.spring singleton, I don't see any reasonable thing > > for MySeason(1) to return. Hence the request to raise an exception. > > > > What do you need MySeason for, though? IIUC, you don't ask to allow adding > enum values in it, so it only leaves adding extra functionality (methods)? > What are the use cases? I was talking in the context where subclassing is allowed. I don't think there's a use-case for subclassing of non-empty enums. On the other hand, empty enums should probably allow subclassing (they are "abstract base enums", in a way). Regards Antoine. From eliben at gmail.com Wed May 1 22:57:11 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 13:57:11 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <20130501224553.740bab91@fsol> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> Message-ID: On Wed, May 1, 2013 at 1:45 PM, Antoine Pitrou wrote: > On Wed, 1 May 2013 13:43:22 -0700 > Eli Bendersky wrote: > > > On Wed, May 1, 2013 at 1:33 PM, Antoine Pitrou > wrote: > > > > > On Wed, 1 May 2013 13:05:53 -0700 > > > Eli Bendersky wrote: > > > > On Wed, May 1, 2013 at 11:59 AM, Georg Brandl > wrote: > > > > > > > > > Am 01.05.2013 20:44, schrieb Antoine Pitrou: > > > > > > On Wed, 01 May 2013 10:21:30 -0700 > > > > > > Ethan Furman wrote: > > > > > >> We may not want to /completely/ disallow subclassing. Consider: > > > > > >> > > > > > >> --> class StrEnum(str, Enum): > > > > > >> ... '''string enums for Business Basic variable names''' > > > > > >> ... > > > > > >> --> class Vendors(StrEnum): > > > > > >> EnumError: subclassing not allowed > > > > > > > > > > > > I don't see the point of disallowing subclassing. It sounds like > > > > > > a pointless restriction. > > > > > > > > > > > > However, perhaps the constructor should forbid the returning of a > > > base > > > > > > type, e.g.: > > > > > > > > > > > > class Season(Enum): > > > > > > spring = 1 > > > > > > > > > > > > class MySeason(Season): > > > > > > """I look nicer than Season""" > > > > > > > > > > > > MySeason('spring') > > > > > > ... > > > > > > ValueError: Season.spring is not a MySeason instance > > > > > > > > > > > > (what this means is perhaps the subclassing of non-empty enum > classes > > > > > > should be forbidden) > > > > > > > > > > That's exactly what's implemented in the ref435 code at the moment. > > > > > > > > > > > > > > It can't be because __call__ is by-value lookup, not by-name lookup. > > > > > > Ok, I've mixed up the example. But, still, since Season(1) should > > > return the Season.spring singleton, I don't see any reasonable thing > > > for MySeason(1) to return. Hence the request to raise an exception. > > > > > > > What do you need MySeason for, though? IIUC, you don't ask to allow > adding > > enum values in it, so it only leaves adding extra functionality > (methods)? > > What are the use cases? > > I was talking in the context where subclassing is allowed. I don't > think there's a use-case for subclassing of non-empty enums. On the > other hand, empty enums should probably allow subclassing (they are > "abstract base enums", in a way). > I still don't understand what you mean, sorry. Like, this: class MyEmptyEnum(Enum): pass Why would you want to subclass MyEmptyEnum ? Or do you mean this: class IntEnum(int, Enum): pass Now I can have: class SocketFamily(IntEnum): ?? If it's the latter, then why allow subclassing explicitly just for this reason? I think the explicit approach of: class SocketFamily(int, Enum): Is cleaner anyway, and absolves us of providing yet another enum class to export from the stdlib. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed May 1 23:00:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 23:00:01 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> Message-ID: <20130501230001.4034b46a@fsol> On Wed, 1 May 2013 13:57:11 -0700 Eli Bendersky wrote: > > I still don't understand what you mean, sorry. Like, this: > > class MyEmptyEnum(Enum): > pass > > Why would you want to subclass MyEmptyEnum ? > > Or do you mean this: > > class IntEnum(int, Enum): > pass > > Now I can have: > > class SocketFamily(IntEnum): > ?? > > If it's the latter, then why allow subclassing explicitly just for this > reason? Because I may want to share methods accross all concrete subclasses of IntEnum (or WhateverEnum). Regards Antoine. From eliben at gmail.com Wed May 1 23:04:11 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 14:04:11 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <20130501230001.4034b46a@fsol> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: On Wed, May 1, 2013 at 2:00 PM, Antoine Pitrou wrote: > On Wed, 1 May 2013 13:57:11 -0700 > Eli Bendersky wrote: > > > > I still don't understand what you mean, sorry. Like, this: > > > > class MyEmptyEnum(Enum): > > pass > > > > Why would you want to subclass MyEmptyEnum ? > > > > Or do you mean this: > > > > class IntEnum(int, Enum): > > pass > > > > Now I can have: > > > > class SocketFamily(IntEnum): > > ?? > > > > If it's the latter, then why allow subclassing explicitly just for this > > reason? > > Because I may want to share methods accross all concrete subclasses of > IntEnum (or WhateverEnum). > You mean this? class BehaviorMixin: # bla bla class MyBehavingIntEnum(int, BehaviorMixin, Enum): foo = 1 bar = 2 Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed May 1 23:11:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 1 May 2013 23:11:16 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: <20130501231116.2e9c9441@fsol> On Wed, 1 May 2013 14:04:11 -0700 Eli Bendersky wrote: > > You mean this? > > class BehaviorMixin: > # bla bla > > class MyBehavingIntEnum(int, BehaviorMixin, Enum): > foo = 1 > bar = 2 Yes, but without the need for multiple inheritance and separate mixins ;-) Especially if the behaviour is enum-specific, e.g.: class IETFStatusCode(IntEnum): @classmethod def from_statusline(cls, line): return cls(int(line.split()[0])) class HTTPStatusCode(IETFStatusCode): NOT_FOUND = 404 class SIPStatusCode(IETFStatusCode): RINGING = 180 Regards Antoine. From guido at python.org Wed May 1 23:07:51 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 14:07:51 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: On Wed, May 1, 2013 at 2:04 PM, Eli Bendersky wrote: > > > > On Wed, May 1, 2013 at 2:00 PM, Antoine Pitrou wrote: >> >> On Wed, 1 May 2013 13:57:11 -0700 >> Eli Bendersky wrote: >> > >> > I still don't understand what you mean, sorry. Like, this: >> > >> > class MyEmptyEnum(Enum): >> > pass >> > >> > Why would you want to subclass MyEmptyEnum ? >> > >> > Or do you mean this: >> > >> > class IntEnum(int, Enum): >> > pass >> > >> > Now I can have: >> > >> > class SocketFamily(IntEnum): >> > ?? >> > >> > If it's the latter, then why allow subclassing explicitly just for this >> > reason? >> >> Because I may want to share methods accross all concrete subclasses of >> IntEnum (or WhateverEnum). > > > You mean this? > > class BehaviorMixin: > # bla bla > > class MyBehavingIntEnum(int, BehaviorMixin, Enum): > foo = 1 > bar = 2 It's a common pattern to do this with a base class rather than a mixin, though, and I think the rule "only allow subclassing empty enums" makes a lot of sense. -- --Guido van Rossum (python.org/~guido) From eliben at gmail.com Wed May 1 23:19:00 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 14:19:00 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: On Wed, May 1, 2013 at 2:07 PM, Guido van Rossum wrote: > On Wed, May 1, 2013 at 2:04 PM, Eli Bendersky wrote: > > > > > > > > On Wed, May 1, 2013 at 2:00 PM, Antoine Pitrou > wrote: > >> > >> On Wed, 1 May 2013 13:57:11 -0700 > >> Eli Bendersky wrote: > >> > > >> > I still don't understand what you mean, sorry. Like, this: > >> > > >> > class MyEmptyEnum(Enum): > >> > pass > >> > > >> > Why would you want to subclass MyEmptyEnum ? > >> > > >> > Or do you mean this: > >> > > >> > class IntEnum(int, Enum): > >> > pass > >> > > >> > Now I can have: > >> > > >> > class SocketFamily(IntEnum): > >> > ?? > >> > > >> > If it's the latter, then why allow subclassing explicitly just for > this > >> > reason? > >> > >> Because I may want to share methods accross all concrete subclasses of > >> IntEnum (or WhateverEnum). > > > > > > You mean this? > > > > class BehaviorMixin: > > # bla bla > > > > class MyBehavingIntEnum(int, BehaviorMixin, Enum): > > foo = 1 > > bar = 2 > > It's a common pattern to do this with a base class rather than a > mixin, though, and I think the rule "only allow subclassing empty > enums" makes a lot of sense. > I see your point (and Antoine's example in the next email is good), but my concern is that this is a TIMTOWTDI thing, since the same can be achieved with mixins. Specifically, Antoine's example becomes: class IETFStatusCode: @classmethod def from_statusline(cls, line): return cls(int(line.split()[0])) class HTTPStatusCode(int, IETFStatusCode, Enum): NOT_FOUND = 404 class SIPStatusCode(int, IETFStatusCode, Enum): RINGING = 180 Same thing, while keeping the stdlib API cleaner and more minimal. Cleaner because "no subclassing" is a simpler, more explicit, and easier to understand rule than "no subclassing unless base class is devoid of enumeration values". And because we can no longer say "Enum classes are final", which is a relatively familiar and understood semantic. That said, I don't feel strongly about this so if the above does not convert you, I'm fine with allowing subclassing enum classes that don't define any enums =) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed May 1 23:35:17 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 23:35:17 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> Message-ID: Am 01.05.2013 22:05, schrieb Eli Bendersky: > > > > On Wed, May 1, 2013 at 11:59 AM, Georg Brandl > wrote: > > Am 01.05.2013 20:44, schrieb Antoine Pitrou: > > On Wed, 01 May 2013 10:21:30 -0700 > > Ethan Furman > wrote: > >> We may not want to /completely/ disallow subclassing. Consider: > >> > >> --> class StrEnum(str, Enum): > >> ... '''string enums for Business Basic variable names''' > >> ... > >> --> class Vendors(StrEnum): > >> EnumError: subclassing not allowed > > > > I don't see the point of disallowing subclassing. It sounds like > > a pointless restriction. > > > > However, perhaps the constructor should forbid the returning of a base > > type, e.g.: > > > > class Season(Enum): > > spring = 1 > > > > class MySeason(Season): > > """I look nicer than Season""" > > > > MySeason('spring') > > ... > > ValueError: Season.spring is not a MySeason instance > > > > (what this means is perhaps the subclassing of non-empty enum classes > > should be forbidden) > > That's exactly what's implemented in the ref435 code at the moment. > > > It can't be because __call__ is by-value lookup, not by-name lookup. By-name > lookup is Season.spring or getattr(Season, 'spring') Right, I was just referring to the parenthetical remark. Georg From timothy.c.delaney at gmail.com Wed May 1 23:36:08 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 2 May 2013 07:36:08 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: On 2 May 2013 02:18, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/01/2013 12:14 PM, Guido van Rossum wrote: > > But we'd probably have to give up something else, e.g. adding methods > > to enums, or any hope that the instance/class/subclass relationships > > make any sense. > > I'd be glad to drop both of those in favor of subclassing: I think the > emphasis on "class-ness" makes no sense, given the driving usecases for > adopting enums into the stdlib in the first place. IOW, I would vote > that real-world usecases trump hypothetical purity. > I have real-world use cases of enums (in java) that are essentially classes and happen to use the enum portion purely to obtain a unique name without explicitly supplying an ID. In the particular use case I'm thinking of, the flow is basically like this: 1. An Enum where each instance describes the shape of a database query. 2. Wire protocol where the Enum instance name is passed. 3. At one end, the data for performing the DB query is populated. 4. At the other end, the data is extracted and the appropriate enum is used to perform the query. Why use an enum? By using the name in the wire protocol I'm guaranteed a unique ID that won't change across versions (there is a requirement to only add to the enum) but does not rely on people setting it manually - the compiler will complain if there is a conflict, as opposed to setting values. And having the behaviour be part of the class simplifies things immensely. Yes, I could do all of this without an enum (have class check that each supplied ID is unique, etc) but the code is much clearer using the enum. I am happy to give up subclassing of enums in order to have behaviour on enum instances. I've always seen enums more as a container for their instances. I do want to be able to find out what enum class a particular enum belongs to (I've used this property in the past) and it's nice that the enum instance is an instance of the defining class (although IMO not required). I see advantages to enums being subclassable, but also significant disadvantages. For example, given the following: class Color(Enum): red = 1 class MoreColor(Color): blue = 2 class DifferentMoreColor(Color): green = 2 then the only reasonable way for it to work IMO is that MoreColor contains both (red, blue) and DifferentMoreColor contains both (red, green) and that red is not an instance of either MoreColor or DifferentMoreColor. If you allow subclassing, at some point either something is going to be intuitively backwards to some people (in the above that Color.red is not an instance of MoreColor), or is going to result in a contravariance violation. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed May 1 23:38:35 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 23:38:35 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: Am 01.05.2013 22:09, schrieb Eli Bendersky: > > > > On Wed, May 1, 2013 at 11:47 AM, Georg Brandl > wrote: > > Am 01.05.2013 20:04, schrieb Eli Bendersky: > > > Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ > > attribute, but in the current ref435 implementation this isn't needed, so > > IntEnum is just: > > > > class IntEnum(int, Enum): > > ''' > > Class where every instance is a subclass of int. > > ''' > > > > So why don't we just drop IntEnum from the API and tell users they should > do the > > above explicitly, i.e.: > > > > class SocketFamily(int, Enum): > > AF_UNIX = 1 > > AF_INET = 2 > > > > As opposed to having an IntEnum explicitly, this just saves 2 characters > > (comma+space), but is more explicit (zen!) and helps us avoid the > special-casing > > the subclass restriction implementation. > > Wait a moment... it might not be immediately useful for IntEnums (however, > that's because base Enum currently defines __int__ which I find questionable), > but with current ref435 you *can* create your own enum base classes with your > own methods, and derive concrete enums from that. It also lets you have a > base class for enums and use it in isinstance(). > > If you forbid subclassing completely that will be impossible. > > > I'm not sure what you mean, Georg, could you clarify? > This works: > >>>> from ref435 import Enum >>>> class SocketFamily(int, Enum): > ... AF_UNIX = 1 > ... AF_INET = 2 > ... >>>> SocketFamily.AF_INET > SocketFamily.AF_INET [value=2] >>>> SocketFamily.AF_INET == 2 > True >>>> type(SocketFamily.AF_INET) > >>>> isinstance(SocketFamily.AF_INET, SocketFamily) > True > > Now, with the way things are currently implemented, class IntEnum is just > syntactic sugar for above. Guido decided against allowing any kind of > subclassing, but as an implementation need we should keep some restricted form > to implement IntEnum. But is IntEnum really needed if the above explicit > multiple-inheritance of int and Enum is possible? Well, my point is that you currently don't have to inherit from int (or IntEnum) to get an __int__ method on your Enum, which is what I find questionable. IMO conversion to integers should only be defined for IntEnums. (But I haven't followed all of the discussion and this may already have been decided.) If __int__ stays where it is, a separate IntEnum is not necessary, but that doesn't mean that enum baseclasses aren't useful for other use cases (and they aren't hard to support, as ref435 shows.) Georg From g.brandl at gmx.net Wed May 1 23:44:03 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 23:44:03 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: Am 01.05.2013 23:19, schrieb Eli Bendersky: > It's a common pattern to do this with a base class rather than a > mixin, though, and I think the rule "only allow subclassing empty > enums" makes a lot of sense. > > > I see your point (and Antoine's example in the next email is good), but my > concern is that this is a TIMTOWTDI thing, since the same can be achieved with > mixins. Specifically, Antoine's example > becomes: > > class IETFStatusCode: > @classmethod > def from_statusline(cls, line): > return cls(int(line.split()[0])) > > class HTTPStatusCode(int, IETFStatusCode, Enum): > NOT_FOUND = 404 > > class SIPStatusCode(int, IETFStatusCode, Enum): > RINGING = 180 Now try it like this: class SIPStatusCode(IETFStatusCode, int, Enum): RINGING = 180 and you'll get Traceback (most recent call last): File "/home/gbr/devel/ref435/ref435.py", line 84, in __new__ enum_item = obj_type.__new__(result, value) TypeError: object.__new__(SIPStatusCode) is not safe, use int.__new__() During handling of the above exception, another exception occurred: Traceback (most recent call last): File "ex.py", line 11, in class SIPStatusCode(IETFStatusCode, int, Enum): File "/home/gbr/devel/ref435/ref435.py", line 86, in __new__ raise EnumError(*exc.args) from None TypeError: exception causes must derive from BaseException > Same thing, while keeping the stdlib API cleaner and more minimal. Cleaner > because "no subclassing" is a simpler, more explicit, and easier to understand > rule than "no subclassing unless base class is devoid of enumeration values". > And because we can no longer say "Enum classes are final", which is a relatively > familiar and understood semantic. I fear the "you can use mixins provided you put them in the right spot in the base classes list" rule is not much simpler than the "no subclassing of enums with values" rule. Georg From eliben at gmail.com Wed May 1 23:48:08 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 14:48:08 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: > > Am 01.05.2013 20:04, schrieb Eli Bendersky: > > > > > Actually, in flufl.enum, IntEnum had to define a magic > __value_factory__ > > > attribute, but in the current ref435 implementation this isn't > needed, so > > > IntEnum is just: > > > > > > class IntEnum(int, Enum): > > > ''' > > > Class where every instance is a subclass of int. > > > ''' > > > > > > So why don't we just drop IntEnum from the API and tell users they > should > > do the > > > above explicitly, i.e.: > > > > > > class SocketFamily(int, Enum): > > > AF_UNIX = 1 > > > AF_INET = 2 > > > > > > As opposed to having an IntEnum explicitly, this just saves 2 > characters > > > (comma+space), but is more explicit (zen!) and helps us avoid the > > special-casing > > > the subclass restriction implementation. > > > > Wait a moment... it might not be immediately useful for IntEnums > (however, > > that's because base Enum currently defines __int__ which I find > questionable), > > but with current ref435 you *can* create your own enum base classes > with your > > own methods, and derive concrete enums from that. It also lets you > have a > > base class for enums and use it in isinstance(). > > > > If you forbid subclassing completely that will be impossible. > > > > > > I'm not sure what you mean, Georg, could you clarify? > > This works: > > > >>>> from ref435 import Enum > >>>> class SocketFamily(int, Enum): > > ... AF_UNIX = 1 > > ... AF_INET = 2 > > ... > >>>> SocketFamily.AF_INET > > SocketFamily.AF_INET [value=2] > >>>> SocketFamily.AF_INET == 2 > > True > >>>> type(SocketFamily.AF_INET) > > > >>>> isinstance(SocketFamily.AF_INET, SocketFamily) > > True > > > > Now, with the way things are currently implemented, class IntEnum is just > > syntactic sugar for above. Guido decided against allowing any kind of > > subclassing, but as an implementation need we should keep some > restricted form > > to implement IntEnum. But is IntEnum really needed if the above explicit > > multiple-inheritance of int and Enum is possible? > > Well, my point is that you currently don't have to inherit from int (or > IntEnum) > to get an __int__ method on your Enum, which is what I find questionable. > IMO > conversion to integers should only be defined for IntEnums. (But I haven't > followed all of the discussion and this may already have been decided.) > Good point. I think this may be just an artifact of the implementation - PEP 435 prohibits implicit conversion to integers for non-IntEnum enums. Since IntEnum came into existence, there's no real need for int-opearbility of other enums, and their values can be arbitrary anyway. Ethan - unless I'm missing something, __int__ should probably be removed. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Wed May 1 23:51:01 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Thu, 2 May 2013 09:51:01 +1200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: <5181414A.7000609@python.org> References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> <5181414A.7000609@python.org> Message-ID: We can get a greater speed up for walkdir() without resorting to > caching, too. Some operating systems and file system report the file > type in the dirent struct that is returned by readdir(). This reduces > the number of stat calls to zero. > Yes, definitely. This is exactly what my os.walk() replacement, "Betterwalk", does: https://github.com/benhoyt/betterwalk#readme On Windows you get *all* stat information from iterating the directory entries (FindFirstFile etc). And on Linux most of the time you get enough for os.walk() not to need an extra stat (though it does depend on the file system). I still hope to clean up Betterwalk and make a C version so we can use it in the standard library. In many cases it speeds up os.walk() by several times, even an order of magnitude in some cases. I intend for it to be a drop-in replacement for os.walk(), just faster. -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Wed May 1 23:52:33 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 23:52:33 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: Am 01.05.2013 23:48, schrieb Eli Bendersky: > Well, my point is that you currently don't have to inherit from int (or IntEnum) > to get an __int__ method on your Enum, which is what I find questionable. IMO > conversion to integers should only be defined for IntEnums. (But I haven't > followed all of the discussion and this may already have been decided.) > > > Good point. I think this may be just an artifact of the implementation - PEP 435 > prohibits implicit conversion to integers for non-IntEnum enums. Since IntEnum > came into existence, there's no real need for int-opearbility of other enums, > and their values can be arbitrary anyway. OK, I'm stupid -- I was thinking about moving the __int__ method to IntEnum (that's why I brought it up in this part of the thread), but as a subclass of int itself that obviously isn't needed :) Georg From g.brandl at gmx.net Wed May 1 23:53:30 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 01 May 2013 23:53:30 +0200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: Am 01.05.2013 23:44, schrieb Georg Brandl: > Traceback (most recent call last): > File "/home/gbr/devel/ref435/ref435.py", line 84, in __new__ > enum_item = obj_type.__new__(result, value) > TypeError: object.__new__(SIPStatusCode) is not safe, use int.__new__() > > During handling of the above exception, another exception occurred: > > Traceback (most recent call last): > File "ex.py", line 11, in > class SIPStatusCode(IETFStatusCode, int, Enum): > File "/home/gbr/devel/ref435/ref435.py", line 86, in __new__ > raise EnumError(*exc.args) from None > TypeError: exception causes must derive from BaseException To be fair the secondary exception is an artifact of me trying the example with Python 3.2, which doesn't have "from None". Georg From eliben at gmail.com Wed May 1 23:57:33 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 14:57:33 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: On Wed, May 1, 2013 at 2:52 PM, Georg Brandl wrote: > Am 01.05.2013 23:48, schrieb Eli Bendersky: > > > Well, my point is that you currently don't have to inherit from int > (or IntEnum) > > to get an __int__ method on your Enum, which is what I find > questionable. IMO > > conversion to integers should only be defined for IntEnums. (But I > haven't > > followed all of the discussion and this may already have been > decided.) > > > > > > Good point. I think this may be just an artifact of the implementation - > PEP 435 > > prohibits implicit conversion to integers for non-IntEnum enums. Since > IntEnum > > came into existence, there's no real need for int-opearbility of other > enums, > > and their values can be arbitrary anyway. > > OK, I'm stupid -- I was thinking about moving the __int__ method to IntEnum > (that's why I brought it up in this part of the thread), but as a subclass > of > int itself that obviously isn't needed :) You did bring up a good point, though - __int__ should not be part of vanilla Enum. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 2 00:02:53 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 May 2013 08:02:53 +1000 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: <5181414A.7000609@python.org> References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> <5181414A.7000609@python.org> Message-ID: On 2 May 2013 02:22, "Christian Heimes" wrote: > > Am 01.05.2013 16:39, schrieb Guido van Rossum: > > I've not got the full context, but I would like to make it *very* > > clear in the API (e.g. through naming of the methods) when you are > > getting a possibly cached result from stat(), and I would be very > > concerned if existing APIs were going to get caching behavior. For > > every use cases that benefits from caching there's a complementary use > > case that caching breaks. Since both use cases are important we must > > offer both APIs, in a way that makes it clear to even the casual > > reader of the code what's going on. > > I deem caching of stat calls as problematic. The correct and > contemporary result of a stat() call has security implications, too. For > example stat() is used to prevent TOCTOU race conditions such as [1]. > Caching is useful but I would prefer explicit caching rather than > implicit and automatic caching of stat() results. > > We can get a greater speed up for walkdir() without resorting to > caching, too. Some operating systems and file system report the file > type in the dirent struct that is returned by readdir(). This reduces > the number of stat calls to zero. While I agree exposing dirent in some manner is desirable, note that I'm not talking about os.walk itself, but the generator pipeline library I built around it in an attempt to break up monolithic directory walking loops into reusable components. Once you get out of the innermost generator, the only state passed through each stage is the path information (and the directory descriptor if using os.fwalk). Upgrading walkdir from simple strings to path objects would be relatively straightforward, but you can't change the API too much before it isn't similar to os.walk any more. The security issues only come into play in the outer loop which actually tries to *do* something with the pipeline output. However, even that case should involve at most two stat calls: one inside the pipeline (cached per iteration) and then a more timely one in the outer loop (assuming using os.fwalk as the base loop instead of os.walk doesn't already cover it). Cheers, Nick. > > Christian > > [1] > https://www.securecoding.cert.org/confluence/display/seccode/POS01-C.+Check+for+the+existence+of+links+when+dealing+with+files > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu May 2 00:11:20 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 15:11:20 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: <51819308.3090704@stoneleaf.us> On 05/01/2013 02:48 PM, Eli Bendersky wrote: > > > Am 01.05.2013 20:04, schrieb Eli Bendersky: > > > > > Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ > > > attribute, but in the current ref435 implementation this isn't needed, so > > > IntEnum is just: > > > > > > class IntEnum(int, Enum): > > > ''' > > > Class where every instance is a subclass of int. > > > ''' > > > > > > So why don't we just drop IntEnum from the API and tell users they should > > do the > > > above explicitly, i.e.: > > > > > > class SocketFamily(int, Enum): > > > AF_UNIX = 1 > > > AF_INET = 2 > > > > > > As opposed to having an IntEnum explicitly, this just saves 2 characters > > > (comma+space), but is more explicit (zen!) and helps us avoid the > > special-casing > > > the subclass restriction implementation. > > > > Wait a moment... it might not be immediately useful for IntEnums (however, > > that's because base Enum currently defines __int__ which I find questionable), > > but with current ref435 you *can* create your own enum base classes with your > > own methods, and derive concrete enums from that. It also lets you have a > > base class for enums and use it in isinstance(). > > > > If you forbid subclassing completely that will be impossible. > > > > > > I'm not sure what you mean, Georg, could you clarify? > > This works: > > > >>>> from ref435 import Enum > >>>> class SocketFamily(int, Enum): > > ... AF_UNIX = 1 > > ... AF_INET = 2 > > ... > >>>> SocketFamily.AF_INET > > SocketFamily.AF_INET [value=2] > >>>> SocketFamily.AF_INET == 2 > > True > >>>> type(SocketFamily.AF_INET) > > > >>>> isinstance(SocketFamily.AF_INET, SocketFamily) > > True > > > > Now, with the way things are currently implemented, class IntEnum is just > > syntactic sugar for above. Guido decided against allowing any kind of > > subclassing, but as an implementation need we should keep some restricted form > > to implement IntEnum. But is IntEnum really needed if the above explicit > > multiple-inheritance of int and Enum is possible? > > Well, my point is that you currently don't have to inherit from int (or IntEnum) > to get an __int__ method on your Enum, which is what I find questionable. IMO > conversion to integers should only be defined for IntEnums. (But I haven't > followed all of the discussion and this may already have been decided.) > > > Good point. I think this may be just an artifact of the implementation - PEP 435 prohibits implicit conversion to > integers for non-IntEnum enums. Since IntEnum came into existence, there's no real need for int-opearbility of other > enums, and their values can be arbitrary anyway. > > Ethan - unless I'm missing something, __int__ should probably be removed. The reason __int__ is there is because pure Enums should be using plain ints as their value 95% or more of the time, and being able to easily convert to a real int for either database storage, wire transmission, or C functions is a Good Thing. IntEnum is for when the enum item *must* be a real, bonafide int in its own right, and the use case here is backwards compatibility with APIs that are already using real ints -- and this is really the *only* time IntEnum should be used). The downside to IntEnum is you lose all Enum type protection; so if you don't need a real int, use a fake int, er, I mean, Enum, which can easily be int'ified on demand due to its handy dandy __int__ method. From guido at python.org Thu May 2 00:22:55 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 15:22:55 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <51819308.3090704@stoneleaf.us> References: <51814F1A.1030104@stoneleaf.us> <51819308.3090704@stoneleaf.us> Message-ID: On Wed, May 1, 2013 at 3:11 PM, Ethan Furman wrote: > The reason __int__ is there is because pure Enums should be using plain ints > as their value 95% or more of the time, and being able to easily convert to > a real int for either database storage, wire transmission, or C functions is > a Good Thing. What would int(x) return if x is an enum whose value is not a plain int? Why can't you use x.value for this use case? -- --Guido van Rossum (python.org/~guido) From eliben at gmail.com Thu May 2 00:25:59 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 1 May 2013 15:25:59 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <51819308.3090704@stoneleaf.us> References: <51814F1A.1030104@stoneleaf.us> <51819308.3090704@stoneleaf.us> Message-ID: > >> Good point. I think this may be just an artifact of the implementation - >> PEP 435 prohibits implicit conversion to >> integers for non-IntEnum enums. Since IntEnum came into existence, >> there's no real need for int-opearbility of other >> enums, and their values can be arbitrary anyway. >> >> Ethan - unless I'm missing something, __int__ should probably be removed. >> > > The reason __int__ is there is because pure Enums should be using plain > ints as their value 95% or more of the time, and being able to easily > convert to a real int for either database storage, wire transmission, or C > functions is a Good Thing. > Yes, but the .value attribute makes it "easy enough". If you have foo which is of type SomeEnum, all you need (if you know for sure it has int values) is to pass foo.value instead of just foo to places that do int conversion. Relying on __init__ is *unsafe* in the general case because enums can have non-int values, or mixed values, or whatever. You only want "true int conversion" in rare cases for which IntEnum exists. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu May 2 00:02:00 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 15:02:00 -0700 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: <518190D8.9030809@stoneleaf.us> On 05/01/2013 02:36 PM, Tim Delaney wrote: > On 2 May 2013 02:18, Tres Seaver wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/01/2013 12:14 PM, Guido van Rossum wrote: > > But we'd probably have to give up something else, e.g. adding methods > > to enums, or any hope that the instance/class/subclass relationships > > make any sense. > > I'd be glad to drop both of those in favor of subclassing: I think the > emphasis on "class-ness" makes no sense, given the driving usecases for > adopting enums into the stdlib in the first place. IOW, I would vote > that real-world usecases trump hypothetical purity. > > > I have real-world use cases of enums (in java) that are essentially classes and happen to use the enum portion purely to > obtain a unique name without explicitly supplying an ID. > > In the particular use case I'm thinking of, the flow is basically like this: > > 1. An Enum where each instance describes the shape of a database query. > 2. Wire protocol where the Enum instance name is passed. > 3. At one end, the data for performing the DB query is populated. > 4. At the other end, the data is extracted and the appropriate enum is used to perform the query. > > Why use an enum? By using the name in the wire protocol I'm guaranteed a unique ID that won't change across versions > (there is a requirement to only add to the enum) but does not rely on people setting it manually - the compiler will > complain if there is a conflict, as opposed to setting values. And having the behaviour be part of the class simplifies > things immensely. > > Yes, I could do all of this without an enum (have class check that each supplied ID is unique, etc) but the code is much > clearer using the enum. > > I am happy to give up subclassing of enums in order to have behaviour on enum instances. I've always seen enums more as > a container for their instances. I do want to be able to find out what enum class a particular enum belongs to (I've > used this property in the past) and it's nice that the enum instance is an instance of the defining class (although IMO > not required). > > I see advantages to enums being subclassable, but also significant disadvantages. For example, given the following: > > class Color(Enum): > red = 1 > > class MoreColor(Color): > blue = 2 > > class DifferentMoreColor(Color): > green = 2 > > then the only reasonable way for it to work IMO is that MoreColor contains both (red, blue) and DifferentMoreColor > contains both (red, green) and that red is not an instance of either MoreColor or DifferentMoreColor. If you allow > subclassing, at some point either something is going to be intuitively backwards to some people (in the above that > Color.red is not an instance of MoreColor), or is going to result in a contravariance violation. Nice example, thank you. As far as subclassing goes, you can have behavior either way. The sticky issue is are the enum items from an inherited enumeration available in the child enumeration, or should such an inheritance raise an error, or should all subclassing inheritance raise an error. It sounds like we're leaning towards allowing subclassing as long as the enumeration being subclassed doesn't define any enum items itself. -- ~Ethan~ From ncoghlan at gmail.com Thu May 2 00:54:10 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 2 May 2013 08:54:10 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: On 2 May 2013 02:46, "Guido van Rossum" wrote: > > On Wed, May 1, 2013 at 9:18 AM, Tres Seaver wrote: > > I'd be glad to drop both of those in favor of subclassing: I think the > > emphasis on "class-ness" makes no sense, given the driving usecases for > > adopting enums into the stdlib in the first place. IOW, I would vote > > that real-world usecases trump hypothetical purity. > > Yeah, this is the dilemma. But what *are* the real-world use cases? > Please provide some. > > Here's how I would implement "extending" an enum if subclassing were > not allowed: > > class Color(Enum): > red = 1 > white = 2 > blue = 3 > > class ExtraColor(Enum): > orange = 4 > yellow = 5 > green = 6 > > flag_colors = set(Color) | set(ExtraColor) > > Now I can test "c in flag_colors" to check whether c is a flag color. > I can also loop over flag_colors. If I want the colors in definition > order I could use a list instead: > > ordered_flag_colors = list(Color) + list(ExtraColor) > > But this would be less or more acceptable depending on whether it is a > common or esoteric use case. If enums had an "as_dict" method that returned an ordered dictionary, you could do: class MoreColors(Enum): locals().update(Colors.as_dict()) orange = 4 ... Using a similar API to PEP 422's class initialisation hook, you could even simplify that to: class MoreColors(Enum, namespace=Colors.as_dict()): orange = 4 ... Cheers, Nick. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Thu May 2 01:44:03 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 02 May 2013 11:44:03 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130501084755.04f44a4f@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> Message-ID: <5181A8C3.9070702@canterbury.ac.nz> Barry Warsaw wrote: > Why isn't getattr() for lookup by name > good enough? Because it will find things that are not enum items, e.g. '__str__'. -- Greg From ethan at stoneleaf.us Thu May 2 00:51:44 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 01 May 2013 15:51:44 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> Message-ID: <51819C80.9040109@stoneleaf.us> On 05/01/2013 02:07 PM, Guido van Rossum wrote: > On Wed, May 1, 2013 at 2:04 PM, Eli Bendersky wrote: >> >> class BehaviorMixin: >> # bla bla >> >> class MyBehavingIntEnum(int, BehaviorMixin, Enum): >> foo = 1 >> bar = 2 > > It's a common pattern to do this with a base class rather than a > mixin, though, and I think the rule "only allow subclassing empty > enums" makes a lot of sense. So is this a pronouncement? I'm going to get whiplash if I change that bit of code many more times. ;) -- ~Ethan~ From guido at python.org Thu May 2 02:11:32 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 1 May 2013 17:11:32 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <51819C80.9040109@stoneleaf.us> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <20130501230001.4034b46a@fsol> <51819C80.9040109@stoneleaf.us> Message-ID: Yes. On Wed, May 1, 2013 at 3:51 PM, Ethan Furman wrote: > On 05/01/2013 02:07 PM, Guido van Rossum wrote: >> >> On Wed, May 1, 2013 at 2:04 PM, Eli Bendersky wrote: >>> >>> >>> class BehaviorMixin: >>> # bla bla >>> >>> class MyBehavingIntEnum(int, BehaviorMixin, Enum): >>> foo = 1 >>> bar = 2 >> >> >> It's a common pattern to do this with a base class rather than a >> mixin, though, and I think the rule "only allow subclassing empty >> enums" makes a lot of sense. > > > So is this a pronouncement? I'm going to get whiplash if I change that bit > of code many more times. ;) > > -- > ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Thu May 2 03:37:48 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 02 May 2013 11:37:48 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: <5181C36C.8090007@pearwood.info> On 02/05/13 08:54, Nick Coghlan wrote: > If enums had an "as_dict" method that returned an ordered dictionary, you > could do: > > class MoreColors(Enum): > locals().update(Colors.as_dict()) Surely that is an implementation-specific piece of code? Writing to locals() is not guaranteed to work, and the documentation warns against it. http://docs.python.org/3/library/functions.html#locals -- Steven From steve at pearwood.info Thu May 2 03:47:25 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 02 May 2013 11:47:25 +1000 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <20130501224553.740bab91@fsol> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> Message-ID: <5181C5AD.9070607@pearwood.info> On 02/05/13 06:45, Antoine Pitrou wrote: > I was talking in the context where subclassing is allowed. I don't > think there's a use-case for subclassing of non-empty enums. On the > other hand, empty enums should probably allow subclassing (they are > "abstract base enums", in a way). If you google for "subclassing enums" you will find many people asking how to subclass enums. Apparently Apache's Java allows subclassing, if I'm reading this correctly: http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/enums/Enum.html So do Scala and Kotlin. The most obvious use-case for subclassing enums is to extend them: class Directions(Enum): north = 1 east = 2 west = 3 south = 4 class Directions3D(Directions): up = 5 down = 6 If you allow enums to have methods, then the most obvious use-case is to add or extend methods, no different to any other class. -- Steven From steve at pearwood.info Thu May 2 04:33:06 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 02 May 2013 12:33:06 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> Message-ID: <5181D062.6020307@pearwood.info> On 02/05/13 02:43, Guido van Rossum wrote: > Here's how I would implement "extending" an enum if subclassing were > not allowed: > > class Color(Enum): > red = 1 > white = 2 > blue = 3 > > class ExtraColor(Enum): > orange = 4 > yellow = 5 > green = 6 > > flag_colors = set(Color) | set(ExtraColor) > > Now I can test "c in flag_colors" to check whether c is a flag color. Earlier you argued that testing for enums should be done with isinstance, not "in". Or did I misunderstood? So I would have thought that isinstance(c, (Color, ExtraColor)) would be the way to check c. I would prefer to write "c in ExtraColor", assuming c extends Color. Lookups by value also become more complex. Instead of c = ExtraColor[value], this leads to two choices, both of which are equally ugly in my opinion: c = [c for c in flag_colors if c.value == value][0] try: c = ExtraColor[value] except: # I'm not sure what exception you get here c = Color[value] There is a further problem if the two enum classes have duplicate values, by accident or design. Accident being more likely, since now you have no warning when ExtraColor defines a value that duplicates something in Color. flag_colors will now contain both duplicates, since enum values from different enums never compare equal, but that's probably not what you want. -- Steven From ether.joe at gmail.com Thu May 2 04:51:36 2013 From: ether.joe at gmail.com (Sean Felipe Wolfe) Date: Wed, 1 May 2013 19:51:36 -0700 Subject: [Python-Dev] noob contributions to unit tests In-Reply-To: References: <20130327022422.D7ACA250BCA@webabinitio.net> Message-ID: On Thu, Mar 28, 2013 at 11:36 AM, Walter D?rwald wrote: > > Am 27.03.2013 um 03:24 schrieb R. David Murray : > >> On Tue, 26 Mar 2013 16:59:06 -0700, Maciej Fijalkowski wrote: >>> On Tue, Mar 26, 2013 at 4:49 PM, Sean Felipe Wolfe wrote: >>>> Hey everybody how are you all :) >>>> >>>> I am an intermediate-level python coder looking to get help out. I've >>>> been reading over the dev guide about helping increase test coverage >>>> --> >>>> http://docs.python.org/devguide/coverage.html >>>> >>>> And also the third-party code coverage referenced in the devguide page: >>>> http://coverage.livinglogic.de/ >>>> >>>> I'm seeing that according to the coverage tool, two of my favorite >>>> libraries, urllib/urllib2, have no unit tests? Is that correct or am I >>>> reading it wrong? >>>> >>>> If that's correct it seems like a great place perhaps for me to cut my >>>> teeth and I would be excited to learn and help out here. >>>> >>>> And of course any thoughts or advice for an aspiring Python >>>> contributor would be appreciated. Of course the dev guide gives me >>>> plenty of good info. >>>> >>>> Thanks! >>> >>> That looks like an error in the coverage report, there are certainly >>> urllib and urllib2 tests in test/test_urllib* >> >> The devguide contains instructions for running coverage yourself, >> and if I recall correctly the 'fullcoverage' recipe does a better >> job than what runs at coverage.livinglogic.de. > > The job that produces that output has been broken for some time now, and I haven't found the time to look into it. If someone wants to try, here's the code: > > https://pypi.python.org/pypi/pycoco/0.7.2 > >> [?] > > Servus, > Walter > Hello Walter and everybody, after a bit of family time and other stuffs, I'm getting back to this today and looking at what's involved in fixing the livinglogic code coverage tool. I was able to get the depencies and a few minor issues, and now the script is running on a first attempt. I'll report back with progress or problems. Thanks y'all :) -- A musician must make music, an artist must paint, a poet must write, if he is to be ultimately at peace with himself. - Abraham Maslow From greg.ewing at canterbury.ac.nz Thu May 2 05:07:18 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 02 May 2013 15:07:18 +1200 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <5181C5AD.9070607@pearwood.info> References: <51814F1A.1030104@stoneleaf.us> <20130501204439.01cfd6fd@fsol> <20130501223344.6899e9a6@fsol> <20130501224553.740bab91@fsol> <5181C5AD.9070607@pearwood.info> Message-ID: <5181D866.90407@canterbury.ac.nz> On 02/05/13 13:47, Steven D'Aprano wrote: > The most obvious use-case for subclassing enums is to extend them: > > class Directions(Enum): > north = 1 > east = 2 > west = 3 > south = 4 > > class Directions3D(Directions): > up = 5 > down = 6 It doesn't necessarily follow that subclassing is the right mechanism for extending enums, though. If anything, you really want to "superclass" them. Maybe class Directions3D(Enum, extends = Directions): up = 5 down = 6 Then we could have issubclass(Directions, Directions3D) rather than the reverse. -- Greg From cf.natali at gmail.com Thu May 2 08:56:33 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 2 May 2013 08:56:33 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> <5181414A.7000609@python.org> Message-ID: > Yes, definitely. This is exactly what my os.walk() replacement, > "Betterwalk", does: > https://github.com/benhoyt/betterwalk#readme > > On Windows you get *all* stat information from iterating the directory > entries (FindFirstFile etc). And on Linux most of the time you get enough > for os.walk() not to need an extra stat (though it does depend on the file > system). > > I still hope to clean up Betterwalk and make a C version so we can use it in > the standard library. In many cases it speeds up os.walk() by several times, > even an order of magnitude in some cases. I intend for it to be a drop-in > replacement for os.walk(), just faster. Actually, there's Gregory's scandir() implementation (returning a generator to be able to cope with large directories) on it's way: http://bugs.python.org/issue11406 It's already been suggested to make it return a tuple (with d_type). I'm sure a review of the code (especially the Windows implementation) will be welcome. From barry at python.org Thu May 2 16:57:30 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 07:57:30 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <518164C8.2090204@hastings.org> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> Message-ID: <20130502075730.0768263c@anarchist> On May 01, 2013, at 11:54 AM, Larry Hastings wrote: >On 04/30/2013 11:29 PM, Ethan Furman wrote: >> On 04/30/2013 11:18 PM, Barry Warsaw wrote: >>> On Apr 28, 2013, at 11:50 PM, Ethan Furman wrote: >>> >>>> But as soon as: >>>> >>>> type(Color.red) is Color # True >>>> type(MoreColor.red) is MoreColor # True >>>> >>>> then: >>>> >>>> Color.red is MoreColor.red # must be False, no? >>>> >>>> >>>> If that last statement can still be True, I'd love it if someone >>> showed me >>>> how. >>> >>> class Foo: >>> a = object() >>> b = object() >>> >>> class Bar(Foo): >>> c = object() >>> >>>>>> Foo.a is Bar.a >>> True >> >> Wow. I think I'm blushing from embarrassment. >> >> Thank you for answering my question, Barry. > >Wait, what? I don't see how Barry's code answers your question. In his >example, type(a) == type(b) == type(c) == object. You were asking "how can >Color.red and MoreColor.red be the same object if they are of different >types?" > >p.s. They can't. Sure, why not? In "normal" Python, Bar inherits a from Foo, it doesn't define it so it's exactly the same object. Thus if you access that object through the superclass, you get the same object as when you access it through the subclass. So Foo.a plays the role of Color.red and Bar.a plays the role of MoreColor.red. Same object, thus `Foo.a is Bar.a` is equivalent to `Color.red is MoreColor.red`. -Barry From barry at python.org Thu May 2 16:58:13 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 07:58:13 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5181A8C3.9070702@canterbury.ac.nz> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> Message-ID: <20130502075813.579f24c0@anarchist> On May 02, 2013, at 11:44 AM, Greg Ewing wrote: >Barry Warsaw wrote: >> Why isn't getattr() for lookup by name >> good enough? > >Because it will find things that are not enum items, >e.g. '__str__'. Why does that matter? -Barry From barry at python.org Thu May 2 17:20:03 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 08:20:03 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: <20130502082003.1abb3b1b@anarchist> On May 01, 2013, at 11:04 AM, Eli Bendersky wrote: >Actually, in flufl.enum, IntEnum had to define a magic __value_factory__ >attribute, but in the current ref435 implementation this isn't needed, so >IntEnum is just: > >class IntEnum(int, Enum): > ''' > Class where every instance is a subclass of int. > ''' > >So why don't we just drop IntEnum from the API and tell users they should >do the above explicitly, i.e.: +1 -Barry From barry at python.org Thu May 2 17:23:56 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 08:23:56 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: References: <51814F1A.1030104@stoneleaf.us> Message-ID: <20130502082356.3babbfc9@anarchist> On May 01, 2013, at 08:47 PM, Georg Brandl wrote: >Wait a moment... it might not be immediately useful for IntEnums (however, >that's because base Enum currently defines __int__ which I find questionable), And broken. And unnecessary. :) >>> class Foo(Enum): ... a = 'a' ... b = 'b' ... >>> int(Foo.a) Traceback (most recent call last): File "", line 1, in TypeError: __int__ returned non-int (type str) ...remove Enum.__int__()... >>> class Bar(int, Enum): ... a = 1 ... b = 2 ... >>> int(Bar.a) 1 So yes, Enum.__int__() should be removed. -Barry From ethan at stoneleaf.us Thu May 2 17:15:02 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 May 2013 08:15:02 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130502075730.0768263c@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> Message-ID: <518282F6.5000401@stoneleaf.us> On 05/02/2013 07:57 AM, Barry Warsaw wrote: > On May 01, 2013, at 11:54 AM, Larry Hastings wrote: > >> On 04/30/2013 11:29 PM, Ethan Furman wrote: >>> On 04/30/2013 11:18 PM, Barry Warsaw wrote: >>>> On Apr 28, 2013, at 11:50 PM, Ethan Furman wrote: >>>> >>>>> But as soon as: >>>>> >>>>> type(Color.red) is Color # True >>>>> type(MoreColor.red) is MoreColor # True >>>>> >>>>> then: >>>>> >>>>> Color.red is MoreColor.red # must be False, no? >>>>> >>>>> >>>>> If that last statement can still be True, I'd love it if someone >>> showed me >>>>> how. >>>> >>>> class Foo: >>>> a = object() >>>> b = object() >>>> >>>> class Bar(Foo): >>>> c = object() >>>> >>>>>>> Foo.a is Bar.a >>>> True >>> >>> Wow. I think I'm blushing from embarrassment. >>> >>> Thank you for answering my question, Barry. >> >> Wait, what? I don't see how Barry's code answers your question. In his >> example, type(a) == type(b) == type(c) == object. You were asking "how can >> Color.red and MoreColor.red be the same object if they are of different >> types?" >> >> p.s. They can't. > > Sure, why not? In "normal" Python, Bar inherits a from Foo, it doesn't define > it so it's exactly the same object. Thus if you access that object through > the superclass, you get the same object as when you access it through the > subclass. > > So Foo.a plays the role of Color.red and Bar.a plays the role of > MoreColor.red. Same object, thus `Foo.a is Bar.a` is equivalent to `Color.red > is MoreColor.red`. Same object, true, but my question was if `type(Bar.a) is Bar`, and in your reply `type(Bar.a) is object`. -- ~Ethan~ From larry at hastings.org Thu May 2 17:42:14 2013 From: larry at hastings.org (Larry Hastings) Date: Thu, 02 May 2013 08:42:14 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130502075730.0768263c@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> Message-ID: <51828956.7080100@hastings.org> On 05/02/2013 07:57 AM, Barry Warsaw wrote: > On May 01, 2013, at 11:54 AM, Larry Hastings wrote: > >> On 04/30/2013 11:29 PM, Ethan Furman wrote: >>> On 04/30/2013 11:18 PM, Barry Warsaw wrote: >>>> On Apr 28, 2013, at 11:50 PM, Ethan Furman wrote: >>>> >>>>> But as soon as: >>>>> >>>>> type(Color.red) is Color # True >>>>> type(MoreColor.red) is MoreColor # True >>>>> >>>>> then: >>>>> >>>>> Color.red is MoreColor.red # must be False, no? >>>>> >>>>> >>>>> If that last statement can still be True, I'd love it if someone >>> showed me >>>>> how. >>>> class Foo: >>>> a = object() >>>> b = object() >>>> >>>> class Bar(Foo): >>>> c = object() >>>> >>>>>>> Foo.a is Bar.a >>>> True >>> Wow. I think I'm blushing from embarrassment. >>> >>> Thank you for answering my question, Barry. >> Wait, what? I don't see how Barry's code answers your question. In his >> example, type(a) == type(b) == type(c) == object. You were asking "how can >> Color.red and MoreColor.red be the same object if they are of different >> types?" >> >> p.s. They can't. > Sure, why not? In "normal" Python, Bar inherits a from Foo, it doesn't define > it so it's exactly the same object. Thus if you access that object through > the superclass, you get the same object as when you access it through the > subclass. > > So Foo.a plays the role of Color.red and Bar.a plays the role of > MoreColor.red. Same object, thus `Foo.a is Bar.a` is equivalent to `Color.red > is MoreColor.red`. So you're saying Color.red and MoreColor.red are the same object. Which means they have the same type. But in Ethan's original example above, type(Color.red) == Color, and type(MoreColor.red) == MoreColor. Those are different types. So, for the second time: How can Color.red and MoreColor.red be the same object when they are of different types? p.s. They can't. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu May 2 17:47:47 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 2 May 2013 08:47:47 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130502075813.579f24c0@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> Message-ID: On Thu, May 2, 2013 at 7:58 AM, Barry Warsaw wrote: > On May 02, 2013, at 11:44 AM, Greg Ewing wrote: > >>Barry Warsaw wrote: >>> Why isn't getattr() for lookup by name >>> good enough? >> >>Because it will find things that are not enum items, >>e.g. '__str__'. > > Why does that matter? I claim it doesn't. The name lookup is only relevant if you already know that you have a valid name of an enum in the class, e.g. if you know that a Color name was written earlier. If you don't, you should do some other check, e.g. "if x in Color:". (Note that from this you cannot derive that Color[x] should work.) -- --Guido van Rossum (python.org/~guido) From barry at python.org Thu May 2 17:54:00 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 08:54:00 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <51819308.3090704@stoneleaf.us> References: <51814F1A.1030104@stoneleaf.us> <51819308.3090704@stoneleaf.us> Message-ID: <20130502085400.6fbf5547@anarchist> On May 01, 2013, at 03:11 PM, Ethan Furman wrote: >The reason __int__ is there is because pure Enums should be using plain ints >as their value 95% or more of the time, and being able to easily convert to a >real int for either database storage, wire transmission, or C functions is a >Good Thing. But then, Foo.a.value is good enough. -Barry From barry at python.org Thu May 2 17:57:52 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 08:57:52 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <51828956.7080100@hastings.org> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> <51828956.7080100@hastings.org> Message-ID: <20130502085752.5b528960@anarchist> On May 02, 2013, at 08:42 AM, Larry Hastings wrote: >So, for the second time: How can Color.red and MoreColor.red be the same >object when they are of different types? It's a moot point now given Guido's pronouncement. -Barry From eliben at gmail.com Thu May 2 17:58:07 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 08:58:07 -0700 Subject: [Python-Dev] Enum: subclassing? In-Reply-To: <20130502085400.6fbf5547@anarchist> References: <51814F1A.1030104@stoneleaf.us> <51819308.3090704@stoneleaf.us> <20130502085400.6fbf5547@anarchist> Message-ID: On Thu, May 2, 2013 at 8:54 AM, Barry Warsaw wrote: > On May 01, 2013, at 03:11 PM, Ethan Furman wrote: > > >The reason __int__ is there is because pure Enums should be using plain > ints > >as their value 95% or more of the time, and being able to easily convert > to a > >real int for either database storage, wire transmission, or C functions > is a > >Good Thing. > > But then, Foo.a.value is good enough. > __int__ is out of Enum, Barry. You're preaching to the choir now ;-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu May 2 18:00:35 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 09:00:35 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130502085752.5b528960@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> <51828956.7080100@hastings.org> <20130502085752.5b528960@anarchist> Message-ID: On Thu, May 2, 2013 at 8:57 AM, Barry Warsaw wrote: > On May 02, 2013, at 08:42 AM, Larry Hastings wrote: > > >So, for the second time: How can Color.red and MoreColor.red be the same > >object when they are of different types? > > It's a moot point now given Guido's pronouncement. > Correct. There's no Color.red and MoreColor.red. Subclassing is allowed only of enums that define no members. So this is forbidden: >>> class MoreColor(Color): ... pink = 17 ... TypeError: Cannot subclass enumerations But this is allowed: >>> class Foo(Enum): ... def some_behavior(self): ... pass ... >>> class Bar(Foo): ... happy = 1 ... sad = 2 ... Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From pieter at nagel.co.za Thu May 2 19:54:02 2013 From: pieter at nagel.co.za (Pieter Nagel) Date: Thu, 02 May 2013 19:54:02 +0200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> Message-ID: <1367517242.2868.387.camel@basilisk> On Wed, 2013-05-01 at 23:54 +1000, Nick Coghlan wrote: > > However, I like the idea of a rich "stat" object, with "path.stat()" > and "path.cached_stat()" accessors on the path objects. > Since it seems there is some support for my proposal, I just posted to python-ideas to get an idea how much support there would be for such a PEP. -- Pieter Nagel From ethan at stoneleaf.us Thu May 2 21:07:15 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 May 2013 12:07:15 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity Message-ID: <5182B963.9030304@stoneleaf.us> In order for the Enum convenience function to be pickleable, we have this line of code in the metaclass: enum_class.__module__ = sys._getframe(1).f_globals['__name__'] This works fine for Cpython, but what about the others? -- ~Ethan~ From benjamin at python.org Thu May 2 21:48:14 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 2 May 2013 15:48:14 -0400 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <5182B963.9030304@stoneleaf.us> References: <5182B963.9030304@stoneleaf.us> Message-ID: 2013/5/2 Ethan Furman : > In order for the Enum convenience function to be pickleable, we have this > line of code in the metaclass: > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > > This works fine for Cpython, but what about the others? Regardless of that, perhaps we should come up with better ways to do this. -- Regards, Benjamin From solipsis at pitrou.net Thu May 2 22:10:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 22:10:03 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity References: <5182B963.9030304@stoneleaf.us> Message-ID: <20130502221003.6180b90c@fsol> On Thu, 2 May 2013 15:48:14 -0400 Benjamin Peterson wrote: > 2013/5/2 Ethan Furman : > > In order for the Enum convenience function to be pickleable, we have this > > line of code in the metaclass: > > > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > > > > This works fine for Cpython, but what about the others? > > Regardless of that, perhaps we should come up with better ways to do this. Two things that were suggested in private: 1) ask users to pass the module name to the convenience function explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the class "name"). Guido doesn't like it :-) 2) dicth the "convenience function" and replace it with a regular class-based syntax. Ethan doesn't like it :-) Regards Antoine. From eliben at gmail.com Thu May 2 22:15:00 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 13:15:00 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502221003.6180b90c@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> Message-ID: On Thu, May 2, 2013 at 1:10 PM, Antoine Pitrou wrote: > On Thu, 2 May 2013 15:48:14 -0400 > Benjamin Peterson wrote: > > 2013/5/2 Ethan Furman : > > > In order for the Enum convenience function to be pickleable, we have > this > > > line of code in the metaclass: > > > > > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > > > > > > This works fine for Cpython, but what about the others? > > > > Regardless of that, perhaps we should come up with better ways to do > this. > > Two things that were suggested in private: > > 1) ask users to pass the module name to the convenience function > explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the > class "name"). Guido doesn't like it :-) > > 2) dicth the "convenience function" and replace it with a regular > class-based syntax. Ethan doesn't like it :-) > Re (2), we already have the hack in stdlib in namedtuple, so not allowing it for an enum is a step backwards. If sys._getframe(1).f_globals['__name__'] feels hackish, maybe it can be shortened to a convenience function the stdlib provides? Are there conditions where it doesn't produce what we expect from it? The point at which the enumeration is defined resides in *some* module, no? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.brandl at gmx.net Thu May 2 22:18:55 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 02 May 2013 22:18:55 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502221003.6180b90c@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> Message-ID: Am 02.05.2013 22:10, schrieb Antoine Pitrou: > On Thu, 2 May 2013 15:48:14 -0400 > Benjamin Peterson wrote: >> 2013/5/2 Ethan Furman : >> > In order for the Enum convenience function to be pickleable, we have this >> > line of code in the metaclass: >> > >> > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] >> > >> > This works fine for Cpython, but what about the others? >> >> Regardless of that, perhaps we should come up with better ways to do this. > > Two things that were suggested in private: > > 1) ask users to pass the module name to the convenience function > explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the > class "name"). Guido doesn't like it :-) > > 2) dicth the "convenience function" and replace it with a regular > class-based syntax. Ethan doesn't like it :-) 5) accept that convenience-created enums have restrictions such as no picklability and point them out in the docs? Georg From solipsis at pitrou.net Thu May 2 22:22:10 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 22:22:10 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> Message-ID: <20130502222210.1cc86c16@fsol> On Thu, 2 May 2013 13:15:00 -0700 Eli Bendersky wrote: > > Two things that were suggested in private: > > > > 1) ask users to pass the module name to the convenience function > > explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the > > class "name"). Guido doesn't like it :-) > > > > 2) dicth the "convenience function" and replace it with a regular > > class-based syntax. Ethan doesn't like it :-) > > > > Re (2), we already have the hack in stdlib in namedtuple, so not allowing > it for an enum is a step backwards. That's a fallacy. There is no step backwards if you adopt a class-based syntax, which is just as convenient as the proposed "convenience function". I have a hard time understanding that calling a function to declare a class is suddenly considered "convenient". > If > sys._getframe(1).f_globals['__name__'] feels hackish, maybe it can be > shortened to a convenience function the stdlib provides? It's not the notation which is hackish, it's the fact that you are inspecting the frame stack in the hope of getting the right information. What if someone wants to write another convenience function that wraps your convenience function? What if your code is executing from some kind of step-by-step debugger which inserts an additional frame in the call stack? What if someone wants the enum to be nested inside another class (rather than reside at the module top-level)? Regards Antoine. From fwierzbicki at gmail.com Thu May 2 22:18:18 2013 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Thu, 2 May 2013 13:18:18 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <5182B963.9030304@stoneleaf.us> References: <5182B963.9030304@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 12:07 PM, Ethan Furman wrote: > In order for the Enum convenience function to be pickleable, we have this > line of code in the metaclass: > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > > This works fine for Cpython, but what about the others? This should work for Jython, but I can't say I like it. I believe IronPython has a sort of speedup mode that disallows the use of _getframe, and I'd like to add this to Jython someday. -Frank From eliben at gmail.com Thu May 2 22:33:21 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 13:33:21 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502222210.1cc86c16@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> Message-ID: On Thu, May 2, 2013 at 1:22 PM, Antoine Pitrou wrote: > On Thu, 2 May 2013 13:15:00 -0700 > Eli Bendersky wrote: > > > Two things that were suggested in private: > > > > > > 1) ask users to pass the module name to the convenience function > > > explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the > > > class "name"). Guido doesn't like it :-) > > > > > > 2) dicth the "convenience function" and replace it with a regular > > > class-based syntax. Ethan doesn't like it :-) > > > > > > > Re (2), we already have the hack in stdlib in namedtuple, so not allowing > > it for an enum is a step backwards. > > That's a fallacy. There is no step backwards if you adopt a class-based > syntax, which is just as convenient as the proposed "convenience > function". I have a hard time understanding that calling a function to > declare a class is suddenly considered "convenient". > > > If > > sys._getframe(1).f_globals['__name__'] feels hackish, maybe it can be > > shortened to a convenience function the stdlib provides? > > It's not the notation which is hackish, it's the fact that you are > inspecting the frame stack in the hope of getting the right information. > > What if someone wants to write another convenience function that wraps > your convenience function? What if your code is executing from some > kind of step-by-step debugger which inserts an additional frame in the > call stack? What if someone wants the enum to be nested inside another > class (rather than reside at the module top-level)? > Would nesting the non-convenience Enum in a function or a class allow one to pickle it? I think programmers who want their libraries to be pickle-able already have to be aware of some restrictions about what can and cannot be pickled. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu May 2 22:39:05 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 13:39:05 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> Message-ID: <20130502133905.7b37bae5@anarchist> On May 02, 2013, at 10:18 PM, Georg Brandl wrote: >5) accept that convenience-created enums have restrictions such as no >picklability and point them out in the docs? That would work fine for me, but ultimately I'm with Guido. I just don't want to have to pass the module name in. -Barry From guido at python.org Thu May 2 22:39:21 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 2 May 2013 13:39:21 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 1:18 PM, fwierzbicki at gmail.com wrote: > On Thu, May 2, 2013 at 12:07 PM, Ethan Furman wrote: >> In order for the Enum convenience function to be pickleable, we have this >> line of code in the metaclass: >> >> enum_class.__module__ = sys._getframe(1).f_globals['__name__'] >> >> This works fine for Cpython, but what about the others? > This should work for Jython, but I can't say I like it. I believe > IronPython has a sort of speedup mode that disallows the use of > _getframe, and I'd like to add this to Jython someday. This particular function is typically only called at module load time, so speeding it up isn't worth it. FWIW, as Eli pointed out, namedtuple() does the same thing (since Python 2.6), so we'll just copy that code (refactoring it doesn't have to hold up the PEP). The only other alternative I find acceptable is not to have the convenience API at all. That's Eli's call. [Eli] > Would nesting the non-convenience Enum in a function or a class allow one to > pickle it? I think programmers who want their libraries to be pickle-able > already have to be aware of some restrictions about what can and cannot be > pickled. Apparently it hasn't been a problem for namedtuple. Calling namedtuple() or Enum() in another function is similar to a class statement inside a function -- the resulting class isn't picklable. (But from this, don't conclude that it's not important for namedtuple() or Enum() to return a picklable class. It is important. It is just not important to try to make it work when they are called through some other wrapper -- there's just not much use for such a pattern.) -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Thu May 2 22:42:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 22:42:53 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> Message-ID: <20130502224253.2cd18384@fsol> On Thu, 2 May 2013 13:33:21 -0700 Eli Bendersky wrote: > On Thu, May 2, 2013 at 1:22 PM, Antoine Pitrou wrote: > > > On Thu, 2 May 2013 13:15:00 -0700 > > Eli Bendersky wrote: > > > > Two things that were suggested in private: > > > > > > > > 1) ask users to pass the module name to the convenience function > > > > explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the > > > > class "name"). Guido doesn't like it :-) > > > > > > > > 2) dicth the "convenience function" and replace it with a regular > > > > class-based syntax. Ethan doesn't like it :-) > > > > > > > > > > Re (2), we already have the hack in stdlib in namedtuple, so not allowing > > > it for an enum is a step backwards. > > > > That's a fallacy. There is no step backwards if you adopt a class-based > > syntax, which is just as convenient as the proposed "convenience > > function". I have a hard time understanding that calling a function to > > declare a class is suddenly considered "convenient". > > > > > If > > > sys._getframe(1).f_globals['__name__'] feels hackish, maybe it can be > > > shortened to a convenience function the stdlib provides? > > > > It's not the notation which is hackish, it's the fact that you are > > inspecting the frame stack in the hope of getting the right information. > > > > What if someone wants to write another convenience function that wraps > > your convenience function? What if your code is executing from some > > kind of step-by-step debugger which inserts an additional frame in the > > call stack? What if someone wants the enum to be nested inside another > > class (rather than reside at the module top-level)? > > > > Would nesting the non-convenience Enum in a function or a class allow one > to pickle it? Once PEP 3154 is implemented (Alexandre is on it :-)), nested classes should be picklable. As for classes inside functions, it sounds quite impossible (how do you instantiate the function namespace without calling the function?). Regards Antoine. From fijall at gmail.com Thu May 2 22:45:18 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Thu, 2 May 2013 22:45:18 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <5182B963.9030304@stoneleaf.us> References: <5182B963.9030304@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 9:07 PM, Ethan Furman wrote: > In order for the Enum convenience function to be pickleable, we have this > line of code in the metaclass: > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > > This works fine for Cpython, but what about the others? It's ugly as hell, but it's not a performance problem for PyPy, since this is executed at module load time (you probably won't jit that code anyway) Cheers, fijal From eliben at gmail.com Thu May 2 22:48:24 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 13:48:24 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502133905.7b37bae5@anarchist> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502133905.7b37bae5@anarchist> Message-ID: On Thu, May 2, 2013 at 1:39 PM, Barry Warsaw wrote: > On May 02, 2013, at 10:18 PM, Georg Brandl wrote: > > >5) accept that convenience-created enums have restrictions such as no > >picklability and point them out in the docs? > > That would work fine for me, but ultimately I'm with Guido. I just don't > want > to have to pass the module name in. > The problem with (5) is this: you use some library that exports an enumeration, and you want to use pickling. Now you depend on the way the library implemented - if it used the convenience API, you can't pickle. If it used the class API, you can. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Thu May 2 22:49:13 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Fri, 3 May 2013 08:49:13 +1200 Subject: [Python-Dev] PEP 428: stat caching undesirable? In-Reply-To: References: <1367393548.2868.262.camel@basilisk> <20130501121821.093fa030@fsol> <1367407340.2868.294.camel@basilisk> <5181414A.7000609@python.org> Message-ID: > Actually, there's Gregory's scandir() implementation (returning a > generator to be able to cope with large directories) on it's way: > > http://bugs.python.org/issue11406 > > It's already been suggested to make it return a tuple (with d_type). > I'm sure a review of the code (especially the Windows implementation) > will be welcome. > Ah, thanks for the pointer, I hadn't seen that. Definitely looks like I should "merge" Betterwalk with it. I'll see if I can spend some time on it again soon. I'd love to see scandir/iterdir go into Python 3.4, and I'd be very chuffed if iterdir_stat got in, because that's the one that can really start speeding up operations like os.walk(). -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu May 2 22:50:21 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 13:50:21 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 1:39 PM, Guido van Rossum wrote: > On Thu, May 2, 2013 at 1:18 PM, fwierzbicki at gmail.com > wrote: > > On Thu, May 2, 2013 at 12:07 PM, Ethan Furman > wrote: > >> In order for the Enum convenience function to be pickleable, we have > this > >> line of code in the metaclass: > >> > >> enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > >> > >> This works fine for Cpython, but what about the others? > > This should work for Jython, but I can't say I like it. I believe > > IronPython has a sort of speedup mode that disallows the use of > > _getframe, and I'd like to add this to Jython someday. > > This particular function is typically only called at module load time, > so speeding it up isn't worth it. > > FWIW, as Eli pointed out, namedtuple() does the same thing (since > Python 2.6), so we'll just copy that code (refactoring it doesn't have > to hold up the PEP). The only other alternative I find acceptable is > not to have the convenience API at all. That's Eli's call. > I really prefer having the convenience API and acknowledging that it has some limitations (i.e. picking enums that were created with the convenience API and are nested in classes). > > [Eli] > > Would nesting the non-convenience Enum in a function or a class allow > one to > > pickle it? I think programmers who want their libraries to be pickle-able > > already have to be aware of some restrictions about what can and cannot > be > > pickled. > > Apparently it hasn't been a problem for namedtuple. Calling > namedtuple() or Enum() in another function is similar to a class > statement inside a function -- the resulting class isn't picklable. > > (But from this, don't conclude that it's not important for > namedtuple() or Enum() to return a picklable class. It is important. > It is just not important to try to make it work when they are called > through some other wrapper -- there's just not much use for such a > pattern.) > I agree. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu May 2 22:57:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 22:57:16 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502133905.7b37bae5@anarchist> Message-ID: <20130502225716.65f5b872@fsol> On Thu, 2 May 2013 13:48:24 -0700 Eli Bendersky wrote: > On Thu, May 2, 2013 at 1:39 PM, Barry Warsaw wrote: > > > On May 02, 2013, at 10:18 PM, Georg Brandl wrote: > > > > >5) accept that convenience-created enums have restrictions such as no > > >picklability and point them out in the docs? > > > > That would work fine for me, but ultimately I'm with Guido. I just don't > > want > > to have to pass the module name in. > > > > The problem with (5) is this: you use some library that exports an > enumeration, and you want to use pickling. Now you depend on the way the > library implemented - if it used the convenience API, you can't pickle. If > it used the class API, you can. A good reason to ditch the function-based syntax. Regards Antoine. From eliben at gmail.com Thu May 2 22:52:29 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 13:52:29 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502224253.2cd18384@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> Message-ID: > > > On Thu, 2 May 2013 13:15:00 -0700 > > > Eli Bendersky wrote: > > > > > Two things that were suggested in private: > > > > > > > > > > 1) ask users to pass the module name to the convenience function > > > > > explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as > the > > > > > class "name"). Guido doesn't like it :-) > > > > > > > > > > 2) dicth the "convenience function" and replace it with a regular > > > > > class-based syntax. Ethan doesn't like it :-) > > > > > > > > > > > > > Re (2), we already have the hack in stdlib in namedtuple, so not > allowing > > > > it for an enum is a step backwards. > > > > > > That's a fallacy. There is no step backwards if you adopt a class-based > > > syntax, which is just as convenient as the proposed "convenience > > > function". I have a hard time understanding that calling a function to > > > declare a class is suddenly considered "convenient". > > > > > > > If > > > > sys._getframe(1).f_globals['__name__'] feels hackish, maybe it can be > > > > shortened to a convenience function the stdlib provides? > > > > > > It's not the notation which is hackish, it's the fact that you are > > > inspecting the frame stack in the hope of getting the right > information. > > > > > > What if someone wants to write another convenience function that wraps > > > your convenience function? What if your code is executing from some > > > kind of step-by-step debugger which inserts an additional frame in the > > > call stack? What if someone wants the enum to be nested inside another > > > class (rather than reside at the module top-level)? > > > > > > > Would nesting the non-convenience Enum in a function or a class allow one > > to pickle it? > > Once PEP 3154 is implemented (Alexandre is on it :-)), nested classes > should be picklable. Interesting, I did not know that. > As for classes inside functions, it sounds quite > impossible (how do you instantiate the function namespace without > calling the function?). > True. Back to my question from before, though - do we have a real technical limitation of having something like inspect.what_module_am_i_now_in() that's supposed to work for all Python code? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu May 2 23:05:25 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 May 2013 14:05:25 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> Message-ID: <5182D515.1040405@stoneleaf.us> On 05/02/2013 01:52 PM, Eli Bendersky wrote: > > Back to my question from before, though - do we have a real technical limitation of having something like > inspect.what_module_am_i_now_in() that's supposed to work for all Python code? By which you really mean inspect.what_module_was_I_called_from() ? -- ~Ethan~ From solipsis at pitrou.net Thu May 2 23:10:28 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 23:10:28 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> Message-ID: <20130502231028.7994527a@fsol> On Thu, 2 May 2013 13:52:29 -0700 Eli Bendersky wrote: > > Back to my question from before, though - do we have a real technical > limitation of having something like inspect.what_module_am_i_now_in() > that's supposed to work for all Python code? I already gave an answer (e.g. the debugger case), but you are free to consider it not reasonable :) In any case, I just find the argument for a function-based syntax non-existent compared to a similarly compact class-based syntax. Regards Antoine. From eliben at gmail.com Thu May 2 23:11:53 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 14:11:53 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <5182D515.1040405@stoneleaf.us> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <5182D515.1040405@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 2:05 PM, Ethan Furman wrote: > On 05/02/2013 01:52 PM, Eli Bendersky wrote: > >> >> Back to my question from before, though - do we have a real technical >> limitation of having something like >> inspect.what_module_am_i_now_**in() that's supposed to work for all >> Python code? >> > > By which you really mean inspect.what_module_was_I_called_from() ? > > Yes, I guess this is what I meant by "now_in" part. Let's be precise: Animal = Enum('Animal', '...........') The call to Enum is the interesting here. In happens in some library and Animal members can then be passed around. But we really want the module where Enum() was invoked to create Animal in the first place. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu May 2 23:15:40 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 14:15:40 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502231028.7994527a@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> Message-ID: On Thu, May 2, 2013 at 2:10 PM, Antoine Pitrou wrote: > On Thu, 2 May 2013 13:52:29 -0700 > Eli Bendersky wrote: > > > > Back to my question from before, though - do we have a real technical > > limitation of having something like inspect.what_module_am_i_now_in() > > that's supposed to work for all Python code? > > I already gave an answer (e.g. the debugger case), but you are free to > consider it not reasonable :) Sorry, but I do find the argument "let's not have a convenience syntax because enums created with such syntax won't pickle properly from within a debugger" not convincing enough :-) It may be just me though, and I'm open to other opinions. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu May 2 23:16:34 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 14:16:34 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502225716.65f5b872@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502133905.7b37bae5@anarchist> <20130502225716.65f5b872@fsol> Message-ID: <20130502141634.2df18866@anarchist> On May 02, 2013, at 10:57 PM, Antoine Pitrou wrote: >On Thu, 2 May 2013 13:48:24 -0700 >> The problem with (5) is this: you use some library that exports an >> enumeration, and you want to use pickling. Now you depend on the way the >> library implemented - if it used the convenience API, you can't pickle. If >> it used the class API, you can. > >A good reason to ditch the function-based syntax. Why? Not everything is picklable. Oh well. -Barry From benjamin at python.org Thu May 2 23:19:30 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 2 May 2013 17:19:30 -0400 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> Message-ID: 2013/5/2 Eli Bendersky : > > > > On Thu, May 2, 2013 at 1:10 PM, Antoine Pitrou wrote: >> >> On Thu, 2 May 2013 15:48:14 -0400 >> Benjamin Peterson wrote: >> > 2013/5/2 Ethan Furman : >> > > In order for the Enum convenience function to be pickleable, we have >> > > this >> > > line of code in the metaclass: >> > > >> > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] >> > > >> > > This works fine for Cpython, but what about the others? >> > >> > Regardless of that, perhaps we should come up with better ways to do >> > this. >> >> Two things that were suggested in private: >> >> 1) ask users to pass the module name to the convenience function >> explicitly (i.e. pass "seasonmodule.Season" instead of "Season" as the >> class "name"). Guido doesn't like it :-) >> >> 2) dicth the "convenience function" and replace it with a regular >> class-based syntax. Ethan doesn't like it :-) > > > Re (2), we already have the hack in stdlib in namedtuple, so not allowing it > for an enum is a step backwards. If sys._getframe(1).f_globals['__name__'] > feels hackish, maybe it can be shortened to a convenience function the > stdlib provides? Are there conditions where it doesn't produce what we > expect from it? The point at which the enumeration is defined resides in > *some* module, no? I disagree that not allowing code smell to spread is a step backwards. Rather we should realize that this is a common problem and find a proper solution rather than further propogating this hack. -- Regards, Benjamin From solipsis at pitrou.net Thu May 2 23:26:30 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 23:26:30 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502133905.7b37bae5@anarchist> <20130502225716.65f5b872@fsol> <20130502141634.2df18866@anarchist> Message-ID: <20130502232630.43c05d77@fsol> On Thu, 2 May 2013 14:16:34 -0700 Barry Warsaw wrote: > On May 02, 2013, at 10:57 PM, Antoine Pitrou wrote: > > >On Thu, 2 May 2013 13:48:24 -0700 > >> The problem with (5) is this: you use some library that exports an > >> enumeration, and you want to use pickling. Now you depend on the way the > >> library implemented - if it used the convenience API, you can't pickle. If > >> it used the class API, you can. > > > >A good reason to ditch the function-based syntax. > > Why? Not everything is picklable. Oh well. Then why insist on the _getframe hack? You are losing me: are you bothered by picklability or not? ;-) If you are not, then fine, let's just make the function-based version *documentedly* unpicklable, and move along. Regards Antoine. From solipsis at pitrou.net Thu May 2 23:28:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 2 May 2013 23:28:21 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> Message-ID: <20130502232821.3e31112c@fsol> On Thu, 2 May 2013 14:15:40 -0700 Eli Bendersky wrote: > > Sorry, but I do find the argument "let's not have a convenience syntax > because enums created with such syntax won't pickle properly from within a > debugger" not convincing enough :-) Eli, it would be nice if you stopped with this claim. I'm not advocating "not having a convenience syntax", I'm advocating having a convenience syntax which is *class-based* rather than function-based. Debuggers are beside the point: there are two kinds of "convenience syntax" on the table; one allows pickling by construction, one requires an ugly hack which may not solve all cases (and which may apparently make Jython / IronPython mildly unhappy). Why you insist on ignoring the former and imposing the latter is beyond me. Regards you Antoine. From amauryfa at gmail.com Thu May 2 23:54:43 2013 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Thu, 2 May 2013 23:54:43 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> Message-ID: 2013/5/2 Guido van Rossum > On Thu, May 2, 2013 at 1:18 PM, fwierzbicki at gmail.com > wrote: > > On Thu, May 2, 2013 at 12:07 PM, Ethan Furman > wrote: > >> In order for the Enum convenience function to be pickleable, we have > this > >> line of code in the metaclass: > >> > >> enum_class.__module__ = sys._getframe(1).f_globals['__name__'] > >> > >> This works fine for Cpython, but what about the others? > > This should work for Jython, but I can't say I like it. I believe > > IronPython has a sort of speedup mode that disallows the use of > > _getframe, and I'd like to add this to Jython someday. > > This particular function is typically only called at module load time, > so speeding it up isn't worth it. It works fine on PyPy as well. It probably also kills any JIT optimization, but it's not an issue since classes are not usually created in tight loops. -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu May 2 23:57:35 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 14:57:35 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130502232821.3e31112c@fsol> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: > Eli, it would be nice if you stopped with this claim. > > I'm not advocating "not having a convenience syntax", I'm advocating > having a convenience syntax which is *class-based* rather than > function-based. > > Debuggers are beside the point: there are two kinds of "convenience > syntax" on the table; one allows pickling by construction, one > requires an ugly hack which may not solve all cases (and which may > apparently make Jython / IronPython mildly unhappy). Why you insist > on ignoring the former and imposing the latter is beyond me. > I'm not trying to belittle our class-based suggestion. I just think there are two separate issues here, and I was focusing on just one of them for now. The one I've been focusing on is how to make the function-based convenience syntax work with pickling in the vast majority of interesting cases. This appears to be possible by using the same pattern used by namedtuple, and even better by encapsulating this pattern formally in stdlib so it stops being a hack (and may actually be useful for other code too). The other issue is your proposal to have a class-based convenience syntax akin to (correct me if I got this wrong): class Animal(Enum): __values__ = 'cat dog' This is obviously a matter of preference (and hence bikeshedding), but this still looks better to me: Animal = Enum('Animal', 'cat dog') It has two advantages: 1. Shorter 2. Parallels namedtuple, which is by now a well known and widely used construct On the other hand, your proposal has the advantage that it allows pickles without hacks in the implementation. Did I sum up the issues fairly? I don't know what to decide here. There's no clear technical merit to decide on one against the other (IMHO!), it's a matter of preference. Hopefully Guido will step in and save us from our misery ;-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 3 00:57:43 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 08:57:43 +1000 Subject: [Python-Dev] [Python-checkins] peps: Add time(), call_at(). Remove call_repeatedly(). Get rid of add_*_handler() In-Reply-To: <3b1rqn1Pt1z7Lkl@mail.python.org> References: <3b1rqn1Pt1z7Lkl@mail.python.org> Message-ID: On 3 May 2013 08:34, "guido.van.rossum" wrote: > > http://hg.python.org/peps/rev/26947623fc5d > changeset: 4870:26947623fc5d > user: Guido van Rossum > date: Thu May 02 14:11:08 2013 -0700 > summary: > Add time(), call_at(). Remove call_repeatedly(). Get rid of add_*_handler() return value. > > files: > pep-3156.txt | 80 +++++++++++++++++++++------------------ > 1 files changed, 43 insertions(+), 37 deletions(-) > > > diff --git a/pep-3156.txt b/pep-3156.txt > --- a/pep-3156.txt > +++ b/pep-3156.txt > @@ -252,13 +252,12 @@ > implementation may choose not to implement the internet/socket > methods, and still conform to the other methods.) > > -- Resource management: ``close()``. > +- Miscellaneous: ``close()``, ``time()``. > > - Starting and stopping: ``run_forever()``, ``run_until_complete()``, > ``stop()``, ``is_running()``. > > -- Basic callbacks: ``call_soon()``, ``call_later()``, > - ``call_repeatedly()``. > +- Basic callbacks: ``call_soon()``, ``call_later()``, ``call_at()``. > > - Thread interaction: ``call_soon_threadsafe()``, > ``wrap_future()``, ``run_in_executor()``, > @@ -303,8 +302,8 @@ > Required Event Loop Methods > --------------------------- > > -Resource Management > -''''''''''''''''''' > +Miscellaneous > +''''''''''''' > > - ``close()``. Closes the event loop, releasing any resources it may > hold, such as the file descriptor used by ``epoll()`` or > @@ -313,6 +312,12 @@ > again. It may be called multiple times; subsequent calls are > no-ops. > > +- ``time()``. Returns the current time according to the event loop's > + clock. This may be ``time.time()`` or ``time.monotonic()`` or some > + other system-specific clock, but it must return a float expressing > + the time in units of approximately one second since some epoch. > + (No clock is perfect -- see PEP 418.) Should the PEP allow event loops that use decimal.Decimal? > + > Starting and Stopping > ''''''''''''''''''''' > > @@ -362,17 +367,27 @@ > ``callback(*args)`` to be called approximately ``delay`` seconds in > the future, once, unless cancelled. Returns a Handle representing > the callback, whose ``cancel()`` method can be used to cancel the > - callback. If ``delay`` is <= 0, this acts like ``call_soon()`` > - instead. Otherwise, callbacks scheduled for exactly the same time > - will be called in an undefined order. > + callback. Callbacks scheduled in the past or at exactly the same > + time will be called in an undefined order. > > -- ``call_repeatedly(interval, callback, **args)``. Like > - ``call_later()`` but calls the callback repeatedly, every (approximately) > - ``interval`` seconds, until the Handle returned is cancelled or > - the callback raises an exception. The first call is in > - approximately ``interval`` seconds. If for whatever reason the > - callback happens later than scheduled, subsequent callbacks will be > - delayed for (at least) the same amount. The ``interval`` must be > 0. > +- ``call_at(when, callback, *args)``. This is like ``call_later()``, > + but the time is expressed as an absolute time. There is a simple > + equivalency: ``loop.call_later(delay, callback, *args)`` is the same > + as ``loop.call_at(loop.time() + delay, callback, *args)``. It may be worth explicitly noting the time scales where floating point's dynamic range starts to significantly limit granularity. Cheers, Nick. > + > +Note: A previous version of this PEP defined a method named > +``call_repeatedly()``, which promised to call a callback at regular > +intervals. This has been withdrawn because the design of such a > +function is overspecified. On the one hand, a simple timer loop can > +easily be emulated using a callback that reschedules itself using > +``call_later()``; it is also easy to write coroutine containing a loop > +and a ``sleep()`` call (a toplevel function in the module, see below). > +On the other hand, due to the complexities of accurate timekeeping > +there are many traps and pitfalls here for the unaware (see PEP 418), > +and different use cases require different behavior in edge cases. It > +is impossible to offer an API for this purpose that is bullet-proof in > +all cases, so it is deemed better to let application designers decide > +for themselves what kind of timer loop to implement. > > Thread interaction > '''''''''''''''''' > @@ -656,12 +671,9 @@ > > - ``add_reader(fd, callback, *args)``. Arrange for > ``callback(*args)`` to be called whenever file descriptor ``fd`` is > - deemed ready for reading. Returns a Handle object which can be used > - to cancel the callback. (However, it is strongly preferred to use > - ``remove_reader()`` instead.) Calling ``add_reader()`` again for > - the same file descriptor implies a call to ``remove_reader()`` for > - the same file descriptor. (TBD: Since cancelling the Handle is not > - recommended, perhaps we should return None instead?) > + deemed ready for reading. Calling ``add_reader()`` again for the > + same file descriptor implies a call to ``remove_reader()`` for the > + same file descriptor. > > - ``add_writer(fd, callback, *args)``. Like ``add_reader()``, > but registers the callback for writing instead of for reading. > @@ -669,8 +681,7 @@ > - ``remove_reader(fd)``. Cancels the current read callback for file > descriptor ``fd``, if one is set. If no callback is currently set > for the file descriptor, this is a no-op and returns ``False``. > - Otherwise, it removes the callback arrangement, cancels the > - corresponding Handle, and returns ``True``. > + Otherwise, it removes the callback arrangement and returns ``True``. > > - ``remove_writer(fd)``. This is to ``add_writer()`` as > ``remove_reader()`` is to ``add_reader()``. > @@ -704,11 +715,7 @@ > '''''''''''''''' > > - ``add_signal_handler(sig, callback, *args). Whenever signal ``sig`` > - is received, arrange for ``callback(*args)`` to be called. Returns > - a Handle which can be used to cancel the signal callback. > - (Cancelling the handle causes ``remove_signal_handler()`` to be > - called the next time the signal arrives. Explicitly calling > - ``remove_signal_handler()`` is preferred.) > + is received, arrange for ``callback(*args)`` to be called. > Specifying another callback for the same signal replaces the > previous handler (only one handler can be active per signal). The > ``sig`` must be a valid sigal number defined in the ``signal`` > @@ -777,11 +784,12 @@ > Handles > ------- > > -The various methods for registering callbacks (e.g. ``call_soon()`` > -and ``add_reader()``) all return an object representing the > -registration that can be used to cancel the callback. This object is > -called a Handle (although its class name is not necessarily > -``Handle``). Handles are opaque and have only one public method: > +The various methods for registering one-off callbacks > +(``call_soon()``, ``call_later()`` and ``call_at()``) all return an > +object representing the registration that can be used to cancel the > +callback. This object is called a Handle (although its class name is > +not necessarily ``Handle``). Handles are opaque and have only one > +public method: > > - ``cancel()``. Cancel the callback. > > @@ -1354,10 +1362,6 @@ > Open Issues > =========== > > -- A ``time()`` method that returns the time according to the function > - used by the scheduler (e.g. ``time.monotonic()`` in Tulip's case)? > - What's the use case? > - > - A fuller public API for Handle? What's the use case? > > - Should we require all event loops to implement ``sock_recv()`` and > @@ -1410,6 +1414,8 @@ > - PEP 3153, while rejected, has a good write-up explaining the need > to separate transports and protocols. > > +- PEP 418 discusses the issues of timekeeping. > + > - Tulip repo: http://code.google.com/p/tulip/ > > - Nick Coghlan wrote a nice blog post with some background, thoughts > > -- > Repository URL: http://hg.python.org/peps > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdhardy at gmail.com Fri May 3 01:08:01 2013 From: jdhardy at gmail.com (Jeff Hardy) Date: Thu, 2 May 2013 16:08:01 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 1:18 PM, fwierzbicki at gmail.com wrote: > On Thu, May 2, 2013 at 12:07 PM, Ethan Furman wrote: >> In order for the Enum convenience function to be pickleable, we have this >> line of code in the metaclass: >> >> enum_class.__module__ = sys._getframe(1).f_globals['__name__'] >> >> This works fine for Cpython, but what about the others? > This should work for Jython, but I can't say I like it. I believe > IronPython has a sort of speedup mode that disallows the use of > _getframe, and I'd like to add this to Jython someday. It's not just a "speedup mode", it's the default. IronPython requires frames to explicitly enabled because tracking them is about a 10% performance hit (or so Dino told me once upon a time). If you must use it, please copy the code block from namedtuple that ignores it on IronPython. - Jeff From ncoghlan at gmail.com Fri May 3 01:14:22 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 09:14:22 +1000 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: On 3 May 2013 08:00, "Eli Bendersky" wrote: > > > Eli, it would be nice if you stopped with this claim. >> >> >> I'm not advocating "not having a convenience syntax", I'm advocating >> having a convenience syntax which is *class-based* rather than >> function-based. >> >> Debuggers are beside the point: there are two kinds of "convenience >> syntax" on the table; one allows pickling by construction, one >> requires an ugly hack which may not solve all cases (and which may >> apparently make Jython / IronPython mildly unhappy). Why you insist >> on ignoring the former and imposing the latter is beyond me. > > > I'm not trying to belittle our class-based suggestion. I just think there are two separate issues here, and I was focusing on just one of them for now. The one I've been focusing on is how to make the function-based convenience syntax work with pickling in the vast majority of interesting cases. This appears to be possible by using the same pattern used by namedtuple, and even better by encapsulating this pattern formally in stdlib so it stops being a hack (and may actually be useful for other code too). > > The other issue is your proposal to have a class-based convenience syntax akin to (correct me if I got this wrong): > > class Animal(Enum): > __values__ = 'cat dog' I would suggest moving the field names into the class header for a class based convenience API: class Animal(Enum, members='cat dog'): pass Cheers, Nick. > > This is obviously a matter of preference (and hence bikeshedding), but this still looks better to me: > > Animal = Enum('Animal', 'cat dog') > > It has two advantages: > > 1. Shorter > 2. Parallels namedtuple, which is by now a well known and widely used construct > > On the other hand, your proposal has the advantage that it allows pickles without hacks in the implementation. > > Did I sum up the issues fairly? > > I don't know what to decide here. There's no clear technical merit to decide on one against the other (IMHO!), it's a matter of preference. Hopefully Guido will step in and save us from our misery ;-) > > Eli > > > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 3 01:18:42 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 2 May 2013 16:18:42 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: On Thu, May 2, 2013 at 4:14 PM, Nick Coghlan wrote: > I would suggest moving the field names into the class header for a class > based convenience API: > > class Animal(Enum, members='cat dog'): pass Would you propose the same for namedtuple? -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Fri May 3 01:43:06 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 03 May 2013 11:43:06 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> Message-ID: <5182FA0A.1040802@canterbury.ac.nz> Guido van Rossum wrote: > you should do some other check, > e.g. "if x in Color:". So you don't think it's important to have an easy way to take user input that's supposed to be a Color name and either return a Color or raise a ValueError? -- Greg From greg.ewing at canterbury.ac.nz Fri May 3 01:45:22 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 03 May 2013 11:45:22 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> <51828956.7080100@hastings.org> <20130502085752.5b528960@anarchist> Message-ID: <5182FA92.70405@canterbury.ac.nz> Eli Bendersky wrote: > TypeError: Cannot subclass enumerations This message might be better phrased as "cannot extend enumerations", since we're still allowing subclassing prior to defining members. -- Greg From ethan at stoneleaf.us Fri May 3 01:50:49 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 May 2013 16:50:49 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5182FA92.70405@canterbury.ac.nz> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> <51828956.7080100@hastings.org> <20130502085752.5b528960@anarchist> <5182FA92.70405@canterbury.ac.nz> Message-ID: <5182FBD9.8010608@stoneleaf.us> On 05/02/2013 04:45 PM, Greg Ewing wrote: > Eli Bendersky wrote: > >> TypeError: Cannot subclass enumerations > > This message might be better phrased as "cannot extend > enumerations", since we're still allowing subclassing > prior to defining members. I like it, thanks! -- ~Ethan~ From eliben at gmail.com Fri May 3 01:57:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 2 May 2013 16:57:46 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5182FBD9.8010608@stoneleaf.us> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <517DF0C7.7080905@stoneleaf.us> <517E1828.1080802@stoneleaf.us> <20130430231843.341c7659@anarchist> <5180B635.7000904@stoneleaf.us> <518164C8.2090204@hastings.org> <20130502075730.0768263c@anarchist> <51828956.7080100@hastings.org> <20130502085752.5b528960@anarchist> <5182FA92.70405@canterbury.ac.nz> <5182FBD9.8010608@stoneleaf.us> Message-ID: On Thu, May 2, 2013 at 4:50 PM, Ethan Furman wrote: > On 05/02/2013 04:45 PM, Greg Ewing wrote: > >> Eli Bendersky wrote: >> >> TypeError: Cannot subclass enumerations >>> >> >> This message might be better phrased as "cannot extend >> enumerations", since we're still allowing subclassing >> prior to defining members. >> > > I like it, thanks! > +1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Fri May 3 02:01:54 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 17:01:54 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: <20130502170154.0b3ab8d1@anarchist> On May 03, 2013, at 09:14 AM, Nick Coghlan wrote: >> The other issue is your proposal to have a class-based convenience syntax >akin to (correct me if I got this wrong): >> >> class Animal(Enum): >> __values__ = 'cat dog' > >I would suggest moving the field names into the class header for a class >based convenience API: > >class Animal(Enum, members='cat dog'): pass Wait, what is this trying to solve? "Convenience API" is really a shorthand for "functional API". Two very different use cases that the above suggestion doesn't address. IMHO, it's not worth giving up the functional API for picklability if the technical problems cannot be resolved, especially given we already have the same problem for namedtuples. -Barry From ethan at stoneleaf.us Fri May 3 01:57:39 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 02 May 2013 16:57:39 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5182FA0A.1040802@canterbury.ac.nz> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> Message-ID: <5182FD73.9090906@stoneleaf.us> On 05/02/2013 04:43 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> you should do some other check, >> e.g. "if x in Color:". > > So you don't think it's important to have an easy > way to take user input that's supposed to be a > Color name and either return a Color or raise > a ValueError? I don't believe that's what he said: > The name lookup is only relevant if you already know that you have a > valid name of an enum in the class [...] User input should qualify, and using getattr(EnumClass, user_input) will get you an AttributeError instead of a ValueError if user_input is not valid, but surely you don't mind that small difference. ;) -- ~Ethan~ From ncoghlan at gmail.com Fri May 3 03:06:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 11:06:40 +1000 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5182FD73.9090906@stoneleaf.us> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> Message-ID: On Fri, May 3, 2013 at 9:57 AM, Ethan Furman wrote: > On 05/02/2013 04:43 PM, Greg Ewing wrote: >> >> Guido van Rossum wrote: >>> >>> you should do some other check, >>> e.g. "if x in Color:". >> >> >> So you don't think it's important to have an easy >> way to take user input that's supposed to be a >> Color name and either return a Color or raise >> a ValueError? > > > I don't believe that's what he said: > >> The name lookup is only relevant if you already know that you have a >> valid name of an enum in the class [...] > > > User input should qualify, and using getattr(EnumClass, user_input) will get > you an AttributeError instead of a ValueError if user_input is not valid, > but surely you don't mind that small difference. ;) >>> int(getattr(C(), "__str__")) Traceback (most recent call last): File "", line 1, in TypeError: int() argument must be a string or a number, not 'method-wrapper' That's the problem Greg is complaining about: when you use getattr to do the name->enum member conversion, you have to do your own checking to exclude method names. This is part of why I think enums should offer an "as_dict()" method that returns an ordered dictionary. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 3 03:10:04 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 11:10:04 +1000 Subject: [Python-Dev] PEP-435 reference implementation In-Reply-To: <5181C36C.8090007@pearwood.info> References: <51802595.2040305@stoneleaf.us> <5180448E.2030301@g.nevcal.com> <518046FC.1000100@stoneleaf.us> <518058A0.9040501@stoneleaf.us> <5180807D.9090707@g.nevcal.com> <20130430214751.023ef767@anarchist> <5180AD27.3090306@stoneleaf.us> <20130501084432.246b4dbd@anarchist> <5181C36C.8090007@pearwood.info> Message-ID: On Thu, May 2, 2013 at 11:37 AM, Steven D'Aprano wrote: > On 02/05/13 08:54, Nick Coghlan wrote: > >> If enums had an "as_dict" method that returned an ordered dictionary, you >> could do: >> >> class MoreColors(Enum): >> locals().update(Colors.as_dict()) > > > > Surely that is an implementation-specific piece of code? Writing to locals() > is not guaranteed to work, and the documentation warns against it. > > http://docs.python.org/3/library/functions.html#locals I've long thought we should stop being wishy-washy about modification of locals(), and make the current CPython behaviour part of the language spec: - at module scope, locals() must return the same thing as globals(), which must be the actual module namespace - at class scope, it must return the namespace returned by __prepare__() - at function scope, it returns a snapshot of the current locals and free variables, and thus does not support modifications (and may not see subsequent changes) I'll start a separate thread about that. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 3 03:29:56 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 11:29:56 +1000 Subject: [Python-Dev] Tightening up the specification for locals() Message-ID: An exchange in one of the enum threads prompted me to write down something I've occasionally thought about regarding locals(): it is currently severely underspecified, and I'd like to make the current CPython behaviour part of the language/library specification. (We recently found a bug in the interaction between the __prepare__ method and lexical closures that was indirectly related to this underspecification) Specifically, rather than the current vague "post-modification of locals may not work", I would like to explicitly document the expected behaviour at module, class and function scope (as well as clearly documenting the connection between modules, classes and the single- and dual-namespace variants of exec() and eval()): * at module scope, as well as when using exec() or eval() with a single namespace, locals() must return the same thing as globals(), which must be the actual execution namespace. Subsequent execution may change the contents of the returned mapping, and changes to the returned mapping must change the execution environment. * at class scope, as well as when using exec() or eval() with separate global and local namespaces, locals() must return the specified local namespace (which may be supplied by the metaclass __prepare__ method in the case of classes). Subsequent execution may change the contents of the returned mapping, and changes to the returned mapping must change the execution environment. For classes, this mapping will not be used as the actual class namespace underlying the defined class (the class creation process will copy the contents to a fresh dictionary that is only accessible by going through the class machinery). * at function scope, locals() must return a *snapshot* of the current locals and free variables. Subsequent execution must not change the contents of the returned mapping and changes to the returned mapping must not change the execution environment. Rather than adding this low level detail to the library reference docs, I would suggest adding it to the data model section of the language reference, with a link to the appropriate section from the docs for the locals() builtin. The warning in the locals() docs would be softened to indicate that modifications won't work at function scope, but are supported at module and class scope. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Fri May 3 03:52:19 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 2 May 2013 18:52:19 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> Message-ID: <20130502185219.0d5d0b92@anarchist> On May 03, 2013, at 11:06 AM, Nick Coghlan wrote: >> User input should qualify, and using getattr(EnumClass, user_input) will get >> you an AttributeError instead of a ValueError if user_input is not valid, >> but surely you don't mind that small difference. ;) > >>>> int(getattr(C(), "__str__")) >Traceback (most recent call last): > File "", line 1, in >TypeError: int() argument must be a string or a number, not 'method-wrapper' > >That's the problem Greg is complaining about: when you use getattr to >do the name->enum member conversion, you have to do your own checking >to exclude method names. > >This is part of why I think enums should offer an "as_dict()" method >that returns an ordered dictionary. Should this be allowed then? class Transformations(Enum): as_int = 1 as_dict = 2 as_tuple = 3 ? I still don't get it why this is an issue though, or at least why this is different than any other getattr on any other class, or even Enums. I mean, you could do a getattr on any other class or instance with any random user input and there's no guarantee you could pass it straight to int() or any other conversion type. So you pretty much have to be prepared to capture exceptions anyway. -Barry From benjamin at python.org Fri May 3 04:43:09 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 2 May 2013 22:43:09 -0400 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: References: Message-ID: 2013/5/2 Nick Coghlan : > An exchange in one of the enum threads prompted me to write down > something I've occasionally thought about regarding locals(): it is > currently severely underspecified, and I'd like to make the current > CPython behaviour part of the language/library specification. (We > recently found a bug in the interaction between the __prepare__ method > and lexical closures that was indirectly related to this > underspecification) > > Specifically, rather than the current vague "post-modification of > locals may not work", I would like to explicitly document the expected > behaviour at module, class and function scope (as well as clearly > documenting the connection between modules, classes and the single- > and dual-namespace variants of exec() and eval()): > > * at module scope, as well as when using exec() or eval() with a > single namespace, locals() must return the same thing as globals(), > which must be the actual execution namespace. Subsequent execution may > change the contents of the returned mapping, and changes to the > returned mapping must change the execution environment. > * at class scope, as well as when using exec() or eval() with separate > global and local namespaces, locals() must return the specified local > namespace (which may be supplied by the metaclass __prepare__ method > in the case of classes). Subsequent execution may change the contents > of the returned mapping, and changes to the returned mapping must > change the execution environment. For classes, this mapping will not > be used as the actual class namespace underlying the defined class > (the class creation process will copy the contents to a fresh > dictionary that is only accessible by going through the class > machinery). > * at function scope, locals() must return a *snapshot* of the current > locals and free variables. Subsequent execution must not change the > contents of the returned mapping and changes to the returned mapping > must not change the execution environment. > > Rather than adding this low level detail to the library reference > docs, I would suggest adding it to the data model section of the > language reference, with a link to the appropriate section from the > docs for the locals() builtin. The warning in the locals() docs would > be softened to indicate that modifications won't work at function > scope, but are supported at module and class scope. This sounds good to me. -- Regards, Benjamin From steve at pearwood.info Fri May 3 04:43:41 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 03 May 2013 12:43:41 +1000 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: References: Message-ID: <5183245D.2000009@pearwood.info> On 03/05/13 11:29, Nick Coghlan wrote: > An exchange in one of the enum threads prompted me to write down > something I've occasionally thought about regarding locals(): it is > currently severely underspecified, and I'd like to make the current > CPython behaviour part of the language/library specification. (We > recently found a bug in the interaction between the __prepare__ method > and lexical closures that was indirectly related to this > underspecification) Fixing the underspecification is good. Enshrining a limitation as the one correct way, not so good. > * at function scope, locals() must return a *snapshot* of the current > locals and free variables. Subsequent execution must not change the > contents of the returned mapping and changes to the returned mapping > must not change the execution environment. If we were designing the language from scratch, with no concern for optimizing function execution, would we want this as a language feature? I don't believe that there is anyone who would say: "I really want locals() to behave differently inside functions from how it behaves inside classes and the global scope, as a feature in and of itself." Obviously CPython introduces that limitation for good reason, and I don't wish to suggest that this is the wrong thing to do, but it is a trade-off, and some implementations may wish to make other trade-offs, or even find a way to avoid it altogether. E.g. IronPython and Jython both allow this: >>> def func(): ... x = 1; del x ... locals()['x'] = 2 ... print x ... >>> func() 2 And why not? In and of itself, writing to locals() inside a function is no worse a thing to do than writing to locals() inside a class or global scope. It's not something actively harmful that must be prohibited, so why prohibit it? I think that conforming Python implementations should be allowed a choice between two fully-specified behaviours, the choice between them being a "quality of implementation" issue: - locals() may return a read-only or frozen mapping containing a snapshot of the current locals and free variable, in which case subsequent execution must not change the contents of the returned mapping, and changing the returned mapping is not possible; - locals() may return an ordinary dict, in which case it must be the actual execution namespace, or a proxy to it. Subsequent execution will change the contents of the returned mapping, and changes to the mapping must change the execution environment. Code can determine at runtime which capability is provided by inspecting the type of the returned mapping: if isinstance(locals(), dict) then you have support for modifying the executable environment, if not, you don't. Obviously if you wish to write platform-agnostic code, you have to target the least behaviour, which would be read-only locals. But there's lots of code that runs only under Jython or IronPython, and if somebody really needs to write to locals(), they can target an implementation that provides that feature. -- Steven From tjreedy at udel.edu Fri May 3 06:21:44 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 03 May 2013 00:21:44 -0400 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: References: Message-ID: On 5/2/2013 9:29 PM, Nick Coghlan wrote: > An exchange in one of the enum threads prompted me to write down > something I've occasionally thought about regarding locals(): it is > currently severely underspecified, and I'd like to make the current > CPython behaviour part of the language/library specification. (We > recently found a bug in the interaction between the __prepare__ method > and lexical closures that was indirectly related to this > underspecification) > > Specifically, rather than the current vague "post-modification of > locals may not work", I would like to explicitly document the expected > behaviour at module, class and function scope (as well as clearly > documenting the connection between modules, classes and the single- > and dual-namespace variants of exec() and eval()): > > * at module scope, as well as when using exec() or eval() with a > single namespace, locals() must return the same thing as globals(), > which must be the actual execution namespace. Subsequent execution may > change the contents of the returned mapping, and changes to the > returned mapping must change the execution environment. > * at class scope, as well as when using exec() or eval() with separate > global and local namespaces, locals() must return the specified local > namespace (which may be supplied by the metaclass __prepare__ method > in the case of classes). Subsequent execution may change the contents > of the returned mapping, and changes to the returned mapping must > change the execution environment. For classes, this mapping will not > be used as the actual class namespace underlying the defined class > (the class creation process will copy the contents to a fresh > dictionary that is only accessible by going through the class > machinery). > * at function scope, locals() must return a *snapshot* of the current > locals and free variables. Subsequent execution must not change the > contents of the returned mapping and changes to the returned mapping > must not change the execution environment. Except that, apparently, subsequent execution *does* change the returned mapping when tracing in on. Some of the loose specification is intentional. http://bugs.python.org/issue7083 locals() behaviour differs when tracing is in effect -- Terry Jan Reedy From g.brandl at gmx.net Fri May 3 07:20:04 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 03 May 2013 07:20:04 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: Am 02.05.2013 23:57, schrieb Eli Bendersky: >> Eli, it would be nice if you stopped with this claim. > > > I'm not advocating "not having a convenience syntax", I'm advocating > having a convenience syntax which is *class-based* rather than > function-based. > > Debuggers are beside the point: there are two kinds of "convenience > syntax" on the table; one allows pickling by construction, one > requires an ugly hack which may not solve all cases (and which may > apparently make Jython / IronPython mildly unhappy). Why you insist > on ignoring the former and imposing the latter is beyond me. > > > I'm not trying to belittle our class-based suggestion. I just think there are > two separate issues here, and I was focusing on just one of them for now. The > one I've been focusing on is how to make the function-based convenience syntax > work with pickling in the vast majority of interesting cases. This appears to be > possible by using the same pattern used by namedtuple, and even better by > encapsulating this pattern formally in stdlib so it stops being a hack (and may > actually be useful for other code too). > > The other issue is your proposal to have a class-based convenience syntax akin > to (correct me if I got this wrong): > > class Animal(Enum): > __values__ = 'cat dog' > > This is obviously a matter of preference (and hence bikeshedding), but this > still looks better to me: > > Animal = Enum('Animal', 'cat dog') > > It has two advantages: > > 1. Shorter > 2. Parallels namedtuple, which is by now a well known and widely used construct Not to forget 3. Has to specify the class name twice for good measure ;) Georg From solipsis at pitrou.net Fri May 3 10:42:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 3 May 2013 10:42:59 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: <20130503104259.49ca54f0@pitrou.net> Le Fri, 3 May 2013 09:14:22 +1000, Nick Coghlan a ?crit : > > > > The other issue is your proposal to have a class-based convenience > > syntax > akin to (correct me if I got this wrong): > > > > class Animal(Enum): > > __values__ = 'cat dog' > > I would suggest moving the field names into the class header for a > class based convenience API: > > class Animal(Enum, members='cat dog'): pass This looks good to me (assuming some people don't like the special attribute scheme). Regards Antoine. From solipsis at pitrou.net Fri May 3 10:51:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 3 May 2013 10:51:03 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> Message-ID: <20130503105103.7d04398c@pitrou.net> Le Thu, 2 May 2013 14:57:35 -0700, Eli Bendersky a ?crit : > > class Animal(Enum): > __values__ = 'cat dog' > > This is obviously a matter of preference (and hence bikeshedding), > but this still looks better to me: > > Animal = Enum('Animal', 'cat dog') > > It has two advantages: > > 1. Shorter You're gaining one line of code. I suppose it's significant if you write ten enums a day, otherwise... ;-) > 2. Parallels namedtuple, which is by now a well known and widely used > construct namedtuple is the exception, not the rule. I don't know of another popular type which follows a similar scheme. On the other hand, well-known ORMs (SQLAlchemy, Django ORM) use a class-based syntax despite their declarative nature and the fact that they allow you to set "meta" options (e.g. the name of the reflected table). As an egoistical data point, I always subclass namedtuples, because I minimally want to add a docstring, and sometimes I also want to add behaviour (e.g. alternate constructors, serialization). Which means namedtuple's declarative conciseness is generally lost for me :-) Note that besides ORMs, the proposed __values__ has built-in precedent with __slots__. Regards Antoine. From steve at pearwood.info Fri May 3 11:40:21 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 03 May 2013 19:40:21 +1000 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130503104259.49ca54f0@pitrou.net> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> Message-ID: <51838605.3010301@pearwood.info> On 03/05/13 18:42, Antoine Pitrou wrote: > Le Fri, 3 May 2013 09:14:22 +1000, > Nick Coghlan a ?crit : >> I would suggest moving the field names into the class header for a >> class based convenience API: >> >> class Animal(Enum, members='cat dog'): pass > > This looks good to me (assuming some people don't like the > special attribute scheme). The problem is that this is not an expression, it is a statement. The advantage of the convenience function is not just that it is shorter, but that it is an expression. -- Steven From solipsis at pitrou.net Fri May 3 11:49:13 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 3 May 2013 11:49:13 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> Message-ID: <20130503114913.2e611e9b@pitrou.net> Le Fri, 03 May 2013 19:40:21 +1000, Steven D'Aprano a ?crit : > On 03/05/13 18:42, Antoine Pitrou wrote: > > Le Fri, 3 May 2013 09:14:22 +1000, > > Nick Coghlan a ?crit : > > >> I would suggest moving the field names into the class header for a > >> class based convenience API: > >> > >> class Animal(Enum, members='cat dog'): pass > > > > This looks good to me (assuming some people don't like the > > special attribute scheme). > > The problem is that this is not an expression, it is a statement. The > advantage of the convenience function is not just that it is shorter, > but that it is an expression. What does that change exactly? Regards Antoine. From stefan_ml at behnel.de Fri May 3 12:11:47 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 03 May 2013 12:11:47 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <5182B963.9030304@stoneleaf.us> References: <5182B963.9030304@stoneleaf.us> Message-ID: Ethan Furman, 02.05.2013 21:07: > In order for the Enum convenience function to be pickleable, we have this > line of code in the metaclass: > > enum_class.__module__ = sys._getframe(1).f_globals['__name__'] What a hack. And fragile, too. > This works fine for Cpython, but what about the others? This doesn't work when used from Cython compiled code due to the lack of frames. They are only created for exception tracebacks and not for normal code by default (just for profiling, coverage etc.). My guess is that no-one noticed the problem for namedtuples so far because using them is still uncommon enough in general, let alone pickling them, and the module name hack only leads to an error when someone tries to pickle such an object. I think that this will be more of a problem for enums than for namedtuples, because enums are more likely to appear in data structures that people want to pickle. The most simple work-around seems to be this, once you know about it: """ ttuple = namedtuple('ttuple', 'a b c') ttuple.__module__ = __name__ # enable pickle support """ Not any worse than the hack above, IMHO, but at least guaranteed to work. For enums, a regular class based declaration can easily avoid this hack, so my vote is for getting rid of the "convenience" API before it starts doing any harm. Or document it explicitly as generating unpicklable objects, as Antoine suggests. Stefan From p.f.moore at gmail.com Fri May 3 12:37:23 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 3 May 2013 11:37:23 +0100 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: Message-ID: On 2 April 2013 01:47, Daniel Holth wrote: > This PEP proposes to fix these problems by re-publicising the feature, > defining the .pyz and .pyzw extensions as ?Python ZIP Applications? > and ?Windowed Python ZIP Applications?, and providing some simple > tooling to manage the format. > There is a bug in Windows Powershell, which is apparently due to a bug in the underlying FindExecutable API, that can fail to recognise extensions which are longer than 3 characters properly. Rather than risk obscure bugs, I would suggest restricting the extensions to 3 characters. For the ?Windowed Python ZIP Applications? case, could we use .pzw as the extension instead of .pyzw? Please don't shoot the messenger here - I'm not going to try to defend such a stupid Windows bug, but better to be safe in my view. Flames about Windows to /dev/null... Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri May 3 12:45:03 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 3 May 2013 20:45:03 +1000 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: Message-ID: On 3 May 2013 20:40, "Paul Moore" wrote: > > On 2 April 2013 01:47, Daniel Holth wrote: >> >> This PEP proposes to fix these problems by re-publicising the feature, >> defining the .pyz and .pyzw extensions as ?Python ZIP Applications? >> and ?Windowed Python ZIP Applications?, and providing some simple >> tooling to manage the format. > > > There is a bug in Windows Powershell, which is apparently due to a bug in the underlying FindExecutable API, that can fail to recognise extensions which are longer than 3 characters properly. > > Rather than risk obscure bugs, I would suggest restricting the extensions to 3 characters. For the ?Windowed Python ZIP Applications? case, could we use .pzw as the extension instead of .pyzw? > > Please don't shoot the messenger here - I'm not going to try to defend such a stupid Windows bug, but better to be safe in my view. Flames about Windows to /dev/null... I'm OK with the shortened extension. Cheers, Nick. > > Paul. > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri May 3 15:34:59 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 May 2013 01:34:59 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130502185219.0d5d0b92@anarchist> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> Message-ID: <5183BD03.9040108@canterbury.ac.nz> Barry Warsaw wrote: > I still don't get it why this is an issue though, or at least why this is > different than any other getattr on any other class, It's not a problem that getattr() has this behaviour. What I'm questioning is the idea that getattr() should be the only provided way of doing a name->enum lookup, because that will require everyone to do extra checks to ensure safety. -- Greg From eliben at gmail.com Fri May 3 16:14:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 3 May 2013 07:14:46 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5183BD03.9040108@canterbury.ac.nz> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> Message-ID: On Fri, May 3, 2013 at 6:34 AM, Greg Ewing wrote: > Barry Warsaw wrote: > >> I still don't get it why this is an issue though, or at least why this is >> different than any other getattr on any other class, >> > > It's not a problem that getattr() has this behaviour. > What I'm questioning is the idea that getattr() should > be the only provided way of doing a name->enum lookup, > because that will require everyone to do extra checks > to ensure safety. > I'm just curious what it is about enums that sets everyone on a "let's make things safer" path. Python is about duck typing, it's absolutely "unsafe" in the static typing sense, in the most fundamental ways imaginable. When programmatically invoking a method on a class (say some sort of RPC), we don't check that the class is of the correct type. We invoke a method, and if it quacks, that's a good enough duck. If it was actually the wrong class, something will break later. EAFP Is a central Python tenet, whether we like it or not. If one looks for static guarantees, Python surely shouldn't be the preferred language, no? And concretely, how is this case different from any programmatic attribute access in Python objects? You can pass dunders to getattr() and it probably wasn't what you meant, but Python does not do this type checking for you. Why is an Enum different than any other class? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 3 16:46:04 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 3 May 2013 07:46:04 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> Message-ID: On Fri, May 3, 2013 at 7:14 AM, Eli Bendersky wrote: > I'm just curious what it is about enums that sets everyone on a "let's make > things safer" path. Python is about duck typing, it's absolutely "unsafe" in > the static typing sense, in the most fundamental ways imaginable. When > programmatically invoking a method on a class (say some sort of RPC), we > don't check that the class is of the correct type. We invoke a method, and > if it quacks, that's a good enough duck. If it was actually the wrong class, > something will break later. EAFP Is a central Python tenet, whether we like > it or not. If one looks for static guarantees, Python surely shouldn't be > the preferred language, no? > > And concretely, how is this case different from any programmatic attribute > access in Python objects? You can pass dunders to getattr() and it probably > wasn't what you meant, but Python does not do this type checking for you. > Why is an Enum different than any other class? Let's make that a topic for a separate, more philosophical thread, python-ideas. Back to this particular issue, I haven't seen code in the style that Greg proposes in decades, and I don't think it is an important enough use case to support more directly than through getattr() + isinstance(). -- --Guido van Rossum (python.org/~guido) From status at bugs.python.org Fri May 3 18:07:23 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 3 May 2013 18:07:23 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130503160723.E21D3560D1@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-04-26 - 2013-05-03) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3953 ( +4) closed 25714 (+40) total 29667 (+44) Open issues with patches: 1773 Issues opened (27) ================== #17585: IDLE - regression with exit() and quit() http://bugs.python.org/issue17585 reopened by serhiy.storchaka #17825: Indentation.offset and SyntaxError.offset mismatch http://bugs.python.org/issue17825 reopened by flox #17852: Built-in module _io can loose data from buffered files at exit http://bugs.python.org/issue17852 opened by arigo #17854: symmetric difference operation applicable to more than two set http://bugs.python.org/issue17854 opened by Amit.Saha #17855: Implement introspection of logger hierarchy http://bugs.python.org/issue17855 opened by vinay.sajip #17857: sqlite modules doesn't build with 2.7.4 on Mac OS X 10.4 http://bugs.python.org/issue17857 opened by lemburg #17858: Different documentation for identical methods http://bugs.python.org/issue17858 opened by amysyk #17859: improve error message for saving ints to file http://bugs.python.org/issue17859 opened by techtonik #17860: subprocess docs lack info how to use output result http://bugs.python.org/issue17860 opened by techtonik #17861: put opcode information in one place http://bugs.python.org/issue17861 opened by benjamin.peterson #17862: itertools.chunks(iterable, size, fill=None) http://bugs.python.org/issue17862 opened by techtonik #17868: pprint long non-printable bytes as hexdump http://bugs.python.org/issue17868 opened by serhiy.storchaka #17870: Python does not provide PyLong_FromIntMax_t() or PyLong_FromUi http://bugs.python.org/issue17870 opened by Devin Jeanpierre #17871: Wrong signature of TextTestRunner's init function http://bugs.python.org/issue17871 opened by piotr.dobrogost #17872: Crash in marshal.load() with bad reader http://bugs.python.org/issue17872 opened by serhiy.storchaka #17873: _ctypes/libffi missing bits for aarch64 support http://bugs.python.org/issue17873 opened by schwab #17874: ProcessPoolExecutor in interactive shell doesn't work in Windo http://bugs.python.org/issue17874 opened by Decade #17877: Skip test_variable_tzname when the zoneinfo database is missin http://bugs.python.org/issue17877 opened by ezio.melotti #17878: There is no way to get a list of available codecs http://bugs.python.org/issue17878 opened by pmoore #17882: test_objecttypes fails for 3.2.4 on CentOS 6 http://bugs.python.org/issue17882 opened by bharper #17883: Fix buildbot testing of Tkinter http://bugs.python.org/issue17883 opened by zach.ware #17884: Try to reuse stdint.h types like int32_t http://bugs.python.org/issue17884 opened by haypo #17887: docs: summary page - generator vs iterator vs iterable http://bugs.python.org/issue17887 opened by techtonik #17888: docs: more information on documentation team http://bugs.python.org/issue17888 opened by techtonik #17890: argparse: mutually exclusive groups full of suppressed args ca http://bugs.python.org/issue17890 opened by gholms #17893: Refactor reduce protocol implementation http://bugs.python.org/issue17893 opened by alexandre.vassalotti #17894: Edits to descriptor howto http://bugs.python.org/issue17894 opened by nedbat Most recent 15 issues with no replies (15) ========================================== #17894: Edits to descriptor howto http://bugs.python.org/issue17894 #17893: Refactor reduce protocol implementation http://bugs.python.org/issue17893 #17887: docs: summary page - generator vs iterator vs iterable http://bugs.python.org/issue17887 #17883: Fix buildbot testing of Tkinter http://bugs.python.org/issue17883 #17882: test_objecttypes fails for 3.2.4 on CentOS 6 http://bugs.python.org/issue17882 #17877: Skip test_variable_tzname when the zoneinfo database is missin http://bugs.python.org/issue17877 #17873: _ctypes/libffi missing bits for aarch64 support http://bugs.python.org/issue17873 #17872: Crash in marshal.load() with bad reader http://bugs.python.org/issue17872 #17862: itertools.chunks(iterable, size, fill=None) http://bugs.python.org/issue17862 #17848: issue about compile with clang and build a shared lib http://bugs.python.org/issue17848 #17844: Add link to alternatives for bytes-to-bytes codecs http://bugs.python.org/issue17844 #17840: base64_codec uses assert for runtime validity checks http://bugs.python.org/issue17840 #17829: csv.Sniffer.snif doesn't set up the dialect properly for a csv http://bugs.python.org/issue17829 #17824: pty.spawn handles errors improperly http://bugs.python.org/issue17824 #17799: settrace docs are wrong about "c_call" events http://bugs.python.org/issue17799 Most recent 15 issues waiting for review (15) ============================================= #17894: Edits to descriptor howto http://bugs.python.org/issue17894 #17893: Refactor reduce protocol implementation http://bugs.python.org/issue17893 #17890: argparse: mutually exclusive groups full of suppressed args ca http://bugs.python.org/issue17890 #17884: Try to reuse stdint.h types like int32_t http://bugs.python.org/issue17884 #17883: Fix buildbot testing of Tkinter http://bugs.python.org/issue17883 #17877: Skip test_variable_tzname when the zoneinfo database is missin http://bugs.python.org/issue17877 #17873: _ctypes/libffi missing bits for aarch64 support http://bugs.python.org/issue17873 #17871: Wrong signature of TextTestRunner's init function http://bugs.python.org/issue17871 #17870: Python does not provide PyLong_FromIntMax_t() or PyLong_FromUi http://bugs.python.org/issue17870 #17868: pprint long non-printable bytes as hexdump http://bugs.python.org/issue17868 #17861: put opcode information in one place http://bugs.python.org/issue17861 #17858: Different documentation for identical methods http://bugs.python.org/issue17858 #17857: sqlite modules doesn't build with 2.7.4 on Mac OS X 10.4 http://bugs.python.org/issue17857 #17855: Implement introspection of logger hierarchy http://bugs.python.org/issue17855 #17844: Add link to alternatives for bytes-to-bytes codecs http://bugs.python.org/issue17844 Top 10 most discussed issues (10) ================================= #17810: Implement PEP 3154 (pickle protocol 4) http://bugs.python.org/issue17810 18 msgs #17857: sqlite modules doesn't build with 2.7.4 on Mac OS X 10.4 http://bugs.python.org/issue17857 15 msgs #17870: Python does not provide PyLong_FromIntMax_t() or PyLong_FromUi http://bugs.python.org/issue17870 13 msgs #17878: There is no way to get a list of available codecs http://bugs.python.org/issue17878 12 msgs #17884: Try to reuse stdint.h types like int32_t http://bugs.python.org/issue17884 10 msgs #12458: Tracebacks should contain the first line of continuation lines http://bugs.python.org/issue12458 9 msgs #17825: Indentation.offset and SyntaxError.offset mismatch http://bugs.python.org/issue17825 9 msgs #17852: Built-in module _io can loose data from buffered files at exit http://bugs.python.org/issue17852 9 msgs #17838: Can't assign a different value for sys.stdin in IDLE http://bugs.python.org/issue17838 8 msgs #17843: Lib/test/testbz2_bigmem.bz2 trigger virus warnings http://bugs.python.org/issue17843 8 msgs Issues closed (38) ================== #1722: Undocumented urllib functions http://bugs.python.org/issue1722 closed by orsenthil #7152: urllib2.build_opener() skips ProxyHandler http://bugs.python.org/issue7152 closed by r.david.murray #11078: Have test___all__ check for duplicates http://bugs.python.org/issue11078 closed by ezio.melotti #12596: cPickle - stored data differ for same dictionary http://bugs.python.org/issue12596 closed by alexandre.vassalotti #13721: ssl.wrap_socket on a connected but failed connection succeeds http://bugs.python.org/issue13721 closed by pitrou #14290: Importing script as module causes ImportError with pickle.load http://bugs.python.org/issue14290 closed by alexandre.vassalotti #14679: Define an __all__ for html.parser http://bugs.python.org/issue14679 closed by ezio.melotti #15535: Fix pickling efficiency of named tuples in 2.7.3 http://bugs.python.org/issue15535 closed by rhettinger #16141: Possible simplification for old-style exception handling code http://bugs.python.org/issue16141 closed by r.david.murray #17358: imp.load_module() leads to the improper caching of the 'file' http://bugs.python.org/issue17358 closed by brett.cannon #17529: fix os.sendfile() documentation regarding the type of file des http://bugs.python.org/issue17529 closed by neologix #17565: segfaults during serialization http://bugs.python.org/issue17565 closed by alexandre.vassalotti #17646: traceback.py has a lot of code duplication http://bugs.python.org/issue17646 closed by python-dev #17712: test_gdb failures http://bugs.python.org/issue17712 closed by pitrou #17802: html.HTMLParser raises UnboundLocalError: http://bugs.python.org/issue17802 closed by ezio.melotti #17804: streaming struct unpacking http://bugs.python.org/issue17804 closed by pitrou #17834: Add Heap (and DynamicHeap) classes to heapq module http://bugs.python.org/issue17834 closed by rhettinger #17842: Add base64 module tests for a bytearray argument http://bugs.python.org/issue17842 closed by serhiy.storchaka #17851: Grammar errors in threading.Lock documentation http://bugs.python.org/issue17851 closed by georg.brandl #17853: Conflict between lexical scoping and name injection in __prepa http://bugs.python.org/issue17853 closed by python-dev #17856: multiprocessing.Process.join does not block if timeout is lowe http://bugs.python.org/issue17856 closed by neologix #17863: Bad sys.stdin assignment hangs interpreter. http://bugs.python.org/issue17863 closed by python-dev #17864: IDLE won't run http://bugs.python.org/issue17864 closed by ned.deily #17865: PowerPC exponentiation and round() interaction http://bugs.python.org/issue17865 closed by mark.dickinson #17866: TestCase.assertItemsEqual exists in 2.7, not in 3.3 http://bugs.python.org/issue17866 closed by ezio.melotti #17867: Deleting __import__ from builtins can crash Python3 http://bugs.python.org/issue17867 closed by python-dev #17869: distutils - TypeError in command/build_ext.py http://bugs.python.org/issue17869 closed by giampaolo.rodola #17875: Set Intersection returns unexpected results http://bugs.python.org/issue17875 closed by mark.dickinson #17876: Doc issue with threading.Event http://bugs.python.org/issue17876 closed by r.david.murray #17879: corrupt download http://bugs.python.org/issue17879 closed by ezio.melotti #17880: `tmpnam_r' is dangerous, better use `mkstemp' http://bugs.python.org/issue17880 closed by christian.heimes #17881: plistlib.writePlist documentation clarification for file objec http://bugs.python.org/issue17881 closed by ezio.melotti #17885: multiprocessing.Process child process imports package instead http://bugs.python.org/issue17885 closed by r.david.murray #17886: spam http://bugs.python.org/issue17886 closed by benjamin.peterson #17889: argparse subparsers break without arguments http://bugs.python.org/issue17889 closed by r.david.murray #17891: Wrong MD5 calculation on really long strings and the Hashlib http://bugs.python.org/issue17891 closed by neologix #17892: Fix the name of _PyObject_CallMethodObjIdArgs http://bugs.python.org/issue17892 closed by python-dev #1727418: xmlrpclib waits indefinately http://bugs.python.org/issue1727418 closed by r.david.murray From barry at python.org Fri May 3 18:08:16 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 3 May 2013 09:08:16 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <51838605.3010301@pearwood.info> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> Message-ID: <20130503090816.5c8ddd66@anarchist> On May 03, 2013, at 07:40 PM, Steven D'Aprano wrote: >The problem is that this is not an expression, it is a statement. The >advantage of the convenience function is not just that it is shorter, but >that it is an expression. Exactly right, but let's stop calling it the "convenience API" and instead call it the "functional API". I probably started the perpetuation of this problem; let's update the PEP. BTW, I made a suggestion elsewhere that the first argument could accept, but not require dotted names in the first argument. If provided, rsplit the string and use the prefix as __module__. If not given, fallback to the _getframe() hack for those implementations where it's available. The same could probably be done to namedtuples. -Barry From guido at python.org Fri May 3 18:23:43 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 3 May 2013 09:23:43 -0700 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130503090816.5c8ddd66@anarchist> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> <20130503090816.5c8ddd66@anarchist> Message-ID: On Fri, May 3, 2013 at 9:08 AM, Barry Warsaw wrote: > On May 03, 2013, at 07:40 PM, Steven D'Aprano wrote: > >>The problem is that this is not an expression, it is a statement. The >>advantage of the convenience function is not just that it is shorter, but >>that it is an expression. > > Exactly right, but let's stop calling it the "convenience API" and instead > call it the "functional API". I probably started the perpetuation of this > problem; let's update the PEP. > > BTW, I made a suggestion elsewhere that the first argument could accept, but > not require dotted names in the first argument. If provided, rsplit the > string and use the prefix as __module__. If not given, fallback to the > _getframe() hack for those implementations where it's available. > > The same could probably be done to namedtuples. All sounds good to me. -- --Guido van Rossum (python.org/~guido) From duda.piotr at gmail.com Fri May 3 19:20:26 2013 From: duda.piotr at gmail.com (Piotr Duda) Date: Fri, 3 May 2013 19:20:26 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130503090816.5c8ddd66@anarchist> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> <20130503090816.5c8ddd66@anarchist> Message-ID: 2013/5/3 Barry Warsaw : > On May 03, 2013, at 07:40 PM, Steven D'Aprano wrote: > >>The problem is that this is not an expression, it is a statement. The >>advantage of the convenience function is not just that it is shorter, but >>that it is an expression. > > Exactly right, but let's stop calling it the "convenience API" and instead > call it the "functional API". I probably started the perpetuation of this > problem; let's update the PEP. > > BTW, I made a suggestion elsewhere that the first argument could accept, but > not require dotted names in the first argument. If provided, rsplit the > string and use the prefix as __module__. If not given, fallback to the > _getframe() hack for those implementations where it's available. What about adding simple syntax that allows get rid of those ugly hacks, something like: def name = expression which would be rough equivalent for: name = expression name.__name__ = 'name' name.__module__ = __name__ -- ???????? ?????? From g.brandl at gmx.net Fri May 3 21:14:42 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 03 May 2013 21:14:42 +0200 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <51838605.3010301@pearwood.info> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> Message-ID: Am 03.05.2013 11:40, schrieb Steven D'Aprano: > On 03/05/13 18:42, Antoine Pitrou wrote: >> Le Fri, 3 May 2013 09:14:22 +1000, Nick Coghlan a >> ?crit : > >>> I would suggest moving the field names into the class header for a class >>> based convenience API: >>> >>> class Animal(Enum, members='cat dog'): pass >> >> This looks good to me (assuming some people don't like the special >> attribute scheme). > > The problem is that this is not an expression, it is a statement. The > advantage of the convenience function is not just that it is shorter, but > that it is an expression. But using that expression in any form other than NAME = Enum('NAME', ...) will again result in an unpicklable enum, which was the point of this thread. Georg From tjreedy at udel.edu Fri May 3 21:51:02 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 03 May 2013 15:51:02 -0400 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: <20130503090816.5c8ddd66@anarchist> References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> <20130503090816.5c8ddd66@anarchist> Message-ID: On 5/3/2013 12:08 PM, Barry Warsaw wrote: > Exactly right, but let's stop calling it the "convenience API" and instead > call it the "functional API". I probably started the perpetuation of this > problem; let's update the PEP. Please do. To me, a 'convenience function' is something like the timeit functions or subprocess.call that create a class instance, call a method (or two) on the instance, and then discard the instance while returning the result of calling methods. For the common case handled by the function, the implementation via a class with methods is a detail that the user hardly need know about. Using a function interface to create and return a class is something else. -- Terry Jan Reedy From p.f.moore at gmail.com Fri May 3 22:23:07 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 3 May 2013 21:23:07 +0100 Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong Message-ID: While reviewing the behaviour of Vinay's "distil" installer tool (see distutils-sig for details, but it's not relevant here) I have found what I think is a flaw in the behaviour of the py.exe launcher for Windows. To recap for people unfamiliar with the launcher, it emulates #! line interpretation on Windows, interpreting commonly used forms from Unix and launching the appropriate installed Python interpreter (finding its location from the registry, as python.exe may not be on PATH). The problem is with the interpretation of #!/usr/bin/env python. The launcher treats this the same as #!/usr/bin/python, launching the "default" Python. But that is *not* what the equivalent line does on Unix, where it launches the *currently active* Python (a crucial difference when there is an active virtualenv). The result is that a script written to run with the active Python works on Unix as expected, but can use an unexpected version of Python on Windows. This is particularly unpleasant when the program in question is an (un)installer like distil! I would propose that the behaviour of the launcher on Windows should be changed when it encounters specifically the hashbang line #!/usr/bin/env python. In that case, it should search PATH for a copy of python.exe, and if it finds one, use that. If there is no python.exe on PATH, it should fall back to the same version of Python as would have been used if the line were #!/usr/bin/python. This will mean that scripts written with #!/usr/bin/env python will behave the same on Unix and Windows in the presence of activated virtualenvs. Would people be happy with this change? If so I will open an issue on bugs.python.org. I can look at producing a patch, as well. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat May 4 01:08:42 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 May 2013 11:08:42 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> Message-ID: <5184437A.7090400@canterbury.ac.nz> Guido van Rossum wrote: > I haven't seen code in the style that > Greg proposes in decades, What style are you talking about here? -- Greg From greg.ewing at canterbury.ac.nz Sat May 4 01:15:17 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 May 2013 11:15:17 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> Message-ID: <51844505.80807@canterbury.ac.nz> Eli Bendersky wrote: > I'm just curious what it is about enums that sets everyone on a "let's > make things safer" path. Python is about duck typing, it's absolutely > "unsafe" in the static typing sense, in the most fundamental ways > imaginable. This isn't about catching bugs in the program, it's about validating user input. That's a common enough task that it deserves to have a convenient way to do it correctly. Imagine if int() had the property that, as well as accepting strings of decimal digits, it also accepted the string "guido" and returned his birthday as a DateTime object. When people complain, they're told it's okay, you only need to write if s != "guido": x = int(s) else: raise ValueError What would you think of that situation? > Why is an Enum different than any other class? It's not, that's the whole point. IMO it deserves to have a convenient way of mapping a valid string representation -- and nothing else -- to a valid value, just as much as any other type does. -- Greg From solipsis at pitrou.net Sat May 4 01:22:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 4 May 2013 01:22:18 +0200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> Message-ID: <20130504012218.37f8470e@fsol> On Sat, 04 May 2013 11:15:17 +1200 Greg Ewing wrote: > Eli Bendersky wrote: > > I'm just curious what it is about enums that sets everyone on a "let's > > make things safer" path. Python is about duck typing, it's absolutely > > "unsafe" in the static typing sense, in the most fundamental ways > > imaginable. > > This isn't about catching bugs in the program, it's > about validating user input. That's a common enough > task that it deserves to have a convenient way to > do it correctly. +1. An enum is basically a bidirectional mapping between some raw values and some "nice" instances, so it deserves a well-defined lookup operation in each direction. Regards Antoine. From guido at python.org Sat May 4 01:31:59 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 3 May 2013 16:31:59 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <5184437A.7090400@canterbury.ac.nz> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <5184437A.7090400@canterbury.ac.nz> Message-ID: On Fri, May 3, 2013 at 4:08 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> I haven't seen code in the style that >> Greg proposes in decades, > What style are you talking about here? Code that wants to validate a string the user typed as input. Web forms just don't work that way. (Command-line flags are a special case, and there are a slew of specialized parsers for that case.) -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Sat May 4 03:41:27 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 04 May 2013 11:41:27 +1000 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: Message-ID: <51846747.9060507@pearwood.info> On 03/05/13 20:37, Paul Moore wrote: > On 2 April 2013 01:47, Daniel Holth wrote: > >> This PEP proposes to fix these problems by re-publicising the feature, >> defining the .pyz and .pyzw extensions as ?Python ZIP Applications? >> and ?Windowed Python ZIP Applications?, and providing some simple >> tooling to manage the format. >> > > There is a bug in Windows Powershell, which is apparently due to a bug in > the underlying FindExecutable API, that can fail to recognise extensions > which are longer than 3 characters properly. Are you referring to this one? https://groups.google.com/group/microsoft.public.vb.general.discussion/browse_thread/thread/109aaa1c7d6a31a7/76f9a67c39002178?hl=en That's pretty old, is it still a problem? Besides, if I'm reading this properly: http://msdn.microsoft.com/en-us/library/bb776419(VS.85).aspx the issue is that they should be using AssocQueryString, not FindExecutable. > Rather than risk obscure bugs, I would suggest restricting the extensions > to 3 characters. For the ?Windowed Python ZIP Applications? case, could we > use .pzw as the extension instead of .pyzw? I've had Linux systems which associated OpenOffice docs with Archive Manager rather than OpenOffice. It's likely that at least some Linux systems will likewise decide that .pyz files are archives, not Python files, and open them in Archive Manager. I don't believe that it is Python's responsibility to work around bugs in desktop environments' handling of file associations. Many official Microsoft file extensions are four or more letters, e.g. docx. I don't see any value in making long-lasting decisions on file extensions based on (transient?) bugs that aren't our responsibility. -- Steven From ncoghlan at gmail.com Sat May 4 03:59:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 4 May 2013 11:59:13 +1000 Subject: [Python-Dev] PyPy, Jython, & IronPython: Enum convenience function and pickleablity In-Reply-To: References: <5182B963.9030304@stoneleaf.us> <20130502221003.6180b90c@fsol> <20130502222210.1cc86c16@fsol> <20130502224253.2cd18384@fsol> <20130502231028.7994527a@fsol> <20130502232821.3e31112c@fsol> <20130503104259.49ca54f0@pitrou.net> <51838605.3010301@pearwood.info> Message-ID: On 4 May 2013 05:17, "Georg Brandl" wrote: > > Am 03.05.2013 11:40, schrieb Steven D'Aprano: > > On 03/05/13 18:42, Antoine Pitrou wrote: > >> Le Fri, 3 May 2013 09:14:22 +1000, Nick Coghlan a > >> ?crit : > > > >>> I would suggest moving the field names into the class header for a class > >>> based convenience API: > >>> > >>> class Animal(Enum, members='cat dog'): pass > >> > >> This looks good to me (assuming some people don't like the special > >> attribute scheme). > > > > The problem is that this is not an expression, it is a statement. The > > advantage of the convenience function is not just that it is shorter, but > > that it is an expression. > > But using that expression in any form other than > > NAME = Enum('NAME', ...) > > will again result in an unpicklable enum, which was the point of this thread. Right, if all we want is a functional API that doesn't support pickling of the resulting class, that's trivial. What I'm after is a convenience API that supports *autonumbering*, as a trivial replacement for code that currently uses "range(n)". A class statement is perfectly acceptable to me for that purpose. Independently of that, I do like the notion of a "types.set_name(cls, dotted_name)" API that alters __name__ and __module__, while leaving __qualname__ alone. Cheers, Nick. > > Georg > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat May 4 04:08:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 4 May 2013 12:08:13 +1000 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> Message-ID: On 4 May 2013 00:17, "Eli Bendersky" wrote: > > > > > On Fri, May 3, 2013 at 6:34 AM, Greg Ewing wrote: >> >> Barry Warsaw wrote: >>> >>> I still don't get it why this is an issue though, or at least why this is >>> different than any other getattr on any other class, >> >> >> It's not a problem that getattr() has this behaviour. >> What I'm questioning is the idea that getattr() should >> be the only provided way of doing a name->enum lookup, >> because that will require everyone to do extra checks >> to ensure safety. > > > I'm just curious what it is about enums that sets everyone on a "let's make things safer" path. Python is about duck typing, it's absolutely "unsafe" in the static typing sense, in the most fundamental ways imaginable. When programmatically invoking a method on a class (say some sort of RPC), we don't check that the class is of the correct type. We invoke a method, and if it quacks, that's a good enough duck. If it was actually the wrong class, something will break later. EAFP Is a central Python tenet, whether we like it or not. If one looks for static guarantees, Python surely shouldn't be the preferred language, no? > > And concretely, how is this case different from any programmatic attribute access in Python objects? You can pass dunders to getattr() and it probably wasn't what you meant, but Python does not do this type checking for you. Why is an Enum different than any other class? The only reason to use enums at all is to improve logging and error messages. Thus, designing the API and behaviour of an enum type is mostly a matter of asking "What mistakes are developers likely to make?" and "How can the enum design help guide them towards a suitable solution?". The answers are a combination of API design and providing appropriate details in error messages. If a developer doesn't care about those two questions then they would just use the raw underlying values. Cheers, Nick. > > Eli > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat May 4 04:11:33 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 4 May 2013 12:11:33 +1000 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <5184437A.7090400@canterbury.ac.nz> Message-ID: On 4 May 2013 09:34, "Guido van Rossum" wrote: > > On Fri, May 3, 2013 at 4:08 PM, Greg Ewing wrote: > > Guido van Rossum wrote: > >> > >> I haven't seen code in the style that > >> Greg proposes in decades, > > > What style are you talking about here? > > Code that wants to validate a string the user typed as input. Web > forms just don't work that way. (Command-line flags are a special > case, and there are a slew of specialized parsers for that case.) And for code that really needs it, it is straightforward to use dir(MyEnum) and isinstance(obj, MyEnum) to get an exact mapping of names to values that also accounts for aliases. Cheers, Nick. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Sat May 4 04:18:40 2013 From: brian at python.org (Brian Curtin) Date: Fri, 3 May 2013 21:18:40 -0500 Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong In-Reply-To: References: Message-ID: On Fri, May 3, 2013 at 3:23 PM, Paul Moore wrote: > I would propose that the behaviour of the launcher on Windows should be > changed when it encounters specifically the hashbang line #!/usr/bin/env > python. In that case, it should search PATH for a copy of python.exe, and if > it finds one, use that. If there is no python.exe on PATH, it should fall > back to the same version of Python as would have been used if the line were > #!/usr/bin/python. > > This will mean that scripts written with #!/usr/bin/env python will behave > the same on Unix and Windows in the presence of activated virtualenvs. > > Would people be happy with this change? If so I will open an issue on > bugs.python.org. I can look at producing a patch, as well. Sounds reasonable to me. From v+python at g.nevcal.com Sat May 4 04:20:37 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Fri, 03 May 2013 19:20:37 -0700 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <51846747.9060507@pearwood.info> References: <51846747.9060507@pearwood.info> Message-ID: <51847075.6090806@g.nevcal.com> On 5/3/2013 6:41 PM, Steven D'Aprano wrote: > Many official Microsoft file extensions are four or more letters, e.g. > docx. I don't see any value in making long-lasting decisions on file > extensions based on (transient?) bugs that aren't our responsibility. +1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sat May 4 07:13:01 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 04 May 2013 14:13:01 +0900 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <51846747.9060507@pearwood.info> References: <51846747.9060507@pearwood.info> Message-ID: <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > > Rather than risk obscure bugs, I would suggest restricting the extensions > > to 3 characters. For the ?Windowed Python ZIP Applications? case, could we > > use .pzw as the extension instead of .pyzw? +0 > Many official Microsoft file extensions are four or more letters, > e.g. docx. Give us a non-MS example, please. Nobody in their right mind would clash with a major MS product's naming conventions. Not even if their file format implements Digital-Ocular Coordination eXtensions. And a shell that borks the Borg's extensions won't make it in the market. > I don't see any value in making long-lasting decisions > on file extensions based on (transient?) bugs that aren't our > responsibility. Getting these associations right is worth *something* to Python. I'm not in a position to say more than "it's positive". But I don't see why we really care about what the file extensions are as long as they serve the purpose of making it easy to figure out which files are in what format in a names-only list. I have to admit that "Windowed Python ZIP Application" is probably something I personally will only ever consider as an hypothesis, though. From ncoghlan at gmail.com Sat May 4 07:50:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 4 May 2013 15:50:09 +1000 Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong In-Reply-To: References: Message-ID: On Sat, May 4, 2013 at 12:18 PM, Brian Curtin wrote: > On Fri, May 3, 2013 at 3:23 PM, Paul Moore wrote: >> I would propose that the behaviour of the launcher on Windows should be >> changed when it encounters specifically the hashbang line #!/usr/bin/env >> python. In that case, it should search PATH for a copy of python.exe, and if >> it finds one, use that. If there is no python.exe on PATH, it should fall >> back to the same version of Python as would have been used if the line were >> #!/usr/bin/python. >> >> This will mean that scripts written with #!/usr/bin/env python will behave >> the same on Unix and Windows in the presence of activated virtualenvs. >> >> Would people be happy with this change? If so I will open an issue on >> bugs.python.org. I can look at producing a patch, as well. > > Sounds reasonable to me. Also to me. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From g.brandl at gmx.net Sat May 4 08:10:43 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 04 May 2013 08:10:43 +0200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130504012218.37f8470e@fsol> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: Am 04.05.2013 01:22, schrieb Antoine Pitrou: > On Sat, 04 May 2013 11:15:17 +1200 > Greg Ewing wrote: >> Eli Bendersky wrote: >> > I'm just curious what it is about enums that sets everyone on a "let's >> > make things safer" path. Python is about duck typing, it's absolutely >> > "unsafe" in the static typing sense, in the most fundamental ways >> > imaginable. >> >> This isn't about catching bugs in the program, it's >> about validating user input. That's a common enough >> task that it deserves to have a convenient way to >> do it correctly. > > +1. An enum is basically a bidirectional mapping between some raw > values and some "nice" instances, so it deserves a well-defined lookup > operation in each direction. Agreed. Georg From steve at pearwood.info Sat May 4 08:15:21 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 04 May 2013 16:15:21 +1000 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <5184A779.6010108@pearwood.info> On 04/05/13 15:13, Stephen J. Turnbull wrote: > Steven D'Aprano writes: > > > > Rather than risk obscure bugs, I would suggest restricting the extensions > > > to 3 characters. For the ?Windowed Python ZIP Applications? case, could we > > > use .pzw as the extension instead of .pyzw? > > +0 > > > Many official Microsoft file extensions are four or more letters, > > e.g. docx. > > Give us a non-MS example, please. Nobody in their right mind would > clash with a major MS product's naming conventions. Not even if their > file format implements Digital-Ocular Coordination eXtensions. And a > shell that borks the Borg's extensions won't make it in the market. I'm afraid I don't understand your question. Are you suggesting that four letter extensions are restricted to Microsoft products? If so, that would be an excellent reason to avoid .pyzw, but I don't believe that is the case. Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, .midi, .java and .torrent. -- Steven From ncoghlan at gmail.com Sat May 4 08:42:08 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 4 May 2013 16:42:08 +1000 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: On Sat, May 4, 2013 at 4:10 PM, Georg Brandl wrote: > Am 04.05.2013 01:22, schrieb Antoine Pitrou: >> On Sat, 04 May 2013 11:15:17 +1200 >> Greg Ewing wrote: >>> Eli Bendersky wrote: >>> > I'm just curious what it is about enums that sets everyone on a "let's >>> > make things safer" path. Python is about duck typing, it's absolutely >>> > "unsafe" in the static typing sense, in the most fundamental ways >>> > imaginable. >>> >>> This isn't about catching bugs in the program, it's >>> about validating user input. That's a common enough >>> task that it deserves to have a convenient way to >>> do it correctly. >> >> +1. An enum is basically a bidirectional mapping between some raw >> values and some "nice" instances, so it deserves a well-defined lookup >> operation in each direction. As I see it, there are 3 possible ways forward here: 1. The current PEP, offering only "getattr(MyEnum, name)". If code needs to ensure non-enum values are detected immediately (such as during translation of user input entered at a command prompt), then they can either create a separate mapping using: lookup = {m.name, m for m in (getattr(MyEnum, name) for name in dir(MyEnum)) if isinstance(m, MyEnum)} or else create a lookup function: def getmember(enum, name): m = getattr(enum, name, None) if not isinstance(m, enum): raise KeyError("{!r} is not a member of {!r}".format(name, enum)) return m 2. We restore __getitem__ on EnumMetaclass *solely* for member lookup by name (the "getmember" functionality above). This would leave __call__ used for the reverse lookup (value to member and hence name) and __getitem__ for the forward lookup (name to member and hence value) (Note: given Ethan's comments about his current implementation, I believe this actually fits nicely with the way EnumMetaclass.__getattr__ is already implemented) 3. We offer my earlier suggestion of an "as_dict()" method on the metaclass, which implements the mapping calculation above. As others pointed out, this has the same name clash problem as offering additional non-special methods on namedtuple objects. I'm now -1 on my own as_dict() suggestion, due to the general name clash problem for arbitrary enums. Options 1 and 2 both sound reasonable to me, although I have a preference for 2 due to the ability to produce a more appropriate error message when the lookup fails. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From greg.ewing at canterbury.ac.nz Sat May 4 08:46:35 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 May 2013 18:46:35 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <5184437A.7090400@canterbury.ac.nz> Message-ID: <5184AECB.6040501@canterbury.ac.nz> Guido van Rossum wrote: > Code that wants to validate a string the user typed as input. Web > forms just don't work that way. Maybe "validation" was a misleading term to use. To be more precise, I'm talking about taking input to the program (it needn't come directly from a user, it could be read from a file or database) that is supposed to be the name of a Color, and turning it into a Color instance. For that purpose, it's convenient to have a function with only two possible outcomes: it either returns a Color instance, or raises a ValueError. The point is that you *shouldn't* have to perform a separate validation step. You should be able to use EAFP -- go ahead and perform the conversion, but be prepared to catch a ValueError at some level and report it to the user. -- Greg From greg.ewing at canterbury.ac.nz Sat May 4 08:53:44 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 04 May 2013 18:53:44 +1200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: <5184B078.9060105@canterbury.ac.nz> Nick Coghlan wrote: > 1. The current PEP, offering only "getattr(MyEnum, name)". > > 2. We restore __getitem__ on EnumMetaclass *solely* for member lookup > by name 3. Use keyword arguments to distinguish two different ways of calling the enum class: MyEnum(value = 1) --> lookup by value MyEnum(name = "foo") --> lookup by name MyEnum(1) could be made equivalent to MyEnum(value = 1) if it's thought that lookup by value will be the most common or natural case. Pros: Explicit is better than implicit. Cons: Not so convenient to get a type-conversion function to pass to other things. -- Greg From ncoghlan at gmail.com Sat May 4 08:48:20 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 4 May 2013 16:48:20 +1000 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <5184A779.6010108@pearwood.info> References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> Message-ID: On Sat, May 4, 2013 at 4:15 PM, Steven D'Aprano wrote: > On 04/05/13 15:13, Stephen J. Turnbull wrote: >> >> Steven D'Aprano writes: >> >> > > Rather than risk obscure bugs, I would suggest restricting the >> extensions >> > > to 3 characters. For the ?Windowed Python ZIP Applications? case, >> could we >> > > use .pzw as the extension instead of .pyzw? >> >> +0 >> >> > Many official Microsoft file extensions are four or more letters, >> > e.g. docx. >> >> Give us a non-MS example, please. Nobody in their right mind would >> clash with a major MS product's naming conventions. Not even if their >> file format implements Digital-Ocular Coordination eXtensions. And a >> shell that borks the Borg's extensions won't make it in the market. > > > > I'm afraid I don't understand your question. Are you suggesting that four > letter extensions are restricted to Microsoft products? If so, that would be > an excellent reason to avoid .pyzw, but I don't believe that is the case. > > Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, .midi, .java > and .torrent. We don't need examples of arbitrary data file extentions, we need examples of 4 letter extensions that are known to work correctly when placed on PATHEXT, including when called from PowerShell. In the absence of confirmation that 4-letter extensions work reliably in such cases, it seems wise to abbreviate the Windows GUI application extension as .pzw. I've also cc'ed Steve Dower, since investigation of this kind of Windows behavioural question is one of the things he offered distuils-sig help with after PyCon US :) Cheers, Nick. P.S. Steve, FYI, here is Paul's original concern: http://mail.python.org/pipermail/python-dev/2013-May/125928.html -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Sat May 4 09:50:27 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 04 May 2013 16:50:27 +0900 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <5184A779.6010108@pearwood.info> References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> Message-ID: <87r4hngp4s.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > > Give us a non-MS example, please. > I'm afraid I don't understand your question. There were two problems mentioned. Paul worries about 4-letter extensions under PowerShell. You mentioned conflicts in Linux file managers. In both cases, a bug on Windows in detecting Microsoft products would kill (or at least seriously maim) a shell or file manager. I doubt many have ever existed, and surely they were detected *and* corrected pretty much immediately. My point is that such bug-awareness would not extend as strongly to extensions used by third-party free software. > Are you suggesting that four letter extensions are restricted to > Microsoft products? No, of course not. > Common 4+ letter extensions include .html, .tiff, .jpeg, .mpeg, > .midi, .java and .torrent. All of which (except perhaps .java and .torrent, which I bet are most commonly invoked not from shells but from IDEs and webbrowsers which have their own internal association databases) are commonly abbreviated to three letters on Windows, including in HTTP URLs which should have no such issues at all. That is consistent with my point (and Paul's, I believe). It doesn't prove anything, but given the decreasing importance of extensions for file typing on all systems, I think there's little penalty to being shortsighted and following the 3-character convention for extensions, especially on Windows. From p.f.moore at gmail.com Sat May 4 11:25:15 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 4 May 2013 10:25:15 +0100 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> Message-ID: On 4 May 2013 07:48, Nick Coghlan wrote: > We don't need examples of arbitrary data file extentions, we need > examples of 4 letter extensions that are known to work correctly when > placed on PATHEXT, including when called from PowerShell. In the > absence of confirmation that 4-letter extensions work reliably in such > cases, it seems wise to abbreviate the Windows GUI application > extension as .pzw. > > I've also cc'ed Steve Dower, since investigation of this kind of > Windows behavioural question is one of the things he offered > distuils-sig help with after PyCon US :) Nick, thanks for passing this on. Your explanation of the issue is precisely correct. For information (I should have included this in the original message) here's the Powershell bug report I found: https://connect.microsoft.com/PowerShell/feedback/details/238550/power-shell-trimming-extension-to-3-characters-when-resolving-file-associations Unfortunately the link to the referenced discussion in that report is inaccessible :-( Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From pconnell at gmail.com Sat May 4 11:26:41 2013 From: pconnell at gmail.com (Phil Connell) Date: Sat, 4 May 2013 10:26:41 +0100 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: On 4 May 2013 07:42, "Nick Coghlan" wrote: > 2. We restore __getitem__ on EnumMetaclass *solely* for member lookup > by name (the "getmember" functionality above). This would leave > __call__ used for the reverse lookup (value to member and hence name) > and __getitem__ for the forward lookup (name to member and hence > value) (Note: given Ethan's comments about his current implementation, > I believe this actually fits nicely with the way > EnumMetaclass.__getattr__ is already implemented) This has the advantage of leaving one obvious way to do the 'reverse' lookup (namely __call__), rather than two redundant alternatives. Cheers, Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From arigo at tunes.org Sat May 4 11:59:42 2013 From: arigo at tunes.org (Armin Rigo) Date: Sat, 4 May 2013 11:59:42 +0200 Subject: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) In-Reply-To: <20130307100840.GA24941@wycliff.ceplovi.cz> References: <1362575394.23949.2.camel@wycliff.ceplovi.cz> <20130307100840.GA24941@wycliff.ceplovi.cz> Message-ID: Hi Matej, On Thu, Mar 7, 2013 at 11:08 AM, Matej Cepl wrote: > if c is not ' ' and c is not ' ': > if c != ' ' and c != ' ': Sorry for the delay in answering, but I just noticed what is wrong in this "fix": it compares c with the same single-character ' ' twice, whereas the original compared it with ' ' and with the two-character ' '. A bient?t, Armin. From solipsis at pitrou.net Sat May 4 13:31:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 4 May 2013 13:31:51 +0200 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" References: <51846747.9060507@pearwood.info> Message-ID: <20130504133151.0c95c9a3@fsol> On Sat, 04 May 2013 11:41:27 +1000 Steven D'Aprano wrote: > > > Rather than risk obscure bugs, I would suggest restricting the extensions > > to 3 characters. For the ?Windowed Python ZIP Applications? case, could we > > use .pzw as the extension instead of .pyzw? > > I've had Linux systems which associated OpenOffice docs with Archive Manager rather than OpenOffice. It's likely that at least some Linux systems will likewise decide that .pyz files are archives, not Python files, and open them in Archive Manager. What would that have to do with the file extension? If some Linux systems decide that .ods and .pyz files are archives, it's probably because they *are* archives in their own right (though specialized ones). Probably the libmagic (used e.g. by the `file` command) wasn't up-to-date enough to specifically recognize OpenOffice documents, so it simply recognized the ZIP file structure and detected the file as such. Regards Antoine. From solipsis at pitrou.net Sat May 4 13:33:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 4 May 2013 13:33:39 +0200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: <20130504133339.1568eb0d@fsol> On Sat, 4 May 2013 16:42:08 +1000 Nick Coghlan wrote: > On Sat, May 4, 2013 at 4:10 PM, Georg Brandl wrote: > > Am 04.05.2013 01:22, schrieb Antoine Pitrou: > >> On Sat, 04 May 2013 11:15:17 +1200 > >> Greg Ewing wrote: > >>> Eli Bendersky wrote: > >>> > I'm just curious what it is about enums that sets everyone on a "let's > >>> > make things safer" path. Python is about duck typing, it's absolutely > >>> > "unsafe" in the static typing sense, in the most fundamental ways > >>> > imaginable. > >>> > >>> This isn't about catching bugs in the program, it's > >>> about validating user input. That's a common enough > >>> task that it deserves to have a convenient way to > >>> do it correctly. > >> > >> +1. An enum is basically a bidirectional mapping between some raw > >> values and some "nice" instances, so it deserves a well-defined lookup > >> operation in each direction. > > As I see it, there are 3 possible ways forward here: 4. Offer classmethods named Enum.by_name() and Enum.by_value(). Simple and explicit. Regards Antoine. From ethan at stoneleaf.us Sat May 4 15:37:23 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 04 May 2013 06:37:23 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <20130504133339.1568eb0d@fsol> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> <20130504133339.1568eb0d@fsol> Message-ID: <51850F13.8090102@stoneleaf.us> On 05/04/2013 04:33 AM, Antoine Pitrou wrote: > On Sat, 4 May 2013 16:42:08 +1000 > Nick Coghlan wrote: >> On Sat, May 4, 2013 at 4:10 PM, Georg Brandl wrote: >>> Am 04.05.2013 01:22, schrieb Antoine Pitrou: >>>> On Sat, 04 May 2013 11:15:17 +1200 >>>> Greg Ewing wrote: >>>>> Eli Bendersky wrote: >>>>>> I'm just curious what it is about enums that sets everyone on a "let's >>>>>> make things safer" path. Python is about duck typing, it's absolutely >>>>>> "unsafe" in the static typing sense, in the most fundamental ways >>>>>> imaginable. >>>>> >>>>> This isn't about catching bugs in the program, it's >>>>> about validating user input. That's a common enough >>>>> task that it deserves to have a convenient way to >>>>> do it correctly. >>>> >>>> +1. An enum is basically a bidirectional mapping between some raw >>>> values and some "nice" instances, so it deserves a well-defined lookup >>>> operation in each direction. >> >> As I see it, there are 3 possible ways forward here: > > 4. Offer classmethods named Enum.by_name() and Enum.by_value(). > Simple and explicit. And then you can't have enum items named by_name and by_value. -- ~Ethan~ From eric at trueblade.com Sat May 4 16:01:12 2013 From: eric at trueblade.com (Eric V. Smith) Date: Sat, 04 May 2013 10:01:12 -0400 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: <518514A8.5010604@trueblade.com> On 5/4/2013 2:42 AM, Nick Coghlan wrote: > On Sat, May 4, 2013 at 4:10 PM, Georg Brandl wrote: >> Am 04.05.2013 01:22, schrieb Antoine Pitrou: >>> On Sat, 04 May 2013 11:15:17 +1200 >>> Greg Ewing wrote: >>>> Eli Bendersky wrote: >>>>> I'm just curious what it is about enums that sets everyone on a "let's >>>>> make things safer" path. Python is about duck typing, it's absolutely >>>>> "unsafe" in the static typing sense, in the most fundamental ways >>>>> imaginable. >>>> >>>> This isn't about catching bugs in the program, it's >>>> about validating user input. That's a common enough >>>> task that it deserves to have a convenient way to >>>> do it correctly. >>> >>> +1. An enum is basically a bidirectional mapping between some raw >>> values and some "nice" instances, so it deserves a well-defined lookup >>> operation in each direction. > > As I see it, there are 3 possible ways forward here: > > 1. The current PEP, offering only "getattr(MyEnum, name)". > > If code needs to ensure non-enum values are detected immediately (such > as during translation of user input entered at a command prompt), then > they can either create a separate mapping using: > > lookup = {m.name, m for m in (getattr(MyEnum, name) for name in > dir(MyEnum)) if isinstance(m, MyEnum)} > > or else create a lookup function: > > def getmember(enum, name): > m = getattr(enum, name, None) > if not isinstance(m, enum): > raise KeyError("{!r} is not a member of {!r}".format(name, enum)) > return m > > 2. We restore __getitem__ on EnumMetaclass *solely* for member lookup > by name (the "getmember" functionality above). This would leave > __call__ used for the reverse lookup (value to member and hence name) > and __getitem__ for the forward lookup (name to member and hence > value) (Note: given Ethan's comments about his current implementation, > I believe this actually fits nicely with the way > EnumMetaclass.__getattr__ is already implemented) > > 3. We offer my earlier suggestion of an "as_dict()" method on the > metaclass, which implements the mapping calculation above. As others > pointed out, this has the same name clash problem as offering > additional non-special methods on namedtuple objects. > > I'm now -1 on my own as_dict() suggestion, due to the general name > clash problem for arbitrary enums. To avoid the name collision, namedtuple calls this _asdict(). -- Eric. From vinay_sajip at yahoo.co.uk Sat May 4 16:20:57 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sat, 4 May 2013 14:20:57 +0000 (UTC) Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong References: Message-ID: Paul Moore gmail.com> writes: > This will mean that scripts written with #!/usr/bin/env python will > behave the same on Unix and Windows in the presence of activated > virtualenvs. Overall I think it's the right result. There's one other compatibility clarification: at the moment, as allowed by the PEP, you can have launcher flags in a line that starts with #!/usr/bin/env python, for example #!/usr/bin/env python3.2-32 -u In such a case the launcher would use the 3.2-32 suffix to indicate that 32-bit Python 3.2 is wanted, and pass the -u to the launched executable. I assume that this behaviour should continue, and the Posix-compatible behaviour being proposed should only apply for lines that contain "#!/usr/bin/env python" followed by whitespace. Also, since we're making a backwards incompatible change, do people feel that it needs to be switched on only in the presence of e.g. an environment variable such as PYLAUNCH_SEARCHPATH, or should we just change the default behaviour now and risk breaking user scripts which may rely on the current behaviour? Regards, Vinay Sajip From solipsis at pitrou.net Sat May 4 16:25:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 4 May 2013 16:25:18 +0200 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? References: <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> <20130504133339.1568eb0d@fsol> <51850F13.8090102@stoneleaf.us> Message-ID: <20130504162518.70bf101c@fsol> On Sat, 04 May 2013 06:37:23 -0700 Ethan Furman wrote: > >>>> > >>>> +1. An enum is basically a bidirectional mapping between some raw > >>>> values and some "nice" instances, so it deserves a well-defined lookup > >>>> operation in each direction. > >> > >> As I see it, there are 3 possible ways forward here: > > > > 4. Offer classmethods named Enum.by_name() and Enum.by_value(). > > Simple and explicit. > > And then you can't have enum items named by_name and by_value. You can. Normal shadowing rules apply. By the same token, you can't have enum items named __str__ or __init__. How is that a problem? Attribute resolution rules imply some restrictions, which are well-known to all Python programmers. But, really, you can decide on another name if you like: __byname__ or _byname, etc. My point is simply that lookup doesn't *have* to invoke operators, and explicitly named classmethods are less confusing than repurposed operators. Regards Antoine. From guido at python.org Sat May 4 16:59:16 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 4 May 2013 07:59:16 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> Message-ID: Just to stop the bikeshedding, let's do #2. Put back __getitem__ solely for lookup by name. Keep __call__ (really __new__) for lookup by value or "pass-through" for members. --Guido On Fri, May 3, 2013 at 11:42 PM, Nick Coghlan wrote: > On Sat, May 4, 2013 at 4:10 PM, Georg Brandl wrote: >> Am 04.05.2013 01:22, schrieb Antoine Pitrou: >>> On Sat, 04 May 2013 11:15:17 +1200 >>> Greg Ewing wrote: >>>> Eli Bendersky wrote: >>>> > I'm just curious what it is about enums that sets everyone on a "let's >>>> > make things safer" path. Python is about duck typing, it's absolutely >>>> > "unsafe" in the static typing sense, in the most fundamental ways >>>> > imaginable. >>>> >>>> This isn't about catching bugs in the program, it's >>>> about validating user input. That's a common enough >>>> task that it deserves to have a convenient way to >>>> do it correctly. >>> >>> +1. An enum is basically a bidirectional mapping between some raw >>> values and some "nice" instances, so it deserves a well-defined lookup >>> operation in each direction. > > As I see it, there are 3 possible ways forward here: > > 1. The current PEP, offering only "getattr(MyEnum, name)". > > If code needs to ensure non-enum values are detected immediately (such > as during translation of user input entered at a command prompt), then > they can either create a separate mapping using: > > lookup = {m.name, m for m in (getattr(MyEnum, name) for name in > dir(MyEnum)) if isinstance(m, MyEnum)} > > or else create a lookup function: > > def getmember(enum, name): > m = getattr(enum, name, None) > if not isinstance(m, enum): > raise KeyError("{!r} is not a member of {!r}".format(name, enum)) > return m > > 2. We restore __getitem__ on EnumMetaclass *solely* for member lookup > by name (the "getmember" functionality above). This would leave > __call__ used for the reverse lookup (value to member and hence name) > and __getitem__ for the forward lookup (name to member and hence > value) (Note: given Ethan's comments about his current implementation, > I believe this actually fits nicely with the way > EnumMetaclass.__getattr__ is already implemented) > > 3. We offer my earlier suggestion of an "as_dict()" method on the > metaclass, which implements the mapping calculation above. As others > pointed out, this has the same name clash problem as offering > additional non-special methods on namedtuple objects. > > I'm now -1 on my own as_dict() suggestion, due to the general name > clash problem for arbitrary enums. > > Options 1 and 2 both sound reasonable to me, although I have a > preference for 2 due to the ability to produce a more appropriate > error message when the lookup fails. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Sat May 4 17:15:54 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Sat, 4 May 2013 16:15:54 +0100 Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong In-Reply-To: References: Message-ID: On 4 May 2013 15:20, Vinay Sajip wrote: > Paul Moore gmail.com> writes: > > > This will mean that scripts written with #!/usr/bin/env python will > > behave the same on Unix and Windows in the presence of activated > > virtualenvs. > > Overall I think it's the right result. There's one other compatibility > clarification: at the moment, as allowed by the PEP, you can have launcher > flags in a line that starts with #!/usr/bin/env python, for example > > #!/usr/bin/env python3.2-32 -u > > In such a case the launcher would use the 3.2-32 suffix to indicate that > 32-bit Python 3.2 is wanted, and pass the -u to the launched executable. I > assume that this behaviour should continue, and the Posix-compatible > behaviour > being proposed should only apply for lines that contain > > "#!/usr/bin/env python" > > followed by whitespace. > That sounds reasonable - I've never used the ability to add flags, but I agree that as there's no equivalent in POSIX, there's no reason to change the current behaviour in that case. > Also, since we're making a backwards incompatible change, do people feel > that > it needs to be switched on only in the presence of e.g. an environment > variable > such as PYLAUNCH_SEARCHPATH, or should we just change the default behaviour > now and risk breaking user scripts which may rely on the current behaviour? > Personally, I'd say make it unconditional - the current behaviour is unlikely to ever be what people actually *want*. But if the consensus is to make it conditional, could we have a flag in py.ini rather than an environment variable? That would be more consistent with normal Windows practice. Paul. PS Vinay - from this post, I assume you're already looking at this code? I was considering trying to put together a patch, but I don't want to duplicate effort if you're working on it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Sat May 4 17:42:28 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sat, 4 May 2013 15:42:28 +0000 (UTC) Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong References: Message-ID: Paul Moore gmail.com> writes: >
PS Vinay - from this post, I assume you're already looking at this > code? I was considering trying to put together a patch, but I don't want to > duplicate effort if you're working on it.
I've taken a quick look at it, but I probably won't be able to make any changes until the near the end of the coming week. Feel free to have a go; the place to make changes will be near the call is_virt = parse_shebang(...) ... if (!is_virt) { ... } else { /* In here is where the new logic will probably go. */ } Also, the #define SEARCH_PATH needs to be uncommented to include the find_on_path function. It also enables searching the path for customised commands. Regards, Vinay From brett at python.org Sat May 4 20:49:16 2013 From: brett at python.org (Brett Cannon) Date: Sat, 4 May 2013 14:49:16 -0400 Subject: [Python-Dev] [Python-checkins] cpython: #17115, 17116: Have modules initialize the __package__ and __loader__ In-Reply-To: <3b2ybB24YVz7Ljl@mail.python.org> References: <3b2ybB24YVz7Ljl@mail.python.org> Message-ID: FYI, I'm aware this broke some buildbots and will have a look today to figure out why. On Sat, May 4, 2013 at 1:57 PM, brett.cannon wrote: > http://hg.python.org/cpython/rev/e39a8f8ceb9f > changeset: 83607:e39a8f8ceb9f > user: Brett Cannon > date: Sat May 04 13:56:58 2013 -0400 > summary: > #17115,17116: Have modules initialize the __package__ and __loader__ > attributes to None. > > The long-term goal is for people to be able to rely on these > attributes existing and checking for None to see if they have been > set. Since import itself sets these attributes when a loader does not > the only instances when the attributes are None are from someone > overloading __import__() and not using a loader or someone creating a > module from scratch. > > This patch also unifies module initialization. Before you could have > different attributes with default values depending on how the module > object was created. Now the only way to not get the same default set > of attributes is to circumvent initialization by calling > ModuleType.__new__() directly. > > files: > Doc/c-api/module.rst | 11 +- > Doc/library/importlib.rst | 2 +- > Doc/reference/import.rst | 4 +- > Doc/whatsnew/3.4.rst | 5 + > Lib/ctypes/test/__init__.py | 2 +- > Lib/doctest.py | 2 +- > Lib/importlib/_bootstrap.py | 2 +- > Lib/inspect.py | 2 +- > Lib/test/test_descr.py | 4 +- > Lib/test/test_importlib/test_api.py | 14 +- > Lib/test/test_module.py | 28 +- > Misc/NEWS | 3 + > Objects/moduleobject.c | 39 +- > Python/importlib.h | 363 ++++++++------- > Python/pythonrun.c | 3 +- > 15 files changed, 264 insertions(+), 220 deletions(-) > > > diff --git a/Doc/c-api/module.rst b/Doc/c-api/module.rst > --- a/Doc/c-api/module.rst > +++ b/Doc/c-api/module.rst > @@ -35,13 +35,20 @@ > single: __name__ (module attribute) > single: __doc__ (module attribute) > single: __file__ (module attribute) > + single: __package__ (module attribute) > + single: __loader__ (module attribute) > > Return a new module object with the :attr:`__name__` attribute set to *name*. > - Only the module's :attr:`__doc__` and :attr:`__name__` attributes are filled in; > - the caller is responsible for providing a :attr:`__file__` attribute. > + The module's :attr:`__name__`, :attr:`__doc__`, :attr:`__package__`, and > + :attr:`__loader__` attributes are filled in (all but :attr:`__name__` are set > + to ``None``); the caller is responsible for providing a :attr:`__file__` > + attribute. > > .. versionadded:: 3.3 > > + .. versionchanged:: 3.4 > + :attr:`__package__` and :attr:`__loader__` are set to ``None``. > + > > .. c:function:: PyObject* PyModule_New(const char *name) > > diff --git a/Doc/library/importlib.rst b/Doc/library/importlib.rst > --- a/Doc/library/importlib.rst > +++ b/Doc/library/importlib.rst > @@ -827,7 +827,7 @@ > decorator as it subsumes this functionality. > > .. versionchanged:: 3.4 > - Set ``__loader__`` if set to ``None`` as well if the attribute does not > + Set ``__loader__`` if set to ``None``, as if the attribute does not > exist. > > > diff --git a/Doc/reference/import.rst b/Doc/reference/import.rst > --- a/Doc/reference/import.rst > +++ b/Doc/reference/import.rst > @@ -423,8 +423,8 @@ > * If the module has a ``__file__`` attribute, this is used as part of the > module's repr. > > - * If the module has no ``__file__`` but does have a ``__loader__``, then the > - loader's repr is used as part of the module's repr. > + * If the module has no ``__file__`` but does have a ``__loader__`` that is not > + ``None``, then the loader's repr is used as part of the module's repr. > > * Otherwise, just use the module's ``__name__`` in the repr. > > diff --git a/Doc/whatsnew/3.4.rst b/Doc/whatsnew/3.4.rst > --- a/Doc/whatsnew/3.4.rst > +++ b/Doc/whatsnew/3.4.rst > @@ -231,3 +231,8 @@ > :exc:`NotImplementedError` blindly. This will only affect code calling > :func:`super` and falling through all the way to the ABCs. For compatibility, > catch both :exc:`NotImplementedError` or the appropriate exception as needed. > + > +* The module type now initializes the :attr:`__package__` and :attr:`__loader__` > + attributes to ``None`` by default. To determine if these attributes were set > + in a backwards-compatible fashion, use e.g. > + ``getattr(module, '__loader__', None) is not None``. > \ No newline at end of file > diff --git a/Lib/ctypes/test/__init__.py b/Lib/ctypes/test/__init__.py > --- a/Lib/ctypes/test/__init__.py > +++ b/Lib/ctypes/test/__init__.py > @@ -37,7 +37,7 @@ > > def find_package_modules(package, mask): > import fnmatch > - if (hasattr(package, "__loader__") and > + if (package.__loader__ is not None and > hasattr(package.__loader__, '_files')): > path = package.__name__.replace(".", os.path.sep) > mask = os.path.join(path, mask) > diff --git a/Lib/doctest.py b/Lib/doctest.py > --- a/Lib/doctest.py > +++ b/Lib/doctest.py > @@ -215,7 +215,7 @@ > if module_relative: > package = _normalize_module(package, 3) > filename = _module_relative_path(package, filename) > - if hasattr(package, '__loader__'): > + if getattr(package, '__loader__', None) is not None: > if hasattr(package.__loader__, 'get_data'): > file_contents = package.__loader__.get_data(filename) > file_contents = file_contents.decode(encoding) > diff --git a/Lib/importlib/_bootstrap.py b/Lib/importlib/_bootstrap.py > --- a/Lib/importlib/_bootstrap.py > +++ b/Lib/importlib/_bootstrap.py > @@ -1726,7 +1726,7 @@ > module_type = type(sys) > for name, module in sys.modules.items(): > if isinstance(module, module_type): > - if not hasattr(module, '__loader__'): > + if getattr(module, '__loader__', None) is None: > if name in sys.builtin_module_names: > module.__loader__ = BuiltinImporter > elif _imp.is_frozen(name): > diff --git a/Lib/inspect.py b/Lib/inspect.py > --- a/Lib/inspect.py > +++ b/Lib/inspect.py > @@ -476,7 +476,7 @@ > if os.path.exists(filename): > return filename > # only return a non-existent filename if the module has a PEP 302 loader > - if hasattr(getmodule(object, filename), '__loader__'): > + if getattr(getmodule(object, filename), '__loader__', None) is not None: > return filename > # or it is in the linecache > if filename in linecache.cache: > diff --git a/Lib/test/test_descr.py b/Lib/test/test_descr.py > --- a/Lib/test/test_descr.py > +++ b/Lib/test/test_descr.py > @@ -2250,7 +2250,9 @@ > minstance = M("m") > minstance.b = 2 > minstance.a = 1 > - names = [x for x in dir(minstance) if x not in ["__name__", "__doc__"]] > + default_attributes = ['__name__', '__doc__', '__package__', > + '__loader__'] > + names = [x for x in dir(minstance) if x not in default_attributes] > self.assertEqual(names, ['a', 'b']) > > class M2(M): > diff --git a/Lib/test/test_importlib/test_api.py b/Lib/test/test_importlib/test_api.py > --- a/Lib/test/test_importlib/test_api.py > +++ b/Lib/test/test_importlib/test_api.py > @@ -197,14 +197,12 @@ > # Issue #17098: all modules should have __loader__ defined. > for name, module in sys.modules.items(): > if isinstance(module, types.ModuleType): > - if name in sys.builtin_module_names: > - self.assertIn(module.__loader__, > - (importlib.machinery.BuiltinImporter, > - importlib._bootstrap.BuiltinImporter)) > - elif imp.is_frozen(name): > - self.assertIn(module.__loader__, > - (importlib.machinery.FrozenImporter, > - importlib._bootstrap.FrozenImporter)) > + self.assertTrue(hasattr(module, '__loader__'), > + '{!r} lacks a __loader__ attribute'.format(name)) > + if importlib.machinery.BuiltinImporter.find_module(name): > + self.assertIsNot(module.__loader__, None) > + elif importlib.machinery.FrozenImporter.find_module(name): > + self.assertIsNot(module.__loader__, None) > > > if __name__ == '__main__': > diff --git a/Lib/test/test_module.py b/Lib/test/test_module.py > --- a/Lib/test/test_module.py > +++ b/Lib/test/test_module.py > @@ -33,7 +33,10 @@ > foo = ModuleType("foo") > self.assertEqual(foo.__name__, "foo") > self.assertEqual(foo.__doc__, None) > - self.assertEqual(foo.__dict__, {"__name__": "foo", "__doc__": None}) > + self.assertIs(foo.__loader__, None) > + self.assertIs(foo.__package__, None) > + self.assertEqual(foo.__dict__, {"__name__": "foo", "__doc__": None, > + "__loader__": None, "__package__": None}) > > def test_ascii_docstring(self): > # ASCII docstring > @@ -41,7 +44,8 @@ > self.assertEqual(foo.__name__, "foo") > self.assertEqual(foo.__doc__, "foodoc") > self.assertEqual(foo.__dict__, > - {"__name__": "foo", "__doc__": "foodoc"}) > + {"__name__": "foo", "__doc__": "foodoc", > + "__loader__": None, "__package__": None}) > > def test_unicode_docstring(self): > # Unicode docstring > @@ -49,7 +53,8 @@ > self.assertEqual(foo.__name__, "foo") > self.assertEqual(foo.__doc__, "foodoc\u1234") > self.assertEqual(foo.__dict__, > - {"__name__": "foo", "__doc__": "foodoc\u1234"}) > + {"__name__": "foo", "__doc__": "foodoc\u1234", > + "__loader__": None, "__package__": None}) > > def test_reinit(self): > # Reinitialization should not replace the __dict__ > @@ -61,7 +66,8 @@ > self.assertEqual(foo.__doc__, "foodoc") > self.assertEqual(foo.bar, 42) > self.assertEqual(foo.__dict__, > - {"__name__": "foo", "__doc__": "foodoc", "bar": 42}) > + {"__name__": "foo", "__doc__": "foodoc", "bar": 42, > + "__loader__": None, "__package__": None}) > self.assertTrue(foo.__dict__ is d) > > @unittest.expectedFailure > @@ -110,13 +116,19 @@ > m.__file__ = '/tmp/foo.py' > self.assertEqual(repr(m), "") > > + def test_module_repr_with_loader_as_None(self): > + m = ModuleType('foo') > + assert m.__loader__ is None > + self.assertEqual(repr(m), "") > + > def test_module_repr_with_bare_loader_but_no_name(self): > m = ModuleType('foo') > del m.__name__ > # Yes, a class not an instance. > m.__loader__ = BareLoader > + loader_repr = repr(BareLoader) > self.assertEqual( > - repr(m), ")>") > + repr(m), "".format(loader_repr)) > > def test_module_repr_with_full_loader_but_no_name(self): > # m.__loader__.module_repr() will fail because the module has no > @@ -126,15 +138,17 @@ > del m.__name__ > # Yes, a class not an instance. > m.__loader__ = FullLoader > + loader_repr = repr(FullLoader) > self.assertEqual( > - repr(m), ")>") > + repr(m), "".format(loader_repr)) > > def test_module_repr_with_bare_loader(self): > m = ModuleType('foo') > # Yes, a class not an instance. > m.__loader__ = BareLoader > + module_repr = repr(BareLoader) > self.assertEqual( > - repr(m), ")>") > + repr(m), "".format(module_repr)) > > def test_module_repr_with_full_loader(self): > m = ModuleType('foo') > diff --git a/Misc/NEWS b/Misc/NEWS > --- a/Misc/NEWS > +++ b/Misc/NEWS > @@ -10,6 +10,9 @@ > Core and Builtins > ----------------- > > +- Issue #17115,17116: Module initialization now includes setting __package__ and > + __loader__ attributes to None. > + > - Issue #17853: Ensure locals of a class that shadow free variables always win > over the closures. > > diff --git a/Objects/moduleobject.c b/Objects/moduleobject.c > --- a/Objects/moduleobject.c > +++ b/Objects/moduleobject.c > @@ -26,6 +26,27 @@ > }; > > > +static int > +module_init_dict(PyObject *md_dict, PyObject *name, PyObject *doc) > +{ > + if (md_dict == NULL) > + return -1; > + if (doc == NULL) > + doc = Py_None; > + > + if (PyDict_SetItemString(md_dict, "__name__", name) != 0) > + return -1; > + if (PyDict_SetItemString(md_dict, "__doc__", doc) != 0) > + return -1; > + if (PyDict_SetItemString(md_dict, "__package__", Py_None) != 0) > + return -1; > + if (PyDict_SetItemString(md_dict, "__loader__", Py_None) != 0) > + return -1; > + > + return 0; > +} > + > + > PyObject * > PyModule_NewObject(PyObject *name) > { > @@ -36,13 +57,7 @@ > m->md_def = NULL; > m->md_state = NULL; > m->md_dict = PyDict_New(); > - if (m->md_dict == NULL) > - goto fail; > - if (PyDict_SetItemString(m->md_dict, "__name__", name) != 0) > - goto fail; > - if (PyDict_SetItemString(m->md_dict, "__doc__", Py_None) != 0) > - goto fail; > - if (PyDict_SetItemString(m->md_dict, "__package__", Py_None) != 0) > + if (module_init_dict(m->md_dict, name, NULL) != 0) > goto fail; > PyObject_GC_Track(m); > return (PyObject *)m; > @@ -347,9 +362,7 @@ > return -1; > m->md_dict = dict; > } > - if (PyDict_SetItemString(dict, "__name__", name) < 0) > - return -1; > - if (PyDict_SetItemString(dict, "__doc__", doc) < 0) > + if (module_init_dict(dict, name, doc) < 0) > return -1; > return 0; > } > @@ -380,7 +393,7 @@ > if (m->md_dict != NULL) { > loader = PyDict_GetItemString(m->md_dict, "__loader__"); > } > - if (loader != NULL) { > + if (loader != NULL && loader != Py_None) { > repr = PyObject_CallMethod(loader, "module_repr", "(O)", > (PyObject *)m, NULL); > if (repr == NULL) { > @@ -404,10 +417,10 @@ > filename = PyModule_GetFilenameObject((PyObject *)m); > if (filename == NULL) { > PyErr_Clear(); > - /* There's no m.__file__, so if there was an __loader__, use that in > + /* There's no m.__file__, so if there was a __loader__, use that in > * the repr, otherwise, the only thing you can use is m.__name__ > */ > - if (loader == NULL) { > + if (loader == NULL || loader == Py_None) { > repr = PyUnicode_FromFormat("", name); > } > else { > diff --git a/Python/importlib.h b/Python/importlib.h > --- a/Python/importlib.h > +++ b/Python/importlib.h > [stripped] > diff --git a/Python/pythonrun.c b/Python/pythonrun.c > --- a/Python/pythonrun.c > +++ b/Python/pythonrun.c > @@ -866,7 +866,8 @@ > * be set if __main__ gets further initialized later in the startup > * process. > */ > - if (PyDict_GetItemString(d, "__loader__") == NULL) { > + PyObject *loader = PyDict_GetItemString(d, "__loader__"); > + if (loader == NULL || loader == Py_None) { > PyObject *loader = PyObject_GetAttrString(interp->importlib, > "BuiltinImporter"); > if (loader == NULL) { > > -- > Repository URL: http://hg.python.org/cpython > > _______________________________________________ > Python-checkins mailing list > Python-checkins at python.org > http://mail.python.org/mailman/listinfo/python-checkins > From eliben at gmail.com Sun May 5 00:04:49 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 4 May 2013 15:04:49 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement Message-ID: Hello pydev, PEP 435 is ready for final review. A lot of the feedback from the last few weeks of discussions has been incorporated. Naturally, not everything could go in because some minor (mostly preference-based) issues did not reach a consensus. We do feel, however, that the end result is better than in the beginning and that Python can finally have a useful enumeration type in the standard library. I'm attaching the latest version of the PEP for convenience. If you've read previous versions, the easiest way to get acquainted with the recent changes is to go through the revision log at http://hg.python.org/peps A reference implementation for PEP 435 is available at https://bitbucket.org/stoneleaf/ref435 Kind regards and happy weekend. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- PEP: 435 Title: Adding an Enum type to the Python standard library Version: $Revision$ Last-Modified: $Date$ Author: Barry Warsaw , Eli Bendersky , Ethan Furman Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2013-02-23 Python-Version: 3.4 Post-History: 2013-02-23, 2013-05-02 Abstract ======== This PEP proposes adding an enumeration type to the Python standard library. An enumeration is a set of symbolic names bound to unique, constant values. Within an enumeration, the values can be compared by identity, and the enumeration itself can be iterated over. Decision ======== TODO: update decision here once pronouncement is made. [1]_ Status of discussions ===================== The idea of adding an enum type to Python is not new - PEP 354 [2]_ is a previous attempt that was rejected in 2005. Recently a new set of discussions was initiated [3]_ on the ``python-ideas`` mailing list. Many new ideas were proposed in several threads; after a lengthy discussion Guido proposed adding ``flufl.enum`` to the standard library [4]_. During the PyCon 2013 language summit the issue was discussed further. It became clear that many developers want to see an enum that subclasses ``int``, which can allow us to replace many integer constants in the standard library by enums with friendly string representations, without ceding backwards compatibility. An additional discussion among several interested core developers led to the proposal of having ``IntEnum`` as a special case of ``Enum``. The key dividing issue between ``Enum`` and ``IntEnum`` is whether comparing to integers is semantically meaningful. For most uses of enumerations, it's a **feature** to reject comparison to integers; enums that compare to integers lead, through transitivity, to comparisons between enums of unrelated types, which isn't desirable in most cases. For some uses, however, greater interoperatiliby with integers is desired. For instance, this is the case for replacing existing standard library constants (such as ``socket.AF_INET``) with enumerations. Further discussion in late April 2013 led to the conclusion that enumeration members should belong to the type of their enum: ``type(Color.red) == Color``. Guido has pronounced a decision on this issue [5]_, as well as related issues of not allowing to subclass enums [6]_, unless they define no enumeration members [7]_. Motivation ========== *[Based partly on the Motivation stated in PEP 354]* The properties of an enumeration are useful for defining an immutable, related set of constant values that have a defined sequence but no inherent semantic meaning. Classic examples are days of the week (Sunday through Saturday) and school assessment grades ('A' through 'D', and 'F'). Other examples include error status values and states within a defined process. It is possible to simply define a sequence of values of some other basic type, such as ``int`` or ``str``, to represent discrete arbitrary values. However, an enumeration ensures that such values are distinct from any others including, importantly, values within other enumerations, and that operations without meaning ("Wednesday times two") are not defined for these values. It also provides a convenient printable representation of enum values without requiring tedious repetition while defining them (i.e. no ``GREEN = 'green'``). Module and type name ==================== We propose to add a module named ``enum`` to the standard library. The main type exposed by this module is ``Enum``. Hence, to import the ``Enum`` type user code will run:: >>> from enum import Enum Proposed semantics for the new enumeration type =============================================== Creating an Enum ---------------- Enumerations are created using the class syntax, which makes them easy to read and write. An alternative creation method is described in `Functional API`_. To define an enumeration, subclass ``Enum`` as follows:: >>> from enum import Enum >>> class Color(Enum): ... red = 1 ... green = 2 ... blue = 3 **A note on nomenclature**: we call ``Color`` an *enumeration* (or *enum*) and ``Color.red``, ``Color.green`` are *enumeration members* (or *enum members*). Enumeration members also have *values* (the value of ``Color.red`` is ``1``, etc.) Enumeration members have human readable string representations:: >>> print(Color.red) Color.red ...while their ``repr`` has more information:: >>> print(repr(Color.red)) The *type* of an enumeration member is the enumeration it belongs to:: >>> type(Color.red) >>> isinstance(Color.green, Color) True >>> Enums also have a property that contains just their item name:: >>> print(Color.red.name) red Enumerations support iteration, in definition order:: >>> class Shake(Enum): ... vanilla = 7 ... chocolate = 4 ... cookies = 9 ... mint = 3 ... >>> for shake in Shake: ... print(shake) ... Shake.vanilla Shake.chocolate Shake.cookies Shake.mint Enumeration members are hashable, so they can be used in dictionaries and sets:: >>> apples = {} >>> apples[Color.red] = 'red delicious' >>> apples[Color.green] = 'granny smith' >>> apples {: 'red delicious', : 'granny smith'} Programmatic access to enumeration members ------------------------------------------ Sometimes it's useful to access members in enumerations programmatically (i.e. situations where ``Color.red`` won't do because the exact color is not known at program-writing time). ``Enum`` allows such access:: >>> Color(1) >>> Color(3) If you want to access enum members by *name*, use item access:: >>> Color['red'] >>> Color['green'] Duplicating enum members and values ----------------------------------- Having two enum members with the same name is invalid:: >>> class Shape(Enum): ... square = 2 ... square = 3 ... Traceback (most recent call last): ... TypeError: Attempted to reuse key: square However, two enum members are allowed to have the same value. Given two members A and B with the same value (and A defined first), B is an alias to A. By-value lookup of the value of A and B will return A. >>> class Shape(Enum): ... square = 2 ... diamond = 1 ... circle = 3 ... alias_for_square = 2 ... >>> Shape.square >>> Shape.alias_for_square >>> Shape(2) Iterating over the members of an enum does not provide the aliases:: >>> list(Shape) [, , ] If access to aliases is required for some reason, use the special attribute ``__aliases__``:: >>> Shape.__aliases__ ['alias_for_square'] Comparisons ----------- Enumeration members are compared by identity:: >>> Color.red is Color.red True >>> Color.red is Color.blue False >>> Color.red is not Color.blue True Ordered comparisons between enumeration values are *not* supported. Enums are not integers (but see `IntEnum`_ below):: >>> Color.red < Color.blue Traceback (most recent call last): File "", line 1, in TypeError: unorderable types: Color() < Color() Equality comparisons are defined though:: >>> Color.blue == Color.red False >>> Color.blue == Color.blue True Comparisons against non-enumeration values will always compare not equal (again, ``IntEnum`` was explicitly designed to behave differently, see below):: >>> Color.blue == 2 False Allowed members and attributes of enumerations ---------------------------------------------- The examples above use integers for enumeration values. Using integers is short and handy (and provided by default by the `Functional API`_), but not strictly enforced. In the vast majority of use-cases, one doesn't care what the actual value of an enumeration is. But if the value *is* important, enumerations can have arbitrary values. Enumerations are Python classes, and can have methods and special methods as usual. If we have this enumeration:: class Mood(Enum): funky = 1 happy = 3 def describe(self): # self is the member here return self.name, self.value def __str__(self): return 'my custom str! {0}'.format(self.value) @classmethod def favorite_mood(cls): # cls here is the enumeration return cls.happy Then:: >>> Mood.favorite_mood() >>> Mood.happy.describe() ('happy', 3) >>> str(Mood.funky) 'my custom str! 1' The rules for what is allowed are as follows: all attributes defined within an enumeration will become members of this enumeration, with the exception of *__dunder__* names and descriptors; methods are descriptors too. Restricted subclassing of enumerations -------------------------------------- Subclassing an enumeration is allowed only if the enumeration does not define any members. So this is forbidden:: >>> class MoreColor(Color): ... pink = 17 ... TypeError: Cannot extend enumerations But this is allowed:: >>> class Foo(Enum): ... def some_behavior(self): ... pass ... >>> class Bar(Foo): ... happy = 1 ... sad = 2 ... The rationale for this decision was given by Guido in [6]_. Allowing to subclass enums that define members would lead to a violation of some important invariants of types and instances. On the other hand, it makes sense to allow sharing some common behavior between a group of enumerations, and subclassing empty enumerations is also used to implement ``IntEnum``. IntEnum ------- A variation of ``Enum`` is proposed which is also a subclass of ``int``. Members of an ``IntEnum`` can be compared to integers; by extension, integer enumerations of different types can also be compared to each other:: >>> from enum import IntEnum >>> class Shape(IntEnum): ... circle = 1 ... square = 2 ... >>> class Request(IntEnum): ... post = 1 ... get = 2 ... >>> Shape == 1 False >>> Shape.circle == 1 True >>> Shape.circle == Request.post True However they still can't be compared to ``Enum``:: >>> class Shape(IntEnum): ... circle = 1 ... square = 2 ... >>> class Color(Enum): ... red = 1 ... green = 2 ... >>> Shape.circle == Color.red False ``IntEnum`` values behave like integers in other ways you'd expect:: >>> int(Shape.circle) 1 >>> ['a', 'b', 'c'][Shape.circle] 'b' >>> [i for i in range(Shape.square)] [0, 1] For the vast majority of code, ``Enum`` is strongly recommended, since ``IntEnum`` breaks some semantic promises of an enumeration (by being comparable to integers, and thus by transitivity to other unrelated enumerations). It should be used only in special cases where there's no other choice; for example, when integer constants are replaced with enumerations and backwards compatibility is required with code that still expects integers. Other derived enumerations -------------------------- ``IntEnum`` will be part of the ``enum`` module. However, it would be very simple to implement independently:: class IntEnum(int, Enum): pass This demonstrates how similar derived enumerations can be defined, for example a ``StrEnum`` that mixes in ``str`` instead of ``int``. Some rules: 1. When subclassing Enum, mixing types must appear before Enum itself in the sequence of bases. 2. While Enum can have members of any type, once you mix in an additional type, all the members must have values of that type, e.g. ``int`` above. This restriction does not apply to behavior-only mixins. Functional API -------------- The ``Enum`` class is callable, providing the following functional API:: >>> Animal = Enum('Animal', 'ant bee cat dog') >>> Animal >>> Animal.ant >>> Animal.ant.value 1 >>> list(Animal) [, , , ] The semantics of this API resemble ``namedtuple``. The first argument of the call to ``Enum`` is the name of the enumeration. The second argument is a source of enumeration member names. It can be a whitespace-separated string of names, a sequence of names or a sequence of 2-tuples with key/value pairs. The last option enables assigning arbitrary values to enumerations; the others auto-assign increasing integers starting with 1. A new class derived from ``Enum`` is returned. In other words, the above assignment to ``Animal`` is equivalent to:: >>> class Animals(Enum): ... ant = 1 ... bee = 2 ... cat = 3 ... dog = 4 Pickling -------- Enumerations be pickled and unpickled:: >>> from enum.tests.fruit import Fruit >>> from pickle import dumps, loads >>> Fruit.tomato is loads(dumps(Fruit.tomato)) True The usual restrictions for pickling apply: picklable enums must be defined in the top level of a module, to be importable from that module when unpickling occurs. Proposed variations =================== Some variations were proposed during the discussions in the mailing list. Here's some of the more popular ones. flufl.enum ---------- ``flufl.enum`` was the reference implementation upon which this PEP was originally based. Eventually, it was decided against the inclusion of ``flufl.enum`` because its design separated enumeration members from enumerations, so the former are not instances of the latter. Its design also explicitly permits subclassing enumerations for extending them with more members (due to the member/enum separation, the type invariants are not violated in ``flufl.enum`` with such a scheme). Not having to specify values for enums -------------------------------------- Michael Foord proposed (and Tim Delaney provided a proof-of-concept implementation) to use metaclass magic that makes this possible:: class Color(Enum): red, green, blue The values get actually assigned only when first looked up. Pros: cleaner syntax that requires less typing for a very common task (just listing enumeration names without caring about the values). Cons: involves much magic in the implementation, which makes even the definition of such enums baffling when first seen. Besides, explicit is better than implicit. Using special names or forms to auto-assign enum values ------------------------------------------------------- A different approach to avoid specifying enum values is to use a special name or form to auto assign them. For example:: class Color(Enum): red = None # auto-assigned to 0 green = None # auto-assigned to 1 blue = None # auto-assigned to 2 More flexibly:: class Color(Enum): red = 7 green = None # auto-assigned to 8 blue = 19 purple = None # auto-assigned to 20 Some variations on this theme: #. A special name ``auto`` imported from the enum package. #. Georg Brandl proposed ellipsis (``...``) instead of ``None`` to achieve the same effect. Pros: no need to manually enter values. Makes it easier to change the enum and extend it, especially for large enumerations. Cons: actually longer to type in many simple cases. The argument of explicit vs. implicit applies here as well. Use-cases in the standard library ================================= The Python standard library has many places where the usage of enums would be beneficial to replace other idioms currently used to represent them. Such usages can be divided to two categories: user-code facing constants, and internal constants. User-code facing constants like ``os.SEEK_*``, ``socket`` module constants, decimal rounding modes and HTML error codes could require backwards compatibility since user code may expect integers. ``IntEnum`` as described above provides the required semantics; being a subclass of ``int``, it does not affect user code that expects integers, while on the other hand allowing printable representations for enumeration values:: >>> import socket >>> family = socket.AF_INET >>> family == 2 True >>> print(family) SocketFamily.AF_INET Internal constants are not seen by user code but are employed internally by stdlib modules. These can be implemented with ``Enum``. Some examples uncovered by a very partial skim through the stdlib: ``binhex``, ``imaplib``, ``http/client``, ``urllib/robotparser``, ``idlelib``, ``concurrent.futures``, ``turtledemo``. In addition, looking at the code of the Twisted library, there are many use cases for replacing internal state constants with enums. The same can be said about a lot of networking code (especially implementation of protocols) and can be seen in test protocols written with the Tulip library as well. Acknowledgments =============== This PEP was initially proposing including the ``flufl.enum`` package [8]_ by Barry Warsaw into the stdlib, and is inspired in large parts by it. Ben Finney is the author of the earlier enumeration PEP 354. References ========== .. [1] Placeholder for pronouncement .. [2] http://www.python.org/dev/peps/pep-0354/ .. [3] http://mail.python.org/pipermail/python-ideas/2013-January/019003.html .. [4] http://mail.python.org/pipermail/python-ideas/2013-February/019373.html .. [5] http://mail.python.org/pipermail/python-dev/2013-April/125687.html .. [6] http://mail.python.org/pipermail/python-dev/2013-April/125716.html .. [7] http://mail.python.org/pipermail/python-dev/2013-May/125859.html .. [8] http://pythonhosted.org/flufl.enum/ Copyright ========= This document has been placed in the public domain. Todo ==== * Mark PEP 354 "superseded by" this one, if accepted * The last revision where flufl.enum was the approach is cb3c18a080a3 .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From victor.stinner at gmail.com Sun May 5 00:30:33 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 5 May 2013 00:30:33 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: Great job guys. Victor Le 5 mai 2013 00:06, "Eli Bendersky" a ?crit : > Hello pydev, > > PEP 435 is ready for final review. A lot of the feedback from the last few > weeks of discussions has been incorporated. Naturally, not everything could > go in because some minor (mostly preference-based) issues did not reach a > consensus. We do feel, however, that the end result is better than in the > beginning and that Python can finally have a useful enumeration type in the > standard library. > > I'm attaching the latest version of the PEP for convenience. If you've > read previous versions, the easiest way to get acquainted with the recent > changes is to go through the revision log at http://hg.python.org/peps > > A reference implementation for PEP 435 is available at > https://bitbucket.org/stoneleaf/ref435 > > Kind regards and happy weekend. > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Sun May 5 00:41:09 2013 From: larry at hastings.org (Larry Hastings) Date: Sat, 04 May 2013 15:41:09 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <518514A8.5010604@trueblade.com> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> <518514A8.5010604@trueblade.com> Message-ID: <51858E85.6080401@hastings.org> On 05/04/2013 07:01 AM, Eric V. Smith wrote: > On 5/4/2013 2:42 AM, Nick Coghlan wrote: >> I'm now -1 on my own as_dict() suggestion, due to the general name >> clash problem for arbitrary enums. > To avoid the name collision, namedtuple calls this _asdict(). Although I recall Raymond told me he should have called it asdict_(), and reserved all identifiers with trailing underscores. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Sun May 5 01:27:25 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 09:27:25 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: Typo line 171: One thing I'd like to be clear in the PEP about is whether enum_type and _EnumDict._enum_names should be documented, or whether they're considered implementation details. I'd like to make a subclass of Enum that accepts ... for auto-valued enums but that requires subclassing the metaclass and access to classdict._enum_names. I can get to enum_type via type(Enum), but _EnumDict._enum_names requires knowing the attribute. It would sufficient for my purposes if it was just documented that the passed classdict had a _enum_names attribute. In testing the below, I've also discovered a bug in the reference implementation - currently it will not handle an __mro__ like: (, , , , ) Apply the following patch to make that work: diff -r 758d43b9f732 ref435.py --- a/ref435.py Fri May 03 18:59:32 2013 -0700 +++ b/ref435.py Sun May 05 09:23:25 2013 +1000 @@ -116,7 +116,11 @@ if bases[-1] is Enum: obj_type = bases[0] else: - obj_type = bases[-1].__mro__[1] # e.g. (IntEnum, int, Enum, object) + for base in bases[-1].__mro__: + if not issubclass(base, Enum): + obj_type = base + break + else: obj_type = object # save enum items into separate mapping so they don't get baked into My auto-enum implementation (using the above patch - without it you can get the essentially the same results with class AutoIntEnum(int, Enum, metaclass=auto_enum). class auto_enum(type(Enum)): def __new__(metacls, cls, bases, classdict): temp = type(classdict)() names = set(classdict._enum_names) i = 0 for k in classdict._enum_names: v = classdict[k] if v is Ellipsis: v = i else: i = v i += 1 temp[k] = v for k, v in classdict.items(): if k not in names: temp[k] = v return super(auto_enum, metacls).__new__(metacls, cls, bases, temp) class AutoNumberedEnum(Enum, metaclass=auto_enum): pass class AutoIntEnum(IntEnum, metaclass=auto_enum): pass class TestAutoNumber(AutoNumberedEnum): a = ... b = 3 c = ... class TestAutoInt(AutoIntEnum): a = ... b = 3 c = ... print(TestAutoNumber, list(TestAutoNumber)) print(TestAutoInt, list(TestAutoInt)) ---------- Run ---------- [, , ] [, , ] Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 5 01:52:12 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 4 May 2013 16:52:12 -0700 Subject: [Python-Dev] enum discussion: can someone please summarize open issues? In-Reply-To: <51858E85.6080401@hastings.org> References: <517D7944.4050107@stoneleaf.us> <517DB2C8.9090002@pearwood.info> <517DBEEF.6060904@stoneleaf.us> <517DD273.3020008@pearwood.info> <517DDF23.3020101@stoneleaf.us> <20130430230827.062515ff@anarchist> <5180C1E9.4030100@g.nevcal.com> <20130501084755.04f44a4f@anarchist> <5181A8C3.9070702@canterbury.ac.nz> <20130502075813.579f24c0@anarchist> <5182FA0A.1040802@canterbury.ac.nz> <5182FD73.9090906@stoneleaf.us> <20130502185219.0d5d0b92@anarchist> <5183BD03.9040108@canterbury.ac.nz> <51844505.80807@canterbury.ac.nz> <20130504012218.37f8470e@fsol> <518514A8.5010604@trueblade.com> <51858E85.6080401@hastings.org> Message-ID: Hm. Trailing underscores look *really* weird to me. On Sat, May 4, 2013 at 3:41 PM, Larry Hastings wrote: > On 05/04/2013 07:01 AM, Eric V. Smith wrote: > > On 5/4/2013 2:42 AM, Nick Coghlan wrote: > > I'm now -1 on my own as_dict() suggestion, due to the general name > clash problem for arbitrary enums. > > To avoid the name collision, namedtuple calls this _asdict(). > > > Although I recall Raymond told me he should have called it asdict_(), and > reserved all identifiers with trailing underscores. > > > /arry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) From eliben at gmail.com Sun May 5 02:49:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 4 May 2013 17:49:56 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On Sat, May 4, 2013 at 4:27 PM, Tim Delaney wrote: > Typo line 171: > > Fixed, thanks. > One thing I'd like to be clear in the PEP about is whether enum_type and > _EnumDict._enum_names should be documented, or whether they're considered > implementation details. > > No, they should not. Not only are they implementation details, they are details of the *reference implementation*, not the actual stdlib module. The reference implementation will naturally serve as a basis for the stdlib module, but it still has to undergo a review in which implementation details can change. Note that usually we do not document implementation details of stdlib modules, but this doesn't prevent some people from using them if they really want to. > I'd like to make a subclass of Enum that accepts ... for auto-valued enums > but that requires subclassing the metaclass and access to > classdict._enum_names. I can get to enum_type via type(Enum), but > _EnumDict._enum_names requires knowing the attribute. It would sufficient > for my purposes if it was just documented that the passed classdict had a > _enum_names attribute. > > In testing the below, I've also discovered a bug in the reference > implementation - currently it will not handle an __mro__ like: > > Thanks! Tim - did you sign the contributor CLA for Python? Since the reference implementation is aimed for becoming the stdlib enum eventually, we'd probably need you to sign that before we can accept patches from you. Eli > (, , , , > ) > > Apply the following patch to make that work: > > diff -r 758d43b9f732 ref435.py > --- a/ref435.py Fri May 03 18:59:32 2013 -0700 > +++ b/ref435.py Sun May 05 09:23:25 2013 +1000 > @@ -116,7 +116,11 @@ > if bases[-1] is Enum: > obj_type = bases[0] > else: > - obj_type = bases[-1].__mro__[1] # e.g. (IntEnum, int, > Enum, object) > + for base in bases[-1].__mro__: > + if not issubclass(base, Enum): > + obj_type = base > + break > + > else: > obj_type = object > # save enum items into separate mapping so they don't get baked > into > > My auto-enum implementation (using the above patch - without it you can > get the essentially the same results with class AutoIntEnum(int, Enum, > metaclass=auto_enum). > > class auto_enum(type(Enum)): > def __new__(metacls, cls, bases, classdict): > temp = type(classdict)() > names = set(classdict._enum_names) > i = 0 > > for k in classdict._enum_names: > v = classdict[k] > > if v is Ellipsis: > v = i > else: > i = v > > i += 1 > temp[k] = v > > for k, v in classdict.items(): > if k not in names: > temp[k] = v > > return super(auto_enum, metacls).__new__(metacls, cls, bases, temp) > > class AutoNumberedEnum(Enum, metaclass=auto_enum): > pass > > class AutoIntEnum(IntEnum, metaclass=auto_enum): > pass > > class TestAutoNumber(AutoNumberedEnum): > a = ... > b = 3 > c = ... > > class TestAutoInt(AutoIntEnum): > a = ... > b = 3 > c = ... > > print(TestAutoNumber, list(TestAutoNumber)) > print(TestAutoInt, list(TestAutoInt)) > > ---------- Run ---------- > [, , > ] > [, , > ] > > Tim Delaney > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Sun May 5 03:22:01 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 11:22:01 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On 5 May 2013 10:49, Eli Bendersky wrote: > > On Sat, May 4, 2013 at 4:27 PM, Tim Delaney wrote: > >> Typo line 171: >> >> > Fixed, thanks. > > > >> One thing I'd like to be clear in the PEP about is whether enum_type and >> _EnumDict._enum_names should be documented, or whether they're considered >> implementation details. >> >> > No, they should not. Not only are they implementation details, they are > details of the *reference implementation*, not the actual stdlib module. > The reference implementation will naturally serve as a basis for the stdlib > module, but it still has to undergo a review in which implementation > details can change. Note that usually we do not document implementation > details of stdlib modules, but this doesn't prevent some people from using > them if they really want to. > I think it would be useful to have some guaranteed method for a sub-metaclass to get the list of enum keys before calling the base class __new__. Not being able to do so removes a large number of possible extensions (like auto-numbering). > In testing the below, I've also discovered a bug in the reference >> implementation - currently it will not handle an __mro__ like: >> > > Thanks! Tim - did you sign the contributor CLA for Python? Since the > reference implementation is aimed for becoming the stdlib enum eventually, > we'd probably need you to sign that before we can accept patches from you. > I have now (just waiting on the confirmation email). Haven't submitted a patch since the CLAs were started ... Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Sun May 5 03:23:00 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 11:23:00 +1000 Subject: [Python-Dev] CLA link from bugs.python.org Message-ID: It appears there's no obvious link from bugs.python.org to the contributor agreement - you need to go via the unintuitive link Foundation -> Contribution Forms (and from what I've read, you're prompted when you add a patch to the tracker). I'd suggest that if the "Contributor Form Received" field is "No" in user details, there be a link to http://www.python.org/psf/contrib/. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Sun May 5 05:11:39 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 13:11:39 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On 5 May 2013 11:22, Tim Delaney wrote: > On 5 May 2013 10:49, Eli Bendersky wrote: > >> >> On Sat, May 4, 2013 at 4:27 PM, Tim Delaney wrote: >> >>> Typo line 171: >>> >>> >> Fixed, thanks. >> >> >> >>> One thing I'd like to be clear in the PEP about is whether enum_type and >>> _EnumDict._enum_names should be documented, or whether they're considered >>> implementation details. >>> >>> >> No, they should not. Not only are they implementation details, they are >> details of the *reference implementation*, not the actual stdlib module. >> The reference implementation will naturally serve as a basis for the stdlib >> module, but it still has to undergo a review in which implementation >> details can change. Note that usually we do not document implementation >> details of stdlib modules, but this doesn't prevent some people from using >> them if they really want to. >> > > I think it would be useful to have some guaranteed method for a > sub-metaclass to get the list of enum keys before calling the base class > __new__. Not being able to do so removes a large number of possible > extensions (like auto-numbering). > I've been able to achieve the auto-numbering without relying on the internal implementation at all (with a limitation), with a single change to enum_type.__new__. My previous patch was slightly wrong - fix below as well. All existing tests pass. BTW, for mix-ins it's required that they have __slots__ = () - might want to mention that in the PEP. diff -r 758d43b9f732 ref435.py --- a/ref435.py Fri May 03 18:59:32 2013 -0700 +++ b/ref435.py Sun May 05 13:10:11 2013 +1000 @@ -116,7 +116,17 @@ if bases[-1] is Enum: obj_type = bases[0] else: - obj_type = bases[-1].__mro__[1] # e.g. (IntEnum, int, Enum, object) + obj_type = None + + for base in bases: + for c in base.__mro__: + if not issubclass(c, Enum): + obj_type = c + break + + if obj_type is not None: + break + else: obj_type = object # save enum items into separate mapping so they don't get baked into @@ -142,6 +152,7 @@ if obj_type in (object, Enum): enum_item = object.__new__(enum_class) else: + value = obj_type.__new__(obj_type, value) enum_item = obj_type.__new__(enum_class, value) enum_item._value = value enum_item._name = e Implementation: class AutoInt(int): __slots__ = () # Required def __new__(cls, value): if value is Ellipsis: try: i = cls._auto_number except AttributeError: i = cls._auto_number = 0 else: i = cls._auto_number = value cls._auto_number += 1 return int.__new__(cls, i) class AutoIntEnum(AutoInt, IntEnum): pass class TestAutoIntEnum(AutoIntEnum): a = ... b = 3 c = ... print(TestAutoIntEnum, list(TestAutoIntEnum)) ---------- Run ---------- [, , ] The implementation is not quite as useful - there's no immediately-obvious way to have an auto-numbered enum that is not also an int enum e.g. if you define class AutoNumberedEnum(AutoInt, Enum) it's still an int subclass. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Sun May 5 05:22:36 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 13:22:36 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On 5 May 2013 13:11, Tim Delaney wrote: > @@ -142,6 +152,7 @@ > if obj_type in (object, Enum): > enum_item = object.__new__(enum_class) > else: > + value = obj_type.__new__(obj_type, value) > enum_item = obj_type.__new__(enum_class, value) > enum_item._value = value > enum_item._name = e > Bugger - this is wrong (it didn't feel right to me) - I'm sure it's only working for me by accident. Need to think of something better. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 5 05:32:52 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 04 May 2013 20:32:52 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: <5185D2E4.80004@stoneleaf.us> On 05/04/2013 08:11 PM, Tim Delaney wrote: > > I've been able to achieve the auto-numbering without relying on the internal implementation at all (with a > limitation), with a single change to enum_type.__new__. My previous patch was slightly wrong - fix below as well. All > existing tests pass. BTW, for mix-ins it's required that they have __slots__ = () - might want to mention that in the PEP. What happens without `__slots__ = ()` ? -- ~Ethan~ From timothy.c.delaney at gmail.com Sun May 5 05:34:48 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 13:34:48 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <5185D2E4.80004@stoneleaf.us> References: <5185D2E4.80004@stoneleaf.us> Message-ID: On 5 May 2013 13:32, Ethan Furman wrote: > On 05/04/2013 08:11 PM, Tim Delaney wrote: > >> >> I've been able to achieve the auto-numbering without relying on the >> internal implementation at all (with a >> limitation), with a single change to enum_type.__new__. My previous patch >> was slightly wrong - fix below as well. All >> existing tests pass. BTW, for mix-ins it's required that they have >> __slots__ = () - might want to mention that in the PEP. >> > > What happens without `__slots__ = ()` ? > Traceback (most recent call last): File "D:\Development\ref435\ref435.py", line 311, in class AutoIntEnum(AutoInt, IntEnum): File "D:\Development\ref435\ref435.py", line 138, in __new__ enum_class = type.__new__(metacls, cls, bases, classdict) TypeError: multiple bases have instance lay-out conflict Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun May 5 05:35:45 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 4 May 2013 20:35:45 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On Sat, May 4, 2013 at 8:22 PM, Tim Delaney wrote: > On 5 May 2013 13:11, Tim Delaney wrote: > >> @@ -142,6 +152,7 @@ >> if obj_type in (object, Enum): >> enum_item = object.__new__(enum_class) >> else: >> + value = obj_type.__new__(obj_type, value) >> enum_item = obj_type.__new__(enum_class, value) >> enum_item._value = value >> enum_item._name = e >> > > Bugger - this is wrong (it didn't feel right to me) - I'm sure it's only > working for me by accident. Need to think of something better. > > Tim Delaney > > Could you please split this off to a separate thread? I'd like to keep this one for raising issues with the actual contents of the PEP and discussing whether this version is good enough for pronouncement. Thanks, Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Sun May 5 05:50:25 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 13:50:25 +1000 Subject: [Python-Dev] PEP 435 - reference implementation discussion Message-ID: Split off from the PEP 435 - requesting pronouncement thread. Think I've come up with a system that works for my auto-numbering case without knowing the internals of enum_type. Patch passes all existing test cases. The patch does two things: 1. Finds the first non-Enum class on the MRO of the new class and uses that as the enum type. 2. Instead of directly setting the _name and _value of the enum_item, it lets the Enum class do it via Enum.__init__(). Subclasses can override this. This gives Enums a 2-phase construction just like other classes. diff -r 758d43b9f732 ref435.py --- a/ref435.py Fri May 03 18:59:32 2013 -0700 +++ b/ref435.py Sun May 05 13:43:56 2013 +1000 @@ -116,7 +116,17 @@ if bases[-1] is Enum: obj_type = bases[0] else: - obj_type = bases[-1].__mro__[1] # e.g. (IntEnum, int, Enum, object) + obj_type = None + + for base in bases: + for c in base.__mro__: + if not issubclass(c, Enum): + obj_type = c + break + + if obj_type is not None: + break + else: obj_type = object # save enum items into separate mapping so they don't get baked into @@ -143,8 +153,7 @@ enum_item = object.__new__(enum_class) else: enum_item = obj_type.__new__(enum_class, value) - enum_item._value = value - enum_item._name = e + enum_item.__init__(e, value) enum_map[e] = enum_item enum_class.__aliases__ = aliases # non-unique enums names enum_class._enum_names = enum_names # enum names in definition order @@ -232,6 +241,10 @@ return enum raise ValueError("%s is not a valid %s" % (value, cls.__name__)) + def __init__(self, name, value): + self._name = name + self._value = value + def __repr__(self): return "<%s.%s: %r>" % (self.__class__.__name__, self._name, self._value) Auto-int implementation: class AutoInt(int): __slots__ = () def __new__(cls, value): if value is Ellipsis: try: i = cls._auto_number except AttributeError: i = cls._auto_number = 0 else: i = cls._auto_number = value cls._auto_number += 1 return int.__new__(cls, i) class AutoIntEnum(AutoInt, IntEnum): def __init__(self, name, value): super(AutoIntEnum, self).__init__(name, int(self)) class TestAutoIntEnum(AutoIntEnum): a = ... b = 3 c = ... class TestAutoIntEnum2(AutoIntEnum): a = ... b = ... c = ... print(TestAutoIntEnum, list(TestAutoIntEnum)) print(TestAutoIntEnum2, list(TestAutoIntEnum2)) ---------- Run ---------- [, , ] [, , ] Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezio.melotti at gmail.com Sun May 5 06:45:07 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Sun, 5 May 2013 07:45:07 +0300 Subject: [Python-Dev] CLA link from bugs.python.org In-Reply-To: References: Message-ID: Hi, On Sun, May 5, 2013 at 4:23 AM, Tim Delaney wrote: > It appears there's no obvious link from bugs.python.org to the contributor > agreement - you need to go via the unintuitive link Foundation -> > Contribution Forms (and from what I've read, you're prompted when you add a > patch to the tracker). > > I'd suggest that if the "Contributor Form Received" field is "No" in user > details, there be a link to http://www.python.org/psf/contrib/. > See http://psf.upfronthosting.co.za/roundup/meta/issue461. Best Regards, Ezio Melotti > Tim Delaney > From ethan at stoneleaf.us Sun May 5 08:17:19 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 04 May 2013 23:17:19 -0700 Subject: [Python-Dev] PEP 435 - reference implementation discussion In-Reply-To: <5185F54A.6060909@stoneleaf.us> References: <5185F54A.6060909@stoneleaf.us> Message-ID: <5185F96F.1000303@stoneleaf.us> On 05/04/2013 10:59 PM, Ethan Furman wrote: > On 05/04/2013 08:50 PM, Tim Delaney wrote: >> >> Think I've come up with a system that works for my auto-numbering case without knowing the internals of enum_type. Patch >> passes all existing test cases. The patch does two things: >> [snip] >> 2. Instead of directly setting the _name and _value of the enum_item, it lets the Enum class do it via Enum.__init__(). >> Subclasses can override this. This gives Enums a 2-phase construction just like other classes. > > Not sure I care for this. Enums are, at least in theory, immutable objects, and immutable objects don't call __init__. Okay, still thinking about `value`, but as far as `name` goes, it should not be passed -- it must be the same as it was in the class definition or we could end up with something like: --> class AreYouKiddingMe(WierdEnum): ... who = 1 ... what = 2 ... when = 3 ... where = 4 ... why = 5 --> list(AreYouKiddingMe) [ , , , , , ] and that's assuming we made more changes to support such insane behavior; otherwise it would just break. So no passing of `name`, it gets set in the metaclass. -- ~Ethan~ From v+python at g.nevcal.com Sun May 5 08:31:13 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 04 May 2013 23:31:13 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 Message-ID: <5185FCB1.6030702@g.nevcal.com> So I have a class based on Nick's Named Values, that has been extended to propagate names into expressions, so that if you have named values 'x' and 'y', when you x + y, the result is a named value whose name is '(x + y)'. Seems pretty awkward to integrate this with Enum. Maybe I'm missing something. Here's carved down code with just one operator defined for brevity. The third item from each print statement should be the same, as far as I understand... but isn't. class NamedInt( int ): _count = 0 def __new__( cls, *args, **kwds ): name, *args = args if len( args ) == 0: args = [ cls._count ] cls._count += 1 self = super().__new__( cls, *args, **kwds ) self._name = name return self def __init__( self, *args, **kwds ): name, *args = args super().__init__() @property def __name__( self ): return self._name def __repr__( self ): # repr() is updated to include the name and type info return "{}({!r}, {})".format(type(self).__name__, self.__name__, super().__repr__()) def __str__( self ): # str() is unchanged, even if it relies on the repr() fallback base = super() base_str = base.__str__ if base_str.__objclass__ is object: return base.__repr__() return base_str() # for simplicity, we only define one operator that propagates expressions def __add__(self, other): temp = int( self ) + int( other ) if isinstance( self, NamedInt ) and isinstance( other, NamedInt ): return NamedInt( '({0} + {1})'.format(self.__name__, other.__name__), temp ) else: return temp x = NamedInt('the-x', 1 ) y = NamedInt('the-y', 2 ) # demonstrate that NamedInt propagates the names into an expression syntax print( repr( x ), repr( y ), repr( x+y )) from ref435 import Enum # requires redundant names, but loses names in the expression class NEI( NamedInt, Enum ): x = NamedInt('the-x', 1 ) y = NamedInt('the-y', 2 ) print( repr( NEI( 1 )), repr( NEI( 2 )), repr( NEI(1) + NEI(2))) # looks redundant, and still loses the names in the expression class NEI2( NamedInt, Enum ): x = x y = y print( repr( NEI2( x )), repr( NEI2( x )), repr( NEI2(x) + NEI2(y))) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 5 07:59:38 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 04 May 2013 22:59:38 -0700 Subject: [Python-Dev] PEP 435 - reference implementation discussion In-Reply-To: References: Message-ID: <5185F54A.6060909@stoneleaf.us> On 05/04/2013 08:50 PM, Tim Delaney wrote: > > Think I've come up with a system that works for my auto-numbering case without knowing the internals of enum_type. Patch > passes all existing test cases. The patch does two things: > > 1. Finds the first non-Enum class on the MRO of the new class and uses that as the enum type. This is good. :) > 2. Instead of directly setting the _name and _value of the enum_item, it lets the Enum class do it via Enum.__init__(). > Subclasses can override this. This gives Enums a 2-phase construction just like other classes. Not sure I care for this. Enums are, at least in theory, immutable objects, and immutable objects don't call __init__. Of course, practicality beats purity... I'll have to think about this some more. Fortunately, none of this has any bearing on the PEP itself. -- ~Ethan~ From v+python at g.nevcal.com Sun May 5 08:46:53 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 04 May 2013 23:46:53 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5185FCB1.6030702@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> Message-ID: <5186005D.7060409@g.nevcal.com> On 5/4/2013 11:31 PM, Glenn Linderman wrote: > So I have a class based on Nick's Named Values, that has been extended > to propagate names into expressions, so that if you have named values > 'x' and 'y', when you x + y, the result is a named value whose name > is '(x + y)'. > > Seems pretty awkward to integrate this with Enum. Maybe I'm missing > something. Here's carved down code with just one operator defined for > brevity. The third item from each print statement should be the same, > as far as I understand... but isn't. > > class NamedInt( int ): > _count = 0 > def __new__( cls, *args, **kwds ): > name, *args = args > if len( args ) == 0: > args = [ cls._count ] > cls._count += 1 > self = super().__new__( cls, *args, **kwds ) > self._name = name > return self > def __init__( self, *args, **kwds ): > name, *args = args > super().__init__() > @property > def __name__( self ): > return self._name > def __repr__( self ): > # repr() is updated to include the name and type info > return "{}({!r}, {})".format(type(self).__name__, > self.__name__, > super().__repr__()) > def __str__( self ): > # str() is unchanged, even if it relies on the repr() fallback > base = super() > base_str = base.__str__ > if base_str.__objclass__ is object: > return base.__repr__() > return base_str() > > # for simplicity, we only define one operator that propagates > expressions > def __add__(self, other): > temp = int( self ) + int( other ) > if isinstance( self, NamedInt ) and isinstance( other, NamedInt ): > return NamedInt( > '({0} + {1})'.format(self.__name__, other.__name__), > temp ) > else: > return temp > > x = NamedInt('the-x', 1 ) > y = NamedInt('the-y', 2 ) > # demonstrate that NamedInt propagates the names into an expression syntax > print( repr( x ), repr( y ), repr( x+y )) > > from ref435 import Enum > > # requires redundant names, but loses names in the expression > class NEI( NamedInt, Enum ): > x = NamedInt('the-x', 1 ) > y = NamedInt('the-y', 2 ) > > print( repr( NEI( 1 )), repr( NEI( 2 )), repr( NEI(1) + NEI(2))) > > # looks redundant, and still loses the names in the expression > class NEI2( NamedInt, Enum ): > x = x > y = y > > print( repr( NEI2( x )), repr( NEI2( x )), repr( NEI2(x) + NEI2(y))) I've tried some more variations, without success: print( repr( NEI( x )), repr( NEI( y )), repr( NEI( x ) + NEI( y ))) print( repr( NEI.x ), repr( NEI.y ), repr( NEI.x + NEI.y)) print( repr( NEI2.x ), repr( NEI2.y ), repr( NEI2.x + NEI2.y )) Somehow, the overloading is not finding the __add__ operator in the NamedInt class, when the NamedInt's are wrapped in enumerations. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Sun May 5 09:10:21 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 05 May 2013 00:10:21 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5186005D.7060409@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <5186005D.7060409@g.nevcal.com> Message-ID: <518605DD.1010404@g.nevcal.com> On 5/4/2013 11:46 PM, Glenn Linderman wrote: > Somehow, the overloading is not finding the __add__ operator in the > NamedInt class, when the NamedInt's are wrapped in enumerations. And I guess I figured it out... NamedInt needs to test issubclass( type( self ), NamedInt ) rather than isinstance( self, NamedInt ) and likewise for other. Sorry for the noise, and I finally figured out what issubclass is for :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Sun May 5 09:16:56 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 05 May 2013 00:16:56 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518605DD.1010404@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <5186005D.7060409@g.nevcal.com> <518605DD.1010404@g.nevcal.com> Message-ID: <51860768.1070402@g.nevcal.com> On 5/5/2013 12:10 AM, Glenn Linderman wrote: > On 5/4/2013 11:46 PM, Glenn Linderman wrote: >> Somehow, the overloading is not finding the __add__ operator in the >> NamedInt class, when the NamedInt's are wrapped in enumerations. > > And I guess I figured it out... NamedInt needs to test > > issubclass( type( self ), NamedInt ) > > rather than > > isinstance( self, NamedInt ) > > and likewise for other. Sorry for the noise, and I finally figured > out what issubclass is for :) Sorry, it is getting late here... issubclass was not the cure I thought it might be. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 5 09:21:22 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 00:21:22 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5185FCB1.6030702@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> Message-ID: <51860872.8020800@stoneleaf.us> On 05/04/2013 11:31 PM, Glenn Linderman wrote: > > x = NamedInt('the-x', 1 ) > y = NamedInt('the-y', 2 ) > # demonstrate that NamedInt propagates the names into an expression syntax > print( repr( x ), repr( y ), repr( x+y )) > > from ref435 import Enum > > # requires redundant names, but loses names in the expression > class NEI( NamedInt, Enum ): > x = NamedInt('the-x', 1 ) > y = NamedInt('the-y', 2 ) > > print( repr( NEI( 1 )), repr( NEI( 2 )), repr( NEI(1) + NEI(2))) Well, my first question would be why are you using named anything in an enumeration, where it's going to get another name? But setting that aside, if you --> print(NEI.x.__name__) 'x' not 'the-x'. Now let's look for the clues: class Enum... ... @StealthProperty def name(self): return self._name class NamedInt... ... def __name__(self): return self._name # look familiar? When NamedInt goes looking for _name, it finds the one on `x`, not the one on `x.value`. -- ~Ethan~ From v+python at g.nevcal.com Sun May 5 10:01:12 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 05 May 2013 01:01:12 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <51860872.8020800@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> Message-ID: <518611C8.8070901@g.nevcal.com> On 5/5/2013 12:21 AM, Ethan Furman wrote: > On 05/04/2013 11:31 PM, Glenn Linderman wrote: >> >> x = NamedInt('the-x', 1 ) >> y = NamedInt('the-y', 2 ) >> # demonstrate that NamedInt propagates the names into an expression >> syntax >> print( repr( x ), repr( y ), repr( x+y )) >> >> from ref435 import Enum >> >> # requires redundant names, but loses names in the expression >> class NEI( NamedInt, Enum ): >> x = NamedInt('the-x', 1 ) >> y = NamedInt('the-y', 2 ) >> >> print( repr( NEI( 1 )), repr( NEI( 2 )), repr( NEI(1) + NEI(2))) > > Well, my first question would be why are you using named anything in > an enumeration, where it's going to get another name? :) It is a stepping stone, but consider it a stupid test case for now. > But setting that aside, if you > > --> print(NEI.x.__name__) > 'x' > > not 'the-x'. > > Now let's look for the clues: > > class Enum... > ... > @StealthProperty > def name(self): > return self._name > > class NamedInt... > ... > def __name__(self): > return self._name # look familiar? > > > When NamedInt goes looking for _name, it finds the one on `x`, not the > one on `x.value`. Indeed. But that isn't the problem of biggest concern. I changed NamedInt to use _intname instead of _name, and it didn't cure the bigger problem. The bigger problem is that the arithmetic on enumeration items, which seems like it should be inherited from NamedInt (and seems to be, because the third value from each print is a NamedInt), doesn't pick up "x" or "y", nor does it pick up "the-x" or "the-y", but rather, it somehow picks up the str of the value. The third item from each print should be the same in all print statements, but the first, that deals with the NamedInt directly, works, and the others, that are wrapped in Enum, do not. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun May 5 10:08:36 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 5 May 2013 18:08:36 +1000 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <51860872.8020800@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> Message-ID: On Sun, May 5, 2013 at 5:21 PM, Ethan Furman wrote: > When NamedInt goes looking for _name, it finds the one on `x`, not the one > on `x.value`. There's also the code in enum_type.__call__ that ensures Enum.__repr__ and Enum.__str__ are used in preference to those from the value type. (Specifically, the code at https://bitbucket.org/stoneleaf/ref435/src/758d43b9f7327cd61dc2e45050539b6b5db1c4e3/ref435.py?at=default#cl-152 that ignores __repr__ and __str__ from non-Enum types) I think this needs to be documented more clearly - if you want to keep a custom __repr__ or __str__ when mixing Enum (or an Enum subclass) with another type, then you need to explicitly set them in your subclass. (e.g. in Glenn's case, setting "__repr__ = NamedValue.__repr__") I'm OK with magic to get the kind of enum behaviour we want, but I'm not OK with *black* magic that we don't explain. There should be an advanced section in the enum docs which explains these edge cases in the way the enum metaclass interacts with the normal class machinery. That said, I'm also fairly sure the current code is incorrect: I believe it does the wrong thing when an Enum subclass further customises __repr__, __str__ or __new__. The more reasonable logic to me seems to be to figure out the "first enum base" and the "first non-enum base" based on: enum_bases = [base for base in enum_class.mro() if issubclass(base, Enum)] non_enum_bases = [base for base in enum_class.mro() if not issubclass(base, Enum)] Then, if the current __new__, __str__ or __repr__ implementation is the same as that for the first non-enum base, we replace it with the impl from the first enum base. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun May 5 12:05:30 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 5 May 2013 12:05:30 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement References: Message-ID: <20130505120530.08b62855@fsol> On Sat, 4 May 2013 15:04:49 -0700 Eli Bendersky wrote: > Hello pydev, > > PEP 435 is ready for final review. A lot of the feedback from the last few > weeks of discussions has been incorporated. I still would like to see Nick's class-based API preferred over the functional API: class Season(Enum, members='spring summer autumn'): pass The PEP doesn't even mention it, even though you got significant pushback on the proposed _getframe() hack for pickling (including mentions that IronPython and Cython may not support it), and nobody seemed to be unhappy with the class-based proposal. Regards Antoine. From stefan_ml at behnel.de Sun May 5 12:19:31 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 05 May 2013 12:19:31 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505120530.08b62855@fsol> References: <20130505120530.08b62855@fsol> Message-ID: Antoine Pitrou, 05.05.2013 12:05: > On Sat, 4 May 2013 15:04:49 -0700 > Eli Bendersky wrote: >> PEP 435 is ready for final review. A lot of the feedback from the last few >> weeks of discussions has been incorporated. > > I still would like to see Nick's class-based API preferred over the > functional API: > > class Season(Enum, members='spring summer autumn'): > pass > > The PEP doesn't even mention it, even though you got significant > pushback on the proposed _getframe() hack for pickling (including > mentions that IronPython and Cython may not support it), and nobody > seemed to be unhappy with the class-based proposal. +1 Stefan From steve at pearwood.info Sun May 5 12:59:03 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 05 May 2013 20:59:03 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505120530.08b62855@fsol> References: <20130505120530.08b62855@fsol> Message-ID: <51863B77.3000500@pearwood.info> On 05/05/13 20:05, Antoine Pitrou wrote: > I still would like to see Nick's class-based API preferred over the > functional API: > > class Season(Enum, members='spring summer autumn'): > pass -1 As already mentioned, this is no substitute for the functional API as it is a statement, not an expression. As for pickling, the usual restrictions on pickling apply. It's not like the functional API creates new and unexpected restrictions. -- Steven From solipsis at pitrou.net Sun May 5 13:06:12 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 5 May 2013 13:06:12 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement References: <20130505120530.08b62855@fsol> <51863B77.3000500@pearwood.info> Message-ID: <20130505130612.45841997@fsol> On Sun, 05 May 2013 20:59:03 +1000 Steven D'Aprano wrote: > On 05/05/13 20:05, Antoine Pitrou wrote: > > > I still would like to see Nick's class-based API preferred over the > > functional API: > > > > class Season(Enum, members='spring summer autumn'): > > pass > > -1 > > As already mentioned, this is no substitute for the functional API as it is a statement, not an expression. So, can you explain why it would make a difference? > As for pickling, the usual restrictions on pickling apply. No. I'm sure pickling classes normally works on Cython and IronPython, and with PEP 3154 pickling nested classes will also be supported. Regards Antoine. From timothy.c.delaney at gmail.com Sun May 5 13:58:55 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Sun, 5 May 2013 21:58:55 +1000 Subject: [Python-Dev] PEP 435 - reference implementation discussion In-Reply-To: <5185F96F.1000303@stoneleaf.us> References: <5185F54A.6060909@stoneleaf.us> <5185F96F.1000303@stoneleaf.us> Message-ID: On 5 May 2013 16:17, Ethan Furman wrote: > On 05/04/2013 10:59 PM, Ethan Furman wrote: > >> On 05/04/2013 08:50 PM, Tim Delaney wrote: >> >>> 2. Instead of directly setting the _name and _value of the enum_item, it >>> lets the Enum class do it via Enum.__init__(). >>> >> Subclasses can override this. This gives Enums a 2-phase construction >>> just like other classes. >>> >> >> Not sure I care for this. Enums are, at least in theory, immutable >> objects, and immutable objects don't call __init__. >> > > Okay, still thinking about `value`, but as far as `name` goes, it should > not be passed -- it must be the same as it was in the class definition > Agreed - name should not be passed. I would have preferred to use __new__, but Enum.__new__ doesn't get called at all from enum_type (and the implementation wouldn't be at all appropriate anyway). Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 5 15:07:55 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 06:07:55 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <51860872.8020800@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> Message-ID: <518659AB.6090302@stoneleaf.us> class NEI( NamedInt, Enum ): x = NamedInt('the-x', 1 ) y = NamedInt('the-y', 2 ) @property def __name__(self): return self.value.__name__ From ethan at stoneleaf.us Sun May 5 15:44:03 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 06:44:03 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505120530.08b62855@fsol> References: <20130505120530.08b62855@fsol> Message-ID: <51866223.6090003@stoneleaf.us> On 05/05/2013 03:05 AM, Antoine Pitrou wrote: > On Sat, 4 May 2013 15:04:49 -0700 > Eli Bendersky wrote: >> Hello pydev, >> >> PEP 435 is ready for final review. A lot of the feedback from the last few >> weeks of discussions has been incorporated. > > I still would like to see Nick's class-based API preferred over the > functional API: > > class Season(Enum, members='spring summer autumn'): > pass > > The PEP doesn't even mention it, even though you got significant > pushback on the proposed _getframe() hack for pickling (including > mentions that IronPython and Cython may not support it), and nobody > seemed to be unhappy with the class-based proposal. Agreed that the PEP should mention it. -1 on using it. We don't need two different ways to use class syntax. The functional interface is there for two reasons: - to easily create enums dynamically (fairly rare, I'm sure) - to easily create enums when prototyping or at the interactive prompt (I'll use it all the time -- it's convenient! ;) -- ~Ethan~ From victor.stinner at gmail.com Sun May 5 15:51:10 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Sun, 5 May 2013 15:51:10 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505120530.08b62855@fsol> References: <20130505120530.08b62855@fsol> Message-ID: I'm unhappy with this API. I never used it. It is also more verbose than the functional API. Victor Le dimanche 5 mai 2013, Antoine Pitrou a ?crit : > On Sat, 4 May 2013 15:04:49 -0700 > Eli Bendersky > wrote: > > Hello pydev, > > > > PEP 435 is ready for final review. A lot of the feedback from the last > few > > weeks of discussions has been incorporated. > > I still would like to see Nick's class-based API preferred over the > functional API: > > class Season(Enum, members='spring summer autumn'): > pass > > The PEP doesn't even mention it, even though you got significant > pushback on the proposed _getframe() hack for pickling (including > mentions that IronPython and Cython may not support it), and nobody > seemed to be unhappy with the class-based proposal. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Sun May 5 16:09:14 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 5 May 2013 07:09:14 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505120530.08b62855@fsol> References: <20130505120530.08b62855@fsol> Message-ID: On Sun, May 5, 2013 at 3:05 AM, Antoine Pitrou wrote: > On Sat, 4 May 2013 15:04:49 -0700 > Eli Bendersky wrote: > > Hello pydev, > > > > PEP 435 is ready for final review. A lot of the feedback from the last > few > > weeks of discussions has been incorporated. > > I still would like to see Nick's class-based API preferred over the > functional API: > > class Season(Enum, members='spring summer autumn'): > pass > > The PEP doesn't even mention it, even though you got significant > pushback on the proposed _getframe() hack for pickling (including > mentions that IronPython and Cython may not support it), Plenty of points were raised against having this members= API. People argued ardently both ways .Guido publicly asked to decide in favor of the functional API, and we added an explicit warning about pickling (which was lifted from the docs of pickle itself). If you feel this has to be discussed further, please open a new thread. I don't want another 100 bikeshedding emails to go into this one. We'll mention this as a considered alternative in the PEP, though. > and nobody > seemed to be unhappy with the class-based proposal. > > Not true, as you see. Eli > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/eliben%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 5 16:49:44 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 07:49:44 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> Message-ID: <51867188.7070808@stoneleaf.us> On 05/05/2013 01:08 AM, Nick Coghlan wrote: > On Sun, May 5, 2013 at 5:21 PM, Ethan Furman wrote: > > There's also the code in enum_type.__call__ that ensures Enum.__repr__ > and Enum.__str__ are used in preference to those from the value type. > (Specifically, the code at > https://bitbucket.org/stoneleaf/ref435/src/758d43b9f7327cd61dc2e45050539b6b5db1c4e3/ref435.py?at=default#cl-152 > that ignores __repr__ and __str__ from non-Enum types) > > I think this needs to be documented more clearly - if you want to keep > a custom __repr__ or __str__ when mixing Enum (or an Enum subclass) > with another type, then you need to explicitly set them in your > subclass. (e.g. in Glenn's case, setting "__repr__ = > NamedValue.__repr__") Certainly the docs need to be clear about this. I don't think the PEP needs to be. (Apologies if you were not referring to the PEP.) > The more reasonable logic to me seems to be to figure out the "first > enum base" and the "first non-enum base" based on: > > enum_bases = [base for base in enum_class.mro() if issubclass(base, Enum)] > non_enum_bases = [base for base in enum_class.mro() if not > issubclass(base, Enum)] > > Then, if the current __new__, __str__ or __repr__ implementation is > the same as that for the first non-enum base, we replace it with the > impl from the first enum base. Fair point -- working on it. -- ~Ethan~ From pjenvey at underboss.org Sun May 5 18:50:14 2013 From: pjenvey at underboss.org (Philip Jenvey) Date: Sun, 5 May 2013 09:50:14 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <51866223.6090003@stoneleaf.us> References: <20130505120530.08b62855@fsol> <51866223.6090003@stoneleaf.us> Message-ID: <7BD3D41D-E858-4388-83FF-FC6825FADEF8@underboss.org> On May 5, 2013, at 6:44 AM, Ethan Furman wrote: > On 05/05/2013 03:05 AM, Antoine Pitrou wrote: >> I still would like to see Nick's class-based API preferred over the >> functional API: >> >> class Season(Enum, members='spring summer autumn'): >> pass >> >> The PEP doesn't even mention it, even though you got significant >> pushback on the proposed _getframe() hack for pickling (including >> mentions that IronPython and Cython may not support it), and nobody >> seemed to be unhappy with the class-based proposal. +1 > > Agreed that the PEP should mention it. > > -1 on using it. > > We don't need two different ways to use class syntax. > > The functional interface is there for two reasons: > > - to easily create enums dynamically (fairly rare, I'm sure) > > - to easily create enums when prototyping or at the interactive prompt (I'll use it all the time -- it's convenient! ;) I don't understand, the class based API is perfectly fine for prototyping in the repl. For dynamic creation, the class API always provides a functional API for free: import types types.new_class('Season', (Enum,), dict(values='spring summer autumn')) It's not convenient, but that doesn't matter because this usage is rare anyway. Certainly much rarer than declarations of auto-numbered enums. -- Philip Jenvey From cf.natali at gmail.com Sun May 5 19:07:55 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 5 May 2013 19:07:55 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: I'm chiming in late, but am I the only one who's really bothered by the syntax? class Color(Enum): red = 1 green = 2 blue = 3 I really don't see why one has to provide values, since an enum constant *is* the value. In many cases, there's no natural mapping between an enum constant and a value, e.g. there's no reason why Color.red should be mapped to 1 and Color.blue to 3. Furthermore, the PEP makes it to possible to do something like: class Color(Enum): red = 1 green = 2 blue = 3 red_alias = 1 which is IMO really confusing, since enum instances are supposed to be distinct. All the languages I can think of that support explicit values (Java being particular in the sense that it's really a full-fledge object which can have attributes, methods, etc) make it optional by default. Finally, I think 99% of users won't care about the assigned value (which is just an implementation detail), so explicit value will be just noise annoying users (well, me at least :-). cf 2013/5/5 Eli Bendersky : > Hello pydev, > > PEP 435 is ready for final review. A lot of the feedback from the last few > weeks of discussions has been incorporated. Naturally, not everything could > go in because some minor (mostly preference-based) issues did not reach a > consensus. We do feel, however, that the end result is better than in the > beginning and that Python can finally have a useful enumeration type in the > standard library. > > I'm attaching the latest version of the PEP for convenience. If you've read > previous versions, the easiest way to get acquainted with the recent changes > is to go through the revision log at http://hg.python.org/peps > > A reference implementation for PEP 435 is available at > https://bitbucket.org/stoneleaf/ref435 > > Kind regards and happy weekend. > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/cf.natali%40gmail.com > From p.f.moore at gmail.com Sun May 5 19:10:08 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 5 May 2013 18:10:08 +0100 Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong In-Reply-To: References: Message-ID: On 4 May 2013 16:42, Vinay Sajip wrote: > I've taken a quick look at it, but I probably won't be able to make any > changes until the near the end of the coming week. Feel free to have a go; > OK, I have a patch against the standalone pylauncher repo at https://bitbucket.org/pmoore/pylauncher. I'm not sure what the best approach is - I didn't want to patch the python core version directly (a) because I wouldn't be able to test it easily, and (b) because I'd want a standalone version anyway until 3.4 comes out. BTW, the tests for pylauncher fail for me on the unpatched version, so all I can say is that the patched version fails the same way and my manual tests worked as expected... I can rework it against cpython if needed. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 5 19:35:36 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 10:35:36 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: This has all long been hashed out, and I've pronounced on this already. I'm sorry you weren't there for the bikeshedding, but nothing you say here is new and it was all considered carefully. On Sun, May 5, 2013 at 10:07 AM, Charles-Fran?ois Natali wrote: > I'm chiming in late, but am I the only one who's really bothered by the syntax? > > class Color(Enum): > red = 1 > green = 2 > blue = 3 > > I really don't see why one has to provide values, since an enum > constant *is* the value. > In many cases, there's no natural mapping between an enum constant and > a value, e.g. there's no reason why Color.red should be mapped to 1 > and Color.blue to 3. > > Furthermore, the PEP makes it to possible to do something like: > > class Color(Enum): > red = 1 > green = 2 > blue = 3 > red_alias = 1 > > > which is IMO really confusing, since enum instances are supposed to be distinct. > > All the languages I can think of that support explicit values (Java > being particular in the sense that it's really a full-fledge object > which can have attributes, methods, etc) make it optional by default. > > Finally, I think 99% of users won't care about the assigned value > (which is just an implementation detail), so explicit value will be > just noise annoying users (well, me at least :-). > > cf > > > > 2013/5/5 Eli Bendersky : >> Hello pydev, >> >> PEP 435 is ready for final review. A lot of the feedback from the last few >> weeks of discussions has been incorporated. Naturally, not everything could >> go in because some minor (mostly preference-based) issues did not reach a >> consensus. We do feel, however, that the end result is better than in the >> beginning and that Python can finally have a useful enumeration type in the >> standard library. >> >> I'm attaching the latest version of the PEP for convenience. If you've read >> previous versions, the easiest way to get acquainted with the recent changes >> is to go through the revision log at http://hg.python.org/peps >> >> A reference implementation for PEP 435 is available at >> https://bitbucket.org/stoneleaf/ref435 >> >> Kind regards and happy weekend. >> >> >> >> >> >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/cf.natali%40gmail.com >> > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun May 5 19:40:40 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 10:40:40 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505120530.08b62855@fsol> References: <20130505120530.08b62855@fsol> Message-ID: On Sun, May 5, 2013 at 3:05 AM, Antoine Pitrou wrote: > I still would like to see Nick's class-based API preferred over the > functional API: > > class Season(Enum, members='spring summer autumn'): > pass > > The PEP doesn't even mention it, even though you got significant > pushback on the proposed _getframe() hack for pickling (including > mentions that IronPython and Cython may not support it), and nobody > seemed to be unhappy with the class-based proposal. This particular bikeshed has sailed. I heard all the feedback, took into account my own thoughts, and have decided that we should go ahead with this syntax and the _getframe() hack. If the _getframe() hack doesn't work on a given platform, the __module__ attribute is not set correctly, so pickling will fail, but everything else will work. We can work on a PEP to replace the _getframe() hack separately; I think it's functionality that is useful beyond Enum() and namedtuple(), and can be implemented on all platforms with something a lot less general than _getframe(). Authors can also avoid the _getframe() hack in two ways: (a) use the full class definition; (b) specify the full dotted class name in the call. (We should modify namedtuple() to support this too BTW.) -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Sun May 5 19:41:59 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 5 May 2013 18:41:59 +0100 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: OK, I thought I'd take a look. I have never particularly needed enums in real life, so I'm reading the PEP from the POV of a naive user who is just thinking "hey, neat, Python got enums, let's see how they work". I have been skimming the discussions and my head has been exploding with the complexity, so I admit I was very, very scared that the PEP might be equally daunting. First, the good news - from the POV described above, the PEP is both readable and intuitive. Nice job, guys! Now the problems I had: 1. Having to enter the values is annoying. Sorry, I read the rationale and all that, and I *still* want to write a C-Like enum { A, B, C }. I fully expect to edit and reorder enums (if I ever use them) and get irritated with having to update the value assignments. 2. Enums are not orderable by default. Yuk. I doubt I'll care about this often (iteration is more important) but when I do, I'll be annoyed. 3. This is just a thought, but I suspect that IntEnums iterating in definition order but ordering by value could trip people up and cause hard to diagnose bugs. 4. I'll either use the functional form all the time (because I don't have to specify values) or never (because it's ugly as sin). I can't work out which aspect will win yet. And one omission that struck me. There's no mention of the common case of bitmap enums. class Example(Enum): a = 1 b = 2 c = 4 Do I need to use an IntEnum (given the various warnings in the PEP about how "most people won't need it") if I want to be able to do things like flags = "Example.a | Example.c"? I think there should at least be an extended example in the PEP covering a bitmap enum case. (And certainly the final documentation should include a cookbook-style example of bitmap enums). Summary - good job, I like the PEP a lot. But Python's enums are very unlike those of other languages, and I suspect that's going to be more of an issue than you'd hope... Paul. On 4 May 2013 23:04, Eli Bendersky wrote: > Hello pydev, > > PEP 435 is ready for final review. A lot of the feedback from the last few > weeks of discussions has been incorporated. Naturally, not everything could > go in because some minor (mostly preference-based) issues did not reach a > consensus. We do feel, however, that the end result is better than in the > beginning and that Python can finally have a useful enumeration type in the > standard library. > > I'm attaching the latest version of the PEP for convenience. If you've > read previous versions, the easiest way to get acquainted with the recent > changes is to go through the revision log at http://hg.python.org/peps > > A reference implementation for PEP 435 is available at > https://bitbucket.org/stoneleaf/ref435 > > Kind regards and happy weekend. > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/p.f.moore%40gmail.com > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun May 5 19:43:09 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 06 May 2013 03:43:09 +1000 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: <51869A2D.7060801@pearwood.info> On 06/05/13 03:07, Charles-Fran?ois Natali wrote: > I'm chiming in late, but am I the only one who's really bothered by the syntax? > > class Color(Enum): > red = 1 > green = 2 > blue = 3 > > I really don't see why one has to provide values, since an enum > constant *is* the value. > In many cases, there's no natural mapping between an enum constant and > a value, e.g. there's no reason why Color.red should be mapped to 1 > and Color.blue to 3. The functional API provides a way to conveniently create enums without caring what value they get. Other than that, the PEP explains that there was an early proposal to declare names without values: # rejected syntax class Color(Enum): red green blue but this was rejected for being too magical, and too confusing to those who aren't expecting it. > Furthermore, the PEP makes it to possible to do something like: > > class Color(Enum): > red = 1 > green = 2 > blue = 3 > red_alias = 1 > > > which is IMO really confusing, since enum instances are supposed to be distinct. Enums often have duplicate values, sometimes to provide aliases, sometimes to correct spelling errors, or to manage deprecated names, etc. class Color(Enum): red = 1 green = 2 # this is the preferred spelling blue = 3 gren = green # oops, do not remove, needed for backwards compatibility E.g. I googled on "C enum" and the very first hit includes a duplicate value: http://msdn.microsoft.com/en-AU/library/whbyts4t%28v=vs.80%29.aspx And two examples from asm-generic/errno.h: #define EWOULDBLOCK EAGAIN /* Operation would block */ #define EDEADLOCK EDEADLK > All the languages I can think of that support explicit values (Java > being particular in the sense that it's really a full-fledge object > which can have attributes, methods, etc) make it optional by default. > > Finally, I think 99% of users won't care about the assigned value > (which is just an implementation detail), so explicit value will be > just noise annoying users (well, me at least :-). Compatibility with (e.g.) C enums is an important use-case for these, and in that case you likely will care about the actual value. -- Steven From solipsis at pitrou.net Sun May 5 19:46:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 5 May 2013 19:46:03 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: <20130505120530.08b62855@fsol> Message-ID: <20130505194603.142cbace@fsol> On Sun, 5 May 2013 07:09:14 -0700 Eli Bendersky wrote: > On Sun, May 5, 2013 at 3:05 AM, Antoine Pitrou wrote: > > > On Sat, 4 May 2013 15:04:49 -0700 > > Eli Bendersky wrote: > > > Hello pydev, > > > > > > PEP 435 is ready for final review. A lot of the feedback from the last > > few > > > weeks of discussions has been incorporated. > > > > I still would like to see Nick's class-based API preferred over the > > functional API: > > > > class Season(Enum, members='spring summer autumn'): > > pass > > > > The PEP doesn't even mention it, even though you got significant > > pushback on the proposed _getframe() hack for pickling (including > > mentions that IronPython and Cython may not support it), > > Plenty of points were raised against having this members= API. The main point seems to be "I don't like it". If you consider this a strong argument against the concrete issues with the functional API, then good for you. > Guido publicly asked to decide in favor of the > functional API, and we added an explicit warning about pickling (which was > lifted from the docs of pickle itself). This is not true. The pickling restrictions which have been raised are specifically caused by the functional syntax, something which your warning omits. > If you feel this has to be > discussed further, please open a new thread. I don't want another 100 > bikeshedding emails to go into this one. This is not bikeshedding since it addresses concrete functional issues. (but apparently you would very much like to sweep those issues under the carpet in the name of "bikeshedding") Regards Antoine. From guido at python.org Sun May 5 19:50:56 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 10:50:56 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505194603.142cbace@fsol> References: <20130505120530.08b62855@fsol> <20130505194603.142cbace@fsol> Message-ID: I am fine with adding more information about this issue to the PEP. I am not fine with reopening the issue. I really, really, really have looked at it from all sides and the current design of the functional API has my full blessing. On Sun, May 5, 2013 at 10:46 AM, Antoine Pitrou wrote: > On Sun, 5 May 2013 07:09:14 -0700 > Eli Bendersky wrote: >> On Sun, May 5, 2013 at 3:05 AM, Antoine Pitrou wrote: >> >> > On Sat, 4 May 2013 15:04:49 -0700 >> > Eli Bendersky wrote: >> > > Hello pydev, >> > > >> > > PEP 435 is ready for final review. A lot of the feedback from the last >> > few >> > > weeks of discussions has been incorporated. >> > >> > I still would like to see Nick's class-based API preferred over the >> > functional API: >> > >> > class Season(Enum, members='spring summer autumn'): >> > pass >> > >> > The PEP doesn't even mention it, even though you got significant >> > pushback on the proposed _getframe() hack for pickling (including >> > mentions that IronPython and Cython may not support it), >> >> Plenty of points were raised against having this members= API. > > The main point seems to be "I don't like it". If you consider this a > strong argument against the concrete issues with the functional API, > then good for you. > >> Guido publicly asked to decide in favor of the >> functional API, and we added an explicit warning about pickling (which was >> lifted from the docs of pickle itself). > > This is not true. The pickling restrictions which have been raised are > specifically caused by the functional syntax, something which your > warning omits. > >> If you feel this has to be >> discussed further, please open a new thread. I don't want another 100 >> bikeshedding emails to go into this one. > > This is not bikeshedding since it addresses concrete functional issues. > (but apparently you would very much like to sweep those issues under > the carpet in the name of "bikeshedding") > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun May 5 19:49:22 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 10:49:22 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On Sun, May 5, 2013 at 10:41 AM, Paul Moore wrote: > OK, I thought I'd take a look. I have never particularly needed enums in > real life, so I'm reading the PEP from the POV of a naive user who is just > thinking "hey, neat, Python got enums, let's see how they work". I have been > skimming the discussions and my head has been exploding with the complexity, > so I admit I was very, very scared that the PEP might be equally daunting. > > First, the good news - from the POV described above, the PEP is both > readable and intuitive. Nice job, guys! > > Now the problems I had: > > 1. Having to enter the values is annoying. Sorry, I read the rationale and > all that, and I *still* want to write a C-Like enum { A, B, C }. I fully > expect to edit and reorder enums (if I ever use them) and get irritated with > having to update the value assignments. I guess there are cultural differences around this. Anyway, you can use the functional/convenience API for this purpose. > 2. Enums are not orderable by default. Yuk. I doubt I'll care about this > often (iteration is more important) but when I do, I'll be annoyed. I personally agree with you, but not strongly enough to override Barry and Eli who seem to be strongly for unordered enums. > 3. This is just a thought, but I suspect that IntEnums iterating in > definition order but ordering by value could trip people up and cause hard > to diagnose bugs. This is somewhat in conflict with your #1. :-) > 4. I'll either use the functional form all the time (because I don't have to > specify values) or never (because it's ugly as sin). I can't work out which > aspect will win yet. But not everybody will make the same choice. > And one omission that struck me. There's no mention of the common case of > bitmap enums. > > class Example(Enum): > a = 1 > b = 2 > c = 4 > > Do I need to use an IntEnum (given the various warnings in the PEP about how > "most people won't need it") if I want to be able to do things like flags = > "Example.a | Example.c"? I think there should at least be an extended > example in the PEP covering a bitmap enum case. (And certainly the final > documentation should include a cookbook-style example of bitmap enums). You'd have to use IntEnum. Plus, these are hardly enums -- they are a particularly obscure old school hack for representing sets of flags. (I liked Pascal's solution for this better -- it had a bit set data structure that supported sets of enums.) > Summary - good job, I like the PEP a lot. But Python's enums are very unlike > those of other languages, and I suspect that's going to be more of an issue > than you'd hope... We're pretty confident that we're doing about the best job possible given the constraints (one of which is getting this accepted into Python 3.4 without any of the participants incurring permanent brain damage). -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Sun May 5 20:36:17 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Sun, 5 May 2013 19:36:17 +0100 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: Message-ID: On 5 May 2013 18:49, Guido van Rossum wrote: > > Summary - good job, I like the PEP a lot. But Python's enums are very > unlike > > those of other languages, and I suspect that's going to be more of an > issue > > than you'd hope... > > We're pretty confident that we're doing about the best job possible > given the constraints (one of which is getting this accepted into > Python 3.4 without any of the participants incurring permanent brain > damage). Agreed. My points were definitely minor ones (and addressed by your reply, thanks). Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun May 5 22:09:50 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 13:09:50 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes Message-ID: <5186BC8E.6030502@stoneleaf.us> On 05/05/2013 10:07 AM, ? wrote:> I'm chiming in late, but am I the only one who's really bothered by the syntax? > > class Color(Enum): > red = 1 > green = 2 > blue = 3 No, you are not only one that's bothered by it. I tried it without assignments until I discovered that bugs are way too easy to introduce. The problem is a successful name lookup looks just like a name failure, but of course no error is raised and no new enum item is created: --> class Color(Enum): ... red, green, blue ... --> class MoreColor(Color): ... red, orange, yellow ... --> type(MoreColor.red) is MoreColor False --> MoreColor.orange # value should be 5 About the closest you going to be able to get is something like: def e(_next=[1]): e, _next[0] = _next[0], _next[0] + 1 return e class Color(Enum): red = e() green = e() blue = e() and you can keep using `e()` for all your enumerations, since you don't care what actual value each enumeration member happens to get. -- ~Ethan~ From timothy.c.delaney at gmail.com Sun May 5 23:55:22 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Mon, 6 May 2013 07:55:22 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: <5186BC8E.6030502@stoneleaf.us> References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On 6 May 2013 06:09, Ethan Furman wrote: > On 05/05/2013 10:07 AM, ? wrote:> I'm chiming in late, but am I the only > one who's really bothered by the syntax? > >> >> class Color(Enum): >> red = 1 >> green = 2 >> blue = 3 >> > > No, you are not only one that's bothered by it. I tried it without > assignments until I discovered that bugs are way too easy to introduce. > The problem is a successful name lookup looks just like a name failure, > but of course no error is raised and no new enum item is created: > > --> class Color(Enum): > ... red, green, blue > ... > > --> class MoreColor(Color): > ... red, orange, yellow > ... > > --> type(MoreColor.red) is MoreColor > False > > --> MoreColor.orange > # value should be 5 > Actually, my implementation at https://bitbucket.org/magao/enum (the one mentioned in the PEP) does detect MoreColor.red as a duplicate. It's possible to do it, but it's definitely black magic and also involves use of sys._getframe() for more than just getting module name. >>> from enum import Enum >>> class Color(Enum): ... red, green, blue ... >>> class MoreColor(Color): ... red, orange, yellow ... Traceback (most recent call last): File "", line 1, in File ".\enum.py", line 388, in __new__ raise AttributeError("Duplicate enum key '%s.%s' (overriding '%s')" % (result.__name__, v.key, k eys[v.key])) AttributeError: Duplicate enum key 'MoreColor.red' (overriding 'Color.red') >>> So long as I can get one of the requirements documented to implement an auto-number syntax I'll be happy enough with stdlib enums I think. class Color(AutoIntEnum): red = ... green = ... blue = ... Not as pretty, but ends up being less magical. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 6 00:00:20 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 15:00:20 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On Sun, May 5, 2013 at 2:55 PM, Tim Delaney wrote: > So long as I can get one of the requirements documented to implement an > auto-number syntax I'll be happy enough with stdlib enums I think. Specifically what do you want the PEP to promise? -- --Guido van Rossum (python.org/~guido) From mcepl at redhat.com Mon May 6 00:01:56 2013 From: mcepl at redhat.com (Matej Cepl) Date: Sun, 5 May 2013 18:01:56 -0400 (EDT) Subject: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) In-Reply-To: References: <1362575394.23949.2.camel@wycliff.ceplovi.cz> <20130307100840.GA24941@wycliff.ceplovi.cz> Message-ID: <660198223.6059416.1367791316158.JavaMail.root@redhat.com> ----- Original Message ----- > From: "Armin Rigo" > To: "Matej Cepl" > Cc: python-dev at python.org > Sent: Saturday, May 4, 2013 11:59:42 AM > Subject: Re: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) > > Hi Matej, > > On Thu, Mar 7, 2013 at 11:08 AM, Matej Cepl wrote: > > if c is not ' ' and c is not ' ': > > if c != ' ' and c != ' ': > > Sorry for the delay in answering, but I just noticed what is wrong in > this "fix": it compares c with the same single-character ' ' twice, > whereas the original compared it with ' ' and with the two-character ' Comments on https://github.com/mcepl/html2text/commit/f511f3c78e60d7734d677f8945580f52ef7ef742#L0R765 (perhaps in https://github.com/aaronsw/html2text/pull/77) are more than welcome. When using SPACE_RE = re.compile(r'\s\+') for checking, whole onlywhite function is not needed anymore (and it still made me wonder what Aaron meant when he wrote it). Why line.isspace() doesn't work is weird though. Best, Mat?j From python at mrabarnett.plus.com Mon May 6 00:20:01 2013 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 05 May 2013 23:20:01 +0100 Subject: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) In-Reply-To: <660198223.6059416.1367791316158.JavaMail.root@redhat.com> References: <1362575394.23949.2.camel@wycliff.ceplovi.cz> <20130307100840.GA24941@wycliff.ceplovi.cz> <660198223.6059416.1367791316158.JavaMail.root@redhat.com> Message-ID: <5186DB11.7000504@mrabarnett.plus.com> On 05/05/2013 23:01, Matej Cepl wrote: > ----- Original Message ----- >> From: "Armin Rigo" >> To: "Matej Cepl" >> Cc: python-dev at python.org >> Sent: Saturday, May 4, 2013 11:59:42 AM >> Subject: Re: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial) >> >> Hi Matej, >> >> On Thu, Mar 7, 2013 at 11:08 AM, Matej Cepl wrote: >> > if c is not ' ' and c is not ' ': >> > if c != ' ' and c != ' ': >> >> Sorry for the delay in answering, but I just noticed what is wrong in >> this "fix": it compares c with the same single-character ' ' twice, >> whereas the original compared it with ' ' and with the two-character ' > > Comments on https://github.com/mcepl/html2text/commit/f511f3c78e60d7734d677f8945580f52ef7ef742#L0R765 (perhaps in https://github.com/aaronsw/html2text/pull/77) are more than welcome. When using > > SPACE_RE = re.compile(r'\s\+') > That will match a whitespace character followed by a '+'. > for checking, whole onlywhite function is not needed anymore (and it still made me wonder what Aaron meant when he wrote it). Why line.isspace() doesn't work is weird though. > What do you mean by "doesn't work"? From eliben at gmail.com Mon May 6 00:27:36 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 5 May 2013 15:27:36 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <20130505194603.142cbace@fsol> References: <20130505120530.08b62855@fsol> <20130505194603.142cbace@fsol> Message-ID: On Sun, May 5, 2013 at 10:46 AM, Antoine Pitrou wrote: > On Sun, 5 May 2013 07:09:14 -0700 > Eli Bendersky wrote: > > On Sun, May 5, 2013 at 3:05 AM, Antoine Pitrou > wrote: > > > > > On Sat, 4 May 2013 15:04:49 -0700 > > > Eli Bendersky wrote: > > > > Hello pydev, > > > > > > > > PEP 435 is ready for final review. A lot of the feedback from the > last > > > few > > > > weeks of discussions has been incorporated. > > > > > > I still would like to see Nick's class-based API preferred over the > > > functional API: > > > > > > class Season(Enum, members='spring summer autumn'): > > > pass > > > > > > The PEP doesn't even mention it, even though you got significant > > > pushback on the proposed _getframe() hack for pickling (including > > > mentions that IronPython and Cython may not support it), > > > > Plenty of points were raised against having this members= API. > > The main point seems to be "I don't like it". If you consider this a > strong argument against the concrete issues with the functional API, > then good for you. > > > Guido publicly asked to decide in favor of the > > functional API, and we added an explicit warning about pickling (which > was > > lifted from the docs of pickle itself). > > This is not true. The pickling restrictions which have been raised are > specifically caused by the functional syntax, something which your > warning omits. > > > If you feel this has to be > > discussed further, please open a new thread. I don't want another 100 > > bikeshedding emails to go into this one. > > This is not bikeshedding since it addresses concrete functional issues. > (but apparently you would very much like to sweep those issues under > the carpet in the name of "bikeshedding") > I'm sorry that you're taking this issue so personally, Antoine. As for pickling enums created with the functional API, I don't think we now provide less than the pickle module dictates in the general sense. The pickle docs say: The following types can be pickled: [...] - classes that are defined at the top level of a module - instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section *Pickling Class Instances*for details). I'll open a separate thread about how this can be implemented and documented in the best way possible, but I really don't see it as an unsolvable issue. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Mon May 6 00:34:54 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Mon, 6 May 2013 08:34:54 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On 6 May 2013 08:00, Guido van Rossum wrote: > On Sun, May 5, 2013 at 2:55 PM, Tim Delaney > wrote: > > So long as I can get one of the requirements documented to implement an > > auto-number syntax I'll be happy enough with stdlib enums I think. > > Specifically what do you want the PEP to promise? > It was mentioned in the other threads, but the requirement is either: 1. That the dictionary returned from .__prepare__ provide a way to obtain the enum instance names once it's been populated (e.g. once it's been passed as the classdict to __new__). The reference implementation provides a _enum_names list attribute. The enum names need to be available to a metaclass subclass before calling the base metaclass __new__. OR 2. A way for subclasses of Enum to modify the value before it's assigned to the actual enum - see the PEP 435 reference implementation - discussion thread where I modified the reference implementation to give enum instances 2-phase construction, passing the value to Enum.__init__. This way is more limited, as you need to use an appropriate mix-in type which puts certain constraints on the behaviour of the enum instances (e.g. they *have* to be int instances for auto-numbering). The implementation is also more complex, and as noted in that thread, __init__ might not be appropriate for an Enum. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 6 00:43:48 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 15:43:48 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On Sun, May 5, 2013 at 3:34 PM, Tim Delaney wrote: > On 6 May 2013 08:00, Guido van Rossum wrote: >> >> On Sun, May 5, 2013 at 2:55 PM, Tim Delaney >> wrote: >> > So long as I can get one of the requirements documented to implement an >> > auto-number syntax I'll be happy enough with stdlib enums I think. >> >> Specifically what do you want the PEP to promise? > It was mentioned in the other threads, but the requirement is either: > > 1. That the dictionary returned from .__prepare__ provide a > way to obtain the enum instance names once it's been populated (e.g. once > it's been passed as the classdict to __new__). The reference implementation > provides a _enum_names list attribute. The enum names need to be available > to a metaclass subclass before calling the base metaclass __new__. > > OR > > 2. A way for subclasses of Enum to modify the value before it's assigned to > the actual enum - see the PEP 435 reference implementation - discussion > thread where I modified the reference implementation to give enum instances > 2-phase construction, passing the value to Enum.__init__. This way is more > limited, as you need to use an appropriate mix-in type which puts certain > constraints on the behaviour of the enum instances (e.g. they *have* to be > int instances for auto-numbering). The implementation is also more complex, > and as noted in that thread, __init__ might not be appropriate for an Enum. I'll let Eli or Ethan respond to this. It sounds fine to me to support you with some kind of hook in the spec, even though I personally would rather assign my enum values explicitly (I'm old-fashioned that way :-). -- --Guido van Rossum (python.org/~guido) From eliben at gmail.com Mon May 6 00:55:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 5 May 2013 15:55:46 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On Sun, May 5, 2013 at 3:34 PM, Tim Delaney wrote: > On 6 May 2013 08:00, Guido van Rossum wrote: > >> On Sun, May 5, 2013 at 2:55 PM, Tim Delaney >> wrote: >> > So long as I can get one of the requirements documented to implement an >> > auto-number syntax I'll be happy enough with stdlib enums I think. >> >> Specifically what do you want the PEP to promise? >> > > It was mentioned in the other threads, but the requirement is either: > > 1. That the dictionary returned from .__prepare__ provide > a way to obtain the enum instance names once it's been populated (e.g. once > it's been passed as the classdict to __new__). The reference implementation > provides a _enum_names list attribute. The enum names need to be available > to a metaclass subclass before calling the base metaclass __new__. > > OR > > 2. A way for subclasses of Enum to modify the value before it's assigned > to the actual enum - see the PEP 435 reference implementation - discussion > thread where I modified the reference implementation to give enum instances > 2-phase construction, passing the value to Enum.__init__. This way is more > limited, as you need to use an appropriate mix-in type which puts certain > constraints on the behaviour of the enum instances (e.g. they *have* to be > int instances for auto-numbering). The implementation is also more complex, > and as noted in that thread, __init__ might not be appropriate for an Enum. > So your preferred solution is (1), which requires exposing the metaclass and an attribute publicly? I have to ask - to what end? What is the goal of this? To have an AutoNumberedEnum which is guaranteed to be compatible with stdlib's Enum? IMHO this goal is not important enough, and I'm not aware of other stdlib modules that go to such lengths exposing implementation details publicly (but I'd be happy to be educated on this!) Assuming ref435 goes as-is into stdlib in 3.4, can't you just assume its implementation? And then change yours if it changes? Python's stdlib doesn't change that often, but if we do want to change the implementation at some point, this documented piece of internals is surely going to be in the way. Why should the future malleability of a stdlib module be sacrificed for the sake of this extension? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Mon May 6 00:57:57 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 05 May 2013 15:57:57 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518659AB.6090302@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518659AB.6090302@stoneleaf.us> Message-ID: <5186E3F5.3020206@g.nevcal.com> On 5/5/2013 6:07 AM, Ethan Furman wrote: > class NEI( NamedInt, Enum ): > x = NamedInt('the-x', 1 ) > y = NamedInt('the-y', 2 ) > @property > def __name__(self): > return self.value.__name__ This cured it, thank you. But I really still don't understand why the numbers showed up as names... seems to me that either the name of the enumeration member, or the name of the NamedInt should have showed up, without this property definition. The PEP is the only documentation I had, other than the reference implementation, but I can't say I fully understand the reference implementation, not having dealt with metaclass much. Hopefully the documentation will explain all the incantations necessary to make things work in an expected manner. I guess I don't understand why Enum can't wrap the __str__ and __repr__ of the type of the mixed class, instead of replacing it, and then forcing overrides in subclasses. But empirically, it didn't seem to be __str__ and __repr__ that caused the visible problem, it was __name__. So you asked why would I want to put a named object as the value of something else with a name... and that is a fair question... really I don't... but I see one of the beneficial uses of Enum being collecting flags values together, and constructing a flag that is useful for debugging (has a name or expression telling what the values are). So while the PEP thinks IntEnum is an odd case, I think it is important. And since IntEnum loses its name when included in an expression, I was trying to marry it to NamedInt to fill the gap. So you asked why would I want to put a named object as the value of something else with a name... maybe Enum should make provision for that... if the primary type ( Int for IntEnum, NamedInt for NamedIntEnum) happens to have a __name__ property, maybe the name of enumeration members should be passed to the constructor for the members... in other words, class NIE( NamedInt, Enum ): x = 1 y = 2 could construct enumeration members x and y whose values are NamedInt('x', 1) and NamedInt('y', 2)... -------------- next part -------------- An HTML attachment was scrubbed... URL: From Nikolaus at rath.org Mon May 6 00:16:57 2013 From: Nikolaus at rath.org (Nikolaus Rath) Date: Sun, 05 May 2013 15:16:57 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: (Guido van Rossum's message of "Sun, 5 May 2013 10:49:22 -0700") References: Message-ID: <87txmh3wdi.fsf@vostro.rath.org> Guido van Rossum writes: >> 1. Having to enter the values is annoying. Sorry, I read the rationale and >> all that, and I *still* want to write a C-Like enum { A, B, C }. I fully >> expect to edit and reorder enums (if I ever use them) and get irritated with >> having to update the value assignments. > > I guess there are cultural differences around this. Anyway, you can > use the functional/convenience API for this purpose. Would it be wise to forbid ... as an enum value to preserve the option to use it for automatic value assignment in some indefinite future? Best, -Nikolaus -- ?Time flies like an arrow, fruit flies like a Banana.? PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C From guido at python.org Mon May 6 01:06:23 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 16:06:23 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On Sun, May 5, 2013 at 3:55 PM, Eli Bendersky wrote: > > > > On Sun, May 5, 2013 at 3:34 PM, Tim Delaney > wrote: >> >> On 6 May 2013 08:00, Guido van Rossum wrote: >>> >>> On Sun, May 5, 2013 at 2:55 PM, Tim Delaney >>> wrote: >>> > So long as I can get one of the requirements documented to implement an >>> > auto-number syntax I'll be happy enough with stdlib enums I think. >>> >>> Specifically what do you want the PEP to promise? >> >> >> It was mentioned in the other threads, but the requirement is either: >> >> 1. That the dictionary returned from .__prepare__ provide >> a way to obtain the enum instance names once it's been populated (e.g. once >> it's been passed as the classdict to __new__). The reference implementation >> provides a _enum_names list attribute. The enum names need to be available >> to a metaclass subclass before calling the base metaclass __new__. >> >> OR >> >> 2. A way for subclasses of Enum to modify the value before it's assigned >> to the actual enum - see the PEP 435 reference implementation - discussion >> thread where I modified the reference implementation to give enum instances >> 2-phase construction, passing the value to Enum.__init__. This way is more >> limited, as you need to use an appropriate mix-in type which puts certain >> constraints on the behaviour of the enum instances (e.g. they *have* to be >> int instances for auto-numbering). The implementation is also more complex, >> and as noted in that thread, __init__ might not be appropriate for an Enum. > > > So your preferred solution is (1), which requires exposing the metaclass and > an attribute publicly? I have to ask - to what end? What is the goal of > this? To have an AutoNumberedEnum which is guaranteed to be compatible with > stdlib's Enum? > > IMHO this goal is not important enough, and I'm not aware of other stdlib > modules that go to such lengths exposing implementation details publicly > (but I'd be happy to be educated on this!) > > Assuming ref435 goes as-is into stdlib in 3.4, can't you just assume its > implementation? And then change yours if it changes? Python's stdlib doesn't > change that often, but if we do want to change the implementation at some > point, this documented piece of internals is surely going to be in the way. > Why should the future malleability of a stdlib module be sacrificed for the > sake of this extension? Hm. Either you should argue much more strongly against Tim's solution, or you should expose the implementation detail he needs. Recommending that he should just use an internal detail of the implementation and hope it never changes sounds like encouraging a bad habit. It also seems you're contradicting yourself by saying that the code is unlikely to change and at the same time wanting to reserve the right to change it. Also note that the future malleability of a stdlib module is affected even by 3rd party use that goes beyond the documented API -- it all depends on a pragmatic weighing of how important a proposed change is against how likely it is to break existing use, and there are plenty of examples in the past where we have resisted changing an implementation detail because it would break too much code. If you really don't want to guarantee this part of the implementation, you should recommend that Tim just copy all of ref435. TBH I don't see what deriving AutoNumberEnum from the stdlib Enum class buy him except that he has to maintain less code. I don't expect there to be a lot of opportunities anywhere for writing isinstance(x, Enum). -- --Guido van Rossum (python.org/~guido) From timothy.c.delaney at gmail.com Mon May 6 01:14:36 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Mon, 6 May 2013 09:14:36 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: On 6 May 2013 08:55, Eli Bendersky wrote: > 1. That the dictionary returned from .__prepare__ provide > a way to obtain the enum instance names once it's been populated (e.g. once > it's been passed as the classdict to __new__). The reference implementation > provides a _enum_names list attribute. The enum names need to be available > to a metaclass subclass before calling the base metaclass __new__. > >> So your preferred solution is (1), which requires exposing the metaclass >> and an attribute publicly? I have to ask - to what end? What is the goal of >> this? To have an AutoNumberedEnum which is guaranteed to be compatible with >> stdlib's Enum? >> > My preferred solution is 1 (for the reason mentioned above) but it does not require exposing the metaclass publically (that's obtainable via type(Enum)). It does require a way to get the enum names before calling the base metaclass __new__, but that does not necessarily imply that I'm advocating exposing _enum_names (or at least, not directly). My preferred way would probably be a note that the dictionary returned from the enum metaclass __prepare__ implements an enum_names() or maybe __enum_names__() method which returns an iterator over the enum instance names in definition order. The way this is implemented by the dictionary would be an implementation detail. The enum metaclass __new__ needs access to the enum instance names in definition order, so I think making it easily available to enum metaclass subclasses as well just makes sense. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon May 6 01:15:05 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 16:15:05 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <87txmh3wdi.fsf@vostro.rath.org> References: <87txmh3wdi.fsf@vostro.rath.org> Message-ID: <5186E7F9.30701@stoneleaf.us> On 05/05/2013 03:16 PM, Nikolaus Rath wrote: > Guido van Rossum writes: >>> 1. Having to enter the values is annoying. Sorry, I read the rationale and >>> all that, and I *still* want to write a C-Like enum { A, B, C }. I fully >>> expect to edit and reorder enums (if I ever use them) and get irritated with >>> having to update the value assignments. >> >> I guess there are cultural differences around this. Anyway, you can >> use the functional/convenience API for this purpose. > > Would it be wise to forbid ... as an enum value to preserve the option > to use it for automatic value assignment in some indefinite future? No. If somebody has a use for ... is a value we're not going to say no on the very remote chance that Guido someday changes his mind on that point. ;) -- ~Ethan~ From barry at python.org Mon May 6 01:16:10 2013 From: barry at python.org (Barry Warsaw) Date: Sun, 5 May 2013 19:16:10 -0400 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: <20130505191610.1d691275@anarchist> On May 05, 2013, at 03:43 PM, Guido van Rossum wrote: >I'll let Eli or Ethan respond to this. It sounds fine to me to support >you with some kind of hook in the spec, even though I personally would >rather assign my enum values explicitly (I'm old-fashioned that way >:-). Assuming the picklability of functional API created Enums is fixed (sorry, another threadsplosion still awaits me), if you *really* have to have autonumbering, use the functional API. IMHO, the class API doesn't need them. -Barry From timothy.c.delaney at gmail.com Mon May 6 01:22:58 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Mon, 6 May 2013 09:22:58 +1000 Subject: [Python-Dev] PEP 435 - reference implementation discussion In-Reply-To: References: <5185F54A.6060909@stoneleaf.us> <5185F96F.1000303@stoneleaf.us> Message-ID: On 5 May 2013 21:58, Tim Delaney wrote: > On 5 May 2013 16:17, Ethan Furman wrote: > >> On 05/04/2013 10:59 PM, Ethan Furman wrote: >> >>> On 05/04/2013 08:50 PM, Tim Delaney wrote: >>> >>>> 2. Instead of directly setting the _name and _value of the enum_item, >>>> it lets the Enum class do it via Enum.__init__(). >>>> >>> Subclasses can override this. This gives Enums a 2-phase construction >>>> just like other classes. >>>> >>> >>> Not sure I care for this. Enums are, at least in theory, immutable >>> objects, and immutable objects don't call __init__. >>> >> >> Okay, still thinking about `value`, but as far as `name` goes, it should >> not be passed -- it must be the same as it was in the class definition >> > > Agreed - name should not be passed. > > I would have preferred to use __new__, but Enum.__new__ doesn't get called > at all from enum_type (and the implementation wouldn't be at all > appropriate anyway). > *If* I can manage to convince Guido and Eli over in that other (initial values) thread, I think it's still probably worthwhile calling __init__ on the enum instance, but with no parameters. That would allow more behaviour-based enums to set up any other initial state they require. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Mon May 6 01:24:10 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 5 May 2013 16:24:10 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: > >> It was mentioned in the other threads, but the requirement is either: > >> > >> 1. That the dictionary returned from .__prepare__ > provide > >> a way to obtain the enum instance names once it's been populated (e.g. > once > >> it's been passed as the classdict to __new__). The reference > implementation > >> provides a _enum_names list attribute. The enum names need to be > available > >> to a metaclass subclass before calling the base metaclass __new__. > >> > >> OR > >> > >> 2. A way for subclasses of Enum to modify the value before it's assigned > >> to the actual enum - see the PEP 435 reference implementation - > discussion > >> thread where I modified the reference implementation to give enum > instances > >> 2-phase construction, passing the value to Enum.__init__. This way is > more > >> limited, as you need to use an appropriate mix-in type which puts > certain > >> constraints on the behaviour of the enum instances (e.g. they *have* to > be > >> int instances for auto-numbering). The implementation is also more > complex, > >> and as noted in that thread, __init__ might not be appropriate for an > Enum. > > > > > > So your preferred solution is (1), which requires exposing the metaclass > and > > an attribute publicly? I have to ask - to what end? What is the goal of > > this? To have an AutoNumberedEnum which is guaranteed to be compatible > with > > stdlib's Enum? > > > > IMHO this goal is not important enough, and I'm not aware of other stdlib > > modules that go to such lengths exposing implementation details publicly > > (but I'd be happy to be educated on this!) > > > > Assuming ref435 goes as-is into stdlib in 3.4, can't you just assume its > > implementation? And then change yours if it changes? Python's stdlib > doesn't > > change that often, but if we do want to change the implementation at some > > point, this documented piece of internals is surely going to be in the > way. > > Why should the future malleability of a stdlib module be sacrificed for > the > > sake of this extension? > > Hm. Either you should argue much more strongly against Tim's solution, > or you should expose the implementation detail he needs. Recommending > that he should just use an internal detail of the implementation and > hope it never changes sounds like encouraging a bad habit. It also > seems you're contradicting yourself by saying that the code is > unlikely to change and at the same time wanting to reserve the right > to change it. > OK, then I'll say without contradictions that I don't expect the implementation of Enum to be stable at this point. We don't even *have* an implementation yet. All we have is some (pretty good!) code Ethan wrote and I only partially reviewed. The final implementation may be completely different, and then again we may want to change it in light of new input. I wouldn't want to constrain ourselves at this point. Perhaps when 3.4 is branched will be a point in time in which this can be re-visited. Makes sense? > Also note that the future malleability of a stdlib module is affected > even by 3rd party use that goes beyond the documented API -- it all > depends on a pragmatic weighing of how important a proposed change is > against how likely it is to break existing use, and there are plenty > of examples in the past where we have resisted changing an > implementation detail because it would break too much code. > Agreed, but if we document these details, we're forever bound, pragmatic weighing notwithstanding. Also, in this particular case if auto-numbered enums in the class API are deemed super-useful we may end up incorporating a syntax for them anyway, which will render the external module obsolete. > If you really don't want to guarantee this part of the implementation, > you should recommend that Tim just copy all of ref435. TBH I don't see > what deriving AutoNumberEnum from the stdlib Enum class buy him except > that he has to maintain less code. I don't expect there to be a lot of > opportunities anywhere for writing isinstance(x, Enum). > That's what I was trying to say, I guess. Even if the chance of changing the implementation of Enum is pretty low (after 3.4 I mean, before that it's pretty damn high), I don't think that restricting ourselves here is justified by Tim's maintaining less code in his external module. With all due respect, of course ;-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon May 6 01:24:21 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 5 May 2013 16:24:21 -0700 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <5186E7F9.30701@stoneleaf.us> References: <87txmh3wdi.fsf@vostro.rath.org> <5186E7F9.30701@stoneleaf.us> Message-ID: On Sun, May 5, 2013 at 4:15 PM, Ethan Furman wrote: > On 05/05/2013 03:16 PM, Nikolaus Rath wrote: >> >> Guido van Rossum writes: >>>> >>>> 1. Having to enter the values is annoying. Sorry, I read the rationale >>>> and >>>> all that, and I *still* want to write a C-Like enum { A, B, C }. I fully >>>> expect to edit and reorder enums (if I ever use them) and get irritated >>>> with >>>> having to update the value assignments. >>> >>> >>> I guess there are cultural differences around this. Anyway, you can >>> use the functional/convenience API for this purpose. >> >> >> Would it be wise to forbid ... as an enum value to preserve the option >> to use it for automatic value assignment in some indefinite future? > > > No. If somebody has a use for ... is a value we're not going to say no on > the very remote chance that Guido someday changes his mind on that point. > ;) Correct. *If* we were to have a change of heart on this issue, we'd just introduce a class AutoNumberEnum. But I find the "..." syntax sufficiently ugly that I really don't expect I'll ever change my mind. -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Mon May 6 01:57:13 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 06 May 2013 11:57:13 +1200 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: <5186BC8E.6030502@stoneleaf.us> References: <5186BC8E.6030502@stoneleaf.us> Message-ID: <5186F1D9.3040604@canterbury.ac.nz> Ethan Furman wrote: > --> class Color(Enum): > ... red, green, blue > ... > > --> class MoreColor(Color): > ... red, orange, yellow > ... > > --> type(MoreColor.red) is MoreColor > False This argument no longer applies, since we're not allowing enums to be extended. > class Color(Enum): > red = e() > green = e() > blue = e() > > and you can keep using `e()` for all your enumerations, since you don't > care what actual value each enumeration member happens to get. I don't think it's true that people wanting auto-numbering don't care what values they get. Rather, they probably want ordinal values assigned separately and consecutively for each type, as in every other language I'm aware of that provides auto-numbered enums. If you *really* don't care what the values are, there's no need for the items to have values at all. -- Greg From ethan at stoneleaf.us Mon May 6 02:03:52 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 17:03:52 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: <5186F368.4050501@stoneleaf.us> On 05/05/2013 04:14 PM, Tim Delaney wrote: > > 1. That the dictionary returned from .__prepare__ provide a way to obtain the enum instance names > once it's been populated (e.g. once it's been passed as the classdict to __new__). The reference implementation > provides a _enum_names list attribute. The enum names need to be available to a metaclass subclass before calling > the base metaclass __new__. [...] > My preferred solution is 1 (for the reason mentioned above) but it does not require exposing the metaclass publically > (that's obtainable via type(Enum)). It does require a way to get the enum names before calling the base metaclass > __new__, but that does not necessarily imply that I'm advocating exposing _enum_names (or at least, not directly). > > My preferred way would probably be a note that the dictionary returned from the enum metaclass __prepare__ implements an > enum_names() or maybe __enum_names__() method which returns an iterator over the enum instance names in definition > order. The way this is implemented by the dictionary would be an implementation detail. I like having an __enum_names__() that returns a list or tuple of (name, value) pairs in definition order. -- ~Ethan~ From ethan at stoneleaf.us Mon May 6 02:44:42 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 05 May 2013 17:44:42 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: <5186F1D9.3040604@canterbury.ac.nz> References: <5186BC8E.6030502@stoneleaf.us> <5186F1D9.3040604@canterbury.ac.nz> Message-ID: <5186FCFA.7030401@stoneleaf.us> On 05/05/2013 04:57 PM, Greg Ewing wrote: > Ethan Furman wrote: >> --> class Color(Enum): >> ... red, green, blue >> ... >> >> --> class MoreColor(Color): >> ... red, orange, yellow >> ... >> >> --> type(MoreColor.red) is MoreColor >> False > > This argument no longer applies, since we're not > allowing enums to be extended. Actually, it does: --> class Color(Enum): ... black, red, green, blue, cyan, magenta, yellow, black # oops! -- ~Ethan~ From ncoghlan at gmail.com Mon May 6 04:14:58 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 May 2013 12:14:58 +1000 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5186E3F5.3020206@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518659AB.6090302@stoneleaf.us> <5186E3F5.3020206@g.nevcal.com> Message-ID: On Mon, May 6, 2013 at 8:57 AM, Glenn Linderman wrote: > So you asked why would I want to put a named object as the value of > something else with a name... maybe Enum should make provision for that... > if the primary type ( Int for IntEnum, NamedInt for NamedIntEnum) happens to > have a __name__ property, maybe the name of enumeration members should be > passed to the constructor for the members... in other words, > > class NIE( NamedInt, Enum ): > x = 1 > y = 2 > > could construct enumeration members x and y whose values are NamedInt('x', > 1) and > NamedInt('y', 2)... I think there comes a point where "subclass the metaclass" is the right answer to "how do I do X with this type?". I believe making two different kinds of value labelling mechanisms play nice is such a case :) Abstract Base Classes were the first real example of metaclass magic in the standard library, and they avoided many of the confusing aspects of metaclass magic by banning instantiation - once you get to a concrete subclass, instances behave pretty much like any other instance, even though type(type(obj)) is abc.ABCMeta rather than type: >>> import collections >>> class Example(collections.Hashable): ... def __hash__(self): return 0 ... >>> type(Example) >>> type(Example()) >>> type(type(Example())) Enumerations will only be the second standard library instance of using metaclasses to create classes and objects that behave substantially differently from conventional ones that use type as the metaclass. The difference in this case relative to ABCs is that more of the behavioural changes are visible on the instances. This is *not* a bad thing (this is exactly what the metaclass machinery is designed to enable), but it does mean we're going to have to up our game in terms of documenting some of the consequences (as Ethan noted, this doesn't need to go into the PEP, since that's aimed at convincing people like Guido that already understand this stuff - it's the enum module documentation, and potentially the language reference, that is going to need enhancement). "Here's the section on metaclasses in the language reference, figure out the consequences for yourselves" is a defensible situation when we're providing metaclasses primarily as a toolkit for third party frameworks, but a standard library module that exploits them the way the enum module does places additional obligations on us. The upside is that the very presence of the enum module provides a concrete non-trivial example for us to lean on in those explanations, rather than having to come up with toy examples :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From v+python at g.nevcal.com Mon May 6 04:46:55 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 05 May 2013 19:46:55 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518659AB.6090302@stoneleaf.us> <5186E3F5.3020206@g.nevcal.com> Message-ID: <5187199F.1080500@g.nevcal.com> On 5/5/2013 7:14 PM, Nick Coghlan wrote: > I think there comes a point where "subclass the metaclass" is the > right answer to "how do I do X with this type?". I believe making two > different kinds of value labelling mechanisms play nice is such a case > :) Could be. Could be that sufficient operators added to an IntEnum subclass might work too. Although it might be unexpected that adding two enumeration members would produce a NamedInt :) But of course, if the enumeration were defined by NamedIntEnum, it might be less surprising. On the other hand, if Enum is in stdlib, and lots of flags parameters get defined with Enum, then it would be a pain in the neck to redefine them with NamedIntEnum, so that debugging of flag combinations is easier. There are enough flag parameters in the stdlib APIs to make me think this would be a deficiency. Sadly, once the Enums are defined, there is to be no way to subclass them to add functionality, like producing a NamedInt result from operations on them. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 6 06:51:33 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 May 2013 14:51:33 +1000 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5187199F.1080500@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518659AB.6090302@stoneleaf.us> <5186E3F5.3020206@g.nevcal.com> <5187199F.1080500@g.nevcal.com> Message-ID: On Mon, May 6, 2013 at 12:46 PM, Glenn Linderman wrote: > Sadly, once the Enums are defined, there is to be no way to subclass them to > add functionality, like producing a NamedInt result from operations on them. That rule is enforced by the metaclass, so... ;) Custom metaclasses are amazingly powerful, the trick is to deploy their power judiciously, such that people can use the custom classes associated with them without getting confused. SQL Alchemy, Django and other frameworks do this quite well, but it *does* create subsections of the type hierarchy which don't play well with others (for example, having the same class be both an SQL Alchemy Table definition and a Django Model definition probably isn't going to work). Enums are the same - they carve out a subtree in the type hierarchy that *doesn't* behave the same as the standard tree anchored directly on type. This *is* going to cause conflicts with meta-tools that only handle ordinary types - the trick is that the cause of the problem (a custom metaclass) is also the solution (a custom metaclass derived from enum.enum_type). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From v+python at g.nevcal.com Mon May 6 07:50:24 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sun, 05 May 2013 22:50:24 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518659AB.6090302@stoneleaf.us> <5186E3F5.3020206@g.nevcal.com> <5187199F.1080500@g.nevcal.com> Message-ID: <518744A0.1030207@g.nevcal.com> On 5/5/2013 9:51 PM, Nick Coghlan wrote: > On Mon, May 6, 2013 at 12:46 PM, Glenn Linderman wrote: >> Sadly, once the Enums are defined, there is to be no way to subclass them to >> add functionality, like producing a NamedInt result from operations on them. > That rule is enforced by the metaclass, so... ;) Sure. But: stdlib contains: Enum (with subclass prohibiting metaclass) APIs with flags (assumed, in time) Enums defining the flag values (complete with enforcement by the metaclass) user code: have to recreate all the Enums defining flag values using custom enum_type metaclass. Seems like FlagEnum might be a a good thing to invent before (re-)defining Enums for all the flag values (that's what I'm after in combining NamedInt and Enum, really). I suppose that a mere mortal could simply define a subclass of int that keeps track of expressions during arithmetic... using __name__ attributes of its operands if they exist, and as long as the mere mortal remembered to use it, it would achieve the same goal. But having that built in with the flag value definitions would assure that it was available for everyone, all the time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon May 6 08:27:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 May 2013 08:27:45 +0200 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: References: <20130505120530.08b62855@fsol> <20130505194603.142cbace@fsol> Message-ID: <20130506082745.37748bbe@fsol> On Sun, 5 May 2013 15:27:36 -0700 Eli Bendersky wrote: > > As for pickling enums created with the functional API, I don't think we now > provide less than the pickle module dictates in the general sense. The > pickle docs say: Next time, please try reading the message(s) you are replying to before posting. Thanks Antoine. From arigo at tunes.org Mon May 6 10:46:33 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 6 May 2013 10:46:33 +0200 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables Message-ID: Hi all, In the context PyPy, we've recently seen again the issue of "x is y" not being well-defined on immutable constants. I've tried to summarize the issues and possible solutions in a mail to pypy-dev [1] and got some answers already. Having been convinced that the core is a language design issue, I'm asking for help from people on this list. (Feel free to cross-post.) [1] http://mail.python.org/pipermail/pypy-dev/2013-May/011299.html To summarize: the issue is a combination of various optimizations that work great otherwise. For example we can store integers directly in lists of integers, so when we read them back, we need to put them into fresh W_IntObjects (equivalent of PyIntObject). We solved temporarily the issue of "I'm getting an object which isn't ``is``-identical to the one I put in!" by making all equal integers ``is``-identical. This required hacking at ``id(x)`` as well to keep the requirement ``x is y <=> id(x)==id(y)``. This is getting annoying for strings, though -- how do you compute the id() of a long string? Give a unique long integer? And if we do the same for tuples, what about their id()? The long-term solution that seems the most stable to me would be to relax the requirement ``x is y <=> id(x)==id(y)``. If we can get away with only ``x is y <= id(x)==id(y)`` then it would allow us to implement ``is`` in a consistent way (e.g. two strings with equal content would always be ``is``-identical) while keeping id() reasonable (both in terms of complexity and of size of the resulting long number). Obviously ``x is y <=> id(x)==id(y)`` would still be true if any of ``x`` or ``y`` is not an immutable "by-value" built-in type. This is clearly a language design issue though. I can't really think of a use case that would break if we relax the requirement, but I might be wrong. It seems to me that at most some modules like pickle which use id()-keyed dictionaries will fail to find some otherwise-identical objects, but would still work (even if tuples are "relaxed" in this way, you can't have cycles with only tuples). A bient?t, Armin. From tjreedy at udel.edu Mon May 6 14:43:38 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 06 May 2013 08:43:38 -0400 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: References: Message-ID: On 5/6/2013 4:46 AM, Armin Rigo wrote: 'is' *is* well-defined. In production code, the main use of 'is' is for builtin singletons, the bool doubleton, and object instances used as sentinals. The most common use, in particular, is 'if a is None:'. For such code, the result must be independent of implementation. For other immutable classes, for which 'is' is mostly irrelevant and useless, the result of some code is intentionally implementation dependent to allow optional optimizations. 'Implementation dependent' is differnt from 'random'. For such classes (int, tuple, set, string), the main use of 'is' is to test if the intended optimization is being done. In other words, for these classes, the implementation dependence is a feature. The general advice given to newbies by python-list regulars is to limit the use of 'is' with immutables to the first group of classes and never use it for the second. > In the context PyPy, we've recently seen again the issue of "x is y" > not being well-defined on immutable constants. Since immutable objects have a constant value by definition of immutable, I am not sure if you are trying to say anything more by adding the extra word. > I've tried to > summarize the issues and possible solutions in a mail to pypy-dev [1] > and got some answers already. Having been convinced that the core is > a language design issue, I'm asking for help from people on this list. > (Feel free to cross-post.) > > [1] http://mail.python.org/pipermail/pypy-dev/2013-May/011299.html > > To summarize: the issue is a combination of various optimizations that > work great otherwise. For example we can store integers directly in > lists of integers, so when we read them back, we need to put them into > fresh W_IntObjects (equivalent of PyIntObject). Interesting. I presume you only do this when the ints all fit in a machine int so that all require the same number of bytes so you can efficiently index and slice. This is sort of what strings do with characters, except for there being no char class. The similarity is that if you concatenate a string to another string and then slice it back out, you generally get a different object, but may get the same object if some optimization has that effect. For instance, in current CPython, s is ''+s is s+''. The details depend on the CPython version. > We solved temporarily the issue of "I'm getting an object which isn't > ``is``-identical to the one I put in!" Does the definition of list operations guarantee preservation of object identify? After 'somelist.append(a)', must 'somelist.pop() is a' be true? I am not sure. For immutables, it could be an issue if someone stores the id. But I don't know why someone would do that for an int. As I already said, we routinely tell people on python-list (c.l.p) that they shouldn't care about ids of ints.. The identity of an int cannot (and should not) affect the result of numerical calculation. > by making all equal integers ``is``-identical. Which changes the definition of 'is', or rather, makes the definition implementation dependent. > This required hacking at ``id(x)`` as well to keep the requirement ``x > is y <=> id(x)==id(y)``. This is getting annoying for strings, though > -- how do you compute the id() of a long string? Give a unique long > integer? And if we do the same for tuples, what about their id()? The solution to the annoyance is to not do this ;-). More seriously, are you planning to unbox strings or tuples? > The long-term solution that seems the most stable to me would be to > relax the requirement ``x is y <=> id(x)==id(y)``. I see this as a definition, not a requirement. Changing the definition would break any use that depends on the definition being what it is. -- Terry Jan Reedy From ncoghlan at gmail.com Mon May 6 15:18:54 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 6 May 2013 23:18:54 +1000 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: References: Message-ID: On Mon, May 6, 2013 at 6:46 PM, Armin Rigo wrote: > This is clearly a language design issue though. I can't really think > of a use case that would break if we relax the requirement, but I > might be wrong. It seems to me that at most some modules like pickle > which use id()-keyed dictionaries will fail to find some > otherwise-identical objects, but would still work (even if tuples are > "relaxed" in this way, you can't have cycles with only tuples). IIRC, Jython just delays calculating the object id() until it is called, and lives with it potentially being incredibly expensive to calculate. Is there some way PyPy can run with a model where "is" is defined in terms of values for immutable objects, with a lazily populated mapping from values to numeric ids if you're forced to define them through an explicit call to id()? We're not going to change the language design because people don't understand the difference between "is" and "==" and then wrongly blame PyPy for breaking their code. If you're tired of explaining to people that it's their code which is buggy rather than PyPy, then your Solution 2 (mimic'ing CPython's caching) is likely your best bet. Alternatively, we've offered to add CompatibilityWarning to CPython in the past (there may even be a preliminary patch for it on the tracker). That offer is still open, and would be applicable to this case. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Mon May 6 15:26:56 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 May 2013 15:26:56 +0200 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables References: Message-ID: <20130506152656.279a0a7b@pitrou.net> Le Mon, 6 May 2013 23:18:54 +1000, Nick Coghlan a ?crit : > > IIRC, Jython just delays calculating the object id() until it is > called, and lives with it potentially being incredibly expensive to > calculate. Is there some way PyPy can run with a model where "is" is > defined in terms of values for immutable objects, with a lazily > populated mapping from values to numeric ids if you're forced to > define them through an explicit call to id()? This sounds reasonable. Actually, for small ints, id() could simply be a tagged pointer (e.g. "1 + 2 * myint.value"). > We're not going to change the language design because people don't > understand the difference between "is" and "==" and then wrongly blame > PyPy for breaking their code. Well, if I'm doing: mylist = [x] and ``mylist[0] is x`` returns False, then I pretty much consider the Python implementation to be broken, not my code :-) Regards Antoine. From ncoghlan at gmail.com Mon May 6 16:20:56 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 7 May 2013 00:20:56 +1000 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: <20130506152656.279a0a7b@pitrou.net> References: <20130506152656.279a0a7b@pitrou.net> Message-ID: On Mon, May 6, 2013 at 11:26 PM, Antoine Pitrou wrote: > Le Mon, 6 May 2013 23:18:54 +1000, > Nick Coghlan a ?crit : >> We're not going to change the language design because people don't >> understand the difference between "is" and "==" and then wrongly blame >> PyPy for breaking their code. > > Well, if I'm doing: > > mylist = [x] > > and ``mylist[0] is x`` returns False, then I pretty much consider the > Python implementation to be broken, not my code :-) Yeah, that's a rather good point - I briefly forgot that the trigger here was PyPy's specialised single type containers. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Mon May 6 18:31:06 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 6 May 2013 12:31:06 -0400 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518659AB.6090302@stoneleaf.us> <5186E3F5.3020206@g.nevcal.com> <5187199F.1080500@g.nevcal.com> Message-ID: <20130506123106.7760cf61@anarchist> On May 06, 2013, at 02:51 PM, Nick Coghlan wrote: >Enums are the same - they carve out a subtree in the type hierarchy >that *doesn't* behave the same as the standard tree anchored directly >on type. This *is* going to cause conflicts with meta-tools that only >handle ordinary types - the trick is that the cause of the problem (a >custom metaclass) is also the solution (a custom metaclass derived >from enum.enum_type). Agreed. When the time is right, I think we should consider implementation details that allow for useful flexibility. An example would be the prohibition on extension through subclassing. I'm perfectly willing to accept that as the standard behavior on stdlib Enums, but I'd like to be able to override this with my own custom metaclass subclass. I think it could be done quite easily with the right refactoring of the implementation, although there would be some discussion around what is blessed private API and what is YOYO[1] API. -Barry [1] You're own your own. From barry at python.org Mon May 6 18:53:52 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 6 May 2013 12:53:52 -0400 Subject: [Python-Dev] PEP 435 - requesting pronouncement In-Reply-To: <51866223.6090003@stoneleaf.us> References: <20130505120530.08b62855@fsol> <51866223.6090003@stoneleaf.us> Message-ID: <20130506125352.7af662c6@anarchist> On May 05, 2013, at 06:44 AM, Ethan Furman wrote: > to easily create enums when prototyping or at the interactive prompt (I'll > use it all the time -- it's convenient! ;) +1billion (That's literally the number of times I've used the functional API when discussion various aspects of enum behavior :). -Barry From g.brandl at gmx.net Mon May 6 19:48:32 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 06 May 2013 19:48:32 +0200 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: <5186BC8E.6030502@stoneleaf.us> References: <5186BC8E.6030502@stoneleaf.us> Message-ID: Am 05.05.2013 22:09, schrieb Ethan Furman: > About the closest you going to be able to get is something like: > > def e(_next=[1]): > e, _next[0] = _next[0], _next[0] + 1 > return e > > class Color(Enum): > red = e() > green = e() > blue = e() Uh, that's surely more nicely spelled as "e = itertools.count()"? Georg From ethan at stoneleaf.us Mon May 6 19:53:57 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 06 May 2013 10:53:57 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: <5187EE35.2020807@stoneleaf.us> On 05/06/2013 10:48 AM, Georg Brandl wrote: > Am 05.05.2013 22:09, schrieb Ethan Furman: > >> About the closest you going to be able to get is something like: >> >> def e(_next=[1]): >> e, _next[0] = _next[0], _next[0] + 1 >> return e >> >> class Color(Enum): >> red = e() >> green = e() >> blue = e() > > Uh, that's surely more nicely spelled as "e = itertools.count()"? Why, yes, I believe it is. :) -- ~Ethan~ From solipsis at pitrou.net Mon May 6 21:17:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 6 May 2013 21:17:01 +0200 Subject: [Python-Dev] cpython: Issue #11816: multiple improvements to the dis module References: <3b45DP41NszNgv@mail.python.org> Message-ID: <20130506211701.67a86ad4@fsol> On Mon, 6 May 2013 15:59:49 +0200 (CEST) nick.coghlan wrote: > http://hg.python.org/cpython/rev/f65b867ce817 > changeset: 83644:f65b867ce817 > user: Nick Coghlan > date: Mon May 06 23:59:20 2013 +1000 > summary: > Issue #11816: multiple improvements to the dis module > > * get_instructions generator > * ability to redirect output to a file > * Bytecode and Instruction abstractions > > Patch by Nick Coghlan, Ryan Kelly and Thomas Kluyver. > > files: > Doc/library/dis.rst | 269 +++++++++++++++++++----- > Doc/whatsnew/3.4.rst | 15 + > Lib/dis.py | 341 +++++++++++++++++++++--------- > Lib/test/test_dis.py | 339 +++++++++++++++++++++++++++--- > Misc/NEWS | 4 + > 5 files changed, 775 insertions(+), 193 deletions(-) Looks like you forgot to add bytecode_helper.py. Regards Antoine. From Steve.Dower at microsoft.com Mon May 6 21:46:56 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Mon, 6 May 2013 19:46:56 +0000 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> Message-ID: <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> > From: Nick Coghlan [mailto:ncoghlan at gmail.com] > Sent: Friday, May 3, 2013 2348 > > We don't need examples of arbitrary data file extentions, we need examples > of 4 letter extensions that are known to work correctly when placed on > PATHEXT, including when called from PowerShell. In the absence of > confirmation that 4-letter extensions work reliably in such cases, it seems > wise to abbreviate the Windows GUI application extension as .pzw. > > I've also cc'ed Steve Dower, since investigation of this kind of Windows > behavioural question is one of the things he offered distuils-sig help with > after PyCon US :) Thanks, Nick. I've been following along with this but haven't really been able to add anything. I can certainly say that I've never had any issue with more than 3 letters in an extension, and I deal with those every day (.pyproj, .csproj, .vxcproj.filters, etc). The PowerShell bug (which I hadn't heard of before) may be a complete non-issue, depending on how the associations are set up. To summarise the bug, when PowerShell invokes a command based on an extension in PATHEXT, only the first three characters of the extension are used to determine the associated program. I tested this by creating a file "test.txta" and adding ".TXTA" to my PATHEXT variable. Typing ".\test" in PowerShell opened the file in my text editor. This only affects PowerShell (cmd.exe handles it correctly) and only in the case where you don't specify the extension (".\test.txta" works fine, and with tab completion, this is probably more likely). It also ignores the associated command line and only uses the executable. (I'll pass this on to the PowerShell team, though I have no idea how they'll prioritise it, and of course there's probably no fix for existing versions.) Because we'd be claiming both .pyz and .pyzw, it's possible to work around this issue if we accept that .pyzw files may run with the .pyz program instead of the .pyzw program. Maybe it goes to something other than py.exe that could choose based on the extension. (Since other command-line arguments get stripped, adding an option to py.exe can't be done, and unless the current behaviour is for it to open .pyw files in pythonw.exe, I wouldn't want it to be different for .pyzw files.) However, anywhere in Windows that uses ShellExecute rather than FindExecutable will handle long extensions without any issue. AFAIK, this is everywhere except PowerShell, so I don't really see a strong case for breaking the w-suffix convention here. Cheers, Steve From p.f.moore at gmail.com Mon May 6 22:30:02 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 6 May 2013 21:30:02 +0100 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> Message-ID: On 6 May 2013 20:46, Steve Dower wrote: > To summarise the bug, when PowerShell invokes a command based on an > extension in PATHEXT, only the first three characters of the extension are > used to determine the associated program. I tested this by creating a file > "test.txta" and adding ".TXTA" to my PATHEXT variable. Typing ".\test" in > PowerShell opened the file in my text editor. This only affects PowerShell > (cmd.exe handles it correctly) and only in the case where you don't specify > the extension (".\test.txta" works fine, and with tab completion, this is > probably more likely). It also ignores the associated command line and only > uses the executable. > The form in which I hit the bug is that I tried to create a "script" extension so that "foo.script" would be a generic script with a #! extension specifying the interpreter. I was adding the extension to PATHEXT so that scripts would be run "inline" (displaying the output at the console prompt, rather than in a new transient console window) - again this is a Powershell-specific issue which does not affect CMD. But when I added .script to PATHEXT, the script ran, but in a separate console window, which flashed up too fast for me to see the output. (It may be that it's the clash with .scr screensaver files that caused the file to be treated as a windows executable rather than a console executable, but it's hard to tell when you can't see the error message :-() > (I'll pass this on to the PowerShell team, though I have no idea how > they'll prioritise it, and of course there's probably no fix for existing > versions.) > Thanks. In my view, it's a vaguely irritating rough edge rather than a dealbreaker. But it does cause problems as here. And the number of rough edges in powershell when dealing with "traditional" console commands (e.g., see the point above about needing PATHEXT to get inline output, rather than just to be able to omit the extension) are sufficient to make the accumulation greater than the sum of its individual parts. > Because we'd be claiming both .pyz and .pyzw, it's possible to work around > this issue if we accept that .pyzw files may run with the .pyz program > instead of the .pyzw program. Maybe it goes to something other than py.exe > that could choose based on the extension. (Since other command-line > arguments get stripped, adding an option to py.exe can't be done, and > unless the current behaviour is for it to open .pyw files in pythonw.exe, I > wouldn't want it to be different for .pyzw files.) > I'm not sure the behaviour is clearly defined enough to be certain of this (although I'll defer to someone who's looked at the Powershell source code :-)). In my experiments, it was frustratingly difficult to pin down the exact behaviour with any certainty. And given that the choice is over running a console executable or a Windows one, it can be particularly bad if the wrong one is run (console program pops up in a transient second window and the output gets lost, for example). Add the fact that the powershell behaviour is essentially undocumented, and it's hard to guarantee anything. On the plus side, I suspect (but haven't proved) that if the GUI extension (pyzw) gets misread as the console one (pyz) the behaviour is less serious, because PATHEXT is not as relevant for GUI programs. > However, anywhere in Windows that uses ShellExecute rather than > FindExecutable will handle long extensions without any issue. AFAIK, this > is everywhere except PowerShell, so I don't really see a strong case for > breaking the w-suffix convention here. > To be blunt, I see no point in using a pair of extensions that are known to be broken, even if only in Powershell, over a pair that will work everywhere (but are no more than mildly less consistent with other cases - note that while there's a py/pyw pair, there is no pycw corresponding to pyc, or pyow corresponding to pyo). Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Mon May 6 22:30:41 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Mon, 06 May 2013 21:30:41 +0100 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> Message-ID: So the bug would just cause .pyzw files to be opened with py instead of pyw? Won't this be harmless? I think the worst that would happen would be that you get a redundant console window if you are not already running powershell inside a console. -- Richard From dholth at gmail.com Mon May 6 22:35:38 2013 From: dholth at gmail.com (Daniel Holth) Date: Mon, 6 May 2013 16:35:38 -0400 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> Message-ID: As the PEP author I declare we can have 3-letter extensions. It is not a big deal. Daniel Holth On Mon, May 6, 2013 at 4:30 PM, Richard Oudkerk wrote: > So the bug would just cause .pyzw files to be opened with py instead of pyw? > Won't this be harmless? > > I think the worst that would happen would be that you get a redundant > console window if you are not already running powershell inside a console. > > -- > Richard > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/dholth%40gmail.com From pje at telecommunity.com Mon May 6 23:20:38 2013 From: pje at telecommunity.com (PJ Eby) Date: Mon, 6 May 2013 17:20:38 -0400 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: References: Message-ID: On Mon, May 6, 2013 at 4:46 AM, Armin Rigo wrote: > This is clearly a language design issue though. I can't really think > of a use case that would break if we relax the requirement, but I > might be wrong. It seems to me that at most some modules like pickle > which use id()-keyed dictionaries will fail to find some > otherwise-identical objects, but would still work (even if tuples are > "relaxed" in this way, you can't have cycles with only tuples). I don't know if I've precisely understood the change you're proposing, but I do know that in PEAK-Rules I use id() as an approximation for "is" in order to build indexes of various "parameter is some_object" conditions, for various "some_objects" and a given parameter. The rule engine takes id(parameter) at call time and then looks it up to obtain a subset of applicable rules. IIUC, this would require that either "x is y" equates to "id(x)==id(y)", or else that there be some way to determine in advance all the possible id(y)s that are now or would ever be "is x", so they can be placed in the index. Otherwise, any use of an "is" condition would require a linear search of the possibilities, as you could not rule out the possibility that a given x was "is" to *some* y already in the index. Of course, rules using "is" tend to be few and far between, outside of some special cases, and their use with simple integers and strings would be downright silly. And on top of that, I'm not even sure whether the "a <= b" notation you used was meant to signify "a implies b" or "b implies a". ;-) But since you mentioned id()-keyed dictionaries and this is another use of them that I know of, I figured I should at least throw it out there for information's sake, regardless of which side of the issue it lands on. ;-) From tjreedy at udel.edu Tue May 7 00:23:02 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 06 May 2013 18:23:02 -0400 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: References: <20130506152656.279a0a7b@pitrou.net> Message-ID: On 5/6/2013 10:20 AM, Nick Coghlan wrote: > On Mon, May 6, 2013 at 11:26 PM, Antoine Pitrou wrote: >> Le Mon, 6 May 2013 23:18:54 +1000, >> Nick Coghlan a ?crit : >>> We're not going to change the language design because people don't >>> understand the difference between "is" and "==" For sure. The definition "The operators is and is not test for object identity: x is y is true if and only if x and y are the same object. x is not y yields the inverse truth value. [4]" is clear enough as far as it goes. But perhaps it should be said that whether or not x and y *are* the same object, in a particular situation, may depend on the implementation. The footnote [4] "Due to automatic garbage-collection, free lists, and the dynamic nature of descriptors, you may notice seemingly unusual behaviour in certain uses of the is operator, like those involving comparisons between instance methods, or constants." tells only part of the story, and the less common part at that. >>> and then wrongly blame PyPy for breaking their code. The language definition intentionally leaves 'isness' implementation defined for number and string operations in order to allow but not require optimizations. Preserving isness when mixing numbers and strings with mutable collections is a different issue. >> Well, if I'm doing: >> >> mylist = [x] >> >> and ``mylist[0] is x`` returns False, then I pretty much consider the >> Python implementation to be broken, not my code :-) If x were constrained to be an int, the comparison would not make much sense, but part of the essential nature of lists is that x could be literally any object. So unless False were a documented possibility, I might be inclined to agree with you, based on CPython precedent. The situation *is* different with type-limited arrays. >>> from array import array >>> x = 1001 >>> myray = array('i', [x]) >>> myray[0] is x False I think the possibility of False is implicit in "an object type which can compactly represent an array of basic values". The later phrase "the type of objects stored in them is constrained" is incorrectly worded because arrays store constrained *values*, not *objects* or even object references as lists do. > Yeah, that's a rather good point - I briefly forgot that the trigger > here was PyPy's specialised single type containers. Does implicitly replacing or implementing a list with something that is internally more like Cpython arrays than Cpython lists (as I understand what pypy is doing) violates the language spec? I re-read the doc and I am not sure. Sequences are sequences of 'items'. For example: "s[i] ith item of s, origin 0" 'Items' are not defined, but pragmatically, they can be defined either by value or identity Containment is defined in terms of equality, which itself can be defined in terms of either value or identity. For strings and ranges, the 'items' are values, not objects. They also are for bytes even though identity is recovered when objects for all possible byte values are pre-cached, as in CPython. 'Item' is necessarily left vague for mutable sequences as bytearrays also store values. The fact that Antoine's example 'works' for bytearrays is an artifact of the caching, not a language-mandated necessity. >>> b = bytearray() >>> b.append(98) >>> b[0] is 98 True The definition for lists does not narrow 'item' either. "Lists are mutable sequences, typically used to store collections of homogeneous items (where the precise degree of similarity will vary by application)." Antoine's opinion would be more supportable if 'item' were replaced by 'object'. Guido's notion of 'homogenous' could be interpreted as supporting specialized 'lists'. On the other hand, I think explicit import, as with the array module and numarray package, is a better idea. This is especially true if an implementation intends to be a drop-in replacement for CPython. It seems to me that Armin's pain comes from trying to be both different and compatible at the same time. -- Terry Jan Reedy From solipsis at pitrou.net Tue May 7 00:34:04 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 00:34:04 +0200 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables References: <20130506152656.279a0a7b@pitrou.net> Message-ID: <20130507003404.6a1312fd@fsol> On Mon, 06 May 2013 18:23:02 -0400 Terry Jan Reedy wrote: > > 'Item' is necessarily left vague for mutable sequences as bytearrays > also store values. The fact that Antoine's example 'works' for > bytearrays is an artifact of the caching, not a language-mandated > necessity. No, it isn't. You are mixing up values and references. A bytearray or a array.array may indeed store values, but a list stores references to objects. I'm pretty sure that not respecting identity of objects stored in general-purpose containers would break a *lot* of code out there. Regards Antoine. From ethan at stoneleaf.us Tue May 7 03:26:56 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 06 May 2013 18:26:56 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518611C8.8070901@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518611C8.8070901@g.nevcal.com> Message-ID: <51885860.3090702@stoneleaf.us> On 05/05/2013 01:01 AM, Glenn Linderman wrote: > > The bigger problem is that the arithmetic on enumeration items, which seems like it should be inherited from NamedInt > (and seems to be, because the third value from each print is a NamedInt), doesn't pick up "x" or "y", nor does it pick > up "the-x" or "the-y", but rather, it somehow picks up the str of the value. Indeed, the bigger problem is that we ended up have an (NamedInt, Enum) wrapping a NamedInt, so we had both NEI.x._intname /and/ NEI.x.value._intname, and it was just one big mess. But I think it is solved. Try the new code. Here's what your example should look like: class NamedInt( int ): def __new__( cls, *args, **kwds ): _args = args name, *args = args if len( args ) == 0: raise TypeError("name and value must be specified") self = int.__new__( cls, *args, **kwds ) self._intname = name return self @property def __name__( self ): return self._intname def __repr__( self ): # repr() is updated to include the name and type info return "{}({!r}, {})".format(type(self).__name__, self.__name__, int.__repr__(self)) def __str__( self ): # str() is unchanged, even if it relies on the repr() fallback base = int base_str = base.__str__ if base_str.__objclass__ is object: return base.__repr__(self) return base_str(self) # for testing, we only define one operator that propagates expressions def __add__(self, other): temp = int( self ) + int( other ) if isinstance( self, NamedInt ) and isinstance( other, NamedInt ): return NamedInt( '({0} + {1})'.format(self.__name__, other.__name__), temp ) else: return temp class NEI( NamedInt, Enum ): x = ('the-x', 1 ) y = ('the-y', 2 ) NEI.x + NEI.y -- ~Ethan~ From tjreedy at udel.edu Tue May 7 04:50:55 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 06 May 2013 22:50:55 -0400 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: <20130507003404.6a1312fd@fsol> References: <20130506152656.279a0a7b@pitrou.net> <20130507003404.6a1312fd@fsol> Message-ID: On 5/6/2013 6:34 PM, Antoine Pitrou wrote: > On Mon, 06 May 2013 18:23:02 -0400 > Terry Jan Reedy wrote: >> >> 'Item' is necessarily left vague for mutable sequences as bytearrays >> also store values. The fact that Antoine's example 'works' for >> bytearrays is an artifact of the caching, not a language-mandated >> necessity. > > No, it isn't. Yes it is. Look again at the array example. >>> from array import array >>> x = 1001 >>> myray = array('i', [x]) >>> myray[0] is x False Change 1001 to a cached int value such as 98 and the result is True instead of False. For the equivalent bytearray example >>> b = bytearray() >>> b.append(98) >>> b[0] is 98 True the result is always True *because*, and only because, all byte value are (now) cached. I believe the test for that is marked as CPython-specific. > You are mixing up values and references. No I am not. My whole post was about being careful to not to confuse the two. I noted, however, that the Python *docs* use 'item' to mean either or both. If you do not like the *doc* being unclear, clarify it. > A bytearray or a array.array may indeed store values, but a list stores references to > objects. I said exactly that in reference to CPython. As far as I know, the same is true of lists in every other implementation up until Pypy decided to optimize that away. What I also said is that I cannot read the *current* doc as guaranteeing that characterization. The reason is that the members of sequences, mutable sequences, and lists are all described as 'items'. In the first two cases, 'item' means 'value or object reference'. I see nothing in the doc to force a reader to change or particularized the meaning of 'item' in the third case. If I missed something *in the specification*, please point it out to me. > I'm pretty sure that not respecting identity of objects stored in > general-purpose containers would break a *lot* of code out there. Me too. Hence I suggested that if lists, etc, are intended to respect identity, with 'is' as currently defined, in any implementation, then the docs should say so and end the discussion. I would be happy to commit an approved patch, but I am not in a position to decide the substantive content. Hence, I tried to provide a neutral analysis that avoided confusing the CPython implementation with the Python specification. In my final paragraph, however, I did suggest that Pypy respect precedent, to avoid breaking existing code and expectations, and call their mutable sequences something other than 'list'. -- Terry Jan Reedy From ethan at stoneleaf.us Tue May 7 04:29:32 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 06 May 2013 19:29:32 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> Message-ID: <5188670C.7020203@stoneleaf.us> On 05/05/2013 02:55 PM, Tim Delaney wrote: > > So long as I can get one of the requirements documented to implement an auto-number syntax I'll be happy enough with > stdlib enums I think. > > class Color(AutoIntEnum): > red = ... > green = ... > blue = ... > Will this do? class AutoNumber(Enum): def __new__(cls): value = len(cls.__enum_info__) + 1 obj = object.__new__(cls) obj._value = value return obj def __int__(self): return self._value class Color(AutoNumber): red = () green = () blue = () -- ~Ethan~ From timothy.c.delaney at gmail.com Tue May 7 04:58:42 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 7 May 2013 12:58:42 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: <5188670C.7020203@stoneleaf.us> References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> Message-ID: On 7 May 2013 12:29, Ethan Furman wrote: > On 05/05/2013 02:55 PM, Tim Delaney wrote: > >> >> So long as I can get one of the requirements documented to implement an >> auto-number syntax I'll be happy enough with >> stdlib enums I think. >> >> class Color(AutoIntEnum): >> red = ... >> green = ... >> blue = ... >> >> > Will this do? > > class AutoNumber(Enum): > def __new__(cls): > value = len(cls.__enum_info__) + 1 > obj = object.__new__(cls) > obj._value = value > return obj > def __int__(self): > return self._value > class Color(AutoNumber): > red = () > green = () > blue = () Considering that doesn't actually work with the reference implementation (AutoNumber.__new__ is never called) ... no. print(Color.red._value) print(int(Color.red)) ---------- Run Python3 ---------- () Traceback (most recent call last): File "D:\home\repos\mercurial\ref435\ref435.py", line 292, in print(int(Color.red)) TypeError: __int__ returned non-int (type tuple) Plus I would not want to use the empty tuple for the purpose - at least ... implies something ongoing. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue May 7 05:35:54 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 06 May 2013 20:35:54 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <51885860.3090702@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518611C8.8070901@g.nevcal.com> <51885860.3090702@stoneleaf.us> Message-ID: <5188769A.5030108@g.nevcal.com> On 5/6/2013 6:26 PM, Ethan Furman wrote: > On 05/05/2013 01:01 AM, Glenn Linderman wrote: >> >> The bigger problem is that the arithmetic on enumeration items, which >> seems like it should be inherited from NamedInt >> (and seems to be, because the third value from each print is a >> NamedInt), doesn't pick up "x" or "y", nor does it pick >> up "the-x" or "the-y", but rather, it somehow picks up the str of the >> value. > > Indeed, the bigger problem is that we ended up have an (NamedInt, > Enum) wrapping a NamedInt, so we had both NEI.x._intname /and/ > NEI.x.value._intname, and it was just one big mess. > > But I think it is solved. Try the new code. Here's what your example > should look like: OK. I notice you changed some super()s to specific int calls; I think I understand why, having recently reread about the specific problem that super() solves regarding diamond inheritance, and with that understanding, it is clear that super() is not always the right thing to use, particularly when unrelated classes may still have particular methods with name clashes (dunder methods would commonly have the same names in unrelated classes). So my use of super likely contributed to the multiple wrappings that you allude to above, although I haven't (yet) tried to figure out the exact details of how that happened. > > class NamedInt( int ): > def __new__( cls, *args, **kwds ): > _args = args > name, *args = args > if len( args ) == 0: > raise TypeError("name and value must be specified") > self = int.__new__( cls, *args, **kwds ) > self._intname = name > return self > @property > def __name__( self ): > return self._intname > def __repr__( self ): > # repr() is updated to include the name and type info > return "{}({!r}, {})".format(type(self).__name__, > self.__name__, > int.__repr__(self)) > def __str__( self ): > # str() is unchanged, even if it relies on the repr() > fallback > base = int > base_str = base.__str__ > if base_str.__objclass__ is object: > return base.__repr__(self) > return base_str(self) > # for testing, we only define one operator that propagates > expressions > def __add__(self, other): > temp = int( self ) + int( other ) > if isinstance( self, NamedInt ) and isinstance( other, > NamedInt ): > return NamedInt( > '({0} + {1})'.format(self.__name__, other.__name__), > temp ) > else: > return temp > > class NEI( NamedInt, Enum ): > x = ('the-x', 1 ) > y = ('the-y', 2 ) I had tried this sort of constructor, thinking it should work, but couldn't tell that it helped or hindered, but it probably took eliminating the super() problem earlier, and likely your preservation of __new__ in your ref435 changes, to enable this syntax to do what I expected it might. This certainly does allow the name definitions to be better grouped, even though still somewhat redundant. It may take a subclass of the enum_type, as Nick was suggesting, to make the NamedInt and the enumeration member actually share a single name... but this (after fleshing out NamedInt with more operators) would be a functional method of producing enumerations for the various flag parameters in the Python API (that are mostly inherited from wrapped C APIs, I suppose... at least, I haven't found a need or benefit of creating flag parameters in new Python APIs that I have created). > NEI.x + NEI.y And this works as expected, now. Can't say I still fully understand the changes, but the test case works, and the constructors for the NamedInt inside the NEI class works, so this is pretty much what I was hoping to be able when I started down this path... but since it found some issues that you were able to fix in ref435, I guess I wasn't totally wasting your time presenting the issue. Thanks for investigating, and fixing, rather than blowing it off, even given my amateurish presentation. And I should have called this NIE, not NEI, because it was intended to stand for NamedIntEnum... but it is just a name, so doesn't affect the functionality. N.B. In your latest ref435.py code, line 105, should be "An Enum class _is_ final..." rather than "in". -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue May 7 06:28:52 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 6 May 2013 21:28:52 -0700 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: References: <20130506152656.279a0a7b@pitrou.net> <20130507003404.6a1312fd@fsol> Message-ID: On Mon, May 6, 2013 at 7:50 PM, Terry Jan Reedy wrote: > On 5/6/2013 6:34 PM, Antoine Pitrou wrote: > >> On Mon, 06 May 2013 18:23:02 -0400 >> Terry Jan Reedy wrote: >> >>> >>> 'Item' is necessarily left vague for mutable sequences as bytearrays >>> also store values. The fact that Antoine's example 'works' for >>> bytearrays is an artifact of the caching, not a language-mandated >>> necessity. >>> >> >> No, it isn't. >> > > Yes it is. Look again at the array example. > > >>> from array import array > >>> x = 1001 > >>> myray = array('i', [x]) > >>> myray[0] is x > False > > Change 1001 to a cached int value such as 98 and the result is True > instead of False. For the equivalent bytearray example > > > >>> b = bytearray() > >>> b.append(98) > >>> b[0] is 98 > True > > the result is always True *because*, and only because, all byte value are > (now) cached. I believe the test for that is marked as CPython-specific. > > > > You are mixing up values and references. > > No I am not. My whole post was about being careful to not to confuse the > two. I noted, however, that the Python *docs* use 'item' to mean either or > both. If you do not like the *doc* being unclear, clarify it. > > > A bytearray or a array.array may indeed store values, but a list stores >> references to >> objects. >> > > I said exactly that in reference to CPython. As far as I know, the same is > true of lists in every other implementation up until Pypy decided to > optimize that away. What I also said is that I cannot read the *current* > doc as guaranteeing that characterization. The reason is that the members > of sequences, mutable sequences, and lists are all described as 'items'. In > the first two cases, 'item' means 'value or object reference'. I see > nothing in the doc to force a reader to change or particularized the > meaning of 'item' in the third case. If I missed something *in the > specification*, please point it out to me. > > > I'm pretty sure that not respecting identity of objects stored in >> general-purpose containers would break a *lot* of code out there. >> > > Me too. Hence I suggested that if lists, etc, are intended to respect > identity, with 'is' as currently defined, in any implementation, then the > docs should say so and end the discussion. I would be happy to commit an > approved patch, but I am not in a position to decide the substantive > content. Hence, I tried to provide a neutral analysis that avoided > confusing the CPython implementation with the Python specification. > > In my final paragraph, however, I did suggest that Pypy respect precedent, > to avoid breaking existing code and expectations, and call their mutable > sequences something other than 'list'. > Wouldn't the entire point of such things existing in pypy be that the implementation is irrelevant to the user and used behind the scenes automatically in the common case when a container is determined to fit the special constraint? I personally do not think we should guarantee that "mylist[0] = x; assert x is mylist[0]" succeeds when x is an immutable type other than None. If something is immutable and not intended to be a singleton and does not define equality (like None or sentinel values commonly tested using is such as arbitrary object() instances) it needs to be up to the language VM to determine when to copy or not in most situations. You already gave the example of the interned small integers in CPython. String constants and names used in code are also interned in today's CPython implementation. This doesn't tend to trip any real code up. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue May 7 06:54:35 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 06 May 2013 21:54:35 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> Message-ID: <5188890B.9090904@stoneleaf.us> On 05/06/2013 07:58 PM, Tim Delaney wrote: > > Considering that doesn't actually work with the reference implementation (AutoNumber.__new__ is never called) ... no. Two points: 1) Did you grab the latest code? That exact implementation passes in the tests. 2) You can write your __new__ however you want -- use ... ! ;) -- ~Ethan~ From v+python at g.nevcal.com Tue May 7 06:58:39 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 06 May 2013 21:58:39 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> Message-ID: <518889FF.4050805@g.nevcal.com> On 5/6/2013 7:58 PM, Tim Delaney wrote: > On 7 May 2013 12:29, Ethan Furman > wrote: > > On 05/05/2013 02:55 PM, Tim Delaney wrote: > > > So long as I can get one of the requirements documented to > implement an auto-number syntax I'll be happy enough with > stdlib enums I think. > > class Color(AutoIntEnum): > red = ... > green = ... > blue = ... > > > Will this do? > > class AutoNumber(Enum): > def __new__(cls): > value = len(cls.__enum_info__) + 1 > obj = object.__new__(cls) > obj._value = value > return obj > def __int__(self): > return self._value > class Color(AutoNumber): > red = () > green = () > blue = () > > > Considering that doesn't actually work with the reference > implementation (AutoNumber.__new__ is never called) ... no. Maybe you should have tried with the latest version of the reference implementation, where Ethan kindly fixed the reference implementation to work better with NamedInt (per my thread "ref impl disc 2") and apparently also with the above class's __new__... > > print(Color.red._value) > print(int(Color.red)) > > ---------- Run Python3 ---------- > () > Traceback (most recent call last): > File "D:\home\repos\mercurial\ref435\ref435.py", line 292, in > print(int(Color.red)) > TypeError: __int__ returned non-int (type tuple) > > Plus I would not want to use the empty tuple for the purpose - at > least ... implies something ongoing. Why not? For classes derived from Enum, having __new__, the value/tuple assigned to the enumeration member becomes the set of parameters to __new__... so why would you want to provide a parameter? Well, you could, with a minor tweak. If you don't like Ethan's AutoNumber class, you can now write your own, like the following one that I derived from his, but to use your preferred ... class AutoNumber(Enum): def __new__(cls, parm): obj = object.__new__(cls) if parm is ...: value = len(cls.__enum_info__) + 1 obj._value = value else: obj._value = parm return obj def __int__(self): return self._value class Color(AutoNumber): red = ... green = ... blue = 7 purple = ... print ( Color.red, repr( Color.red )) print ( Color.green, repr( Color.green )) print ( Color.blue, repr( Color.blue )) print ( Color.purple, repr( Color.purple )) Since you want to provide a parameter, I decided in my example AutoNumber class that I would use ... as a flag to use his count, and anything else would be an actual value for the enumeration member. You could do whatever else you like, of course, should you write your own, including using someone's suggested itertools.count() -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Tue May 7 07:16:00 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 7 May 2013 15:16:00 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> <5188890B.9090904@stoneleaf.us> Message-ID: On 7 May 2013 15:14, Tim Delaney wrote: > D'oh! I had my default path being my forked repo ... so didn't see the > changes. BTW I can't see how that exact implementation passes ... not > enough parameters declared in AutoNumber.__new__ ... > Sorry - my fault again - I'd already changed () to ... Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue May 7 06:52:30 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 06 May 2013 21:52:30 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5188769A.5030108@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <51860872.8020800@stoneleaf.us> <518611C8.8070901@g.nevcal.com> <51885860.3090702@stoneleaf.us> <5188769A.5030108@g.nevcal.com> Message-ID: <5188888E.4090702@stoneleaf.us> On 05/06/2013 08:35 PM, Glenn Linderman wrote: > > N.B. In your latest ref435.py code, line 105, should be "An Enum class _is_ final..." rather than "in". Thanks, fixed. -- ~Ethan~ From timothy.c.delaney at gmail.com Tue May 7 07:18:41 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 7 May 2013 15:18:41 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> <5188890B.9090904@stoneleaf.us> Message-ID: On 7 May 2013 15:14, Tim Delaney wrote: > Unfortunately, if you subclass AutoNumber from IntEnum it breaks. > > ---------- Run Python3 ---------- > Traceback (most recent call last): > File "D:\home\repos\mercurial\ref435\ref435.py", line 346, in > class Color(AutoNumber): > File "D:\home\repos\mercurial\ref435\ref435.py", line 184, in __new__ > enum_item = __new__(enum_class, *args) > TypeError: int() argument must be a string or a number, not 'ellipsis' > Or using your exact implementation, but subclassing AutoNumber from IntEnum: class AutoNumber(IntEnum): def __new__(cls): value = len(cls.__enum_info__) + 1 obj = object.__new__(cls) obj._value = value return obj def __int__(self): return self._value class Color(AutoNumber): red = () green = () blue = () print(repr(Color.red)) ---------- Run Python3 ---------- Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothy.c.delaney at gmail.com Tue May 7 07:14:25 2013 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Tue, 7 May 2013 15:14:25 +1000 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: <5188890B.9090904@stoneleaf.us> References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> <5188890B.9090904@stoneleaf.us> Message-ID: On 7 May 2013 14:54, Ethan Furman wrote: > On 05/06/2013 07:58 PM, Tim Delaney wrote: > >> >> Considering that doesn't actually work with the reference implementation >> (AutoNumber.__new__ is never called) ... no. >> > > Two points: > > 1) Did you grab the latest code? That exact implementation passes in > the tests. > D'oh! I had my default path being my forked repo ... so didn't see the changes. BTW I can't see how that exact implementation passes ... not enough parameters declared in AutoNumber.__new__ ... > 2) You can write your __new__ however you want -- use ... ! ;) class AutoNumber(Enum): def __new__(cls, value): if value is Ellipsis: try: value = cls._auto_number except AttributeError: value = cls._auto_number = 0 else: cls._auto_number = int(value) obj = object.__new__(cls) obj._value = value cls._auto_number += 1 return obj def __int__(self): return self._value class Color(AutoNumber): red = ... green = 3 blue = ... print(repr(Color.red)) print(repr(Color.green)) print(repr(Color.blue)) ---------- Run Python3 ---------- Unfortunately, if you subclass AutoNumber from IntEnum it breaks. ---------- Run Python3 ---------- Traceback (most recent call last): File "D:\home\repos\mercurial\ref435\ref435.py", line 346, in class Color(AutoNumber): File "D:\home\repos\mercurial\ref435\ref435.py", line 184, in __new__ enum_item = __new__(enum_class, *args) TypeError: int() argument must be a string or a number, not 'ellipsis' I would probably also suggest 2 changes: 1. Set enum_item._name before calling enum_item.__init__. 2. Don't pass any arguments to enum_item.__init__ - the value should be set in enum_item.__new__. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue May 7 08:25:57 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 08:25:57 +0200 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables References: <20130506152656.279a0a7b@pitrou.net> <20130507003404.6a1312fd@fsol> Message-ID: <20130507082557.28daa804@fsol> On Mon, 06 May 2013 22:50:55 -0400 Terry Jan Reedy wrote: > > > A bytearray or a array.array may indeed store values, but a list stores references to > > objects. > > I said exactly that in reference to CPython. As far as I know, the same > is true of lists in every other implementation up until Pypy decided to > optimize that away. What I also said is that I cannot read the *current* > doc as guaranteeing that characterization. In the absence of more precise specification, the reference is IMO the reference interpreter, a.k.a. CPython, and its behaviour is more than well-known and stable over time here. > > I'm pretty sure that not respecting identity of objects stored in > > general-purpose containers would break a *lot* of code out there. > > Me too. Hence I suggested that if lists, etc, are intended to respect > identity, with 'is' as currently defined, in any implementation, then > the docs should say so and end the discussion. I would be happy to > commit an approved patch, but I am not in a position to decide the > substantive content. For me, a patch that mandated general-purpose containers (list, dict, etc.) respect object identity would be ok. Regards Antoine. From victor.stinner at gmail.com Tue May 7 09:34:45 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 7 May 2013 09:34:45 +0200 Subject: [Python-Dev] All 3.x stable buildbots are red Message-ID: http://buildbot.python.org/all/waterfall?category=3.x.stable x86 Windows Server 2003 [SB] 3.x: 3 tests failed, test___all__ test_gc test_ssl x86 Windows7 3.x: 3 tests failed, test___all__ test_gc test_ssl x86 Gentoo Non-Debug 3.x: 3 tests failed, test_logging test_multiprocessing test_urllib2net x86 Gentoo 3.x: 2 tests failed, test_logging test_urllib2net x86 Ubuntu Shared 3.x: 1 test failed, test_logging AMD64 Windows7 SP1 3.x: 4 tests failed, test___all__ test_gc test_logging test_ssl AMD64 OpenIndiana 3.x: 1 test failed, test_logging AMD64 Ubuntu LTS 3.x: 1 test failed, test_logging AMD64 FreeBSD 9.0 3.x: 1 test failed, test_logging Can someone please look at these failures? (I don't have time for this right now.) Victor From arigo at tunes.org Tue May 7 10:27:42 2013 From: arigo at tunes.org (Armin Rigo) Date: Tue, 7 May 2013 10:27:42 +0200 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: <20130507082557.28daa804@fsol> References: <20130506152656.279a0a7b@pitrou.net> <20130507003404.6a1312fd@fsol> <20130507082557.28daa804@fsol> Message-ID: Hi Antoine, On Tue, May 7, 2013 at 8:25 AM, Antoine Pitrou wrote: > For me, a patch that mandated general-purpose containers (list, dict, > etc.) respect object identity would be ok. Thanks, that's also my opinion. In PyPy's approach, in trying to emulate CPython vs. trying to convince users that "is" is sometimes a bad idea, we might eventually end up at the extreme side, which can be seen as where CPython would be if it cached *all* ints, longs, floats, complexes, strings, unicodes and tuples. The original question in this thread was about if it's ok for two objects x and y to satisfy "x is y" while at the same time "id(x) != id(y)". I think by now that it would only create more confusion (even if only in some very special cases). We'll continue to maintain the invariant then, and if it requires creating extremely large values for id(), too bad. (1) A bient?t, Armin. (1) the Jython approach of caching the id's is not applicable here: the objects whose id are hard to get are precisely those that don't have a long-living representation as object in memory. You can't cache an id with a key that is, say, a double-word "long" --- if this double-word is not an object, but merely a value, it can't be used as key in a weakdict. You don't have a way of knowing when you can remove it from the cache. From ethan at stoneleaf.us Tue May 7 11:49:47 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 07 May 2013 02:49:47 -0700 Subject: [Python-Dev] PEP 435: initial values must be specified? Yes In-Reply-To: References: <5186BC8E.6030502@stoneleaf.us> <5188670C.7020203@stoneleaf.us> <5188890B.9090904@stoneleaf.us> Message-ID: <5188CE3B.9020503@stoneleaf.us> On 05/06/2013 10:18 PM, Tim Delaney wrote: > On 7 May 2013 15:14, Tim Delaney > wrote: > > Unfortunately, if you subclass AutoNumber from IntEnum it breaks. > > ---------- Run Python3 ---------- > Traceback (most recent call last): > File "D:\home\repos\mercurial\ref435\ref435.py", line 346, in > class Color(AutoNumber): > File "D:\home\repos\mercurial\ref435\ref435.py", line 184, in __new__ > enum_item = __new__(enum_class, *args) > TypeError: int() argument must be a string or a number, not 'ellipsis' > > > Or using your exact implementation, but subclassing AutoNumber from IntEnum: > > class AutoNumber(IntEnum): > def __new__(cls): > value = len(cls.__enum_info__) + 1 > obj = object.__new__(cls) > obj._value = value > return obj > def __int__(self): > return self._value > class Color(AutoNumber): > red = () > green = () > blue = () > > print(repr(Color.red)) > > ---------- Run Python3 ---------- > Thanks for the test case. It now passes. -- ~Ethan~ From eliben at gmail.com Tue May 7 15:34:50 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 7 May 2013 06:34:50 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API Message-ID: One of the contended issues with PEP 435 on which Guido pronounced was the functional API, that allows created enumerations dynamically in a manner similar to namedtuple: Color = Enum('Color', 'red blue green') The biggest complaint reported against this API is interaction with pickle. As promised, I want to discuss here how we're going to address this concern. At this point, the pickle docs say that module-top-level classes can be pickled. This obviously works for the normal Enum classes, but is a problem with the functional API because the class is created dynamically and has no __module__. To solve this, the reference implementation is used the same approach as namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real code has some safeguards): module_name = sys._getframe(1).f_globals['__name__'] enum_class.__module__ = module_name According to an earlier discussion, this is works on CPython, PyPy and Jython, but not on IronPython. The alternative that works everywhere is to define the Enum like this: Color = Enum('the_module.Color', 'red blue green') The reference implementation supports this as well. Some points for discussion: 1) We can say that using the functional API when pickling can happen is not recommended, but maybe a better way would be to just explain the way things are and let users decide? 2) namedtuple should also support the fully qualified name syntax. If this is agreed upon, I can create an issue. 3) Antoine mentioned that work is being done in 3.4 to enable pickling of nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets implemented, I don't see a reason why Enum and namedtuple can't be adjusted to find the __qualname__ of the class they're internal to. Am I missing something? 4) Using _getframe(N) here seems like an overkill to me. What we really need is just the module in which the current execution currently is (i.e. the metaclass's __new__ in our case). Would it make sense to add a new function somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides the current module name? It seems that all Pythons should be able to easily provide it, it's certainly a very small subset of the functionality provided by walking the callframe stack. This function can then be used for build fully qualified names for pickling of Enum and namedtuple. Moreover, it can be general even more widely - dynamic class building is quite common in Python code, and as Nick mentioned somewhere earlier, the extra power of metaclasses in the recent 3.x's will probably make it even more common. Eli (*) namedtuple uses an explicit function to build the resulting class, not a metaclass (ther's no class syntax for namedtuple). -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 7 16:33:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 May 2013 00:33:07 +1000 Subject: [Python-Dev] Fighting the theoretical randomness of "is" on immutables In-Reply-To: References: <20130506152656.279a0a7b@pitrou.net> <20130507003404.6a1312fd@fsol> <20130507082557.28daa804@fsol> Message-ID: On Tue, May 7, 2013 at 6:27 PM, Armin Rigo wrote: > Hi Antoine, > > On Tue, May 7, 2013 at 8:25 AM, Antoine Pitrou wrote: >> For me, a patch that mandated general-purpose containers (list, dict, >> etc.) respect object identity would be ok. > > Thanks, that's also my opinion. > > In PyPy's approach, in trying to emulate CPython vs. trying to > convince users that "is" is sometimes a bad idea, we might eventually > end up at the extreme side, which can be seen as where CPython would > be if it cached *all* ints, longs, floats, complexes, strings, > unicodes and tuples. > > The original question in this thread was about if it's ok for two > objects x and y to satisfy "x is y" while at the same time "id(x) != > id(y)". I think by now that it would only create more confusion (even > if only in some very special cases). We'll continue to maintain the > invariant then, and if it requires creating extremely large values for > id(), too bad. (1) Yeah, I've been trying to come up with a way to phrase the end result that doesn't make my brain hurt, but I've mostly failed. The details below are the closest I've come to something that makes sense to me. With equality, the concepts of hashing and value have a clear relationship: x == y implies hash(x) == hash(y), but there's no implication going in the other direction. Even if the hashes are the same, the values may be different (you can have hash(x) == hash(y) without having x == y). NaN's aside, you also have the relationship that x is y implies x == y (and the standard containers assume this). Again, there's no implication in the other direction. Two objects may be equal, while having different identities (such as 0 == 0.0 == 0j) The definition of object identity is that x is y implies id(x) == id(y) *and* vice-versa. The suggested change would actually involving defining a *new* language concept, a "reference id", where ref_id(x) == ref_id(y) implies x is y, but the reverse is not true. Thus, this is actually a suggestion for two changes rolled into one: 1. Add the "reference id" concept 2. Change the id() builtin to return the reference id rather than the object id I think the array.array solution is a more tolerable one: provide explicit value based containers that are known not to be identity preserving. If you want maximum speed, you have to be prepared to deal with the difference in semantics. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From duda.piotr at gmail.com Tue May 7 16:48:26 2013 From: duda.piotr at gmail.com (Piotr Duda) Date: Tue, 7 May 2013 16:48:26 +0200 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: 2013/5/7 Eli Bendersky : > > 4) Using _getframe(N) here seems like an overkill to me. What we really need > is just the module in which the current execution currently is (i.e. the > metaclass's __new__ in our case). Would it make sense to add a new function > somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides > the current module name? It seems that all Pythons should be able to easily > provide it, it's certainly a very small subset of the functionality provided > by walking the callframe stack. This function can then be used for build > fully qualified names for pickling of Enum and namedtuple. Moreover, it can > be general even more widely - dynamic class building is quite common in > Python code, and as Nick mentioned somewhere earlier, the extra power of > metaclasses in the recent 3.x's will probably make it even more common. What about adding simple syntax (I proposed this earlier, but no one commented) that take care of assigning name and module, something like: def name = expression which would be rough equivalent for: name = expression name.__name__ = 'name' name.__module__ = __name__ -- ???????? ?????? From ethan at stoneleaf.us Tue May 7 16:53:25 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 07 May 2013 07:53:25 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: <51891565.2010107@stoneleaf.us> On 05/07/2013 07:48 AM, Piotr Duda wrote: > > What about adding simple syntax (I proposed this earlier, but no one > commented) that take care of assigning name and module, something > like: > > def name = expression > > which would be rough equivalent for: > > name = expression > name.__name__ = 'name' > name.__module__ = __name__ How is that different from --> name = Enum('module.name', ... ) ? -- ~Ethan~ From duda.piotr at gmail.com Tue May 7 17:01:44 2013 From: duda.piotr at gmail.com (Piotr Duda) Date: Tue, 7 May 2013 17:01:44 +0200 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <51891565.2010107@stoneleaf.us> References: <51891565.2010107@stoneleaf.us> Message-ID: 2013/5/7 Ethan Furman : > On 05/07/2013 07:48 AM, Piotr Duda wrote: >> >> >> What about adding simple syntax (I proposed this earlier, but no one >> commented) that take care of assigning name and module, something >> like: >> >> def name = expression >> >> which would be rough equivalent for: >> >> name = expression >> name.__name__ = 'name' >> name.__module__ = __name__ > > > How is that different from > > --> name = Enum('module.name', ... ) > > ? It's DRY. -- ???????? ?????? From ncoghlan at gmail.com Tue May 7 17:03:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 May 2013 01:03:38 +1000 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky wrote: > One of the contended issues with PEP 435 on which Guido pronounced was the > functional API, that allows created enumerations dynamically in a manner > similar to namedtuple: > > Color = Enum('Color', 'red blue green') > > The biggest complaint reported against this API is interaction with pickle. > As promised, I want to discuss here how we're going to address this concern. > > At this point, the pickle docs say that module-top-level classes can be > pickled. This obviously works for the normal Enum classes, but is a problem > with the functional API because the class is created dynamically and has no > __module__. > > To solve this, the reference implementation is used the same approach as > namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real > code has some safeguards): > > module_name = sys._getframe(1).f_globals['__name__'] > enum_class.__module__ = module_name > > According to an earlier discussion, this is works on CPython, PyPy and > Jython, but not on IronPython. The alternative that works everywhere is to > define the Enum like this: > > Color = Enum('the_module.Color', 'red blue green') > > The reference implementation supports this as well. > > Some points for discussion: > > 1) We can say that using the functional API when pickling can happen is not > recommended, but maybe a better way would be to just explain the way things > are and let users decide? It's probably worth creating a section in the pickle docs and explaining the vagaries of naming things and the dependency on knowing the module name. The issue comes up with defining classes in __main__ and when implementing pseudo-modules as well (see PEP 395). > 2) namedtuple should also support the fully qualified name syntax. If this > is agreed upon, I can create an issue. Yes, I think that part should be done. > 3) Antoine mentioned that work is being done in 3.4 to enable pickling of > nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets > implemented, I don't see a reason why Enum and namedtuple can't be adjusted > to find the __qualname__ of the class they're internal to. Am I missing > something? The class based form should still work (assuming only classes are involved), the stack inspection will likely fail. > 4) Using _getframe(N) here seems like an overkill to me. It's not just overkill, it's fragile - it only works if you call the constructor directly. If you use a convenience function in a utility module, it will try to load your pickles from there rather than wherever you bound the name. > What we really need > is just the module in which the current execution currently is (i.e. the > metaclass's __new__ in our case). Would it make sense to add a new function > somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides > the current module name? It seems that all Pythons should be able to easily > provide it, it's certainly a very small subset of the functionality provided > by walking the callframe stack. This function can then be used for build > fully qualified names for pickling of Enum and namedtuple. Moreover, it can > be general even more widely - dynamic class building is quite common in > Python code, and as Nick mentioned somewhere earlier, the extra power of > metaclasses in the recent 3.x's will probably make it even more common. Yes, I've been thinking along these lines myself, although in a slightly more expanded form that also touches on the issues that stalled PEP 406 (the import engine API that tries to better encapsulate the import state). It may also potentially address some issues with initialisation of C extensions (I don't remember the exact details off the top of my head, but there's some info we want to get from the import machinery to modules initialised from Cython, but the loader API and the C module initialisation API both get in the way). Specifically, what I'm talking about is some kind of implicit context similar to the approach the decimal module uses to control operations on Decimal instances. In this case, what we're trying to track is the "active module", either __main__ (if the code has been triggered directly through an operation in that module), or else the module currently being imported (if the import machinery has been invoked). The bare minimum would just be to store the __name__ (using sys.modules to get access to the full module if needed) in a way that adequately handles nested, circular and threaded imports, but there may be a case for tracking a richer ModuleContext object instead. However, there's also a separate question of whether implicitly tracking the active module is really what we want. Do we want that, or is what we actually want the ability to define an arbitrary "naming context" in order to use functional APIs to construct classes without losing the pickle integration of class statements? What if there was a variant of the class statement that bound the result of a function call rather than using the normal syntax: class Animal from enum.Enum(members="dog cat bear") And it was only class statements in that form which manipulated the naming context? (you could also use the def keyword rather than class) Either form would essentially be an ordinary assignment statement, *except* that they would manipulate the naming context to record the name being bound *and* relevant details of the active module. Regardless, I think the question is not really well enough defined to be a topic for python-dev, even though it came up in a python-dev discussion - it's more python-ideas territory. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Tue May 7 17:06:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 May 2013 01:06:29 +1000 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <51891565.2010107@stoneleaf.us> References: <51891565.2010107@stoneleaf.us> Message-ID: On Wed, May 8, 2013 at 12:53 AM, Ethan Furman wrote: > On 05/07/2013 07:48 AM, Piotr Duda wrote: >> >> >> What about adding simple syntax (I proposed this earlier, but no one >> commented) that take care of assigning name and module, something >> like: >> >> def name = expression >> >> which would be rough equivalent for: >> >> name = expression >> name.__name__ = 'name' >> name.__module__ = __name__ > > > How is that different from > > --> name = Enum('module.name', ... ) With the repetition, you're setting yourself up for bugs in future maintenance when either the module name or the assigned name change. I like Piotr's suggestion of simply assigning to __name__ and __module__ after the fact, though - much simpler than my naming context idea. Cheers, Nick. From solipsis at pitrou.net Tue May 7 17:24:02 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 17:24:02 +0200 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API References: Message-ID: <20130507172402.38d0755d@pitrou.net> Le Wed, 8 May 2013 01:03:38 +1000, Nick Coghlan a ?crit : > > What if there was a variant of the class statement that bound the > result of a function call rather than using the normal syntax: > > class Animal from enum.Enum(members="dog cat bear") Apparently you're trying hard to invent syntaxes just to avoid subclassing. Regards Antoine. From ethan at stoneleaf.us Tue May 7 17:07:58 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 07 May 2013 08:07:58 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: <51891565.2010107@stoneleaf.us> Message-ID: <518918CE.2010402@stoneleaf.us> On 05/07/2013 08:01 AM, Piotr Duda wrote: > 2013/5/7 Ethan Furman : >> On 05/07/2013 07:48 AM, Piotr Duda wrote: >>> >>> >>> What about adding simple syntax (I proposed this earlier, but no one >>> commented) that take care of assigning name and module, something >>> like: >>> >>> def name = expression >>> >>> which would be rough equivalent for: >>> >>> name = expression >>> name.__name__ = 'name' >>> name.__module__ = __name__ >> >> >> How is that different from >> >> --> name = Enum('module.name', ... ) >> >> ? > > It's DRY. How? You need to provide a complete example: Do you mean something like: --> def mymodule.Color('red green blue') ? -- ~Ethan~ From duda.piotr at gmail.com Tue May 7 17:35:11 2013 From: duda.piotr at gmail.com (Piotr Duda) Date: Tue, 7 May 2013 17:35:11 +0200 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <518918CE.2010402@stoneleaf.us> References: <51891565.2010107@stoneleaf.us> <518918CE.2010402@stoneleaf.us> Message-ID: 2013/5/7 Ethan Furman : > On 05/07/2013 08:01 AM, Piotr Duda wrote: >> >> 2013/5/7 Ethan Furman : >>> >>> On 05/07/2013 07:48 AM, Piotr Duda wrote: >>>> >>>> >>>> >>>> What about adding simple syntax (I proposed this earlier, but no one >>>> commented) that take care of assigning name and module, something >>>> like: >>>> >>>> def name = expression >>>> >>>> which would be rough equivalent for: >>>> >>>> name = expression >>>> name.__name__ = 'name' >>>> name.__module__ = __name__ >>> >>> >>> >>> How is that different from >>> >>> --> name = Enum('module.name', ... ) >>> >>> ? >> >> >> It's DRY. > > > How? You need to provide a complete example: > > Do you mean something like: > > --> def mymodule.Color('red green blue') > def Color = Enum('red green blue') -- ???????? ?????? From eliben at gmail.com Tue May 7 17:44:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 7 May 2013 08:44:46 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan wrote: > On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky wrote: > > One of the contended issues with PEP 435 on which Guido pronounced was > the > > functional API, that allows created enumerations dynamically in a manner > > similar to namedtuple: > > > > Color = Enum('Color', 'red blue green') > > > > The biggest complaint reported against this API is interaction with > pickle. > > As promised, I want to discuss here how we're going to address this > concern. > > > > At this point, the pickle docs say that module-top-level classes can be > > pickled. This obviously works for the normal Enum classes, but is a > problem > > with the functional API because the class is created dynamically and has > no > > __module__. > > > > To solve this, the reference implementation is used the same approach as > > namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real > > code has some safeguards): > > > > module_name = sys._getframe(1).f_globals['__name__'] > > enum_class.__module__ = module_name > > > > According to an earlier discussion, this is works on CPython, PyPy and > > Jython, but not on IronPython. The alternative that works everywhere is > to > > define the Enum like this: > > > > Color = Enum('the_module.Color', 'red blue green') > > > > The reference implementation supports this as well. > > > > Some points for discussion: > > > > 1) We can say that using the functional API when pickling can happen is > not > > recommended, but maybe a better way would be to just explain the way > things > > are and let users decide? > > It's probably worth creating a section in the pickle docs and > explaining the vagaries of naming things and the dependency on knowing > the module name. The issue comes up with defining classes in __main__ > and when implementing pseudo-modules as well (see PEP 395). > > Any pickle-expert volunteers to do this? I guess we can start by creating a documentation issue. > > 2) namedtuple should also support the fully qualified name syntax. If > this > > is agreed upon, I can create an issue. > > Yes, I think that part should be done. > OK, I'll create an issue. > > > 3) Antoine mentioned that work is being done in 3.4 to enable pickling of > > nested classes (http://www.python.org/dev/peps/pep-3154/). If that gets > > implemented, I don't see a reason why Enum and namedtuple can't be > adjusted > > to find the __qualname__ of the class they're internal to. Am I missing > > something? > > The class based form should still work (assuming only classes are > involved), the stack inspection will likely fail. > I can probably be made to work with a bit more effort than the current "hack", but I don't see why it wouldn't be doable. > > 4) Using _getframe(N) here seems like an overkill to me. > > It's not just overkill, it's fragile - it only works if you call the > constructor directly. If you use a convenience function in a utility > module, it will try to load your pickles from there rather than > wherever you bound the name. > In theory you can climb the frame stack until the desired place, but this is specifically what my proposal of adding a function tries to avoid. > > > What we really need > > is just the module in which the current execution currently is (i.e. the > > metaclass's __new__ in our case). Would it make sense to add a new > function > > somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just > provides > > the current module name? It seems that all Pythons should be able to > easily > > provide it, it's certainly a very small subset of the functionality > provided > > by walking the callframe stack. This function can then be used for build > > fully qualified names for pickling of Enum and namedtuple. Moreover, it > can > > be general even more widely - dynamic class building is quite common in > > Python code, and as Nick mentioned somewhere earlier, the extra power of > > metaclasses in the recent 3.x's will probably make it even more common. > > Yes, I've been thinking along these lines myself, although in a > slightly more expanded form that also touches on the issues that > stalled PEP 406 (the import engine API that tries to better > encapsulate the import state). It may also potentially address some > issues with initialisation of C extensions (I don't remember the exact > details off the top of my head, but there's some info we want to get > from the import machinery to modules initialised from Cython, but the > loader API and the C module initialisation API both get in the way). > > Specifically, what I'm talking about is some kind of implicit context > similar to the approach the decimal module uses to control operations > on Decimal instances. In this case, what we're trying to track is the > "active module", either __main__ (if the code has been triggered > directly through an operation in that module), or else the module > currently being imported (if the import machinery has been invoked). > > The bare minimum would just be to store the __name__ (using > sys.modules to get access to the full module if needed) in a way that > adequately handles nested, circular and threaded imports, but there > may be a case for tracking a richer ModuleContext object instead. > > However, there's also a separate question of whether implicitly > tracking the active module is really what we want. Do we want that, or > is what we actually want the ability to define an arbitrary "naming > context" in order to use functional APIs to construct classes without > losing the pickle integration of class statements? > > What if there was a variant of the class statement that bound the > result of a function call rather than using the normal syntax: > > class Animal from enum.Enum(members="dog cat bear") > > And it was only class statements in that form which manipulated the > naming context? (you could also use the def keyword rather than class) > > Either form would essentially be an ordinary assignment statement, > *except* that they would manipulate the naming context to record the > name being bound *and* relevant details of the active module. > > Regardless, I think the question is not really well enough defined to > be a topic for python-dev, even though it came up in a python-dev > discussion - it's more python-ideas territory. > Wait... I agree that having a special syntax for this is a novel idea that's not well defined and can be discussed on python-ideas. But the utility function I was mentioning is a pretty simple idea, and it's well defined. It can be very useful in contexts where code is created dynamically, by removing the amount of explicit-frame-walking hacks. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From python-dev at masklinn.net Tue May 7 17:46:09 2013 From: python-dev at masklinn.net (Xavier Morel) Date: Tue, 7 May 2013 17:46:09 +0200 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On 2013-05-07, at 17:03 , Nick Coghlan wrote: > > Specifically, what I'm talking about is some kind of implicit context > similar to the approach the decimal module uses to control operations > on Decimal instances. Wouldn't it be a good occasion to add actual, full-fledged and correctly implemented (and working) dynamically scoped variables? Or extending exceptions to signals (in the Smalltalk/Lisp sense) providing the same feature? From eliben at gmail.com Tue May 7 17:47:41 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 7 May 2013 08:47:41 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: <51891565.2010107@stoneleaf.us> <518918CE.2010402@stoneleaf.us> Message-ID: On Tue, May 7, 2013 at 8:35 AM, Piotr Duda wrote: > 2013/5/7 Ethan Furman : > > On 05/07/2013 08:01 AM, Piotr Duda wrote: > >> > >> 2013/5/7 Ethan Furman : > >>> > >>> On 05/07/2013 07:48 AM, Piotr Duda wrote: > >>>> > >>>> > >>>> > >>>> What about adding simple syntax (I proposed this earlier, but no one > >>>> commented) that take care of assigning name and module, something > >>>> like: > >>>> > >>>> def name = expression > >>>> > >>>> which would be rough equivalent for: > >>>> > >>>> name = expression > >>>> name.__name__ = 'name' > >>>> name.__module__ = __name__ > >>> > >>> > >>> > >>> How is that different from > >>> > >>> --> name = Enum('module.name', ... ) > >>> > >>> ? > >> > >> > >> It's DRY. > > > > > > How? You need to provide a complete example: > > > > Do you mean something like: > > > > --> def mymodule.Color('red green blue') > > > > def Color = Enum('red green blue') > It's an interesting idea, but as NIck suggested we should probably discuss it on the python-ideas list. It occurred to me while thinking about the duplication in "Color = Enum(Color, '...')" that if "Enum" had some magical way to know the name of the variable it's assigned to, the duplication would not be needed. But then, it obviously is fragile because what's this: somedict[key] = Enum(Color, ...). A special syntax raises more questions though, because it has to be defined very precisely. Feel free to come up with a complete proposal to python-ideas, defining the interesting semantics. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue May 7 18:07:45 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 07 May 2013 09:07:45 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: <518926D1.4000808@stoneleaf.us> On 05/07/2013 08:03 AM, Nick Coghlan wrote: > On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky wrote: >> >> 4) Using _getframe(N) here seems like an overkill to me. > > It's not just overkill, it's fragile - it only works if you call the > constructor directly. If you use a convenience function in a utility > module, it will try to load your pickles from there rather than > wherever you bound the name. > >> What we really need >> is just the module in which the current execution currently is (i.e. the >> metaclass's __new__ in our case). Would it make sense to add a new function >> somewhere in the stdlib of 3.4 (in sys or inspect or ...) that just provides >> the current module name? It seems that all Pythons should be able to easily >> provide it, it's certainly a very small subset of the functionality provided >> by walking the callframe stack. This function can then be used for build >> fully qualified names for pickling of Enum and namedtuple. Moreover, it can >> be general even more widely - dynamic class building is quite common in >> Python code, and as Nick mentioned somewhere earlier, the extra power of >> metaclasses in the recent 3.x's will probably make it even more common. Perhaps I am being too pendantic, or maybe I'm not thinking in low enough detail, but it seems to me that the module in which the current execution is is the module in which the currently running code was defined. What we need is a way to get where the currently running code was called from. And to support those dreaded utility functions, a way to pass along where you were called from so the utility function can lie and say, "Hey, you! Yeah, you Enum! You were called from app.main, not app.utils.misc!" -- ~Ethan~ From solipsis at pitrou.net Tue May 7 18:14:24 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 7 May 2013 18:14:24 +0200 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API References: Message-ID: <20130507181424.43e2f2eb@pitrou.net> Le Tue, 7 May 2013 08:44:46 -0700, Eli Bendersky a ?crit : > > > 4) Using _getframe(N) here seems like an overkill to me. > > > > It's not just overkill, it's fragile - it only works if you call the > > constructor directly. If you use a convenience function in a utility > > module, it will try to load your pickles from there rather than > > wherever you bound the name. > > In theory you can climb the frame stack until the desired place, but > this is specifically what my proposal of adding a function tries to > avoid. I don't know how you could do it without walking the frame stack. Granted, you don't need all the information that the stack holds (you don't need to know about line numbers, instruction numbers and local variables, for instance :-)), but you still have to walk *some* kind of dynamically-created stack. This isn't something that is solvable statically (as opposed to e.g. a class's __qualname__, which is computed at compile-time). Regards Antoine. From eliben at gmail.com Tue May 7 18:25:33 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 7 May 2013 09:25:33 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <20130507181424.43e2f2eb@pitrou.net> References: <20130507181424.43e2f2eb@pitrou.net> Message-ID: On Tue, May 7, 2013 at 9:14 AM, Antoine Pitrou wrote: > Le Tue, 7 May 2013 08:44:46 -0700, > Eli Bendersky a ?crit : > > > > 4) Using _getframe(N) here seems like an overkill to me. > > > > > > It's not just overkill, it's fragile - it only works if you call the > > > constructor directly. If you use a convenience function in a utility > > > module, it will try to load your pickles from there rather than > > > wherever you bound the name. > > > > In theory you can climb the frame stack until the desired place, but > > this is specifically what my proposal of adding a function tries to > > avoid. > > I don't know how you could do it without walking the frame stack. > Granted, you don't need all the information that the stack holds > (you don't need to know about line numbers, instruction numbers > and local variables, for instance :-)), but you still have to walk > *some* kind of dynamically-created stack. This isn't something > that is solvable statically (as opposed to e.g. a class's __qualname__, > which is computed at compile-time). > Yes, I fully realize that. I guess I should have phrased my reply differently - this is what the proposal helps *user code to avoid*. For CPython and PyPy and Jython it will be perfectly reasonable to actually climb the frame stack inside that function. For IronPython, another solution may be required if no such frame stack exists. However, even in IronPython there must be a way to get to the module name? In other words, the goal is to hide an ugly piece of exposed implementation detail behind a library call. The library call can be implemented by each platform according to its own internals, but the user won't care. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From murman at gmail.com Tue May 7 20:15:11 2013 From: murman at gmail.com (Michael Urman) Date: Tue, 7 May 2013 13:15:11 -0500 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On Tue, May 7, 2013 at 8:34 AM, Eli Bendersky wrote: > According to an earlier discussion, this is works on CPython, PyPy and > Jython, but not on IronPython. The alternative that works everywhere is to > define the Enum like this: > > Color = Enum('the_module.Color', 'red blue green') > > The reference implementation supports this as well. > As an alternate bikeshed color, why not pass the receiving module to the class factory when pickle support is desirable? That should be less brittle than its name. The class based syntax can still be recommended to libraries that won't know ahead of time if their values need to be pickled. >>> Color = Enum('Color', 'red blue green', module=__main__) Functions that wrap class factories could similarly accept and pass a module along. The fundamental problem is that the class factory cannot know what the intended destination module is without either syntax that provides this ('class' today, proposed 'def' or 'class from' in the thread, or the caller passing additional information around (module name, or module instance). Syntax changes are clearly beyond the scope of PEP 435, otherwise a true enum syntax might have been born. So that leaves us with requiring the caller to provide it. Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed May 8 00:11:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 May 2013 00:11:45 +0200 Subject: [Python-Dev] All 3.x stable buildbots are red References: Message-ID: <20130508001145.7c0902a5@fsol> On Tue, 7 May 2013 09:34:45 +0200 Victor Stinner wrote: > http://buildbot.python.org/all/waterfall?category=3.x.stable > > x86 Windows Server 2003 [SB] 3.x: 3 tests failed, test___all__ test_gc test_ssl > x86 Windows7 3.x: 3 tests failed, test___all__ test_gc test_ssl > x86 Gentoo Non-Debug 3.x: 3 tests failed, test_logging > test_multiprocessing test_urllib2net > x86 Gentoo 3.x: 2 tests failed, test_logging test_urllib2net > x86 Ubuntu Shared 3.x: 1 test failed, test_logging > AMD64 Windows7 SP1 3.x: 4 tests failed, test___all__ test_gc > test_logging test_ssl > AMD64 OpenIndiana 3.x: 1 test failed, test_logging > AMD64 Ubuntu LTS 3.x: 1 test failed, test_logging > AMD64 FreeBSD 9.0 3.x: 1 test failed, test_logging test_ssl is because of http://bugs.python.org/issue17425 test_gc is because of http://bugs.python.org/issue1545463 Regards Antoine. From larry at hastings.org Wed May 8 00:36:06 2013 From: larry at hastings.org (Larry Hastings) Date: Tue, 07 May 2013 15:36:06 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: <51891565.2010107@stoneleaf.us> <518918CE.2010402@stoneleaf.us> Message-ID: <518981D6.60901@hastings.org> On 05/07/2013 08:47 AM, Eli Bendersky wrote: > > def Color = Enum('red green blue') > > > It's an interesting idea, but as NIck suggested we should probably > discuss it on the python-ideas list. [...] > > A special syntax raises more questions though, because it has to be > defined very precisely. Feel free to come up with a complete proposal > to python-ideas, defining the interesting semantics. We don't need a special syntax, we can already do this: @Enum('red green blue') def Color(): pass Here, Enum would take the one argument, and return a function working as a function decorator. That decorator would ignore the body of the function and return the Enum. It's awful, but then so is the idea of creating special syntax just for the functional form of Enum--if we're willing to go down that road, let's just add new syntax for enums and be done with it. As for the non-pickleability of enums created with the functional interface, why can't it use the same mechanism (whatever it is) as the three-argument form of type? Types created that way are dynamic, yet have a __module__ and are pickleable. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed May 8 03:00:45 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 08 May 2013 11:00:45 +1000 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: <5189A3BD.8040401@pearwood.info> On 07/05/13 23:34, Eli Bendersky wrote: > One of the contended issues with PEP 435 on which Guido pronounced was the > functional API, that allows created enumerations dynamically in a manner > similar to namedtuple: > > Color = Enum('Color', 'red blue green') > > The biggest complaint reported against this API is interaction with pickle. > As promised, I want to discuss here how we're going to address this concern. Does this issue really need to be solved before 435 is accepted? As the Zen says: Now is better than never. Although never is often better than *right* now. Solving the pickle issue is a hard problem, but not a critical issue. namedtuple has had the same issue since its inception, only worse because there is no class syntax for namedtuple. This has not been a barrier to the success of namedtuple. Or rather, the issue is not with Enum, or namedtuple, but pickle. Any dynamically-created type will have this issue: >>> import pickle >>> def example(name): ... return type(name, (object,), {}) ... >>> instance = example("Foo")() >>> pickle.dumps(instance) Traceback (most recent call last): File "", line 1, in _pickle.PicklingError: Can't pickle : attribute lookup __main__.Foo failed I don't think it is unreasonable to chalk it up to a limitation of pickle, and say that unless you can meet certain conditions, you won't be able to pickle your instance. Either way, approval of PEP 435 should not be dependent on fixing the pickle issue. -- Steven From eliben at gmail.com Wed May 8 03:55:24 2013 From: eliben at gmail.com (Eli Bendersky) Date: Tue, 7 May 2013 18:55:24 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <5189A3BD.8040401@pearwood.info> References: <5189A3BD.8040401@pearwood.info> Message-ID: On Tue, May 7, 2013 at 6:00 PM, Steven D'Aprano wrote: > On 07/05/13 23:34, Eli Bendersky wrote: > >> One of the contended issues with PEP 435 on which Guido pronounced was the >> functional API, that allows created enumerations dynamically in a manner >> similar to namedtuple: >> >> Color = Enum('Color', 'red blue green') >> >> The biggest complaint reported against this API is interaction with >> pickle. >> As promised, I want to discuss here how we're going to address this >> concern. >> > > > Does this issue really need to be solved before 435 is accepted? As the > Zen says: > > Now is better than never. > Although never is often better than *right* now. > > Solving the pickle issue is a hard problem, but not a critical issue. > namedtuple has had the same issue since its inception, only worse because > there is no class syntax for namedtuple. This has not been a barrier to the > success of namedtuple. > > Agreed > Or rather, the issue is not with Enum, or namedtuple, but pickle. Any > dynamically-created type will have this issue: > > import pickle >>>> def example(name): >>>> >>> ... return type(name, (object,), {}) > ... > >> instance = example("Foo")() >>>> pickle.dumps(instance) >>>> >>> Traceback (most recent call last): > File "", line 1, in > _pickle.PicklingError: Can't pickle : attribute > lookup __main__.Foo failed > > > I don't think it is unreasonable to chalk it up to a limitation of pickle, > and say that unless you can meet certain conditions, you won't be able to > pickle your instance. > > Either way, approval of PEP 435 should not be dependent on fixing the > pickle issue. > Just to be clear- it was not my intention to delay PEP 435 because of this issue. I don't see it as a blocker to pronouncement and from a private correspondence with Guido, he doesn't either. I merely wanted to start a separate thread because I didn't want this discussion to overwhelm the pronouncement thread. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed May 8 12:29:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 8 May 2013 20:29:29 +1000 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <20130507172402.38d0755d@pitrou.net> References: <20130507172402.38d0755d@pitrou.net> Message-ID: On 8 May 2013 01:26, "Antoine Pitrou" wrote: > > Le Wed, 8 May 2013 01:03:38 +1000, > Nick Coghlan a ?crit : > > > > What if there was a variant of the class statement that bound the > > result of a function call rather than using the normal syntax: > > > > class Animal from enum.Enum(members="dog cat bear") > > Apparently you're trying hard to invent syntaxes just to avoid > subclassing. Yeah, just accepting an auto-numbered "members" arg still seems cleaner to me. If we decouple autonumbering from using the functional API, then the rules for pickle support are simple: * use the class syntax; or * pass a fully qualified name. The fragile getframe hack should not be propagated beyond namedtuple. Cheers, Nick. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From aloknayak29 at gmail.com Wed May 8 13:31:20 2013 From: aloknayak29 at gmail.com (Alok Nayak) Date: Wed, 8 May 2013 17:01:20 +0530 Subject: [Python-Dev] this python string literals documentation couldn't explain me: single quote presence inside double quoted string and viceversa. Can Anyone explain me? In-Reply-To: References: Message-ID: I asked this question here, http://stackoverflow.com/questions/16435233/this-python-string-literals-documentation-couldnt-explain-me-single-quote-pres, . I was advised to ask here On Wed, May 8, 2013 at 4:56 PM, Alok Nayak wrote: > > This python string literals documentationcouldn't explain: single quote presence inside double quoted string and > viceversa. > > I think both double quoted string and single quoted string need to be > defined differently for representing the 'stringliteral' lexical definition. > > shortstringchar ::= > > here in this definition 'the quote' isn't specific whether single (') or > double ("). > > > -- > Alok Nayak > Gwalior, India > -- Alok Nayak Gwalior, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From aloknayak29 at gmail.com Wed May 8 13:56:18 2013 From: aloknayak29 at gmail.com (Alok Nayak) Date: Wed, 8 May 2013 17:26:18 +0530 Subject: [Python-Dev] this python string literals documentation couldn't explain me: single quote presence inside double quoted string and viceversa. Can Anyone explain me? Message-ID: This python string literals documentationcouldn't explain: single quote presence inside double quoted string and viceversa. I think both double quoted string and single quoted string need to be defined differently for representing the 'stringliteral' lexical definition. shortstringchar ::= here in this definition 'the quote' isn't specific whether single (') or double ("). I asked this question here, http://stackoverflow.com/questions/16435233/this-python-string-literals-documentation-couldnt-explain-me-single-quote-pres, . I was advised to ask here -- Alok Nayak Gwalior, India -- Alok Nayak Gwalior, India -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed May 8 14:16:14 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 08 May 2013 22:16:14 +1000 Subject: [Python-Dev] this python string literals documentation couldn't explain me: single quote presence inside double quoted string and viceversa. Can Anyone explain me? In-Reply-To: References: Message-ID: <518A420E.2050805@pearwood.info> On 08/05/13 21:31, Alok Nayak wrote: > I asked this question here, > http://stackoverflow.com/questions/16435233/this-python-string-literals-documentation-couldnt-explain-me-single-quote-pres, > . I was advised to ask here They were wrong. It is not relevant here, since it is not a question about development of Python. But I will answer your question anyway. The relevant parts of the documentation are: shortstring ::= "'" shortstringitem* "'" | '"' shortstringitem* '"' shortstringitem ::= shortstringchar | escapeseq shortstringchar ::= So let's look at a string: 'a"b' This is a shortstring, made up of single-quote followed by three shortstringitems, followed by single-quote. All three shortstring items are shortstringchar, not escapeseq: a is a source character, not including "\" or newline or single-quote " is a source character, not including "\" or newline or single-quote b is a source character, not including "\" or newline or single-quote [...] >> shortstringchar ::= >> >> here in this definition 'the quote' isn't specific whether single (') or >> double ("). Correct. You are expected to understand that it means either single-quote or double-quote according to context. This is documentation aimed at a human reader who should be able to use human reasoning skills to understand what "the quote" means, it is not the literal BNF grammar used by the compiler to compile Python's parser. For brevity and simplicity, some definitions may be simplified. -- Steven From aloknayak29 at gmail.com Wed May 8 15:09:52 2013 From: aloknayak29 at gmail.com (Alok Nayak) Date: Wed, 8 May 2013 18:39:52 +0530 Subject: [Python-Dev] this python string literals documentation couldn't explain me: single quote presence inside double quoted string and viceversa. Can Anyone explain me? In-Reply-To: <518A420E.2050805@pearwood.info> References: <518A420E.2050805@pearwood.info> Message-ID: Thanks for you answer sir, I was thinking its regular expressions(automata) not BNF grammer. Aint I right ? And I thought even if it is for human reading, if we write literal grammer ( regular expression, in my view) using documentation, we would end up with python not allowing strings like"python's rule" and ' python"the master" ' Here I changed the defination of string literal for explaining my thinking, inserted short-singlequoted-stringitem, short-sq-tstringchar, short-doublequoted-stringitem and short-dq-stringchar . I think we should use this in documentation stringliteral ::= [stringprefix](shortstring | longstring) stringprefix ::= "r" | "u" | "ur" | "R" | "U" | "UR" | "Ur" | "uR" | "b" | "B" | "br" | "Br" | "bR" | "BR" shortstring ::= "'" short-singlequoted-stringitem* "'" | '"' short-doublequoted-stringitem* '"' longstring ::= "'''" longstringitem* "'''" | '"""' longstringitem* '"""' short-singlequoted-stringitem ::= short-sq-stringchar | escapeseq short-doublequoted-stringitem ::= short-dq-stringchar | escapeseq longstringitem ::= longstringchar | escapeseq short-sq-tstringchar ::= short-dq-stringchar ::= longstringchar ::= escapeseq ::= "\" -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed May 8 18:14:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 8 May 2013 18:14:59 +0200 Subject: [Python-Dev] Call for testing: generator finalization Message-ID: <20130508181459.5e36374a@fsol> Hello, In http://bugs.python.org/issue17807 I've committed a patch to allow generator finalization (execution of "finally" blocks) even when a generator is part of a reference cycle. If you have some workload which is known for problems with generator finalization (or otherwise makes a heavy use of generators), it would nice to have some feedback on this change. (the commit is only on the default branch) Regards Antoine. From JMao at rocketsoftware.com Thu May 9 02:37:43 2013 From: JMao at rocketsoftware.com (Jianfeng Mao) Date: Thu, 9 May 2013 00:37:43 +0000 Subject: [Python-Dev] Any script to create the installation pacakge of Python 3.3.1 on Windows and *NIX? Message-ID: <6478B930E8DB4545BD8EEADFAA73FED878CF152C@nwt-s-mbx1.rocketsoftware.com> To Python-Dev committers: I am working on a project to embed a slightly customized Python interpreter in our own software. For easy installation and setup, we want to be able to do the standard Python installation as part of the installation of our product. So far I have successfully customized and built Python 3.3.1 (including the subprojects) on Windows but I can't find anything in the source distribution to allow me package the binaries/modules etc into a MSI just like the one on the download page on python.org. So I am asking for information regarding how to package Python build for installation on both Windows and *NIX platforms. Your help will be greatly appreciated. Thanks, Jianfeng -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Thu May 9 05:08:45 2013 From: brian at python.org (Brian Curtin) Date: Wed, 8 May 2013 22:08:45 -0500 Subject: [Python-Dev] Any script to create the installation pacakge of Python 3.3.1 on Windows and *NIX? In-Reply-To: <6478B930E8DB4545BD8EEADFAA73FED878CF152C@nwt-s-mbx1.rocketsoftware.com> References: <6478B930E8DB4545BD8EEADFAA73FED878CF152C@nwt-s-mbx1.rocketsoftware.com> Message-ID: On Wed, May 8, 2013 at 7:37 PM, Jianfeng Mao wrote: > To Python-Dev committers: > > > > I am working on a project to embed a slightly customized Python interpreter > in our own software. For easy installation and setup, we want to be able to > do the standard Python installation as part of the installation of our > product. So far I have successfully customized and built Python 3.3.1 > (including the subprojects) on Windows but I can?t find anything in the > source distribution to allow me package the binaries/modules etc into a MSI > just like the one on the download page on python.org. So I am asking for > information regarding how to package Python build for installation on both > Windows and *NIX platforms. Your help will be greatly appreciated. See Tools/msi/msi.py for the Windows MSI builder. From eliben at gmail.com Thu May 9 05:47:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 8 May 2013 20:47:46 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan wrote: > On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky wrote: > > One of the contended issues with PEP 435 on which Guido pronounced was > the > > functional API, that allows created enumerations dynamically in a manner > > similar to namedtuple: > > > > Color = Enum('Color', 'red blue green') > > > > The biggest complaint reported against this API is interaction with > pickle. > > As promised, I want to discuss here how we're going to address this > concern. > > > > At this point, the pickle docs say that module-top-level classes can be > > pickled. This obviously works for the normal Enum classes, but is a > problem > > with the functional API because the class is created dynamically and has > no > > __module__. > > > > To solve this, the reference implementation is used the same approach as > > namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real > > code has some safeguards): > > > > module_name = sys._getframe(1).f_globals['__name__'] > > enum_class.__module__ = module_name > > > > According to an earlier discussion, this is works on CPython, PyPy and > > Jython, but not on IronPython. The alternative that works everywhere is > to > > define the Enum like this: > > > > Color = Enum('the_module.Color', 'red blue green') > > > > The reference implementation supports this as well. > > > > Some points for discussion: > > > > 1) We can say that using the functional API when pickling can happen is > not > > recommended, but maybe a better way would be to just explain the way > things > > are and let users decide? > > It's probably worth creating a section in the pickle docs and > explaining the vagaries of naming things and the dependency on knowing > the module name. The issue comes up with defining classes in __main__ > and when implementing pseudo-modules as well (see PEP 395). > > > 2) namedtuple should also support the fully qualified name syntax. If > this > > is agreed upon, I can create an issue. > > Yes, I think that part should be done. > > http://bugs.python.org/issue17941 Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 9 16:17:18 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 May 2013 00:17:18 +1000 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On 9 May 2013 13:48, "Eli Bendersky" wrote: > > > > > On Tue, May 7, 2013 at 8:03 AM, Nick Coghlan wrote: >> >> On Tue, May 7, 2013 at 11:34 PM, Eli Bendersky wrote: >> > One of the contended issues with PEP 435 on which Guido pronounced was the >> > functional API, that allows created enumerations dynamically in a manner >> > similar to namedtuple: >> > >> > Color = Enum('Color', 'red blue green') >> > >> > The biggest complaint reported against this API is interaction with pickle. >> > As promised, I want to discuss here how we're going to address this concern. >> > >> > At this point, the pickle docs say that module-top-level classes can be >> > pickled. This obviously works for the normal Enum classes, but is a problem >> > with the functional API because the class is created dynamically and has no >> > __module__. >> > >> > To solve this, the reference implementation is used the same approach as >> > namedtuple (*). In the metaclass's __new__ (this is an excerpt, the real >> > code has some safeguards): >> > >> > module_name = sys._getframe(1).f_globals['__name__'] >> > enum_class.__module__ = module_name >> > >> > According to an earlier discussion, this is works on CPython, PyPy and >> > Jython, but not on IronPython. The alternative that works everywhere is to >> > define the Enum like this: >> > >> > Color = Enum('the_module.Color', 'red blue green') >> > >> > The reference implementation supports this as well. >> > >> > Some points for discussion: >> > >> > 1) We can say that using the functional API when pickling can happen is not >> > recommended, but maybe a better way would be to just explain the way things >> > are and let users decide? >> >> It's probably worth creating a section in the pickle docs and >> explaining the vagaries of naming things and the dependency on knowing >> the module name. The issue comes up with defining classes in __main__ >> and when implementing pseudo-modules as well (see PEP 395). >> >> > 2) namedtuple should also support the fully qualified name syntax. If this >> > is agreed upon, I can create an issue. >> >> Yes, I think that part should be done. >> > > http://bugs.python.org/issue17941 As Eric noted on the tracker issue, a keyword only "module" argument may be a better choice for both than allowing dotted names. A separate parameter is easier to use with __name__ to avoid hardcoding the module name. At the very least, the PEP should provide a rationale for the current choice. Cheers, Nick. > > Eli > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu May 9 18:24:04 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 9 May 2013 09:24:04 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: On Thu, May 9, 2013 at 7:17 AM, Nick Coghlan wrote: > As Eric noted on the tracker issue, a keyword only "module" argument may > be a better choice for both than allowing dotted names. A separate > parameter is easier to use with __name__ to avoid hardcoding the module > name. +1. This is a good one. While adding module=__name__ is actually more typing than passing __name__ + '.Color' as the class name, the current proposal (parsing for dots) makes it very attractive to do the wrong thing and hardcode the module name. Then typing the module incorrectly is very easy, and the mistake is easily overlooked because it won't be noticed until you actually try to pickle a member. At the very least, the PEP should provide a rationale for the current > choice. > > Cheers, > Nick. > > > > > Eli > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Thu May 9 18:31:34 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 9 May 2013 12:31:34 -0400 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: References: Message-ID: <20130509123134.2003492a@anarchist> On May 09, 2013, at 09:24 AM, Guido van Rossum wrote: >+1. This is a good one. While adding module=__name__ is actually more >typing than passing __name__ + '.Color' as the class name, the current >proposal (parsing for dots) makes it very attractive to do the wrong thing >and hardcode the module name. Then typing the module incorrectly is very >easy, and the mistake is easily overlooked because it won't be noticed >until you actually try to pickle a member. Seems reasonable. The `module` argument should be keyword-only, and obviously namedtuple should support the same API. -Barry From eliben at gmail.com Thu May 9 18:47:31 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 9 May 2013 09:47:31 -0700 Subject: [Python-Dev] PEP 435: pickling enums created with the functional API In-Reply-To: <20130509123134.2003492a@anarchist> References: <20130509123134.2003492a@anarchist> Message-ID: On Thu, May 9, 2013 at 9:31 AM, Barry Warsaw wrote: > On May 09, 2013, at 09:24 AM, Guido van Rossum wrote: > > >+1. This is a good one. While adding module=__name__ is actually more > >typing than passing __name__ + '.Color' as the class name, the current > >proposal (parsing for dots) makes it very attractive to do the wrong thing > >and hardcode the module name. Then typing the module incorrectly is very > >easy, and the mistake is easily overlooked because it won't be noticed > >until you actually try to pickle a member. > > Seems reasonable. The `module` argument should be keyword-only, and > obviously > namedtuple should support the same API. > Yes, this was already pointed out by Eric in http://bugs.python.org/issue17941 which tracks this feature for namedtuple. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From diegotolentino at gmail.com Thu May 9 21:58:02 2013 From: diegotolentino at gmail.com (Diego Tolentino) Date: Thu, 9 May 2013 16:58:02 -0300 Subject: [Python-Dev] I want contribute to the project Message-ID: Hi guys, I have 3 computer and want to contribute to the project, Windows 7, Interl I5 64bits Ubuntu 13.04, Interl I5 32bits Ubuntu 13.04, Interl I5 64bits How i can proceed? I'm nice in this world of free software and python best regards Diego Tolentino SENIOR DEVELOPER Skype: diegotolentino ?Do not go where the path may lead, go instead where there is no path and leave a trail.? - Ralph Waldo Emerson -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Thu May 9 22:44:50 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 9 May 2013 13:44:50 -0700 Subject: [Python-Dev] I want contribute to the project In-Reply-To: References: Message-ID: On Thu, May 9, 2013 at 12:58 PM, Diego Tolentino wrote: > Hi guys, > > I have 3 computer and want to contribute to the project, > > Windows 7, Interl I5 64bits > Ubuntu 13.04, Interl I5 32bits > Ubuntu 13.04, Interl I5 64bits > > How i can proceed? I'm nice in this world of free software and python > > Hi Diego, Welcome! The best way to get started is to read this short page - http://pythonmentors.com/, subscribe to the mentorship mailing list it mentions, and then go over the (not-so-short, but very useful) developers' guide it links to. Good luck, Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri May 10 01:01:15 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 9 May 2013 16:01:15 -0700 Subject: [Python-Dev] PEP 435 (Enums) is Accepted Message-ID: I have reviewed the latest version of PEP 435 and I see that it is very good. I hereby declare PEP 435 as Accepted. Congratulations go to Barry, Eli and Ethan for pulling it through one of the most thorough reviewing and bikeshedding processes any PEP has seen. Thanks to everyone else for the many review comments. It is a better PEP because of the reviews. Barry or Eli, you can update the PEP's status. (I also expect there will be some copy-editing still.) Ethan: the stdlib implementation should probably be assigned a bug tracker issue (if there isn't one already) and start code review now. -- --Guido van Rossum (python.org/~guido) From g.rodola at gmail.com Fri May 10 02:01:40 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Fri, 10 May 2013 02:01:40 +0200 Subject: [Python-Dev] Help requested for issue 9285 (profile.py) Message-ID: http://bugs.python.org/issue9285#msg182986 I'm stuck as I really have no clue what that error means. Any help from someone experienced with profile.py code is welcome. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From barry at python.org Fri May 10 02:12:45 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 9 May 2013 20:12:45 -0400 Subject: [Python-Dev] PEP 435 (Enums) is Accepted In-Reply-To: References: Message-ID: <20130509201245.7361d30c@limelight.wooz.org> On May 09, 2013, at 04:01 PM, Guido van Rossum wrote: >I have reviewed the latest version of PEP 435 and I see that it is >very good. I hereby declare PEP 435 as Accepted. Congratulations go to >Barry, Eli and Ethan for pulling it through one of the most thorough >reviewing and bikeshedding processes any PEP has seen. Thanks to >everyone else for the many review comments. It is a better PEP because >of the reviews. Let me echo Guido's thanks to everyone on python-dev and python-ideas, and especially Eli and Ethan. Our ability to come together and produce agreement on not only a contentious, but long wished for, feature shows off the best of our community. Huge thanks also to Guido for the invaluable pronouncements along the way. I can honestly say I'm happy with the results, and the experience of participating. Great work everyone. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Fri May 10 08:46:33 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 May 2013 16:46:33 +1000 Subject: [Python-Dev] PEP 435 (Enums) is Accepted In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 9:01 AM, Guido van Rossum wrote: > I have reviewed the latest version of PEP 435 and I see that it is > very good. I hereby declare PEP 435 as Accepted. Congratulations go to > Barry, Eli and Ethan for pulling it through one of the most thorough > reviewing and bikeshedding processes any PEP has seen. Thanks to > everyone else for the many review comments. It is a better PEP because > of the reviews. And there was much rejoicing, huzzah! :) As an added bonus, people trying to understand the details of metaclasses will now have a non-trivial standard library example to investigate (we may want to update the language reference at some point to highlight that). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ethan at stoneleaf.us Fri May 10 08:56:00 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 09 May 2013 23:56:00 -0700 Subject: [Python-Dev] PEP 435 (Enums) is Accepted In-Reply-To: References: Message-ID: <518C9A00.106@stoneleaf.us> On 05/09/2013 11:46 PM, Nick Coghlan wrote: > > As an added bonus, people trying to understand the details of > metaclasses will now have a non-trivial standard library example to > investigate Hmmm... __prepare__ really isn't doing very much at the moment... I could have it do more... maybe create some kind of little helper class, name it the same as the class being created, and stuff it in the class dict... ;) -- ~Ethan~ From ncoghlan at gmail.com Fri May 10 09:14:21 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 May 2013 17:14:21 +1000 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs Message-ID: I'd like to mark a few PEPs that are not currently being actively considered for 3.4 as Deferred: S 286 Enhanced Argument Tuples von L?wis S 337 Logging Usage in the Standard Library Dubner S 368 Standard image protocol and class Mastrodomenico I 396 Module Version Numbers Warsaw S 400 Deprecate codecs.StreamReader and codecs.StreamWriter Stinner S 419 Protecting cleanup statements from interruptions Colomiets I 423 Naming conventions and recipes related to packaging Bryon I 444 Python Web3 Interface McDonough, Ronacher S 3124 Overloading, Generic Functions, Interfaces, and ... Eby S 3142 Add a "while" clause to generator expressions Britton S 3143 Standard daemon process library Finney S 3145 Asynchronous I/O For subprocess.Popen Pruitt, McCreary, Carlson S 3152 Cofunctions Ewing Obviously, they can be reactivated at any time, but I think it would be beneficial to have the "Open" list more accurately reflect proposals that are currently being championed. I'd also like to mark this one as rejected by Guido at PyCon US 2013 in favour of an updated PEP 436: S 437 A DSL for specifying signatures, annotations and ... Krah Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Fri May 10 10:18:49 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 10:18:49 +0200 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs References: Message-ID: <20130510101849.5e4a1820@pitrou.net> Hello Nick, Le Fri, 10 May 2013 17:14:21 +1000, Nick Coghlan a ?crit : > I'd like to mark a few PEPs that are not currently being actively > considered for 3.4 as Deferred: > > S 286 Enhanced Argument Tuples von > L?wis S 337 Logging Usage in the Standard > Library Dubner S 368 Standard image protocol and > class Mastrodomenico I 396 Module Version > Numbers Warsaw S 400 Deprecate > codecs.StreamReader and codecs.StreamWriter Stinner S 419 > Protecting cleanup statements from interruptions Colomiets I > 423 Naming conventions and recipes related to packaging Bryon > I 444 Python Web3 Interface McDonough, Ronacher > S 3124 Overloading, Generic Functions, Interfaces, and ... Eby > S 3142 Add a "while" clause to generator expressions > Britton S 3143 Standard daemon process > library Finney S 3145 Asynchronous I/O For > subprocess.Popen Pruitt, McCreary, Carlson > S 3152 Cofunctions > Ewing > > Obviously, they can be reactivated at any time, but I think it would > be beneficial to have the "Open" list more accurately reflect > proposals that are currently being championed. Sounds fine to me. > I'd also like to mark this one as rejected by Guido at PyCon US 2013 > in favour of an updated PEP 436: > S 437 A DSL for specifying signatures, annotations and ... Krah I haven't followed enough to have an opinion :) Regards Antoine. From benhoyt at gmail.com Fri May 10 12:55:56 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Fri, 10 May 2013 22:55:56 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info Message-ID: A few of us were having a discussion at http://bugs.python.org/issue11406 about adding os.scandir(): a generator version of os.listdir() to make iterating over very large directories more memory efficient. This also reflects how the OS gives things to you -- it doesn't give you a big list, but you call a function to iterate and fetch the next entry. While I think that's a good idea, I'm not sure just that much is enough of an improvement to make adding the generator version worth it. But what would make this a killer feature is making os.scandir() generate tuples of (name, stat_like_info). The Windows directory iteration functions (FindFirstFile/FindNextFile) give you the full stat information for free, and the Linux and OS X functions (opendir/readdir) give you partial file information (d_type in the dirent struct, which is basically the st_mode part of a stat, whether it's a file, directory, link, etc). Having this available at the Python level would mean we can vastly speed up functions like os.walk() that otherwise need to make an os.stat() call for every file returned. In my benchmarks of such a generator on Windows, it speeds up os.walk() by 9-10x. On Linux/OS X, it's more like 1.5-3x. In my opinion, that kind of gain is huge, especially on Windows, but also on Linux/OS X. So the idea is to add this relatively low-level function that exposes the extra information the OS gives us for free, but which os.listdir() currently throws away. Then higher-level, platform-independent functions like os.walk() could use os.scandir() to get much better performance. People over at Issue 11406 think this is a good idea. HOWEVER, there's debate over what kind of object the second element in the tuple, "stat_like_info", should be. My strong vote is for it to be a stat_result-like object, but where the fields are None if they're unknown. There would be basically three scenarios: 1) stat_result with all fields set: this would happen on Windows, where you get as much info from FindFirst/FindNext as from an os.stat() 2) stat_result with just st_mode set, and all other fields None: this would be the usual case on Linux/OS X 3) stat_result with all fields None: this would happen on systems whose readdir()/dirent doesn't have d_type, or on Linux/OS X when d_type was DT_UNKNOWN Higher-level functions like os.walk() would then check the fields they needed are not None, and only call os.stat() if needed, for example: # Build lists of files and directories in path files = [] dirs = [] for name, st in os.scandir(path): if st.st_mode is None: st = os.stat(os.path.join(path, name)) if stat.S_ISDIR(st.st_mode): dirs.append(name) else: files.append(name) Not bad for a 2-10x performance boost, right? What do folks think? Cheers, Ben. P.S. A few non-essential further notes: 1) As a Windows guy, a nice-to-have addition to os.scandir() would be a keyword arg like win_wildcard which defaulted to '*.*', but power users can pass in to utilize the wildcard feature of FindFirst/FindNext on Windows. We have plenty of other low-level functions that expose OS-specific features in the OS module, so this would be no different. But then again, it's not nearly as important as exposing the stat info. 2) I've been dabbling with this concept for a while in my BetterWalk library: https://github.com/benhoyt/betterwalk Note that the benchmarks there are old, and I've made further improvements in my local copy. The ctypes version gives speed gains for os.walk() of 2-3x on Windows, but I've also got a C version, which is giving 9-10x speed gains. I haven't yet got a Linux/OS X version written in C. 3) See also the previous python-dev thread on BetterWalk: http://mail.python.org/pipermail/python-ideas/2012-November/017944.html From christian at python.org Fri May 10 13:46:30 2013 From: christian at python.org (Christian Heimes) Date: Fri, 10 May 2013 13:46:30 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: Message-ID: <518CDE16.6010104@python.org> Am 10.05.2013 12:55, schrieb Ben Hoyt: > Higher-level functions like os.walk() would then check the fields they > needed are not None, and only call os.stat() if needed, for example: > > # Build lists of files and directories in path > files = [] > dirs = [] > for name, st in os.scandir(path): > if st.st_mode is None: > st = os.stat(os.path.join(path, name)) > if stat.S_ISDIR(st.st_mode): > dirs.append(name) > else: > files.append(name) Have you actually tried the code? It can't give you correct answers. The struct dirent.d_type member as returned by readdir() has different values than stat.st_mode's file type. For example on my system readdir() returns DT_DIR for a directory but S_ISDIR() checks different bits: DT_DIR = 4 S_ISDIR(mode) ((mode) & 0170000) == 0040000 Or are you proposing to map d_type to st_mode? That's also problematic because st_mode would only have file type bits, not permission bits. Also POSIX standards state that new file types will not get additional S_IF* constant assigned to. Some operation systems have IFTODT() / DTTOIF() macros which convert bits between st_mode and d_type but the macros aren't part of POSIX standard. Hence I'm +1 on the general idea but -1 on something stat like. IMHO os.scandir() should yield four objects: * name * inode * file type or DT_UNKNOWN * stat_result or None stat_result shall only be returned when the operating systems provides a full stat result as returned by os.stat(). Christian From solipsis at pitrou.net Fri May 10 14:16:22 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 14:16:22 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: <518CDE16.6010104@python.org> Message-ID: <20130510141622.250aaa65@pitrou.net> Le Fri, 10 May 2013 13:46:30 +0200, Christian Heimes a ?crit : > > Hence I'm +1 on the general idea but -1 on something stat like. IMHO > os.scandir() should yield four objects: > > * name > * inode > * file type or DT_UNKNOWN > * stat_result or None > > stat_result shall only be returned when the operating systems > provides a full stat result as returned by os.stat(). But what if some systems return more than the file type and less than a full stat result? The general problem is POSIX's terrible inertia. I feel that a stat result with some None fields would be an acceptable compromise here. Regards Antoine. From dholth at gmail.com Fri May 10 14:30:20 2013 From: dholth at gmail.com (Daniel Holth) Date: Fri, 10 May 2013 08:30:20 -0400 Subject: [Python-Dev] PEP 4XX: pyzaa "Improving Python ZIP Application Support" In-Reply-To: References: <51846747.9060507@pearwood.info> <87vc6zcopu.fsf@uwakimon.sk.tsukuba.ac.jp> <5184A779.6010108@pearwood.info> <21fa0d713f49484298d479d348d185d4@BLUPR03MB035.namprd03.prod.outlook.com> Message-ID: Everyone seems to like the first half of this simple PEP adding the extensions. The 3-letter extension for windowed apps can be "pzw" while the "pyz" extension for console apps stays the same. The second half, the tool https://bitbucket.org/dholth/pyzaa/src/tip/pyzaa.py?at=default is less mature, but there's not a whole lot to do in a simple tool that may serve more as an example: you can open any file with ZipFile in append mode, even one that is not a zip file and just contains the #!python shebang line. Thanks, Daniel From ronaldoussoren at mac.com Fri May 10 15:25:01 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 10 May 2013 15:25:01 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130510141622.250aaa65@pitrou.net> References: <518CDE16.6010104@python.org> <20130510141622.250aaa65@pitrou.net> Message-ID: On 10 May, 2013, at 14:16, Antoine Pitrou wrote: > Le Fri, 10 May 2013 13:46:30 +0200, > Christian Heimes a ?crit : >> >> Hence I'm +1 on the general idea but -1 on something stat like. IMHO >> os.scandir() should yield four objects: >> >> * name >> * inode >> * file type or DT_UNKNOWN >> * stat_result or None >> >> stat_result shall only be returned when the operating systems >> provides a full stat result as returned by os.stat(). > > But what if some systems return more than the file type and less than a > full stat result? The general problem is POSIX's terrible inertia. > I feel that a stat result with some None fields would be an acceptable > compromise here. But how do you detect that the st_mode field on systems with a d_type is incomplete, as oposed to a system that can return a full st_mode from its readdir equivalent and where the permission bits happen to be 0o0000? One option would be to add a file type field to stat_result, IIRC this was mentioned in some revisions of the extended stat_result proposal over on python-ideas. Ronald > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com From christian at python.org Fri May 10 15:46:21 2013 From: christian at python.org (Christian Heimes) Date: Fri, 10 May 2013 15:46:21 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130510141622.250aaa65@pitrou.net> References: <518CDE16.6010104@python.org> <20130510141622.250aaa65@pitrou.net> Message-ID: <518CFA2D.3020405@python.org> Am 10.05.2013 14:16, schrieb Antoine Pitrou: > But what if some systems return more than the file type and less than a > full stat result? The general problem is POSIX's terrible inertia. > I feel that a stat result with some None fields would be an acceptable > compromise here. POSIX only defines the d_ino and d_name members of struct dirent. Linux, BSD and probably some other platforms also happen to provide d_type. The other members of struct dirent (d_reclen, d_namlen) aren't useful in Python space by themselves. d_type and st_mode aren't compatible in any way. As you know st_mode also contains POSIX permission information. The file type is encoded with a different set of bits, too. Future file types aren't mapped to S_IF* constants for st_mode. For d_ino you also need the device number from the directory because the inode is only unique within a device. I don't really see how to map strut dirent to struct stat on POSIX. Christian From solipsis at pitrou.net Fri May 10 15:54:11 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 15:54:11 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: <518CDE16.6010104@python.org> <20130510141622.250aaa65@pitrou.net> <518CFA2D.3020405@python.org> Message-ID: <20130510155411.277ea9cb@pitrou.net> Le Fri, 10 May 2013 15:46:21 +0200, Christian Heimes a ?crit : > Am 10.05.2013 14:16, schrieb Antoine Pitrou: > > But what if some systems return more than the file type and less > > than a full stat result? The general problem is POSIX's terrible > > inertia. I feel that a stat result with some None fields would be > > an acceptable compromise here. > > POSIX only defines the d_ino and d_name members of struct dirent. > Linux, BSD and probably some other platforms also happen to provide > d_type. The other members of struct dirent (d_reclen, d_namlen) > aren't useful in Python space by themselves. > > d_type and st_mode aren't compatible in any way. As you know st_mode > also contains POSIX permission information. The file type is encoded > with a different set of bits, too. Future file types aren't mapped to > S_IF* constants for st_mode. Thank you and Ronald for clarifying. This does make the API design a bit bothersome. We want to expose as much information as possible in a cross-platform way and with a flexible granularity, but doing so might require a gazillion of namedtuple fields (platonically, as much as one field per stat bit). > For d_ino you also need the device number from the directory because > the inode is only unique within a device. But hopefully you've already stat'ed the directory ;) Regards Antoine. From barry at python.org Fri May 10 15:57:37 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 10 May 2013 09:57:37 -0400 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: References: Message-ID: <20130510095737.06143376@anarchist> On May 10, 2013, at 05:14 PM, Nick Coghlan wrote: > I 396 Module Version Numbers Warsaw I do want to eventually return to this PEP, but I probably won't any time soon. -Barry From ncoghlan at gmail.com Fri May 10 15:53:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 10 May 2013 23:53:37 +1000 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <518CFA2D.3020405@python.org> References: <518CDE16.6010104@python.org> <20130510141622.250aaa65@pitrou.net> <518CFA2D.3020405@python.org> Message-ID: On Fri, May 10, 2013 at 11:46 PM, Christian Heimes wrote: > Am 10.05.2013 14:16, schrieb Antoine Pitrou: >> But what if some systems return more than the file type and less than a >> full stat result? The general problem is POSIX's terrible inertia. >> I feel that a stat result with some None fields would be an acceptable >> compromise here. > > POSIX only defines the d_ino and d_name members of struct dirent. Linux, > BSD and probably some other platforms also happen to provide d_type. The > other members of struct dirent (d_reclen, d_namlen) aren't useful in > Python space by themselves. > > d_type and st_mode aren't compatible in any way. As you know st_mode > also contains POSIX permission information. The file type is encoded > with a different set of bits, too. Future file types aren't mapped to > S_IF* constants for st_mode. Why are we exposing a bitfield as the primary Python level API, anyway? It makes sense for the well defined permission bits, but why are we copying the C level concept for the other flags? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 10 16:12:31 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 May 2013 00:12:31 +1000 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: <20130510095737.06143376@anarchist> References: <20130510095737.06143376@anarchist> Message-ID: On Fri, May 10, 2013 at 11:57 PM, Barry Warsaw wrote: > On May 10, 2013, at 05:14 PM, Nick Coghlan wrote: > >> I 396 Module Version Numbers Warsaw > > I do want to eventually return to this PEP, but I probably won't any time > soon. Yeah, I have a couple of PEPs like that - they pretty much live in Deferred and I update them when inspiration strikes :) PEP-a-holic'ly, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rdmurray at bitdance.com Fri May 10 16:19:14 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 10 May 2013 10:19:14 -0400 Subject: [Python-Dev] PEP 368 In-Reply-To: References: Message-ID: <20130510141914.A942C250498@webabinitio.net> On Fri, 10 May 2013 17:14:21 +1000, Nick Coghlan wrote: > S 368 Standard image protocol and class Mastrodomenico I haven't read through it in detail yet, but this PEP looks interesting in the context of the further enhancements planned for the email module (ie: a MIME image object returned by the email parser is a candidate to provide the PEP 368 interface). Does anyone know if there is any associated code? --David From ronaldoussoren at mac.com Fri May 10 16:20:29 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 10 May 2013 16:20:29 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130510155411.277ea9cb@pitrou.net> References: <518CDE16.6010104@python.org> <20130510141622.250aaa65@pitrou.net> <518CFA2D.3020405@python.org> <20130510155411.277ea9cb@pitrou.net> Message-ID: <092C18DC-4076-4983-B9A8-F9B795CE1C4A@mac.com> On 10 May, 2013, at 15:54, Antoine Pitrou wrote: > Le Fri, 10 May 2013 15:46:21 +0200, > Christian Heimes a ?crit : > >> Am 10.05.2013 14:16, schrieb Antoine Pitrou: >>> But what if some systems return more than the file type and less >>> than a full stat result? The general problem is POSIX's terrible >>> inertia. I feel that a stat result with some None fields would be >>> an acceptable compromise here. >> >> POSIX only defines the d_ino and d_name members of struct dirent. >> Linux, BSD and probably some other platforms also happen to provide >> d_type. The other members of struct dirent (d_reclen, d_namlen) >> aren't useful in Python space by themselves. >> >> d_type and st_mode aren't compatible in any way. As you know st_mode >> also contains POSIX permission information. The file type is encoded >> with a different set of bits, too. Future file types aren't mapped to >> S_IF* constants for st_mode. > > Thank you and Ronald for clarifying. This does make the API design a > bit bothersome. We want to expose as much information as possible in a > cross-platform way and with a flexible granularity, but doing so might > require a gazillion of namedtuple fields (platonically, as much as one > field per stat bit). One field per stat bit is overkill, file permissions are well known enough to keep them as a single item. Most if not all uses of the st_mode field can be covered by adding just "filetype" and "permissions" fields. That would also make it possible to use stat_result in os.scandir() without loosing information (it would have filetype != None and permissions and st_mode == None on systems with d_type). > >> For d_ino you also need the device number from the directory because >> the inode is only unique within a device. > > But hopefully you've already stat'ed the directory ;) Why? There's no need to stat the directory when implementing os.walk using os.scandir (for systems that return filetype information in the API used by os.scandir). Anyway, setting st_ino in the result of os.scandir is harmless, even though using st_ino is uncommon. Getting st_dev from the directory isn't good anyway, for example when using rebind mounts to mount a single file into a different directory (which is a convenient way to make a configuration file available in a chroot environment) Ronald > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com From python at mrabarnett.plus.com Fri May 10 16:30:54 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 10 May 2013 15:30:54 +0100 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: Message-ID: <518D049E.4080900@mrabarnett.plus.com> On 10/05/2013 11:55, Ben Hoyt wrote: > A few of us were having a discussion at > http://bugs.python.org/issue11406 about adding os.scandir(): a > generator version of os.listdir() to make iterating over very large > directories more memory efficient. This also reflects how the OS gives > things to you -- it doesn't give you a big list, but you call a > function to iterate and fetch the next entry. > > While I think that's a good idea, I'm not sure just that much is > enough of an improvement to make adding the generator version worth > it. > > But what would make this a killer feature is making os.scandir() > generate tuples of (name, stat_like_info). The Windows directory > iteration functions (FindFirstFile/FindNextFile) give you the full > stat information for free, and the Linux and OS X functions > (opendir/readdir) give you partial file information (d_type in the > dirent struct, which is basically the st_mode part of a stat, whether > it's a file, directory, link, etc). > > Having this available at the Python level would mean we can vastly > speed up functions like os.walk() that otherwise need to make an > os.stat() call for every file returned. In my benchmarks of such a > generator on Windows, it speeds up os.walk() by 9-10x. On Linux/OS X, > it's more like 1.5-3x. In my opinion, that kind of gain is huge, > especially on Windows, but also on Linux/OS X. > > So the idea is to add this relatively low-level function that exposes > the extra information the OS gives us for free, but which os.listdir() > currently throws away. Then higher-level, platform-independent > functions like os.walk() could use os.scandir() to get much better > performance. People over at Issue 11406 think this is a good idea. > > HOWEVER, there's debate over what kind of object the second element in > the tuple, "stat_like_info", should be. My strong vote is for it to be > a stat_result-like object, but where the fields are None if they're > unknown. There would be basically three scenarios: > > 1) stat_result with all fields set: this would happen on Windows, > where you get as much info from FindFirst/FindNext as from an > os.stat() > 2) stat_result with just st_mode set, and all other fields None: this > would be the usual case on Linux/OS X > 3) stat_result with all fields None: this would happen on systems > whose readdir()/dirent doesn't have d_type, or on Linux/OS X when > d_type was DT_UNKNOWN > > Higher-level functions like os.walk() would then check the fields they > needed are not None, and only call os.stat() if needed, for example: > > # Build lists of files and directories in path > files = [] > dirs = [] > for name, st in os.scandir(path): > if st.st_mode is None: > st = os.stat(os.path.join(path, name)) > if stat.S_ISDIR(st.st_mode): > dirs.append(name) > else: > files.append(name) > > Not bad for a 2-10x performance boost, right? What do folks think? > > Cheers, > Ben. > [snip] In the python-ideas list there's a thread "PEP: Extended stat_result" about adding methods to stat_result. Using that, you wouldn't necessarily have to look at st.st_mode. The method could perform an additional os.stat() if the field was None. For example: # Build lists of files and directories in path files = [] dirs = [] for name, st in os.scandir(path): if st.is_dir(): dirs.append(name) else: files.append(name) That looks much nicer. From ronaldoussoren at mac.com Fri May 10 16:42:49 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 10 May 2013 16:42:49 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <518D049E.4080900@mrabarnett.plus.com> References: <518D049E.4080900@mrabarnett.plus.com> Message-ID: <561D1ABE-2299-40F7-B325-B724135160C7@mac.com> On 10 May, 2013, at 16:30, MRAB wrote: >> > [snip] > In the python-ideas list there's a thread "PEP: Extended stat_result" > about adding methods to stat_result. > > Using that, you wouldn't necessarily have to look at st.st_mode. The method could perform an additional os.stat() if the field was None. For > example: > > # Build lists of files and directories in path > files = [] > dirs = [] > for name, st in os.scandir(path): > if st.is_dir(): > dirs.append(name) > else: > files.append(name) > > That looks much nicer. I'd prefer a filetype field, with 'st.filetype == "dir"' instead of 'st.is_dir()'. The actual type of filetype values is less important, an enum type would also work although bootstrapping that type could be interesting. Ronald > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ronaldoussoren%40mac.com From solipsis at pitrou.net Fri May 10 17:01:06 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 17:01:06 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: <518CDE16.6010104@python.org> <20130510141622.250aaa65@pitrou.net> <518CFA2D.3020405@python.org> Message-ID: <20130510170106.33098109@pitrou.net> Le Fri, 10 May 2013 23:53:37 +1000, Nick Coghlan a ?crit : > On Fri, May 10, 2013 at 11:46 PM, Christian Heimes > wrote: > > Am 10.05.2013 14:16, schrieb Antoine Pitrou: > >> But what if some systems return more than the file type and less > >> than a full stat result? The general problem is POSIX's terrible > >> inertia. I feel that a stat result with some None fields would be > >> an acceptable compromise here. > > > > POSIX only defines the d_ino and d_name members of struct dirent. > > Linux, BSD and probably some other platforms also happen to provide > > d_type. The other members of struct dirent (d_reclen, d_namlen) > > aren't useful in Python space by themselves. > > > > d_type and st_mode aren't compatible in any way. As you know st_mode > > also contains POSIX permission information. The file type is encoded > > with a different set of bits, too. Future file types aren't mapped > > to S_IF* constants for st_mode. > > Why are we exposing a bitfield as the primary Python level API, > anyway? It makes sense for the well defined permission bits, but why > are we copying the C level concept for the other flags? Precisely because they are not well-defined, hence any interpretation by us may be incorrect or incomplete (e.g. obscure system-specific bits). Regards Antoine. From tjreedy at udel.edu Fri May 10 18:01:57 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 10 May 2013 12:01:57 -0400 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: References: Message-ID: On 5/10/2013 3:14 AM, Nick Coghlan wrote: > I'd like to mark a few PEPs that are not currently being actively > considered for 3.4 as Deferred: > > S 286 Enhanced Argument Tuples von L?wis > S 337 Logging Usage in the Standard Library Dubner > S 368 Standard image protocol and class Mastrodomenico > I 396 Module Version Numbers Warsaw > S 400 Deprecate codecs.StreamReader and codecs.StreamWriter Stinner > S 419 Protecting cleanup statements from interruptions Colomiets > I 423 Naming conventions and recipes related to packaging Bryon > I 444 Python Web3 Interface > McDonough, Ronacher > S 3124 Overloading, Generic Functions, Interfaces, and ... Eby > S 3142 Add a "while" clause to generator expressions Britton I had the impression that this had more or less been rejected. I suppose I could try to dig up the discussion. > S 3143 Standard daemon process library Finney > S 3145 Asynchronous I/O For subprocess.Popen > Pruitt, McCreary, Carlson > S 3152 Cofunctions Ewing You might also ask the authors if they are still really in favor of them or have any hope for them, considering whatever discussion occurred, or whether they have abandoned them (which would mean withdrawn). From status at bugs.python.org Fri May 10 18:07:34 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 10 May 2013 18:07:34 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130510160734.3F02256921@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-05-03 - 2013-05-10) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3963 (+10) closed 25758 (+44) total 29721 (+54) Open issues with patches: 1774 Issues opened (48) ================== #5845: rlcompleter should be enabled automatically http://bugs.python.org/issue5845 reopened by mark.dickinson #15902: imp.load_module won't accept None for the file argument for a http://bugs.python.org/issue15902 reopened by brett.cannon #17656: Python 2.7.4 breaks ZipFile extraction of zip files with unico http://bugs.python.org/issue17656 reopened by pitrou #17883: Fix buildbot testing of Tkinter http://bugs.python.org/issue17883 reopened by ezio.melotti #17895: TemporaryFile name returns an integer in python3 http://bugs.python.org/issue17895 opened by jtaylor #17896: Move Windows external libs from \..\ to \externals http://bugs.python.org/issue17896 opened by zach.ware #17897: Optimize unpickle prefetching http://bugs.python.org/issue17897 opened by serhiy.storchaka #17898: gettext bug while parsing plural-forms metadata http://bugs.python.org/issue17898 opened by straz #17899: os.listdir() leaks FDs if invoked on FD pointing to a non-dire http://bugs.python.org/issue17899 opened by abacabadabacaba #17900: Recursive OrderedDict pickling http://bugs.python.org/issue17900 opened by serhiy.storchaka #17901: _elementtree.TreeBuilder raises IndexError on end if construct http://bugs.python.org/issue17901 opened by Aaron.Oakley #17902: Document that _elementtree C API cannot use custom TreeBuilder http://bugs.python.org/issue17902 opened by Aaron.Oakley #17903: Python launcher for windows should search path for #!/usr/bin/ http://bugs.python.org/issue17903 opened by pmoore #17904: bytes should be listed as built-in function for 2.7 http://bugs.python.org/issue17904 opened by flox #17905: Add check for locale.h http://bugs.python.org/issue17905 opened by cavallo71 #17906: JSON should accept lone surrogates http://bugs.python.org/issue17906 opened by serhiy.storchaka #17907: Deprecate imp.new_module() in favour of types.ModuleType http://bugs.python.org/issue17907 opened by brett.cannon #17908: Unittest runner needs an option to call gc.collect() after eac http://bugs.python.org/issue17908 opened by gvanrossum #17909: Autodetecting JSON encoding http://bugs.python.org/issue17909 opened by serhiy.storchaka #17911: Extracting tracebacks does too much work http://bugs.python.org/issue17911 opened by gvanrossum #17913: stat.filemode returns "-" for sockets and unknown types http://bugs.python.org/issue17913 opened by christian.heimes #17914: add os.cpu_count() http://bugs.python.org/issue17914 opened by neologix #17915: Encoding error with sax and codecs http://bugs.python.org/issue17915 opened by sconseil #17916: Provide dis.Bytecode based equivalent of dis.distb http://bugs.python.org/issue17916 opened by ncoghlan #17917: use PyModule_AddIntMacro() instead of PyModule_AddIntConstant( http://bugs.python.org/issue17917 opened by neologix #17919: AIX POLLNVAL definition causes problems http://bugs.python.org/issue17919 opened by delhallt #17920: Documentation: "complete ordering" should be "total ordering" http://bugs.python.org/issue17920 opened by abcdef #17922: Crash in clear_weakref http://bugs.python.org/issue17922 opened by jsafrane #17923: test glob with trailing slash fail http://bugs.python.org/issue17923 opened by delhallt #17924: Deprecate stat.S_IF* integer constants http://bugs.python.org/issue17924 opened by christian.heimes #17925: asynchat.async_chat.initiate_send : del deque[0] is not safe http://bugs.python.org/issue17925 opened by Pierrick.Koch #17927: Argument copied into cell still referenced by frame http://bugs.python.org/issue17927 opened by gvanrossum #17930: Search not needed in combinations_with_replacement http://bugs.python.org/issue17930 opened by tim_one #17931: PyLong_FromPid() is not correctly defined on Windows 64-bit http://bugs.python.org/issue17931 opened by haypo #17932: Win64: possible integer overflow in iterobject.c http://bugs.python.org/issue17932 opened by haypo #17933: test_ftp failure / ftplib error formatting issue http://bugs.python.org/issue17933 opened by pitrou #17934: Add a frame method to clear expensive details http://bugs.python.org/issue17934 opened by pitrou #17936: O(n**2) behaviour when adding/removing classes http://bugs.python.org/issue17936 opened by kristjan.jonsson #17937: Collect garbage harder at shutdown http://bugs.python.org/issue17937 opened by pitrou #17939: Misleading information about slice assignment in docs http://bugs.python.org/issue17939 opened by stefanchrobot #17940: extra code in argparse.py http://bugs.python.org/issue17940 opened by aho #17941: namedtuple should support fully qualified name for more portab http://bugs.python.org/issue17941 opened by eli.bendersky #17942: IDLE Debugger: names, values misaligned http://bugs.python.org/issue17942 opened by terry.reedy #17943: AttributeError: 'long' object has no attribute 'release' in Qu http://bugs.python.org/issue17943 opened by georg.brandl #17944: Refactor test_zipfile http://bugs.python.org/issue17944 opened by serhiy.storchaka #17945: tkinter/Python 3.3.0: peer_create doesn't instantiate Text http://bugs.python.org/issue17945 opened by ghoul #17947: Code, test, and doc review for PEP-0435 Enum http://bugs.python.org/issue17947 opened by ethan.furman #17948: HTTPS and sending a big file size hangs. http://bugs.python.org/issue17948 opened by jesusvpct Most recent 15 issues with no replies (15) ========================================== #17944: Refactor test_zipfile http://bugs.python.org/issue17944 #17942: IDLE Debugger: names, values misaligned http://bugs.python.org/issue17942 #17940: extra code in argparse.py http://bugs.python.org/issue17940 #17939: Misleading information about slice assignment in docs http://bugs.python.org/issue17939 #17937: Collect garbage harder at shutdown http://bugs.python.org/issue17937 #17934: Add a frame method to clear expensive details http://bugs.python.org/issue17934 #17933: test_ftp failure / ftplib error formatting issue http://bugs.python.org/issue17933 #17931: PyLong_FromPid() is not correctly defined on Windows 64-bit http://bugs.python.org/issue17931 #17924: Deprecate stat.S_IF* integer constants http://bugs.python.org/issue17924 #17923: test glob with trailing slash fail http://bugs.python.org/issue17923 #17916: Provide dis.Bytecode based equivalent of dis.distb http://bugs.python.org/issue17916 #17909: Autodetecting JSON encoding http://bugs.python.org/issue17909 #17905: Add check for locale.h http://bugs.python.org/issue17905 #17904: bytes should be listed as built-in function for 2.7 http://bugs.python.org/issue17904 #17902: Document that _elementtree C API cannot use custom TreeBuilder http://bugs.python.org/issue17902 Most recent 15 issues waiting for review (15) ============================================= #17947: Code, test, and doc review for PEP-0435 Enum http://bugs.python.org/issue17947 #17944: Refactor test_zipfile http://bugs.python.org/issue17944 #17937: Collect garbage harder at shutdown http://bugs.python.org/issue17937 #17936: O(n**2) behaviour when adding/removing classes http://bugs.python.org/issue17936 #17932: Win64: possible integer overflow in iterobject.c http://bugs.python.org/issue17932 #17931: PyLong_FromPid() is not correctly defined on Windows 64-bit http://bugs.python.org/issue17931 #17927: Argument copied into cell still referenced by frame http://bugs.python.org/issue17927 #17925: asynchat.async_chat.initiate_send : del deque[0] is not safe http://bugs.python.org/issue17925 #17923: test glob with trailing slash fail http://bugs.python.org/issue17923 #17919: AIX POLLNVAL definition causes problems http://bugs.python.org/issue17919 #17917: use PyModule_AddIntMacro() instead of PyModule_AddIntConstant( http://bugs.python.org/issue17917 #17915: Encoding error with sax and codecs http://bugs.python.org/issue17915 #17913: stat.filemode returns "-" for sockets and unknown types http://bugs.python.org/issue17913 #17909: Autodetecting JSON encoding http://bugs.python.org/issue17909 #17906: JSON should accept lone surrogates http://bugs.python.org/issue17906 Top 10 most discussed issues (10) ================================= #5845: rlcompleter should be enabled automatically http://bugs.python.org/issue5845 24 msgs #17922: Crash in clear_weakref http://bugs.python.org/issue17922 16 msgs #17927: Argument copied into cell still referenced by frame http://bugs.python.org/issue17927 15 msgs #11406: There is no os.listdir() equivalent returning generator instea http://bugs.python.org/issue11406 14 msgs #17883: Fix buildbot testing of Tkinter http://bugs.python.org/issue17883 14 msgs #17914: add os.cpu_count() http://bugs.python.org/issue17914 13 msgs #1545463: New-style classes fail to cleanup attributes http://bugs.python.org/issue1545463 12 msgs #11016: stat module in C http://bugs.python.org/issue11016 10 msgs #17810: Implement PEP 3154 (pickle protocol 4) http://bugs.python.org/issue17810 9 msgs #17868: pprint long non-printable bytes as hexdump http://bugs.python.org/issue17868 8 msgs Issues closed (44) ================== #2262: Helping the compiler avoid memory references in PyEval_EvalFra http://bugs.python.org/issue2262 closed by pitrou #6178: Core error in Py_EvalFrameEx 2.6.2 http://bugs.python.org/issue6178 closed by pitrou #7330: PyUnicode_FromFormat: implement width and precision for %s, %S http://bugs.python.org/issue7330 closed by haypo #7855: Add test cases for ctypes/winreg for issues found in IronPytho http://bugs.python.org/issue7855 closed by ezio.melotti #9687: dbmmodule.c:dbm_contains fails on 64bit big-endian (test_dbm.p http://bugs.python.org/issue9687 closed by pitrou #10363: Embedded python, handle (memory) leak http://bugs.python.org/issue10363 closed by pitrou #11816: Refactor the dis module to provide better building blocks for http://bugs.python.org/issue11816 closed by ncoghlan #12181: SIGBUS error on OpenBSD (sparc64) http://bugs.python.org/issue12181 closed by neologix #13495: IDLE: Regressions - Two ColorDelegator instances loaded http://bugs.python.org/issue13495 closed by roger.serwy #13831: get method of multiprocessing.pool.Async should return full t http://bugs.python.org/issue13831 closed by sbt #14173: PyOS_FiniInterupts leaves signal.getsignal segfaulty http://bugs.python.org/issue14173 closed by pitrou #14187: add "function annotation" entry to Glossary http://bugs.python.org/issue14187 closed by r.david.murray #14878: Improve documentation for generator.send method http://bugs.python.org/issue14878 closed by akuchling #15528: Better support for finalization with weakrefs http://bugs.python.org/issue15528 closed by sbt #15834: 2to3 benchmark not working under Python 3 http://bugs.python.org/issue15834 closed by brett.cannon #16445: SEGFAULT when deleting Exception.message http://bugs.python.org/issue16445 closed by pitrou #16523: attrgetter and itemgetter signatures in docs need cleanup http://bugs.python.org/issue16523 closed by ezio.melotti #16584: unhandled IOError filecmp.cmpfiles() if file not readable http://bugs.python.org/issue16584 closed by terry.reedy #16601: Restarting iteration over tarfile continues from where it left http://bugs.python.org/issue16601 closed by serhiy.storchaka #16631: tarfile.extractall() doesn't extract everything if .next() was http://bugs.python.org/issue16631 closed by serhiy.storchaka #17094: sys._current_frames() reports too many/wrong stack frames http://bugs.python.org/issue17094 closed by pitrou #17115: __loader__ = None should be fine http://bugs.python.org/issue17115 closed by brett.cannon #17116: xml.parsers.expat.(errors|model) don't set the __loader__ attr http://bugs.python.org/issue17116 closed by brett.cannon #17289: readline.set_completer_delims() doesn't play well with others http://bugs.python.org/issue17289 closed by pitrou #17408: second python execution fails when embedding http://bugs.python.org/issue17408 closed by pitrou #17714: str.encode('base64') add trailing new line character. It is no http://bugs.python.org/issue17714 closed by ezio.melotti #17798: IDLE: can not edit new file names when using -e http://bugs.python.org/issue17798 closed by roger.serwy #17805: No such class: multiprocessing.pool.AsyncResult http://bugs.python.org/issue17805 closed by sbt #17807: Generator cleanup without tp_del http://bugs.python.org/issue17807 closed by pitrou #17809: FAIL: test_expanduser when $HOME ends with / http://bugs.python.org/issue17809 closed by ezio.melotti #17833: test_gdb broken PPC64 Linux http://bugs.python.org/issue17833 closed by dmalcolm #17841: Remove missing aliases from codecs documentation http://bugs.python.org/issue17841 closed by ezio.melotti #17871: Wrong signature of TextTestRunner's init function http://bugs.python.org/issue17871 closed by ezio.melotti #17877: Skip test_variable_tzname when the zoneinfo database is missin http://bugs.python.org/issue17877 closed by ezio.melotti #17910: Usage error in multiprocessing documentation http://bugs.python.org/issue17910 closed by amysyk #17912: thread states should use a doubly-linked list http://bugs.python.org/issue17912 closed by neologix #17918: failed incoming SSL connection stays open forever http://bugs.python.org/issue17918 closed by pitrou #17921: explicit empty check instead of implicit booleaness http://bugs.python.org/issue17921 closed by r.david.murray #17926: PowerLinux dbm failure in 2.7 http://bugs.python.org/issue17926 closed by pitrou #17928: PowerLinux getargs.c FETCH_SIZE endianness bug http://bugs.python.org/issue17928 closed by pitrou #17929: TypeError using tarfile.addfile() with io.StringIO replacing S http://bugs.python.org/issue17929 closed by r.david.murray #17935: Failed compile on XP buildbot http://bugs.python.org/issue17935 closed by pitrou #17938: Duplicate text in docs/reference/import statement http://bugs.python.org/issue17938 closed by ezio.melotti #17946: base64 encoding result should be str, not bytes http://bugs.python.org/issue17946 closed by r.david.murray From guido at python.org Fri May 10 18:19:50 2013 From: guido at python.org (Guido van Rossum) Date: Fri, 10 May 2013 09:19:50 -0700 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 9:01 AM, Terry Jan Reedy wrote: > On 5/10/2013 3:14 AM, Nick Coghlan wrote: >> >> I'd like to mark a few PEPs that are not currently being actively >> considered for 3.4 as Deferred: >> >> S 286 Enhanced Argument Tuples von >> L?wis >> S 337 Logging Usage in the Standard Library Dubner >> S 368 Standard image protocol and class >> Mastrodomenico >> I 396 Module Version Numbers Warsaw >> S 400 Deprecate codecs.StreamReader and codecs.StreamWriter Stinner >> S 419 Protecting cleanup statements from interruptions >> Colomiets >> I 423 Naming conventions and recipes related to packaging Bryon >> I 444 Python Web3 Interface >> McDonough, Ronacher >> S 3124 Overloading, Generic Functions, Interfaces, and ... Eby >> S 3142 Add a "while" clause to generator expressions Britton > > > I had the impression that this had more or less been rejected. I suppose I > could try to dig up the discussion. I didn't know there was a PEP for that. I hereby reject it. No point wasting more time on it. >> S 3143 Standard daemon process library Finney >> S 3145 Asynchronous I/O For subprocess.Popen >> Pruitt, McCreary, Carlson >> S 3152 Cofunctions Ewing > > > You might also ask the authors if they are still really in favor of them or > have any hope for them, considering whatever discussion occurred, or whether > they have abandoned them (which would mean withdrawn). > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From barry at python.org Fri May 10 18:29:41 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 10 May 2013 12:29:41 -0400 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: References: Message-ID: <20130510122941.7f001c2d@limelight.wooz.org> On May 10, 2013, at 09:19 AM, Guido van Rossum wrote: >>> S 3142 Add a "while" clause to generator expressions Britton >> >> I had the impression that this had more or less been rejected. I suppose I >> could try to dig up the discussion. > >I didn't know there was a PEP for that. I hereby reject it. No point >wasting more time on it. Done. -Barry From ncoghlan at gmail.com Fri May 10 18:30:54 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 11 May 2013 02:30:54 +1000 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: References: Message-ID: On Sat, May 11, 2013 at 2:01 AM, Terry Jan Reedy wrote: > You might also ask the authors if they are still really in favor of them or > have any hope for them, considering whatever discussion occurred, or whether > they have abandoned them (which would mean withdrawn). That's real work though, compared to just marking them as Deferred :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From JMao at rocketsoftware.com Fri May 10 19:31:39 2013 From: JMao at rocketsoftware.com (Jianfeng Mao) Date: Fri, 10 May 2013 17:31:39 +0000 Subject: [Python-Dev] make a Windows installation package (.msi) for Python 3.3 Message-ID: <6478B930E8DB4545BD8EEADFAA73FED878CF1753@nwt-s-mbx1.rocketsoftware.com> To Python Windows Release Managers: My name is Jianfeng Mao and I am a software developer at the U2 group in Rocket Software (http://u2.rocketsoftware.com/). I am currently working on a project to embed a slightly customized Python interpreter in our product. For easy installation and setup, we hope to be able to do the standard Python installation during the installation of our software. Basically I want to create a .msi file that can be called to install the full Python if the user needs this new feature. Brian Curtin (brian at python.org) pointed me to Tools/msi/msi.py for the Windows MSI builder. I tried to follow the instructions in the README but couldn't make it to work after a few twists and turns. Brian mentioned that few people needs to do this and only release managers handle the packaging of Python. I have listed the steps I have done in my attempt to create the .msi file. Please let me know if I have missed anything or done anything wrong. 1. hg clone http://hg.python.org/cpython 2. cd cpython 3. hg update 3.3 4. cd tools\buildbot, edit build.bat to change the configuration from Debug to Releaes; edit external.bat, change DEBUG=1 to DEBUG=0 5. go back to cpython\ and run tools\buildbot\build.bat 6. cd PC, then do 'nmake -f icons.mak' 7. cd ..\tools\msi 8. c:\python27\python msi.py WARNING: nm did not run successfully - libpythonXX.a not built cl /O2 /D WIN32 /D NDEBUG /D _WINDOWS /MT /W3 /c msisupport.c Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86 Copyright (C) Microsoft Corporation. All rights reserved. msisupport.c link.exe /OUT:msisupport.dll /INCREMENTAL:NO /NOLOGO /DLL /SUBSYSTEM:WIN DOWS /OPT:REF /OPT:ICF msisupport.obj msi.lib kernel32.lib Creating library msisupport.lib and object msisupport.exp Traceback (most recent call last): File "msi.py", line 1336, in add_files(db) File "msi.py", line 961, in add_files generate_license() File "msi.py", line 914, in generate_license raise ValueError, "Could not find "+srcdir+"/../"+pat ValueError: Could not find C:\temp\cpython/../tcl8* -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry at hastings.org Fri May 10 23:00:09 2013 From: larry at hastings.org (Larry Hastings) Date: Fri, 10 May 2013 14:00:09 -0700 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs In-Reply-To: References: Message-ID: <518D5FD9.1060800@hastings.org> On 05/10/2013 12:14 AM, Nick Coghlan wrote: > I'd like to mark a few PEPs that are not currently being actively > considered for 3.4 as Deferred: I swear I posted a list like this a couple years ago. Now I can't find it. Anyway it was completely ignored then, probably because I'm not Nick Coghlan. //arry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri May 10 23:08:27 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 10 May 2013 23:08:27 +0200 Subject: [Python-Dev] PEP 0 maintenance - deferring some currently open PEPs References: <518D5FD9.1060800@hastings.org> Message-ID: <20130510230827.144ef614@fsol> On Fri, 10 May 2013 14:00:09 -0700 Larry Hastings wrote: > > I swear I posted a list like this a couple years ago. Now I can't find > it. Anyway it was completely ignored then, probably because I'm not > Nick Coghlan. Many people in the world suffer with this problem. Regards Antoine. From brian at python.org Fri May 10 23:13:04 2013 From: brian at python.org (Brian Curtin) Date: Fri, 10 May 2013 16:13:04 -0500 Subject: [Python-Dev] make a Windows installation package (.msi) for Python 3.3 In-Reply-To: <6478B930E8DB4545BD8EEADFAA73FED878CF1753@nwt-s-mbx1.rocketsoftware.com> References: <6478B930E8DB4545BD8EEADFAA73FED878CF1753@nwt-s-mbx1.rocketsoftware.com> Message-ID: On Fri, May 10, 2013 at 12:31 PM, Jianfeng Mao wrote: > To Python Windows Release Managers: > > > > My name is Jianfeng Mao and I am a software developer at the U2 group in > Rocket Software (http://u2.rocketsoftware.com/). I am currently working on > a project to embed a slightly customized Python interpreter in our product. > For easy installation and setup, we hope to be able to do the standard > Python installation during the installation of our software. Basically I > want to create a .msi file that can be called to install the full Python if > the user needs this new feature. Brian Curtin (brian at python.org) pointed me > to Tools/msi/msi.py for the Windows MSI builder. I tried to follow the > instructions in the README but couldn?t make it to work after a few twists > and turns. Brian mentioned that few people needs to do this and only > release managers handle the packaging of Python. I have listed the steps I > have done in my attempt to create the .msi file. Please let me know if I > have missed anything or done anything wrong. > > > > > > 1. hg clone http://hg.python.org/cpython > > 2. cd cpython > > 3. hg update 3.3 > > 4. cd tools\buildbot, edit build.bat to change the configuration from > Debug to Releaes; edit external.bat, change DEBUG=1 to DEBUG=0 > > 5. go back to cpython\ and run tools\buildbot\build.bat > > 6. cd PC, then do ?nmake ?f icons.mak? > > 7. cd ..\tools\msi > > 8. c:\python27\python msi.py > > > > WARNING: nm did not run successfully - libpythonXX.a not built > > cl /O2 /D WIN32 /D NDEBUG /D _WINDOWS /MT /W3 /c msisupport.c > > Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for > 80x86 > > Copyright (C) Microsoft Corporation. All rights reserved. > > > > msisupport.c > > link.exe /OUT:msisupport.dll /INCREMENTAL:NO /NOLOGO /DLL > /SUBSYSTEM:WIN > > DOWS /OPT:REF /OPT:ICF msisupport.obj msi.lib kernel32.lib > > Creating library msisupport.lib and object msisupport.exp > > Traceback (most recent call last): > > File "msi.py", line 1336, in > > add_files(db) > > File "msi.py", line 961, in add_files > > generate_license() > > File "msi.py", line 914, in generate_license > > raise ValueError, "Could not find "+srcdir+"/../"+pat > > ValueError: Could not find C:\temp\cpython/../tcl8* I'm in an airport and on a Mac right now so I can't test it, but IIRC you just need to adjust the script to look for tcl-8* and not tcl8* on line 908 of msi.py. You'll probably have to do the same for tk. If you come across other exceptions about tcl, tk, or other dependencies, it's likely that the paths are just incorrect. There may be a patch for this on bugs.python.org because I know I've gone through it. From benhoyt at gmail.com Sat May 11 03:29:00 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Sat, 11 May 2013 13:29:00 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <518D049E.4080900@mrabarnett.plus.com> References: <518D049E.4080900@mrabarnett.plus.com> Message-ID: > In the python-ideas list there's a thread "PEP: Extended stat_result" > about adding methods to stat_result. > > Using that, you wouldn't necessarily have to look at st.st_mode. The method > could perform an additional os.stat() if the field was None. For > > example: > > # Build lists of files and directories in path > files = [] > dirs = [] > for name, st in os.scandir(path): > if st.is_dir(): > dirs.append(name) > else: > files.append(name) That's not too bad. However, the st.is_dir() function could potentially call os.stat(), so you'd have to be specific about how errors are handled. Also, I'm not too enthusiastic about how much "API weight" this would add -- do you need st.is_link() and st.size() and st.everything_else() as well? -Ben From benhoyt at gmail.com Sat May 11 06:24:48 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Sat, 11 May 2013 16:24:48 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <518CDE16.6010104@python.org> References: <518CDE16.6010104@python.org> Message-ID: > Have you actually tried the code? It can't give you correct answers. The > struct dirent.d_type member as returned by readdir() has different > values than stat.st_mode's file type. Yes, I'm quite aware of that. In the first version of BetterWalk that's exactly how it did it, and this approach worked fine. However... > Or are you proposing to map d_type to st_mode? Yes, that's exactly what I was proposing -- sorry if that wasn't clear. > Hence I'm +1 on the general idea but -1 on something stat like. IMHO > os.scandir() should yield four objects: > > * name > * inode > * file type or DT_UNKNOWN > * stat_result or None This feels quite heavy to me. And I don't like it how for the normal case (checking whether something was a file or directory) you'd have to check file_type against DT_UNKNOWN as well as stat_result against None before doing anything with it: for item in os.scandir(): if item.file_type == DT_UNKNOWN and item.stat_result is None: # call os.stat() I guess that's not *too* bad. > That's also problematic because st_mode would only have file type > bits, not permission bits. You're right. However, given that scandir() is intended as a low-level, OS-specific function, couldn't we just document this and move on? Keep the API nice and simple and still cover 95% of the use cases. How often does anyone actually iterate through a directory doing stuff with the permission bits. The nice thing about having it return a stat-like object is that in almost all cases you don't have to have two different code paths (d_type and st_mode), you just deal with st_mode. And we already have the stat module for dealing with st_mode stuff, so we wouldn't need another bunch of code/constants for dealing with d_type. The documentation could just say something like: "The exact information returned in st_mode is OS-specific. In practice, on Windows it returns all the information that stat() does. On Linux and OS X, it's either None or it includes the mode bits (but not the permissions bits)." Antoine said: "But what if some systems return more than the file type and less than a full stat result?" Again, I just think that debating the very fine points like this to get that last 5% of use cases will mean we never have this very useful function in the library. In all the *practical* examples I've seen (and written myself), I iterate over a directory and I just need to know whether it's a file or directory (or maybe a link). Occassionally you need the size as well, but that would just mean a similar check "if st.st_size is None: st = os.stat(...)", which on Linux/OS X would call stat(), but it'd still be free and fast on Windows. -Ben From v+python at g.nevcal.com Sat May 11 07:15:59 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Fri, 10 May 2013 22:15:59 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5185FCB1.6030702@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> Message-ID: <518DD40F.1070005@g.nevcal.com> So, thanks everyone for helping me understand the metaclass issues, and helping fix my code and the reference implementation, so that I got a working workaround for enumerations. Twiddling some more.... newly using hg and bitbucket... learned a lot today... at Premise: For API parameters that are bitfields, it would be nice to have an Enum-type class that tracks the calculations done with the named values to create bit masks. Currently available: Enum (or IntEnum) that can group the collection of named bit-field values into values of a single and unique type, but loses the names during calculations. Just written: a class IntET (for "int expression tracker" which has an expression name as well as a value. As a side effect, this could be used to track effective calculations of integers for debugging, because that is the general mechanism needed to track combined flag values, also for debug reporting. So it is quite possible to marry the two, as Ethan helped me figure out using an earlier NamedInt class: class NIE( IntET, Enum ): x = ('NIE.x', 1) y = ('NIE.y', 2) z = ('NIE.z', 4) and then expressions involving members of NIE (and even associated integers) will be tracked... see demo1.py. But the last few lines of demo1 demonstrate that NIE doesn't like, somehow, remember that its values, deep down under the covers, are really int. And doesn't even like them when they are wrapped into IntET objects. This may or may not be a bug in the current Enum implementation. It is cumbersome to specify redundant names for the enumeration members and the underlying IntET separately, however. Turns out that adding one line to ref435 (which I did in ref435a) will allow (nay, require) that base types for these modified Enums must have names, which must be supplied as the first parameter to their constructor. This also works around whatever problem "real" Enum has with using named items internally, as demonstrated by demo2.py (reproduced in part here): class NIE( IntET, Enum ): x = 1 y = 2 z = 4 print( repr( NIE.x + NIE.y )) IntET('(NIE.x + NIE.y)', 3) So the questions are: 1) Is there a bug in ref435 Enum that makes demo1 report errors instead of those lines working? 2) Is something like demo2 interesting to anyone but me? Of course, I think it would be great for reporting flag values using names rather than a number representing combined bit fields. 3) I don't see a way to subclass the ref435 EnumMeta except by replacing the whole __new__ method... does this mechanism warrant a slight refactoring of EnumMeta to make this mechanism easier to subclass with less code redundancy? 4) Or is it simple enough and useful enough to somehow make it a feature of EnumMeta, enabled by a keyword parameter? Or one _could_ detect the existence of a __name__ property on the first base type, and key off of that, but that may sometimes be surprising (of course, that is what documentation is for: to explain away the surprises people get when they don't read it). 5) All this is based on "IntET"... which likely suffices for API flags parameters... but when I got to __truediv__ and __rtruediv__, which don't return int, then I started wondering how to write a vanilla ET class that inherits from "number" instead of "int" or "float"? One could, of course, make cooperating classes FloatET and DecimalET .... is this a language limitation, or is there more documentation I haven't read? :) (I did read footnote [1] of , and trembled.) Probably some of these questions should be on stackoverflow or python ideas, but it is certainly an outgrowth of the Enum PEP, and personally, I'd hate to see flag APIs converted to Enum without the ability to track combinations of them... so I hope that justifies parts of this discussion continuing here. I'm happy to take pieces to other places, if so directed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat May 11 08:02:21 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 10 May 2013 23:02:21 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518DD40F.1070005@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> Message-ID: <518DDEED.9050407@stoneleaf.us> On 05/10/2013 10:15 PM, Glenn Linderman wrote: > > But the last few lines of demo1 demonstrate that NIE doesn't like, somehow, remember that its values, deep down under > the covers, are really int. And doesn't even like them when they are wrapped into IntET objects. This may or may not > be a bug in the current Enum implementation. You're right, sort of. ;) If you do print( repr( NIE1.x.value )) you'll see ('NIE1.x', 1) In other words, the value of NEI.x is `('NEI1.x', 1)` and that is what you would have to pass back into NEI to get the enum member. As an aside, I suspect you are doing this the hard way. Perhaps writing your own __new__ in NIE will have better results (I'd try, but I gotta get some sleep! ;) . Oh, newest code posted. -- ~Ethan~ From v+python at g.nevcal.com Sat May 11 09:11:49 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 11 May 2013 00:11:49 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518DDEED.9050407@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <518DDEED.9050407@stoneleaf.us> Message-ID: <518DEF35.2010604@g.nevcal.com> On 5/10/2013 11:02 PM, Ethan Furman wrote: > On 05/10/2013 10:15 PM, Glenn Linderman wrote: >> >> But the last few lines of demo1 demonstrate that NIE doesn't like, >> somehow, remember that its values, deep down under >> the covers, are really int. And doesn't even like them when they are >> wrapped into IntET objects. This may or may not >> be a bug in the current Enum implementation. > > You're right, sort of. ;) > > If you do > > print( repr( NIE1.x.value )) > > you'll see > > ('NIE1.x', 1) > > In other words, the value of NEI.x is `('NEI1.x', 1)` and that is what > you would have to pass back into NEI to get the enum member. Ah! But the value of NIE.x should be IntET('NIE.x', 1), no? So Enum is presently saving the constructor parameters as the value, rather than the constructed object? So for Enums of builtin types, there is little difference, but for complex types (as opposed to complex numbers), there is a difference, and I guess I ran into the consequences of that difference. > As an aside, I suspect you are doing this the hard way. Perhaps > writing your own __new__ in NIE will have better results NIE.__new__ wouldn't have the name available, unless it is passed in. So it seems to me that Enum (or Enum+) has to pass in the parameter... in which case NIE.__new__ can be pretty ordinary. Other implementation strategies that occurred to me... maybe I'll try them all, if I have time... but time looks to get scarcer soon... * I'm playing with adding another keyword parameter to Enum, but it is presently giving me an error about unknown keyword parameter passed to __prepare__ even though I added **kwds to the list of its parameters. I'll learn something by doing this. * Implement a subclass of Enum that has a bunch of operator tracking methods like IntET. However, the results of arithmetic operations couldn't add a new enumeration member (legally), so some other type would still have to exist... and it would also have to have those operation tracking methods. * Implement a subclass of EnumMeta that installs all the operator tracking methods. Same legal issue. * Do one of the above, but allow new calculated members to exist, although perhaps not in the initial, iterable set. Might have to call it something besides Enum to obey the proclamations :) FlagBits, maybe. But it could use much the same technology as Enum. > (I'd try, but I gotta get some sleep! ;) . Thanks for the response... it cleared up the demo1 mystery, anyway. > Oh, newest code posted. That'll give me some practice pulling from your repository into mine :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat May 11 16:34:14 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 00:34:14 +1000 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: On Sat, May 11, 2013 at 2:24 PM, Ben Hoyt wrote: > In all the *practical* examples I've seen (and written myself), I > iterate over a directory and I just need to know whether it's a file > or directory (or maybe a link). Occassionally you need the size as > well, but that would just mean a similar check "if st.st_size is None: > st = os.stat(...)", which on Linux/OS X would call stat(), but it'd > still be free and fast on Windows. Here's the full set of fields on a current stat object: st_atime st_atime_ns st_blksize st_blocks st_ctime st_ctime_ns st_dev st_gid st_ino st_mode st_mtime st_mtime_ns st_nlink st_rdev st_size st_uid Do we really want to publish an object with all of those as attributes potentially set to None, when the abstraction we're trying to present is intended primarily for the benefit of os.walk? And if we're creating a custom object instead, why return a 2-tuple rather than making the entry's name an attribute of the custom object? To me, that suggests a more reasonable API for os.scandir() might be for it to be an iterator over "dir_entry" objects: name (as a string) is_file() is_dir() is_link() stat() cached_stat (None or a stat object) On all platforms, the query methods would not require a separate stat() call. On Windows, cached_stat would be populated with a full stat object when scandir builds the entry. On non-Windows platforms, cached_stat would initially be None, and you would have to call stat() to populate it. If we find other details that we can reliably provide cross-platform from the dir information, then we can add more query methods or attributes to the dir_entry object. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From christian at python.org Sat May 11 17:42:39 2013 From: christian at python.org (Christian Heimes) Date: Sat, 11 May 2013 17:42:39 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: Am 11.05.2013 16:34, schrieb Nick Coghlan: > Here's the full set of fields on a current stat object: > > st_atime > st_atime_ns > st_blksize > st_blocks > st_ctime > st_ctime_ns > st_dev > st_gid > st_ino > st_mode > st_mtime > st_mtime_ns > st_nlink > st_rdev > st_size > st_uid And there are more fields on some platforms, e.g. st_birthtime. > To me, that suggests a more reasonable API for os.scandir() might be > for it to be an iterator over "dir_entry" objects: > > name (as a string) > is_file() > is_dir() > is_link() > stat() > cached_stat (None or a stat object) I suggest that we call it .lstat() and .cached_lstat to make clear that we are talking about no-follow stat() here. On platforms that support fstatat() it should use fstatat(dir_fd, name, &buf, AT_SYMLINK_NOFOLLOW) where dir_fd is the fd from dirfd() of opendir()'s return value. > On all platforms, the query methods would not require a separate > stat() call. On Windows, cached_stat would be populated with a full > stat object when scandir builds the entry. On non-Windows platforms, > cached_stat would initially be None, and you would have to call stat() > to populate it. +1 > If we find other details that we can reliably provide cross-platform > from the dir information, then we can add more query methods orst > attributes to the dir_entry object. I'd like to see d_type and d_ino, too. d_type should default to DT_UNKNOWN, d_ino to None. Christian From ncoghlan at gmail.com Sat May 11 18:30:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 02:30:29 +1000 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: On Sun, May 12, 2013 at 1:42 AM, Christian Heimes wrote: > I suggest that we call it .lstat() and .cached_lstat to make clear that > we are talking about no-follow stat() here. Fair point. > On platforms that support > fstatat() it should use fstatat(dir_fd, name, &buf, AT_SYMLINK_NOFOLLOW) > where dir_fd is the fd from dirfd() of opendir()'s return value. It may actually make sense to expose the dir_fd as another attribute of the dir_entry object. >> If we find other details that we can reliably provide cross-platform >> from the dir information, then we can add more query methods orst >> attributes to the dir_entry object. > > I'd like to see d_type and d_ino, too. d_type should default to > DT_UNKNOWN, d_ino to None. I'd prefer to see a more minimal set to start with - just the features needed to implement os.walk and os.fwalk more efficiently, and provide ready access to the full stat result. Once that core functionality is in place, *then* start debating what other use cases to optimise based on which platforms would support those optimisations and which would require dropping back to the full stat implementation anyway. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat May 11 18:36:28 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 02:36:28 +1000 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: On Sun, May 12, 2013 at 2:30 AM, Nick Coghlan wrote: > Once that core functionality is in place, *then* start debating what > other use cases to optimise based on which platforms would support > those optimisations and which would require dropping back to the full > stat implementation anyway. Alternatively, we could simply have a full "dirent" attribute that is None on Windows. That would actually make sense at an implementation level anyway - is_file() etc would check self.cached_lstat first, and if that was None they would check self.dirent, and if that was also None they would raise an error. Construction of a dir_entry would require either a stat object or a dirent object, but complain if it received both. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From v+python at g.nevcal.com Sun May 12 04:56:54 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 11 May 2013 19:56:54 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518DEF35.2010604@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <518DDEED.9050407@stoneleaf.us> <518DEF35.2010604@g.nevcal.com> Message-ID: <518F04F6.4050107@g.nevcal.com> On 5/11/2013 12:11 AM, Glenn Linderman wrote: > * I'm playing with adding another keyword parameter to Enum, but it is > presently giving me an error about unknown keyword parameter passed to > __prepare__ even though I added **kwds to the list of its parameters. > I'll learn something by doing this. OK, I figured out the error was because __prepare__ didn't have a **kwds parameters, but then got a similar one regarding __init__ ? but EnumMeta doesn't even have an __init__ so I guess it was using type.__init__, but then I wondered if type.__init__ even does anything, because when I added __init__ to (my modified ref435a) EnumMeta, it didn't seem to matter if my __init__ did nothing, called super().__init__, or called type.__init__. Anyway, defining one seems to get past the errors, and then the keyword can work. So compare your ref435.py and my ref435a.py at to see the code required to support a keyword parameter that would expect a base type containing a name parameter to its __new__ and __init__, by providing the module-qualified name as that parameter. Would this be a controversial enhancement to EnumMeta? Together with my flags.py at the same link, it would enable definitions of enumeration values which have names, and which names could be reported in exceptions.... see demo3.py at the same link. I suppose it might also be good to validate that no unexpected keyword parameters are passed in, rather than just ignoring them, as my code presently does? Not sure what the general thinking is regarding such parameters in such usages; it could be that some mixin class might want to define some keyword parameters too, so ignoring them seems a more flexible option. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Sun May 12 04:59:29 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Sat, 11 May 2013 19:59:29 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518DEF35.2010604@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <518DDEED.9050407@stoneleaf.us> <518DEF35.2010604@g.nevcal.com> Message-ID: <518F0591.6040604@g.nevcal.com> On 5/11/2013 12:11 AM, Glenn Linderman wrote: >> Oh, newest code posted. > > That'll give me some practice pulling from your repository into mine :) Well, I had to bring your changes to my local repository, and then push them up to my bitbucket repo... not sure if there is a way to merge from your bitbucket repo to mine directly... I couldn't find it, if there is. -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Sun May 12 06:03:18 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 11 May 2013 23:03:18 -0500 Subject: [Python-Dev] 2.7.5 baking Message-ID: The long anticipated "emergency" 2.7.5 release has now been tagged. It will be publicly announced as binaries arrive. Originally, I was just going to cherrypick regression fixes onto the 2.7.4 release and release those as 2.7.5. I started to this but ran into some conflicts. Since we don't have buildbot testing of release branches, I decided it would be best to just cut from the maintenance branch. -- Regards, Benjamin From g.brandl at gmx.net Sun May 12 13:24:45 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 12 May 2013 13:24:45 +0200 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: References: Message-ID: Am 12.05.2013 06:03, schrieb Benjamin Peterson: > The long anticipated "emergency" 2.7.5 release has now been tagged. It > will be publicly announced as binaries arrive. > > Originally, I was just going to cherrypick regression fixes onto the > 2.7.4 release and release those as 2.7.5. I started to this but ran > into some conflicts. Since we don't have buildbot testing of release > branches, I decided it would be best to just cut from the maintenance > branch. 3.2.5 and 3.3.2 are coming along as well. Georg From solipsis at pitrou.net Sun May 12 14:01:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 May 2013 14:01:51 +0200 Subject: [Python-Dev] Tightening up the specification for locals() References: <5183245D.2000009@pearwood.info> Message-ID: <20130512140151.1116d35e@fsol> On Fri, 03 May 2013 12:43:41 +1000 Steven D'Aprano wrote: > On 03/05/13 11:29, Nick Coghlan wrote: > > An exchange in one of the enum threads prompted me to write down > > something I've occasionally thought about regarding locals(): it is > > currently severely underspecified, and I'd like to make the current > > CPython behaviour part of the language/library specification. (We > > recently found a bug in the interaction between the __prepare__ method > > and lexical closures that was indirectly related to this > > underspecification) > > Fixing the underspecification is good. Enshrining a limitation as the > one correct way, not so good. I have to say, I agree with Steven here. Mutating locals() is currently an implementation detail, and it should IMHO stay that way. Only reading a non-mutated locals() should be well-defined. Regards Antoine. From ncoghlan at gmail.com Sun May 12 15:22:39 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 12 May 2013 23:22:39 +1000 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: <20130512140151.1116d35e@fsol> References: <5183245D.2000009@pearwood.info> <20130512140151.1116d35e@fsol> Message-ID: On Sun, May 12, 2013 at 10:01 PM, Antoine Pitrou wrote: > On Fri, 03 May 2013 12:43:41 +1000 > Steven D'Aprano wrote: >> On 03/05/13 11:29, Nick Coghlan wrote: >> > An exchange in one of the enum threads prompted me to write down >> > something I've occasionally thought about regarding locals(): it is >> > currently severely underspecified, and I'd like to make the current >> > CPython behaviour part of the language/library specification. (We >> > recently found a bug in the interaction between the __prepare__ method >> > and lexical closures that was indirectly related to this >> > underspecification) >> >> Fixing the underspecification is good. Enshrining a limitation as the >> one correct way, not so good. > > I have to say, I agree with Steven here. Mutating locals() is currently > an implementation detail, and it should IMHO stay that way. Only > reading a non-mutated locals() should be well-defined. At global and class scope (and, equivalently, in exec), I strongly disagree. There, locals() is (or should be) well defined, either as identical to globals(), as the value returned from __prepare__() (and will be passed to the metaclass as the namespace). The exec case corresponds to those two instances, depending on whether the single namespace or dual namespace version is performed. What Steven was objecting to was my suggestion that CPython's current behaviour where mutating locals() may not change the local namespace be elevated to an actual requirement where mutating locals *must not* change the local namespace. He felt that was overspecifying a CPython-specific limitation, and I think he's right - at function scope, the best we can say is that modifying the result of locals() may or may not make those changes visible to other code in that function (or closures that reference the local variables in that function). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun May 12 15:28:47 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 May 2013 15:28:47 +0200 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: References: <5183245D.2000009@pearwood.info> <20130512140151.1116d35e@fsol> Message-ID: <20130512152847.2805adbe@fsol> On Sun, 12 May 2013 23:22:39 +1000 Nick Coghlan wrote: > The exec case > corresponds to those two instances, depending on whether the single > namespace or dual namespace version is performed. I don't get the point. exec() *passes* a locals dictionary, but the compiled code itself isn't expected to use locals() as a way to access (let alone mutate) that dictionary. Regards Antoine. From solipsis at pitrou.net Sun May 12 15:55:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 12 May 2013 15:55:39 +0200 Subject: [Python-Dev] 2.7.5 baking References: Message-ID: <20130512155539.3b99174e@fsol> On Sun, 12 May 2013 13:24:45 +0200 Georg Brandl wrote: > Am 12.05.2013 06:03, schrieb Benjamin Peterson: > > The long anticipated "emergency" 2.7.5 release has now been tagged. It > > will be publicly announced as binaries arrive. > > > > Originally, I was just going to cherrypick regression fixes onto the > > 2.7.4 release and release those as 2.7.5. I started to this but ran > > into some conflicts. Since we don't have buildbot testing of release > > branches, I decided it would be best to just cut from the maintenance > > branch. > > 3.2.5 and 3.3.2 are coming along as well. 3.3.2 can't be released before http://bugs.python.org/issue17962 is fixed. Regards Antoine. From ncoghlan at gmail.com Sun May 12 16:27:22 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 May 2013 00:27:22 +1000 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: <20130512152847.2805adbe@fsol> References: <5183245D.2000009@pearwood.info> <20130512140151.1116d35e@fsol> <20130512152847.2805adbe@fsol> Message-ID: On Sun, May 12, 2013 at 11:28 PM, Antoine Pitrou wrote: > On Sun, 12 May 2013 23:22:39 +1000 > Nick Coghlan wrote: >> The exec case >> corresponds to those two instances, depending on whether the single >> namespace or dual namespace version is performed. > > I don't get the point. exec() *passes* a locals dictionary, but the > compiled code itself isn't expected to use locals() as a way to access > (let alone mutate) that dictionary. Right, the main reason for the proposal is to lock down "locals() is globals()" for module namespaces and "locals() is the namespace that was returned from __prepare__ and will be passed to the metaclass constructor" for class bodies. The change to exec merely follows because the single argument form corresponds to module execution and the two argument form to class body execution. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From benhoyt at gmail.com Mon May 13 00:04:11 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 13 May 2013 10:04:11 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: > And if we're creating a custom object instead, why return a 2-tuple > rather than making the entry's name an attribute of the custom object? > > To me, that suggests a more reasonable API for os.scandir() might be > for it to be an iterator over "dir_entry" objects: > > name (as a string) > is_file() > is_dir() > is_link() > stat() > cached_stat (None or a stat object) Nice! I really like your basic idea of returning a custom object instead of a 2-tuple. And I agree with Christian that .stat() would be clearer called .lstat(). I also like your later idea of simply exposing .dirent (would be None on Windows). One tweak I'd suggest is that is_file() etc be called isfile() etc without the underscore, to match the naming of the os.path.is* functions. > That would actually make sense at an implementation > level anyway - is_file() etc would check self.cached_lstat first, and > if that was None they would check self.dirent, and if that was also > None they would raise an error. Hmm, I'm not sure about this at all. Are you suggesting that the DirEntry object's is* functions would raise an error if both cached_lstat and dirent were None? Wouldn't it make for a much simpler API to just call os.lstat() and populate cached_lstat instead? As far as I'm concerned, that'd be the point of making DirEntry.lstat() a function. In fact, I don't think .cached_lstat should be exposed to the user. They just call entry.lstat(), and it returns a cached stat or calls os.lstat() to get the real stat if required (and populates the internal cached stat value). And the entry.is* functions would call entry.lstat() if dirent was or d_type was DT_UNKNOWN. This would change relatively nasty code like this: files = [] dirs = [] for entry in os.scandir(path): try: isdir = entry.isdir() except NotPresentError: st = os.lstat(os.path.join(path, entry.name)) isdir = stat.S_ISDIR(st) if isdir: dirs.append(entry.name) else: files.append(entry.name) Into nice clean code like this: files = [] dirs = [] for entry in os.scandir(path): if entry.isfile(): dirs.append(entry.name) else: files.append(entry.name) This change would make scandir() usable by ordinary mortals, rather than just hardcore library implementors. In other words, I'm proposing that the DirEntry objects yielded by scandir() would have .name and .dirent attributes, and .isdir(), .isfile(), .islink(), .lstat() methods, and look basically like this (though presumably implemented in C): class DirEntry: def __init__(self, name, dirent, lstat, path='.'): # User shouldn't need to call this, but called internally by scandir() self.name = name self.dirent = dirent self._lstat = lstat # non-public attributes self._path = path def lstat(self): if self._lstat is None: self._lstat = os.lstat(os.path.join(self._path, self.name)) return self._lstat def isdir(self): if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: return self.dirent.d_type == DT_DIR else: return stat.S_ISDIR(self.lstat().st_mode) def isfile(self): if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: return self.dirent.d_type == DT_REG else: return stat.S_ISREG(self.lstat().st_mode) def islink(self): if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: return self.dirent.d_type == DT_LNK else: return stat.S_ISLNK(self.lstat().st_mode) Oh, and the .dirent would either be None (Windows) or would have .d_type and .d_ino attributes (Linux, OS X). This would make the scandir() API nice and simple to use for callers, but still expose all the information the OS provides (both the meaningful fields in dirent, and a full stat on Windows, nicely cached in the DirEntry object). Thoughts? -Ben From christian at python.org Mon May 13 01:28:33 2013 From: christian at python.org (Christian Heimes) Date: Mon, 13 May 2013 01:28:33 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: <519025A1.2010805@python.org> Am 13.05.2013 00:04, schrieb Ben Hoyt: > In fact, I don't think .cached_lstat should be exposed to the user. > They just call entry.lstat(), and it returns a cached stat or calls > os.lstat() to get the real stat if required (and populates the > internal cached stat value). And the entry.is* functions would call > entry.lstat() if dirent was or d_type was DT_UNKNOWN. This would > change relatively nasty code like this: I would prefer to go the other route and don't expose lstat(). It's cleaner and less confusing to have a property cached_lstat on the object because it actually says what it contains. The property's internal code can do a lstat() call if necessary. Your code example doesn't handle the case of a failing lstat() call. It can happen when the file is removed or permission of a parent directory changes. > This change would make scandir() usable by ordinary mortals, rather > than just hardcore library implementors. Why not have both? The os module exposes and leaks the platform details on more than on occasion. A low level function can expose name + dirent struct on POSIX and name + stat_result on Windows. Then you can build a high level API like os.scandir() in pure Python code. > class DirEntry: > def __init__(self, name, dirent, lstat, path='.'): > # User shouldn't need to call this, but called internally by scandir() > self.name = name > self.dirent = dirent > self._lstat = lstat # non-public attributes > self._path = path You should include the fd of the DIR pointer here for the new *at() function family. > def lstat(self): > if self._lstat is None: > self._lstat = os.lstat(os.path.join(self._path, self.name)) > return self._lstat The function should use fstatat(2) function (os.lstat with dir_fd) when it is available on the current platform. It's better and more secure than lstat() with a joined path. > def isdir(self): > if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: > return self.dirent.d_type == DT_DIR > else: > return stat.S_ISDIR(self.lstat().st_mode) > > def isfile(self): > if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: > return self.dirent.d_type == DT_REG > else: > return stat.S_ISREG(self.lstat().st_mode) > > def islink(self): > if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: > return self.dirent.d_type == DT_LNK > else: > return stat.S_ISLNK(self.lstat().st_mode) A bit faster: d_type = getattr(self.dirent, "d_type", DT_UNKNOWN) if d_type != DT_UNKNOWN: return d_type == DT_LNK The code doesn't handle a failing lstat() call. Christian From raymond.hettinger at gmail.com Mon May 13 01:49:44 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 12 May 2013 16:49:44 -0700 Subject: [Python-Dev] Best practices for Enum Message-ID: After the long design effort for the enum module, I'm sure there will be a forthcoming effort to apply them pervasively throughout the standard library. I would like to ask for a little restraint and for there to be individual cost/benefit evaluations for each case. On the plus-side, the new integer-enums have a better repr than plain integers. For internal constants such as those in idlelib and regex, the user won't see any benefit at all. But there will be a cost in terms of code churn, risk of introducing errors in stable code, modestly slowing-down the code, making it more difficult to apply bug fixes across multiple versions of Python, and increased code verbosity (i.e. changing "if direction=LEFT: ..." to "if direction is Direction.LEFT: ...") For external constants, some thought needs to be given to: * is the current API working just fine (i.e. decimal's ROUND_DOWN) * will enums break doctests or any existing user code * will it complicate users converting from Python 2 * do users now have to learn an additional concept * does it complicate the module in any way I'm hoping that enums get used only in cases where they clearly improve the public API (i.e. cases such as sockets that have a large number of integer constants) rather than having a frenzy of every constant, everywhere getting turned into an enum. I would like to see enums used as tool for managing complexity, rather than becoming a cause of added complexity by being used for every problem, the tall and small, even where it is not needed at all. my-two-cents-ly yours, Raymond From victor.stinner at gmail.com Mon May 13 02:11:28 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 13 May 2013 02:11:28 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: 2013/5/13 Ben Hoyt : > class DirEntry: > def __init__(self, name, dirent, lstat, path='.'): > # User shouldn't need to call this, but called internally by scandir() > self.name = name > self.dirent = dirent > self._lstat = lstat # non-public attributes > self._path = path > > def lstat(self): > if self._lstat is None: > self._lstat = os.lstat(os.path.join(self._path, self.name)) > return self._lstat > ... You need to provide a way to invalidate the stat cache, DirEntry.clearcache() for example. Victor From benhoyt at gmail.com Mon May 13 02:21:36 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 13 May 2013 12:21:36 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <519025A1.2010805@python.org> References: <518CDE16.6010104@python.org> <519025A1.2010805@python.org> Message-ID: > I would prefer to go the other route and don't expose lstat(). It's > cleaner and less confusing to have a property cached_lstat on the object > because it actually says what it contains. The property's internal code > can do a lstat() call if necessary. Are you suggesting just accessing .cached_lstat could call os.lstat()? That seems very bad to me. It's a property access -- it looks cheap, therefore people will expect it to be. From PEP 8 "Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap." Even worse is error handling -- I'd expect the expression "entry.cached_lstat" to only ever raise AttributeError, not OSError in the case it calls stat under the covers. Calling code would have to have a try/except around what looked like a simple attribute access. For these two reasons I think lstat() should definitely be a function. > Your code example doesn't handle the case of a failing lstat() call. It > can happen when the file is removed or permission of a parent directory > changes. True. My isdir/isfile/islink implementations should catch any OSError from the lstat() and return False (like os.path.isdir etc do). But then calling code still doesn't need try/excepts around the isdir() calls. This is how os.walk() is implemented -- there's no extra error handling around the isdir() call. > Why not have both? The os module exposes and leaks the platform details > on more than on occasion. A low level function can expose name + dirent > struct on POSIX and name + stat_result on Windows. Then you can build a > high level API like os.scandir() in pure Python code. I wouldn't be opposed to that, but it's a scandir() implementation detail. If there's a scandir_helper_win() and scandir_helper_posix() written in C, and the rest is written in Python, that'd be fine by me. As long as the Python part didn't slow it down much. > The function should use fstatat(2) function (os.lstat with dir_fd) when > it is available on the current platform. It's better and more secure > than lstat() with a joined path. Sure. I'm primarily a Windows dev, so not too familiar with all the fancy stat* functions. But what you're saying makes sense. -Ben From benhoyt at gmail.com Mon May 13 02:24:16 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Mon, 13 May 2013 12:24:16 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: On Mon, May 13, 2013 at 12:11 PM, Victor Stinner wrote: > 2013/5/13 Ben Hoyt : >> class DirEntry: >> ... >> def lstat(self): >> if self._lstat is None: >> self._lstat = os.lstat(os.path.join(self._path, self.name)) >> return self._lstat >> ... > > You need to provide a way to invalidate the stat cache, > DirEntry.clearcache() for example. Hmm, I'm not sure why, as the stat result is cached on the DirEntry instance (not the class). If you don't want the cached version, just call os.stat() yourself, or throw away the DirEntry instance. DirEntry instances would just be used for dealing with scandir() results. -Ben From ethan at stoneleaf.us Mon May 13 04:25:09 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 12 May 2013 19:25:09 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: <51904F05.2070300@stoneleaf.us> On 05/12/2013 04:49 PM, Raymond Hettinger wrote: > After the long design effort for the enum module, > I'm sure there will be a forthcoming effort to apply > them pervasively throughout the standard library. I'd like to apply them where it makes sense. It would be a good way for me to learn all that's in the stdlib while doing something modestly useful. > For internal constants such as those in idlelib and regex, > the user won't see any benefit at all. Devs are users, too! If it makes our lives easier, then, ultimately, it will make our users lives easier as well. >But there will be > a cost in terms of code churn, risk of introducing errors > in stable code, modestly slowing-down the code, making > it more difficult to apply bug fixes across multiple versions > of Python, and increased code verbosity (i.e. changing > "if direction=LEFT: ..." to "if direction is Direction.LEFT: ...") There is no need for increased verbosity, as Enums support __eq__ as well: class Direction(Enum): LEFT = 1 RIGHT = 2 UP = 3 DOWN = 4 globals.update(Direction.__members__) direction = ... if direction == LEFT: ... > For external constants, some thought needs to be given to: > * is the current API working just fine (i.e. decimal's ROUND_DOWN) just fine? or working great? > * will enums break doctests or any existing user code doctests rely on repr's, don't they? Then yes. User code? I would think only if the user was relying on a repr or str of the value. At any rate, that's why this isn't going in until 3.4. > * will it complicate users converting from Python 2 I would hope it would simplify; I'll backport a 2.x version, though, so anyone interested can play with it. > * do users now have to learn an additional concept I don't think enumerations would be a new concept to a computer programmer. > * does it complicate the module in any way A little bit of setup at the top, but then it should be easier everywhere else. > I'm hoping that enums get used only in cases where they > clearly improve the public API (i.e. cases such as sockets > that have a large number of integer constants) rather > than having a frenzy of every constant, everywhere getting > turned into an enum. > > I would like to see enums used as tool for managing complexity, > rather than becoming a cause of added complexity by being used > for every problem, the tall and small, even where it is not needed at all. I will certainly ask for advice on which modules to spend my time on. I know enums are not a cure-all, but they are great for debugging and interactive work. I don't know about you, but I sure spend a lot of time in those two places. -- ~Ethan~ From stephen at xemacs.org Mon May 13 05:15:55 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 13 May 2013 12:15:55 +0900 Subject: [Python-Dev] Best practices for Enum In-Reply-To: <51904F05.2070300@stoneleaf.us> References: <51904F05.2070300@stoneleaf.us> Message-ID: <87bo8fefis.fsf@uwakimon.sk.tsukuba.ac.jp> Ethan Furman writes: > I will certainly ask for advice on which modules to spend my time > on. I know enums are not a cure-all, but they are great for > debugging and interactive work. Especially in new code where they are used throughout. Not so in the existing stdlib, I expect. The concrete limitation on that theory that I envision with retrofitting the stdlib is that cooperative modules (those that call into and are called from the module being converted to use enums) are going to be expecting values, not enums. So you need to convert return values and arguments, and not only do you *not* get the benefit of enum reprs in the cooperating modules, but you introduce additional complexity in the converted module. Nor can you say "OK, it's more than I expected but I'll do the whole stdlib," because you don't know who is calling into or supplying callbacks to the stdlib modules. I expect you would recognize these cases quickly, but I imagine Raymond is feeling a more generic unease, and I can't say I blame him. In many cases you could convert code to use IntEnum instead of Enum, preserving the old semantics, and probably not needing to convert return values, but again I expect the benefits of Enum-ness would attenuate quickly as cooperating code converts them to int internally. From eliben at gmail.com Mon May 13 05:26:39 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sun, 12 May 2013 20:26:39 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: Thanks for the insights, Raymond. I don't think anyone is planning on rushing anything. We still have to get the enum module itself committed and a serious review process has just started for that, so it will take time. There's no general "let's replace all constants with enums" TODO item that I know of. It's my hope that such changes will happen very gradually and only when deemed important and useful by core developers. So it's not different from any other changes made in the Python repository, really. Issues will be opened, discussed, code will be reviewed by whomever is willing to participate. IIRC Guido wanted to have a printable representation for the socket module constants like socket.AF_* and socket.SOCK_* because that would be useful in developing Tulip. Implementing those with IntEnum may be a relatively non-controversial first foray into actually putting enums to use. But again, at least as far as I'm concerned there's no concrete todo list at this point. Eli On Sun, May 12, 2013 at 4:49 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > After the long design effort for the enum module, > I'm sure there will be a forthcoming effort to apply > them pervasively throughout the standard library. > > I would like to ask for a little restraint and for there to > be individual cost/benefit evaluations for each case. > > On the plus-side, the new integer-enums have a better > repr than plain integers. > > For internal constants such as those in idlelib and regex, > the user won't see any benefit at all. But there will be > a cost in terms of code churn, risk of introducing errors > in stable code, modestly slowing-down the code, making > it more difficult to apply bug fixes across multiple versions > of Python, and increased code verbosity (i.e. changing > "if direction=LEFT: ..." to "if direction is Direction.LEFT: ...") > > For external constants, some thought needs to be given to: > * is the current API working just fine (i.e. decimal's ROUND_DOWN) > * will enums break doctests or any existing user code > * will it complicate users converting from Python 2 > * do users now have to learn an additional concept > * does it complicate the module in any way > > I'm hoping that enums get used only in cases where they > clearly improve the public API (i.e. cases such as sockets > that have a large number of integer constants) rather > than having a frenzy of every constant, everywhere getting > turned into an enum. > > I would like to see enums used as tool for managing complexity, > rather than becoming a cause of added complexity by being used > for every problem, the tall and small, even where it is not needed at all. > > my-two-cents-ly yours, > > > Raymond > > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/eliben%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon May 13 06:50:55 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 12 May 2013 21:50:55 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: <87bo8fefis.fsf@uwakimon.sk.tsukuba.ac.jp> References: <51904F05.2070300@stoneleaf.us> <87bo8fefis.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <5190712F.6090300@stoneleaf.us> On 05/12/2013 08:15 PM, Stephen J. Turnbull wrote: > Ethan Furman writes: > > > I will certainly ask for advice on which modules to spend my time > > on. I know enums are not a cure-all, but they are great for > > debugging and interactive work. > > Especially in new code where they are used throughout. Not so in the > existing stdlib, I expect. Perhaps not to somebody who is already well versed in it. It would be very helpful to me. ;) > The concrete limitation on that theory that I envision with > retrofitting the stdlib is that cooperative modules (those that call > into and are called from the module being converted to use enums) are > going to be expecting values, not enums. So you need to convert > return values and arguments, and not only do you *not* get the benefit > of enum reprs in the cooperating modules, but you introduce additional > complexity in the converted module. Nor can you say "OK, it's more > than I expected but I'll do the whole stdlib," because you don't know > who is calling into or supplying callbacks to the stdlib modules. Well, somebody else might, but I know how much (little?) time I have. It'll be great to have new modules use Enums; retrofitted modules should use Psuedonums (okay, I made that word up -- it's supposed to be an Enum but with some other type mixed in so it's no longer a pure Enum, more like a psuedo enum). As I was saying, if tkinter was up for conversion it would just be to StrEnum, and that would mostly consist of adding the enumeration at the top, exporting it to global, then browsing for locations where the string value was used and removing the quotes. Of course, having said that I'm sure somebody will chime in with "yes, but..." > In many cases you could convert code to use IntEnum instead of Enum, > preserving the old semantics, and probably not needing to convert > return values, but again I expect the benefits of Enum-ness would > attenuate quickly as cooperating code converts them to int internally. Hmmm... yeah, that would suck. Well, I'm sure I can help in others ways if this doesn't pan out. Maybe some other new module that Raymond objects to. ;) -- ~Ethan~ From raymond.hettinger at gmail.com Mon May 13 09:06:52 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 13 May 2013 00:06:52 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: <0C619F7A-BD13-4171-9611-1FCB6E9CBE47@gmail.com> On May 12, 2013, at 8:26 PM, Eli Bendersky wrote: > Thanks for the insights, Raymond. I don't think anyone is planning on rushing anything. We still have to get the enum module itself committed and a serious review process has just started for that, so it will take time. > > There's no general "let's replace all constants with enums" TODO item that I know of. It's my hope that such changes will happen very gradually and only when deemed important and useful by core developers. Ethan's email suggests that against my advice he is in-fact going to go through the standard library, applying enums quite broadly. That is somewhat at odds with the notions of holistic refactoring and gradual change. Nor does it reflect sufficient appreciation for concerns about maintenance issues, code stability, the effect on 2-to-3 migration, doctests, performance, the wishes of the module authors, or whether users will see any actual benefits (particularly for internal constants). I fully understand the enthusiasm to take the car out for a spin, but the standard library isn't really a great place for experimentation. And "trying to learn the standard library" isn't a good rationale for making extensive changes to it. So, please do help make sure there is some restraint and careful consideration. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon May 13 09:50:56 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 13 May 2013 16:50:56 +0900 Subject: [Python-Dev] Best practices for Enum In-Reply-To: <0C619F7A-BD13-4171-9611-1FCB6E9CBE47@gmail.com> References: <0C619F7A-BD13-4171-9611-1FCB6E9CBE47@gmail.com> Message-ID: <877gj3e2sf.fsf@uwakimon.sk.tsukuba.ac.jp> Raymond Hettinger writes: > whether users will see any actual benefits (particularly for > internal constants). I don't understand the parenthetical remark. It seems to me that changing internal constants should have the benefits that Ethan points to for understanding, debugging, and interactive exploration, with the least risk of destabilizing external APIs. I agree with you that the *net* benefit is still likely to be negative due to effects of code churn and the potential for new bugs, but (considering benefit separately from cost) the advertised benefits would be achieved. The point is that I think it's clear that using enums to provide name spaces for the large number of constants whose names are well-known, but not so much the values, in the os module and other OS library wrappers is highest priority, and IMO a net plus. But I would say the next place to look would be exactly these internal constants, where they have similar characteristics (many of them, well-known names, hard-to-remember values: the fds for stdin, stdout, and stderr are non-candidates!) As I said, *I* don't think it's worth doing internal constants, but I couldn't defend that opinion well to somebody who thinks it is worth doing. From solipsis at pitrou.net Mon May 13 10:38:00 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 13 May 2013 10:38:00 +0200 Subject: [Python-Dev] Best practices for Enum References: <0C619F7A-BD13-4171-9611-1FCB6E9CBE47@gmail.com> Message-ID: <20130513103800.2a94f9bf@pitrou.net> Le Mon, 13 May 2013 00:06:52 -0700, Raymond Hettinger a ?crit : > > On May 12, 2013, at 8:26 PM, Eli Bendersky wrote: > > > Thanks for the insights, Raymond. I don't think anyone is planning > > on rushing anything. We still have to get the enum module itself > > committed and a serious review process has just started for that, > > so it will take time. > > > > There's no general "let's replace all constants with enums" TODO > > item that I know of. It's my hope that such changes will happen > > very gradually and only when deemed important and useful by core > > developers. > > Ethan's email suggests that against my advice he is in-fact going to > go through the standard library, applying enums quite broadly. It probably won't go in without reviews, so there's no need to be too concerned IMHO. The fact that one of the enum classes is an int subclass should make the behaviour changes minimal, if any. Regards Antoine. From stefan at bytereef.org Mon May 13 12:14:04 2013 From: stefan at bytereef.org (Stefan Krah) Date: Mon, 13 May 2013 12:14:04 +0200 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: <20130513101404.GA2069@sleipnir.bytereef.org> Raymond Hettinger wrote: > I would like to ask for a little restraint and for there to > be individual cost/benefit evaluations for each case. +1 > For external constants, some thought needs to be given to: > * is the current API working just fine (i.e. decimal's ROUND_DOWN) For compatibility with the Python version, I recently changed the rounding constants of the C version to strings. This was at the request of a user who wanted to exchange (Decimal, ROUNDING) pickles between the versions. I think the strings are working fine and personally I have no plans to change the type again. The episode shows that pickling backwards compatibility is one thing to consider, but I'm probably stating the obvious here. :) Stefan Krah From fijall at gmail.com Mon May 13 13:40:27 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 13 May 2013 13:40:27 +0200 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: <20130512140151.1116d35e@fsol> References: <5183245D.2000009@pearwood.info> <20130512140151.1116d35e@fsol> Message-ID: On Sun, May 12, 2013 at 2:01 PM, Antoine Pitrou wrote: > On Fri, 03 May 2013 12:43:41 +1000 > Steven D'Aprano wrote: >> On 03/05/13 11:29, Nick Coghlan wrote: >> > An exchange in one of the enum threads prompted me to write down >> > something I've occasionally thought about regarding locals(): it is >> > currently severely underspecified, and I'd like to make the current >> > CPython behaviour part of the language/library specification. (We >> > recently found a bug in the interaction between the __prepare__ method >> > and lexical closures that was indirectly related to this >> > underspecification) >> >> Fixing the underspecification is good. Enshrining a limitation as the >> one correct way, not so good. > > I have to say, I agree with Steven here. Mutating locals() is currently > an implementation detail, and it should IMHO stay that way. Only > reading a non-mutated locals() should be well-defined. > > Regards > > Antoine. Like it or not, people rely on this behavior. I don't think CPython (or PyPy) can actually afford to change it. If so, documenting it sounds like a better idea than leaving it undocumented only known to the "inner shrine" Cheers, fijal From kristjan at ccpgames.com Mon May 13 13:49:29 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Mon, 13 May 2013 11:49:29 +0000 Subject: [Python-Dev] weak refs in descriptors (http://bugs.python.org/issue17950) Message-ID: Hello python-dev. I'm working on a patch to remove reference cycles from heap-allocated classes: http://bugs.python.org/issue17950 Part of the patch involves making sure that descriptors in the class dictionary don't contain strong references to the class itself. This is item 2) in the defect description. I have implemented this via weak references and hit no issues at all when running the test suite. But I'd like to ask the oracle if there is anything I may be overlooking with this approach? Any hidden problems we might encounter? K -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Mon May 13 14:25:08 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 00:25:08 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: Message-ID: Okay, I've renamed my "BetterWalk" module to "scandir" and updated it as per our discussion: https://github.com/benhoyt/scandir/#readme It's not yet production-ready, and is basically still in API and performance testing stage. For instance, the underlying scandir_helper functions don't even return iterators yet -- they're just glorified versions of os.listdir() that return an additional d_ino/d_type (Linux) or stat_result (Windows). In any case, I really like the API (thanks mostly to Nick Coghlan), and performance is great, even with DirEntry being written in Python. PERFORMANCE: On Windows I'm seeing that scandir.walk() on a large test tree (see benchmark.py) is 8-9 times faster than os.walk(), and on Linux it's 3-4 times faster. Yes, it is that much faster, and yes, those numbers are real. :-) Please critique away. At this stage it'd be most helpful to critique any API or performance-related issues rather than coding style or minor bugs, as I'm expecting the code itself will change quite a bit still. Todos: * Make _scandir.scandir_helper functions return real iterators instead of lists * Move building of DirEntry objects into C module, so basically the entire scandir() is in C * Add tests -Ben From stefan at drees.name Mon May 13 14:47:21 2013 From: stefan at drees.name (Stefan Drees) Date: Mon, 13 May 2013 14:47:21 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: Message-ID: <5190E0D9.3090507@drees.name> Hi Ben, Am 13.05.13 14:25, schrieb Ben Hoyt: > ...It's not yet production-ready, and is basically still in API and > performance testing stage. ... > > In any case, I really like the API (thanks mostly to Nick Coghlan), > and performance is great, even with DirEntry being written in Python. > > PERFORMANCE: On Windows I'm seeing that scandir.walk() on a large test > tree (see benchmark.py) is 8-9 times faster than os.walk(), and on > Linux it's 3-4 times faster. Yes, it is that much faster, and yes, > those numbers are real. :-) > > Please critique away. At this stage it'd be most helpful to critique > any API or performance-related issues ... you asked for critique, but the performance seems to be also 2-3 times speedup (as stated by benchmark.py) on mac osx 10.8.3 (on MacBook Pro 13 inch, start of 2011, solid state disk) with python 2.7.4 (the homebrew one): $> git clone git://github.com/benhoyt/scandir.git $> cd scandir && python setup.py install $> python benchmark.py USING FAST C version Creating tree at benchtree: depth=4, num_dirs=5, num_files=50 Priming the system's cache... Benchmarking walks on benchtree, repeat 1/3... Benchmarking walks on benchtree, repeat 2/3... Benchmarking walks on benchtree, repeat 3/3... os.walk took 0.104s, scandir.walk took 0.031s -- 3.3x as fast $> python benchmark.py -s USING FAST C version Priming the system's cache... Benchmarking walks on benchtree, repeat 1/3... Benchmarking walks on benchtree, repeat 2/3... Benchmarking walks on benchtree, repeat 3/3... os.walk size 226395000, scandir.walk size 226395000 -- equal os.walk took 0.246s, scandir.walk took 0.125s -- 2.0x as fast So for now, all well and thank you. All the best, Stefan. From ncoghlan at gmail.com Mon May 13 15:13:03 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 13 May 2013 23:13:03 +1000 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: Message-ID: On Mon, May 13, 2013 at 10:25 PM, Ben Hoyt wrote: > Okay, I've renamed my "BetterWalk" module to "scandir" and updated it > as per our discussion: > > https://github.com/benhoyt/scandir/#readme Nice! > PERFORMANCE: On Windows I'm seeing that scandir.walk() on a large test > tree (see benchmark.py) is 8-9 times faster than os.walk(), and on > Linux it's 3-4 times faster. Yes, it is that much faster, and yes, > those numbers are real. :-) I'd to see the numbers for NFS or CIFS - stat() can be brutally slow over a network connection (that's why we added a caching mechanism to importlib). > Please critique away. At this stage it'd be most helpful to critique > any API or performance-related issues rather than coding style or > minor bugs, as I'm expecting the code itself will change quite a bit > still. I initially quite liked the idea of not offering any methods on DirEntry, only properties, to make it obvious that they don't touch the file system, but just report info from the scandir call. However, I think that it ends up reading strangely, and would be confusing relative to the os.path() APIs. What you have now seems like a good, simple alternative. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From fijall at gmail.com Mon May 13 15:20:40 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Mon, 13 May 2013 15:20:40 +0200 Subject: [Python-Dev] weak refs in descriptors (http://bugs.python.org/issue17950) In-Reply-To: References: Message-ID: On Mon, May 13, 2013 at 1:49 PM, Kristj?n Valur J?nsson wrote: > Hello python-dev. > > I?m working on a patch to remove reference cycles from heap-allocated > classes: http://bugs.python.org/issue17950 > > Part of the patch involves making sure that descriptors in the class > dictionary don?t contain strong references to the class itself. > > This is item 2) in the defect description. > > I have implemented this via weak references and hit no issues at all when > running the test suite. > > But I?d like to ask the oracle if there is anything I may be overlooking > with this approach? Any hidden problems we might encounter? > > > > K Hi Kristjan The strong reference there is a feature. Descriptors keep the class alive if somehow the class disappears and the descriptor itself does not. Please don't change language semantics (yes, this is a change in semantics), just because the test suite passes - I can assure you there are people doing convoluted stuff that expect this to work. Cheers, fijal From jsbueno at python.org.br Mon May 13 15:35:00 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 13 May 2013 10:35:00 -0300 Subject: [Python-Dev] weak refs in descriptors (http://bugs.python.org/issue17950) In-Reply-To: References: Message-ID: On 13 May 2013 10:20, Maciej Fijalkowski wrote: > On Mon, May 13, 2013 at 1:49 PM, Kristj?n Valur J?nsson > wrote: >> Hello python-dev. >> >> I?m working on a patch to remove reference cycles from heap-allocated >> classes: http://bugs.python.org/issue17950 >> >> Part of the patch involves making sure that descriptors in the class >> dictionary don?t contain strong references to the class itself. >> >> This is item 2) in the defect description. >> >> I have implemented this via weak references and hit no issues at all when >> running the test suite. >> >> But I?d like to ask the oracle if there is anything I may be overlooking >> with this approach? Any hidden problems we might encounter? >> >> >> >> K > > Hi Kristjan > > The strong reference there is a feature. Descriptors keep the class > alive if somehow the class disappears and the descriptor itself does > not. Please don't change language semantics (yes, this is a change in > semantics), just because the test suite passes - I can assure you > there are people doing convoluted stuff that expect this to work. > +1 for it being an expected behavior. So I think it would be a nice thing to write a test that breaks under this condition js -><- > Cheers, > fijal > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br From fabiosantosart at gmail.com Mon May 13 15:35:53 2013 From: fabiosantosart at gmail.com (=?ISO-8859-1?Q?F=E1bio_Santos?=) Date: Mon, 13 May 2013 14:35:53 +0100 Subject: [Python-Dev] Tightening up the specification for locals() In-Reply-To: References: <5183245D.2000009@pearwood.info> <20130512140151.1116d35e@fsol> Message-ID: > Like it or not, people rely on this behavior. I don't think CPython > (or PyPy) can actually afford to change it. If so, documenting it > sounds like a better idea than leaving it undocumented only known to > the "inner shrine" > +1. I am relying on this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon May 13 16:19:38 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 13 May 2013 10:19:38 -0400 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: <20130513101938.1d91bed0@limelight.wooz.org> On May 12, 2013, at 04:49 PM, Raymond Hettinger wrote: >After the long design effort for the enum module, I'm sure there will be a >forthcoming effort to apply them pervasively throughout the standard library. We usually, explicitly, try not to do such wholesale adoptions in the stdlib when new features land. This is almost always a good idea in order to gain more experience with the new feature, reduce code churn (and thus the introduction of bugs), and aid in back/forward porting. It seems entirely reasonable to me to be just as conservative about adoption of enums in the stdlib. As Eli mentions, making the socket constants enums seems like a good test case. -Barry From christian at python.org Mon May 13 16:49:19 2013 From: christian at python.org (Christian Heimes) Date: Mon, 13 May 2013 16:49:19 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> <519025A1.2010805@python.org> Message-ID: <5190FD6F.5070605@python.org> Am 13.05.2013 02:21, schrieb Ben Hoyt: > Are you suggesting just accessing .cached_lstat could call os.lstat()? > That seems very bad to me. It's a property access -- it looks cheap, > therefore people will expect it to be. From PEP 8 "Avoid using > properties for computationally expensive operations; the attribute > notation makes the caller believe that access is (relatively) cheap." > > Even worse is error handling -- I'd expect the expression > "entry.cached_lstat" to only ever raise AttributeError, not OSError in > the case it calls stat under the covers. Calling code would have to > have a try/except around what looked like a simple attribute access. > > For these two reasons I think lstat() should definitely be a function. OK, you got me! I'm now convinced that a property is a bad idea. I still like to annotate that the function may return a cached value. Perhaps lstat() could require an argument? def lstat(self, cached): if not cached or self._lstat is None: self._lstat = os.lstat(...) return self._lstat > True. My isdir/isfile/islink implementations should catch any OSError > from the lstat() and return False (like os.path.isdir etc do). But > then calling code still doesn't need try/excepts around the isdir() > calls. This is how os.walk() is implemented -- there's no extra error > handling around the isdir() call. You could take the opportunity and take the 'file was deleted' case into account. I admit it has a very low priority. Please regard the case for bonus points only. ;) > Sure. I'm primarily a Windows dev, so not too familiar with all the > fancy stat* functions. But what you're saying makes sense. I'm glad to be of assistance! The feature is new (added in 3.3) and is available on most POSIX platforms. http://docs.python.org/3/library/os.html#dir-fd If you need any help or testing please feel free to ask me. I really like to get this feature into 3.4. Christian From ethan at stoneleaf.us Mon May 13 17:38:45 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 08:38:45 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: <0C619F7A-BD13-4171-9611-1FCB6E9CBE47@gmail.com> References: <0C619F7A-BD13-4171-9611-1FCB6E9CBE47@gmail.com> Message-ID: <51910905.6060206@stoneleaf.us> On 05/13/2013 12:06 AM, Raymond Hettinger wrote: > > Ethan's email suggests that against my advice he is in-fact going to go through the standard library, applying enums > quite broadly. I think you are falling victim to Wizard's First Rule: people will believe what they want to be true, or are afraid is true. What I said was > I'd like to apply them where it makes sense. Which does not mean quite broadly, unless you think it would make sense to have them everywhere? Because that's not the impression I have from your posts. Furthermore, what's wrong with going through the stdlib, examining the various modules, and then asking questions to see if, indeed, it does "make sense" to use enums in that module? How else will I know? Are you going to give me list of which ones are acceptable? I don't know you well enough to guess at your motivations for misrepresenting me, but please stop. If you have a question, ask me. -- ~Ethan~ From tjreedy at udel.edu Mon May 13 19:21:17 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 13 May 2013 13:21:17 -0400 Subject: [Python-Dev] weak refs in descriptors (http://bugs.python.org/issue17950) In-Reply-To: References: Message-ID: On 5/13/2013 9:20 AM, Maciej Fijalkowski wrote: > The strong reference there is a feature. Descriptors keep the class > alive if somehow the class disappears and the descriptor itself does Is this feature stated or implied in the reference manual? 3.3.2.1. Implementing Descriptors 3.3.2.2. Invoking Descriptors ??? or is it an implementation detail that people have come to rely on? From solipsis at pitrou.net Mon May 13 19:36:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 13 May 2013 19:36:45 +0200 Subject: [Python-Dev] weak refs in descriptors (http://bugs.python.org/issue17950) References: Message-ID: <20130513193645.62c9d679@fsol> On Mon, 13 May 2013 13:21:17 -0400 Terry Jan Reedy wrote: > On 5/13/2013 9:20 AM, Maciej Fijalkowski wrote: > > > The strong reference there is a feature. Descriptors keep the class > > alive if somehow the class disappears and the descriptor itself does > > Is this feature stated or implied in the reference manual? > 3.3.2.1. Implementing Descriptors > 3.3.2.2. Invoking Descriptors > ??? > or is it an implementation detail that people have come to rely on? Any reference that is not documentedly weak is strong by definition; this is Python's basic semantics, there's no need to ask about documentation pointers. The only question is whether some people rely on this particular one. Regards Antoine. From benhoyt at gmail.com Tue May 14 00:41:01 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 10:41:01 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: Message-ID: > I'd to see the numbers for NFS or CIFS - stat() can be brutally slow > over a network connection (that's why we added a caching mechanism to > importlib). How do I know what file system Windows networking is using? In any case, here's some numbers on Windows -- it's looking pretty good! This is with default DEPTH/NUM_DIRS/NUM_FILES on a LAN: Benchmarking walks on \\anothermachine\docs\Ben\bigtree, repeat 3/3... os.walk took 11.345s, scandir.walk took 0.340s -- 33.3x as fast And this is on a VPN on a remote network with the benchmark.py values cranked down to DEPTH = 3, NUM_DIRS = 3, NUM_FILES = 20 (because otherwise it was taking far too long): Benchmarking walks on \\ben1.titanmt.local\c$\dev\scandir\benchtree, repeat 3/3... os.walk took 122.310s, scandir.walk took 5.452s -- 22.4x as fast If anyone can run benchmark.py on Linux / NFS or similar, that'd be great. You'll probably have to lower DEPTH/NUM_DIRS/NUM_FILES first and then move the "benchtree" to the network file system to run it against that. > I initially quite liked the idea of not offering any methods on > DirEntry, only properties, to make it obvious that they don't touch > the file system, but just report info from the scandir call. However, > I think that it ends up reading strangely, and would be confusing > relative to the os.path() APIs. > > What you have now seems like a good, simple alternative. Thanks. Yeah, I kinda liked the "DirEntry doesn't make any OS calls" at first too, but then as I got into it I realized it make for a really nasty API for most use cases. I like how it's ended up. -Ben From benhoyt at gmail.com Tue May 14 00:50:13 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 10:50:13 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <5190FD6F.5070605@python.org> References: <518CDE16.6010104@python.org> <519025A1.2010805@python.org> <5190FD6F.5070605@python.org> Message-ID: > OK, you got me! I'm now convinced that a property is a bad idea. Thanks. :-) > I still like to annotate that the function may return a cached value. > Perhaps lstat() could require an argument? > > def lstat(self, cached): > if not cached or self._lstat is None: > self._lstat = os.lstat(...) > return self._lstat Hmm, I'm just not sure I like the API. Setting cached to True to me would imply it's only ever going to come from the cache (i.e., just return self._lstat). Also, isdir() etc have the same issue, so if you're going this route, their signatures would need this too. The DirEntry instance is really a cached value in itself. ".name" is cached, ".dirent" is cached, and the methods return cached if they can. That's more or less the point of the object. But you have a fair point, and this would need to be explicit in the documentation. -Ben > > >> True. My isdir/isfile/islink implementations should catch any OSError >> from the lstat() and return False (like os.path.isdir etc do). But >> then calling code still doesn't need try/excepts around the isdir() >> calls. This is how os.walk() is implemented -- there's no extra error >> handling around the isdir() call. > > You could take the opportunity and take the 'file was deleted' case into > account. I admit it has a very low priority. Please regard the case for > bonus points only. ;) > >> Sure. I'm primarily a Windows dev, so not too familiar with all the >> fancy stat* functions. But what you're saying makes sense. > > I'm glad to be of assistance! The feature is new (added in 3.3) and is > available on most POSIX platforms. > http://docs.python.org/3/library/os.html#dir-fd > > If you need any help or testing please feel free to ask me. I really > like to get this feature into 3.4. > > Christian From ethan at stoneleaf.us Tue May 14 04:36:56 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 19:36:56 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <518DD40F.1070005@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> Message-ID: <5191A348.90805@stoneleaf.us> On 05/10/2013 10:15 PM, Glenn Linderman wrote: > > So it is quite possible to marry the two, as Ethan helped me figure out using an earlier NamedInt class: > > class NIE( IntET, Enum ): > x = ('NIE.x', 1) > y = ('NIE.y', 2) > z = ('NIE.z', 4) > > and then expressions involving members of NIE (and even associated integers) will be tracked... see demo1.py. > > But the last few lines of demo1 demonstrate that NIE doesn't like, somehow, remember that its values, deep down under > the covers, are really int. And doesn't even like them when they are wrapped into IntET objects. This may or may not > be a bug in the current Enum implementation. [demo1.py excerpt] print( repr( NIE1( 1 ) + NIE1(2))) print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) > So the questions are: > 1) Is there a bug in ref435 Enum that makes demo1 report errors instead of those lines working? Nope. > 2) Is something like demo2 interesting to anyone but me? Of course, I think it would be great for reporting flag values > using names rather than a number representing combined bit fields. No idea. ;) > 3) I don't see a way to subclass the ref435 EnumMeta except by replacing the whole __new__ method... does this mechanism > warrant a slight refactoring of EnumMeta to make this mechanism easier to subclass with less code redundancy? I've broken it down to make subclassing easier. > 4) Or is it simple enough and useful enough to somehow make it a feature of EnumMeta, enabled by a keyword parameter? Probably not. > 5) All this is based on "IntET"... which likely suffices for API flags parameters... but when I got to __truediv__ and > __rtruediv__, which don't return int, then I started wondering how to write a vanilla ET class that inherits from > "number" instead of "int" or "float"? One could, of course, make cooperating classes FloatET and DecimalET .... is this > a language limitation, or is there more documentation I haven't read? :) (I did read footnote [1] of > , and trembled.) Sounds like a fun project (for some value of fun ;) Okay, sorry for the long delay. What it comes down to is if you want to marry two complex types together, you may have to be the counselor as well. ;) Here's your code, revamped. I did make a slight change in the meta -- I moved the name assignment above the __init__ call so it's available in __init__. --8<-------------------------------------------------------- from ref435 import Enum from flags import IntET class NIE1( IntET, Enum ): x = 1 y = 2 z = 4 def __new__(cls, value): member = IntET.__new__(cls, 'temp', value) member._value = value return member def __init__(self, value): self._etname = self._name print( repr( NIE1.x.value )) print( repr( NIE1.x + NIE1.y )) print( repr( NIE1.x + ~ NIE1.y)) print( repr( NIE1.x + ~ 2 )) print( repr( NIE1.z * 3 )) print( repr( NIE1( 1 ) + NIE1(2))) print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) --8<-------------------------------------------------------- and my results: 1 IntET('(x + y)', 3) IntET('(x + ~y)', -2) IntET('(x + -3)', -2) IntET('(z * 3)', 12) IntET('(x + y)', 3) IntET('(x + y)', 3) Oh, and if you really wanted the 'NEI' in the _etname, change the name assignment: self._etname = 'NIE.' + self._name -- ~Ethan~ From ethan at stoneleaf.us Tue May 14 04:43:37 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 19:43:37 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5191A348.90805@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> Message-ID: <5191A4D9.30604@stoneleaf.us> On 05/13/2013 07:36 PM, Ethan Furman wrote: > Here's your code, revamped. I did make a slight change in the meta -- I moved the name assignment above the __init__ > call so it's available in __init__. > > --8<-------------------------------------------------------- > from ref435 import Enum > from flags import IntET > > class NIE1( IntET, Enum ): > x = 1 > y = 2 > z = 4 > def __new__(cls, value): > member = IntET.__new__(cls, 'temp', value) > member._value = value > return member > def __init__(self, value): > self._etname = self._name > > print( repr( NIE1.x.value )) > print( repr( NIE1.x + NIE1.y )) > print( repr( NIE1.x + ~ NIE1.y)) > print( repr( NIE1.x + ~ 2 )) > print( repr( NIE1.z * 3 )) > > > print( repr( NIE1( 1 ) + NIE1(2))) > print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) > --8<-------------------------------------------------------- > > and my results: > > 1 > IntET('(x + y)', 3) > IntET('(x + ~y)', -2) > IntET('(x + -3)', -2) > IntET('(z * 3)', 12) > IntET('(x + y)', 3) > IntET('(x + y)', 3) Forget to mention the good part -- in the custom __new__ you are able to set the value to whatever you want (not a big deal in this case, but if you had several parameters going in you could still make _value be a single, simple int). -- ~Ethan~ From v+python at g.nevcal.com Tue May 14 07:01:17 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 13 May 2013 22:01:17 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5191A348.90805@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> Message-ID: <5191C51D.3000304@g.nevcal.com> On 5/13/2013 7:36 PM, Ethan Furman wrote: > On 05/10/2013 10:15 PM, Glenn Linderman wrote: >> >> So it is quite possible to marry the two, as Ethan helped me figure >> out using an earlier NamedInt class: >> >> class NIE( IntET, Enum ): >> x = ('NIE.x', 1) >> y = ('NIE.y', 2) >> z = ('NIE.z', 4) >> >> and then expressions involving members of NIE (and even associated >> integers) will be tracked... see demo1.py. >> >> But the last few lines of demo1 demonstrate that NIE doesn't like, >> somehow, remember that its values, deep down under >> the covers, are really int. And doesn't even like them when they are >> wrapped into IntET objects. This may or may not >> be a bug in the current Enum implementation. > > [demo1.py excerpt] > print( repr( NIE1( 1 ) + NIE1(2))) > print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) > > >> So the questions are: >> 1) Is there a bug in ref435 Enum that makes demo1 report errors >> instead of those lines working? > > Nope. Well, if it isn't a bug, it will be interesting to read the documentation that explains the behavior, when the documentation is written: The "obvious" documentation would be that Enum names values of any type, particularly the first type in the multiple-inheritance list. The values assigned to the enumeration members are used as parameters to the constructor of that first type, but the value of the enumeration member itself is an item of the type, created by the constructor. The __call__ syntax [ EnumDerivation( value ) ] looks up enumeration members by value. The obvious documentation would stop there. But if demo1 doesn't demonstrate a bug, it would have to continue, saying something like: However, if you have a complex type, you can't look up by value, but rather have to resupply the constructor parameters used to create the item. This means that for simple types EnumDerivation( EnumerationMember.value ) is EnumerationMember but that doesn't hold for complex types. I think it should. >> 2) Is something like demo2 interesting to anyone but me? Of course, I >> think it would be great for reporting flag values >> using names rather than a number representing combined bit fields. > > No idea. ;) > >> 3) I don't see a way to subclass the ref435 EnumMeta except by >> replacing the whole __new__ method... does this mechanism >> warrant a slight refactoring of EnumMeta to make this mechanism >> easier to subclass with less code redundancy? > > I've broken it down to make subclassing easier. Thanks... I'll take a look, eventually, but I'll be offline until next week. >> 4) Or is it simple enough and useful enough to somehow make it a >> feature of EnumMeta, enabled by a keyword parameter? > > Probably not. > >> 5) All this is based on "IntET"... which likely suffices for API >> flags parameters... but when I got to __truediv__ and >> __rtruediv__, which don't return int, then I started wondering how to >> write a vanilla ET class that inherits from >> "number" instead of "int" or "float"? One could, of course, make >> cooperating classes FloatET and DecimalET .... is this >> a language limitation, or is there more documentation I haven't read? >> :) (I did read footnote [1] of >> , >> and trembled.) > > Sounds like a fun project (for some value of fun ;) Not sure I'll get there, for a few years... such might be useful in certain debugging scenarios, but not sure it is useful enough to implement, given the footnote, except, perhaps, to truly become an expert in the Python object model. > Okay, sorry for the long delay. > > What it comes down to is if you want to marry two complex types > together, you may have to be the counselor as well. ;) :) I assume by "counselor" you mean the code for __new__ and __init__ below, which, when I get a chance to understand them, will probably explain some of your earlier remarks about it maybe being easier to implement in such a manner. Of course, I don't particularly want to marry the types, just have XxxEnum work for IntET as well as it does for int... I was bumping into name conflicts between Nick's implementation and yours, that weren't immediately obvious to me, because I haven't done multiple inheritance much ? Enum is dragging me into that and metaclasses, though, which is a good thing for me, likely. The one piece of "marriage" that is interesting is to avoid specifying the name twice, and it seems your code > Here's your code, revamped. I did make a slight change in the meta > -- I moved the name assignment above the __init__ call so it's > available in __init__. That's handy, thanks. > > --8<-------------------------------------------------------- > from ref435 import Enum > from flags import IntET > > class NIE1( IntET, Enum ): > x = 1 > y = 2 > z = 4 > def __new__(cls, value): > member = IntET.__new__(cls, 'temp', value) > member._value = value > return member > def __init__(self, value): > self._etname = self._name > > print( repr( NIE1.x.value )) > print( repr( NIE1.x + NIE1.y )) > print( repr( NIE1.x + ~ NIE1.y)) > print( repr( NIE1.x + ~ 2 )) > print( repr( NIE1.z * 3 )) > > > print( repr( NIE1( 1 ) + NIE1(2))) > print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) > --8<-------------------------------------------------------- > > and my results: > > 1 > IntET('(x + y)', 3) > IntET('(x + ~y)', -2) > IntET('(x + -3)', -2) > IntET('(z * 3)', 12) > IntET('(x + y)', 3) > IntET('(x + y)', 3) I'd expect NIE1.x.value to be IntET('x', 1) but I'll have to look more carefully at what you've done, when I have some time next week. You may have made some "simplifying assumptions", and things _should_ be as simple as possible, but no simpler... especially not if it leads to unexpected results. > Oh, and if you really wanted the 'NEI' in the _etname, change the name > assignment: > > self._etname = 'NIE.' + self._name Sure. I did, because one problem that might arise is the combination of NIE-style enums from different enumerations... not prohibited for IntEnum or NIE, because it gets converted to the base type (int or IntET, respectively). But if someone accidentally combines an enumeration member from NIE1 and an enumeration member from NIE2, and it has the same member name, the expression could "look right" without the class name included. So you see, including the class name was not just a whim, but the result of analyzing potential error cases. > Forget to mention the good part -- in the custom __new__ you are able > to set the value to whatever you want (not a big deal in this case, > but if you had several parameters going in you could still make _value > be a single, simple int). This will take more thought than I have time for tonight, also. Right now, I think I want the value for NIE.x to be IntET('NIE.x', 1 ). And your code isn't achieving that at present, but maybe I just need to tweak __new__ and then can... and maybe it cures the discrepancy in expectations mentioned earlier too... On the other hand, when I think about it more, maybe I'll see what you are suggesting as a better path, for some reason. But I think it is the case at present, that what you think I want, is different than what I think I want :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Tue May 14 07:01:26 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 13 May 2013 22:01:26 -0700 Subject: [Python-Dev] PEP 435 doesn't help with bitfields [Was: Re: PEP 435 - ref impl disc 2] In-Reply-To: <5191A348.90805@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> Message-ID: <5191C526.1000303@g.nevcal.com> On 5/13/2013 7:36 PM, Ethan Furman wrote: >> 2) Is something like demo2 interesting to anyone but me? Of course, I >> think it would be great for reporting flag values >> using names rather than a number representing combined bit fields. > > No idea. ;) There's been some talk of Enum-ing constants in the Socket library... I'm no socket programmer, so I'd have to go read the APIs to know if any of them are for bitfields which are typically combined together with | or + being the typical operators... and which would convert them to plain integers, and lose the reporting by name. That's the problem I see with IntEnum used for bitfields. For simple selection of choices, one choice per parameter, Enum will be great. But for bitfields, it is lacking. Sorry if this sounds repetitious, but all the other times I've mentioned it, it has been in a big discussion of other stuff too. Glenn -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue May 14 07:35:42 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 22:35:42 -0700 Subject: [Python-Dev] PEP 435 doesn't help with bitfields [Was: Re: PEP 435 - ref impl disc 2] In-Reply-To: <5191C526.1000303@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> <5191C526.1000303@g.nevcal.com> Message-ID: <5191CD2E.3050109@stoneleaf.us> On 05/13/2013 10:01 PM, Glenn Linderman wrote: > > Sorry if this sounds repetitious, but all the other times I've mentioned it, it has been in a big discussion of other > stuff too. It's a while 'til 3.4. A bitfield-type enum may show up in the docs, if no where else. ;) -- ~Ethan~ From ethan at stoneleaf.us Tue May 14 08:11:28 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 13 May 2013 23:11:28 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5191C51D.3000304@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> <5191C51D.3000304@g.nevcal.com> Message-ID: <5191D590.2070601@stoneleaf.us> On 05/13/2013 10:01 PM, Glenn Linderman wrote: > On 5/13/2013 7:36 PM, Ethan Furman wrote: >> On 05/10/2013 10:15 PM, Glenn Linderman wrote: >>> >>> So it is quite possible to marry the two, as Ethan helped me figure out using an earlier NamedInt class: >>> >>> class NIE( IntET, Enum ): >>> x = ('NIE.x', 1) >>> y = ('NIE.y', 2) >>> z = ('NIE.z', 4) >>> >>> and then expressions involving members of NIE (and even associated integers) will be tracked... see demo1.py. >>> >>> But the last few lines of demo1 demonstrate that NIE doesn't like, somehow, remember that its values, deep down under >>> the covers, are really int. And doesn't even like them when they are wrapped into IntET objects. This may or may not >>> be a bug in the current Enum implementation. >> >> [demo1.py excerpt] >> print( repr( NIE1( 1 ) + NIE1(2))) >> print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) >> >> >>> So the questions are: >>> 1) Is there a bug in ref435 Enum that makes demo1 report errors instead of those lines working? >> >> Nope. > > Well, if it isn't a bug, it will be interesting to read the documentation that explains the behavior, when the > documentation is written: > > The "obvious" documentation would be that Enum names values of any type, particularly the first type in the > multiple-inheritance list. The values assigned to the enumeration members are used as parameters to the constructor of > that first type, but the value of the enumeration member itself is an item of the type, created by the constructor. > > The __call__ syntax [ EnumDerivation( value ) ] looks up enumeration members by value. > > The obvious documentation would stop there. > > The one piece of "marriage" that is interesting is to avoid specifying the name twice, and it seems your code > > I'd expect NIE1.x.value to be IntET('x', 1) but I'll have to look more carefully at what you've done, when I have > some time next week. You may have made some "simplifying assumptions", and things _should_ be as simple as possible, but > no simpler... especially not if it leads to unexpected results. > > This will take more thought than I have time for tonight, also. Right now, I think I want the value for NIE.x to be > IntET('NIE.x', 1 ). And your code isn't achieving that at present, but maybe I just need to tweak __new__ and then > can... and maybe it cures the discrepancy in expectations mentioned earlier too... Thank you for being persistent. You are correct, the value should be an IntET (at least, with a custom __new__ ;). I'll look into it. -- ~Ethan~ From greg at krypto.org Tue May 14 08:39:21 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 13 May 2013 23:39:21 -0700 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <518CDE16.6010104@python.org> Message-ID: On Sun, May 12, 2013 at 3:04 PM, Ben Hoyt wrote: > > And if we're creating a custom object instead, why return a 2-tuple > > rather than making the entry's name an attribute of the custom object? > > > > To me, that suggests a more reasonable API for os.scandir() might be > > for it to be an iterator over "dir_entry" objects: > > > > name (as a string) > > is_file() > > is_dir() > > is_link() > > stat() > > cached_stat (None or a stat object) > > Nice! I really like your basic idea of returning a custom object > instead of a 2-tuple. And I agree with Christian that .stat() would be > clearer called .lstat(). I also like your later idea of simply > exposing .dirent (would be None on Windows). > > One tweak I'd suggest is that is_file() etc be called isfile() etc > without the underscore, to match the naming of the os.path.is* > functions. > > > That would actually make sense at an implementation > > level anyway - is_file() etc would check self.cached_lstat first, and > > if that was None they would check self.dirent, and if that was also > > None they would raise an error. > > Hmm, I'm not sure about this at all. Are you suggesting that the > DirEntry object's is* functions would raise an error if both > cached_lstat and dirent were None? Wouldn't it make for a much simpler > API to just call os.lstat() and populate cached_lstat instead? As far > as I'm concerned, that'd be the point of making DirEntry.lstat() a > function. > > In fact, I don't think .cached_lstat should be exposed to the user. > They just call entry.lstat(), and it returns a cached stat or calls > os.lstat() to get the real stat if required (and populates the > internal cached stat value). And the entry.is* functions would call > entry.lstat() if dirent was or d_type was DT_UNKNOWN. This would > change relatively nasty code like this: > > files = [] > dirs = [] > for entry in os.scandir(path): > try: > isdir = entry.isdir() > except NotPresentError: > st = os.lstat(os.path.join(path, entry.name)) > isdir = stat.S_ISDIR(st) > if isdir: > dirs.append(entry.name) > else: > files.append(entry.name) > > Into nice clean code like this: > > files = [] > dirs = [] > for entry in os.scandir(path): > if entry.isfile(): > dirs.append(entry.name) > else: > files.append(entry.name) > > This change would make scandir() usable by ordinary mortals, rather > than just hardcore library implementors. > > In other words, I'm proposing that the DirEntry objects yielded by > scandir() would have .name and .dirent attributes, and .isdir(), > .isfile(), .islink(), .lstat() methods, and look basically like this > (though presumably implemented in C): > > class DirEntry: > def __init__(self, name, dirent, lstat, path='.'): > # User shouldn't need to call this, but called internally by > scandir() > self.name = name > self.dirent = dirent > self._lstat = lstat # non-public attributes > self._path = path > > def lstat(self): > if self._lstat is None: > self._lstat = os.lstat(os.path.join(self._path, self.name)) > return self._lstat > > def isdir(self): > if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: > return self.dirent.d_type == DT_DIR > else: > return stat.S_ISDIR(self.lstat().st_mode) > > def isfile(self): > if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: > return self.dirent.d_type == DT_REG > else: > return stat.S_ISREG(self.lstat().st_mode) > > def islink(self): > if self.dirent is not None and self.dirent.d_type != DT_UNKNOWN: > return self.dirent.d_type == DT_LNK > else: > return stat.S_ISLNK(self.lstat().st_mode) > > Oh, and the .dirent would either be None (Windows) or would have > .d_type and .d_ino attributes (Linux, OS X). > > This would make the scandir() API nice and simple to use for callers, > but still expose all the information the OS provides (both the > meaningful fields in dirent, and a full stat on Windows, nicely cached > in the DirEntry object). > > Thoughts? > I like the sound of this (which sounds like what you've implemented now though I haven't looked at your code). -gps > > -Ben > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue May 14 08:51:35 2013 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 13 May 2013 23:51:35 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: On Sun, May 12, 2013 at 4:49 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > * will enums break doctests or any existing user code > Those are already broken by design. We shouldn't be limited just because someone wrote a bad test that assumed a particular repr of a value. We've already broken that assumption several times in the past from the recent hash randomization change to removing the evil trailing Ls on the old long type, changing the float str vs repr, particular information and wording of exception error messages, etc. This sounds like a feature request for doctest. doctest could be educated about enums and automatically compare to the integer value for such cases. Regardless, it sounds like the consensus agrees with your overall sentiment: refrain from mass converting things "just because" and start with an obvious improvement like the socket module. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue May 14 10:37:35 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 10:37:35 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: Message-ID: <20130514103735.53310d61@pitrou.net> Le Tue, 14 May 2013 10:41:01 +1200, Ben Hoyt a ?crit : > > I'd to see the numbers for NFS or CIFS - stat() can be brutally slow > > over a network connection (that's why we added a caching mechanism > > to importlib). > > How do I know what file system Windows networking is using? In any > case, here's some numbers on Windows -- it's looking pretty good! This > is with default DEPTH/NUM_DIRS/NUM_FILES on a LAN: > > Benchmarking walks on \\anothermachine\docs\Ben\bigtree, repeat 3/3... > os.walk took 11.345s, scandir.walk took 0.340s -- 33.3x as fast > > And this is on a VPN on a remote network with the benchmark.py values > cranked down to DEPTH = 3, NUM_DIRS = 3, NUM_FILES = 20 (because > otherwise it was taking far too long): > > Benchmarking walks on \\ben1.titanmt.local\c$\dev\scandir\benchtree, > repeat 3/3... > os.walk took 122.310s, scandir.walk took 5.452s -- 22.4x as fast > > If anyone can run benchmark.py on Linux / NFS or similar, that'd be > great. You'll probably have to lower DEPTH/NUM_DIRS/NUM_FILES first > and then move the "benchtree" to the network file system to run it > against that. Why does your benchmark create such large files? It doesn't make sense. Regards Antoine. From solipsis at pitrou.net Tue May 14 10:50:55 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 10:50:55 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: Message-ID: <20130514105055.08dc954f@pitrou.net> Le Tue, 14 May 2013 10:41:01 +1200, Ben Hoyt a ?crit : > > If anyone can run benchmark.py on Linux / NFS or similar, that'd be > great. You'll probably have to lower DEPTH/NUM_DIRS/NUM_FILES first > and then move the "benchtree" to the network file system to run it > against that. On a locally running VM: os.walk took 0.400s, scandir.walk took 0.120s -- 3.3x as fast Same VM accessed from the host through a local sshfs: os.walk took 2.261s, scandir.walk took 2.055s -- 1.1x as fast Same, but with "sshfs -o cache=no": os.walk took 24.060s, scandir.walk took 25.906s -- 0.9x as fast Regards Antoine. From p.f.moore at gmail.com Tue May 14 10:54:29 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 14 May 2013 09:54:29 +0100 Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong In-Reply-To: References: Message-ID: On 5 May 2013 18:10, Paul Moore wrote: > > On 4 May 2013 16:42, Vinay Sajip wrote: > >> I've taken a quick look at it, but I probably won't be able to make any >> changes until the near the end of the coming week. Feel free to have a go; >> > > OK, I have a patch against the standalone pylauncher repo at > https://bitbucket.org/pmoore/pylauncher. I'm not sure what the best > approach is - I didn't want to patch the python core version directly (a) > because I wouldn't be able to test it easily, and (b) because I'd want a > standalone version anyway until 3.4 comes out. > Vinay, Did you get a chance to have a look at this? I didn't manage to create a pull request against your copy of pylauncher as my repo is a fork of the pypa one - I'm not sure if that's a limitation of bitbucket or if I just don't know how to do it... I've created a pull request against the pypa version in case that's of use... Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Tue May 14 10:54:50 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 20:54:50 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130514103735.53310d61@pitrou.net> References: <20130514103735.53310d61@pitrou.net> Message-ID: >> If anyone can run benchmark.py on Linux / NFS or similar, that'd be >> great. You'll probably have to lower DEPTH/NUM_DIRS/NUM_FILES first >> and then move the "benchtree" to the network file system to run it >> against that. > > Why does your benchmark create such large files? It doesn't make sense. Yeah, I was just thinking about that last night, and I should probably change that. Originally I did it because I thought it might affect the speed of directory walking, so I was trying to make some of the files large to be more "real world". I've just tested it, and in practice file system doesn't make much difference, so I've fixed that now: https://github.com/benhoyt/scandir/commit/9663c0afcc5c020d5d1fe34a120b0331b8c9d2e0 Thanks, Ben From solipsis at pitrou.net Tue May 14 11:05:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 11:05:25 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: <20130514103735.53310d61@pitrou.net> Message-ID: <20130514110525.242c64df@pitrou.net> Le Tue, 14 May 2013 20:54:50 +1200, Ben Hoyt a ?crit : > >> If anyone can run benchmark.py on Linux / NFS or similar, that'd be > >> great. You'll probably have to lower DEPTH/NUM_DIRS/NUM_FILES first > >> and then move the "benchtree" to the network file system to run it > >> against that. > > > > Why does your benchmark create such large files? It doesn't make > > sense. > > Yeah, I was just thinking about that last night, and I should probably > change that. Originally I did it because I thought it might affect the > speed of directory walking, so I was trying to make some of the files > large to be more "real world". I've just tested it, and in practice > file system doesn't make much difference, so I've fixed that now: Thanks. I had bumped the number of files, thinking it would make things more interesting, and it filled my disk. Regards Antoine. From benhoyt at gmail.com Tue May 14 11:08:27 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 21:08:27 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130514110525.242c64df@pitrou.net> References: <20130514103735.53310d61@pitrou.net> <20130514110525.242c64df@pitrou.net> Message-ID: >> large to be more "real world". I've just tested it, and in practice >> file system doesn't make much difference, so I've fixed that now: > > Thanks. I had bumped the number of files, thinking it would make things > more interesting, and it filled my disk. Denial of Pitrou attack -- sorry! :-) Anyway, it shouldn't fill your disk now. Though it still does use more on-disk space than 3 bytes per file on most FSs, depending on the smallest block size. -Ben From benhoyt at gmail.com Tue May 14 11:10:08 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 21:10:08 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130514105055.08dc954f@pitrou.net> References: <20130514105055.08dc954f@pitrou.net> Message-ID: > On a locally running VM: > os.walk took 0.400s, scandir.walk took 0.120s -- 3.3x as fast > > Same VM accessed from the host through a local sshfs: > os.walk took 2.261s, scandir.walk took 2.055s -- 1.1x as fast > > Same, but with "sshfs -o cache=no": > os.walk took 24.060s, scandir.walk took 25.906s -- 0.9x as fast Thanks. I take it those are "USING FAST C version"? What is "-o cache=no"? I'm guessing the last one isn't giving dirents, so my version is slightly slower than the built-in listdir/stat version due to building and calling methods on the DirEntry objects in Python. It should be no slower when it's all moved to C. -Ben From solipsis at pitrou.net Tue May 14 11:57:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 11:57:16 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: <20130514105055.08dc954f@pitrou.net> Message-ID: <20130514115716.169af254@pitrou.net> Le Tue, 14 May 2013 21:10:08 +1200, Ben Hoyt a ?crit : > > On a locally running VM: > > os.walk took 0.400s, scandir.walk took 0.120s -- 3.3x as fast > > > > Same VM accessed from the host through a local sshfs: > > os.walk took 2.261s, scandir.walk took 2.055s -- 1.1x as fast > > > > Same, but with "sshfs -o cache=no": > > os.walk took 24.060s, scandir.walk took 25.906s -- 0.9x as fast > > Thanks. I take it those are "USING FAST C version"? Yes. > What is "-o cache=no"? I'm guessing the last one isn't giving dirents, > so my version is slightly slower than the built-in listdir/stat > version due to building and calling methods on the DirEntry objects in > Python. It disables sshfs's built-in cache (I suppose it's a filesystem metadata cache). The man page doesn't tell much more about it. > It should be no slower when it's all moved to C. The slowdown is too small to be interesting. The main point is that there was no speedup, though. Regards Antoine. From benhoyt at gmail.com Tue May 14 12:14:42 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Tue, 14 May 2013 22:14:42 +1200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: <20130514115716.169af254@pitrou.net> References: <20130514105055.08dc954f@pitrou.net> <20130514115716.169af254@pitrou.net> Message-ID: >> It should be no slower when it's all moved to C. > > The slowdown is too small to be interesting. The main point is that > there was no speedup, though. True, and thanks for testing. I don't think that's a big issue, however. If it's 3-8x faster in the majority of cases (local disk on all systems, Windows networking), and no slower in a minority (sshfs), I'm not too sad about that. I wonder how sshfs compared to nfs. -Ben From solipsis at pitrou.net Tue May 14 12:34:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 12:34:25 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info References: <20130514105055.08dc954f@pitrou.net> <20130514115716.169af254@pitrou.net> Message-ID: <20130514123425.427176ba@pitrou.net> Le Tue, 14 May 2013 22:14:42 +1200, Ben Hoyt a ?crit : > >> It should be no slower when it's all moved to C. > > > > The slowdown is too small to be interesting. The main point is that > > there was no speedup, though. > > True, and thanks for testing. > > I don't think that's a big issue, however. If it's 3-8x faster in the > majority of cases (local disk on all systems, Windows networking), and > no slower in a minority (sshfs), I'm not too sad about that. > > I wonder how sshfs compared to nfs. Ok, with a NFS mount (default options, especially "sync") to the same local VM: First run: os.walk took 17.137s, scandir.walk took 0.625s -- 27.4x as fast Second run: os.walk took 1.535s, scandir.walk took 0.617s -- 2.5x as fast (something fishy with caches?) Regards Antoine. From cf.natali at gmail.com Tue May 14 12:35:45 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 14 May 2013 12:35:45 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <20130514105055.08dc954f@pitrou.net> <20130514115716.169af254@pitrou.net> Message-ID: > I wonder how sshfs compared to nfs. (I've modified your benchmark to also test the case where data isn't in the page cache). Local ext3: cached: os.walk took 0.096s, scandir.walk took 0.030s -- 3.2x as fast uncached: os.walk took 0.320s, scandir.walk took 0.130s -- 2.5x as fast NFSv3, 1Gb/s network: cached: os.walk took 0.220s, scandir.walk took 0.078s -- 2.8x as fast uncached: os.walk took 0.269s, scandir.walk took 0.139s -- 1.9x as fast From matthieu.brucher at gmail.com Tue May 14 12:53:42 2013 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Tue, 14 May 2013 11:53:42 +0100 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <20130514105055.08dc954f@pitrou.net> <20130514115716.169af254@pitrou.net> Message-ID: Very interesting. Although os.walk may not be widely used in cluster applications, anything that lowers the number of calls to stat() in an spplication is worthwhile for parallel filesystems as stat() is handled by the only non-parallel node, the MDS. Small test on another NFS drive: Creating tree at benchtree: depth=4, num_dirs=5, num_files=50 Priming the system's cache... Benchmarking walks on benchtree, repeat 1/3... Benchmarking walks on benchtree, repeat 2/3... Benchmarking walks on benchtree, repeat 3/3... os.walk took 0.117s, scandir.walk took 0.041s -- 2.8x as fast I may try it on a Lustre FS if I have some time and if I don't forget about this. Cheers, Matthieu 2013/5/14 Charles-Fran?ois Natali > > I wonder how sshfs compared to nfs. > > (I've modified your benchmark to also test the case where data isn't > in the page cache). > > Local ext3: > cached: > os.walk took 0.096s, scandir.walk took 0.030s -- 3.2x as fast > uncached: > os.walk took 0.320s, scandir.walk took 0.130s -- 2.5x as fast > > NFSv3, 1Gb/s network: > cached: > os.walk took 0.220s, scandir.walk took 0.078s -- 2.8x as fast > uncached: > os.walk took 0.269s, scandir.walk took 0.139s -- 1.9x as fast > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/matthieu.brucher%40gmail.com > -- Information System Engineer, Ph.D. Blog: http://matt.eifelle.com LinkedIn: http://www.linkedin.com/in/matthieubrucher Music band: http://liliejay.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Tue May 14 13:34:12 2013 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Tue, 14 May 2013 12:34:12 +0100 (BST) Subject: [Python-Dev] PEP 379 Python launcher for Windows - behaviour for #!/usr/bin/env python line is wrong In-Reply-To: References: Message-ID: <1368531252.98974.YahooMailNeo@web171402.mail.ir2.yahoo.com> > From: Paul Moore >Did you get a chance to have a look at this? I didn't manage to create a pull request against your copy of pylauncher as my repo > is a fork of the pypa one - I'm not sure if that's a limitation of bitbucket or if I just don't know how to do it... I've created a pull request > against the pypa version in case that's of use... Hi Paul, Sorry I haven't had a chance yet - real life is very busy at the moment. A pull request against the pypa version is fine, and I will get to it soon - thanks for your patience. Regards, Vinay Sajip From steve at pearwood.info Tue May 14 14:08:15 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 14 May 2013 22:08:15 +1000 Subject: [Python-Dev] Best practices for Enum In-Reply-To: References: Message-ID: <5192292F.6050406@pearwood.info> On 14/05/13 16:51, Gregory P. Smith wrote: [...] > This sounds like a feature request for doctest. doctest could be educated > about enums and automatically compare to the integer value for such cases. Please no. Enums are not special enough to break the rules. Good: "Doctests look at the object's repr." Bad: "Doctests look at an object's repr, unless the object is an Enum, when it will look at the enum's value." If I want a test that checks the enum's value, then I will write a doctest that explicitly checks the enum's value. -- Steven From dirkjan at ochtman.nl Tue May 14 14:18:10 2013 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 14 May 2013 14:18:10 +0200 Subject: [Python-Dev] Issue 11406: adding os.scandir(), a directory iterator returning stat-like info In-Reply-To: References: <20130514105055.08dc954f@pitrou.net> <20130514115716.169af254@pitrou.net> Message-ID: On Tue, May 14, 2013 at 12:14 PM, Ben Hoyt wrote: > I don't think that's a big issue, however. If it's 3-8x faster in the > majority of cases (local disk on all systems, Windows networking), and > no slower in a minority (sshfs), I'm not too sad about that. Might be interesting to test something status calls with a hacked Mercurial. Cheers, Dirkjan From phil at freehackers.org Tue May 14 14:32:27 2013 From: phil at freehackers.org (Philippe Fremy) Date: Tue, 14 May 2013 14:32:27 +0200 Subject: [Python-Dev] How to debug python crashes Message-ID: <51922EDB.1030404@freehackers.org> Hi, I have a reproducable crash on Windows XP with Python 2.7 which I would like to investigate. I have Visual Studio 2008 installed and I downloaded the pdb files. However I could not find any instructions on how to use them and was unsuccessful at getting anything out of it. I checked the developer guide but could not find anything on debugging crashes. On internet, this seems to be also an underdocumented topic. So, a few questions : - is there some documentation to help debugging crashes ? - are the pdb files released along python usable with Visual Studio and stock Python ? Or do you need a hand-compiled version ? cheers, Philippe From solipsis at pitrou.net Tue May 14 14:47:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 14 May 2013 14:47:59 +0200 Subject: [Python-Dev] How to debug python crashes References: <51922EDB.1030404@freehackers.org> Message-ID: <20130514144759.0009c023@pitrou.net> Le Tue, 14 May 2013 14:32:27 +0200, Philippe Fremy a ?crit : > Hi, > > I have a reproducable crash on Windows XP with Python 2.7 which I > would like to investigate. I have Visual Studio 2008 installed and I > downloaded the pdb files. However I could not find any instructions on > how to use them and was unsuccessful at getting anything out of it. You may as well recompile Python in debug mode and then run it under the Visual Studio debugger. VS 2008 is adequate for building Python 2.7. See http://docs.python.org/devguide/setup.html#windows-compiling (that doesn't answer your question about pdb files, it's simply that I don't know the answer :-)) Regards Antoine. From mail at timgolden.me.uk Tue May 14 14:49:58 2013 From: mail at timgolden.me.uk (Tim Golden) Date: Tue, 14 May 2013 13:49:58 +0100 Subject: [Python-Dev] How to debug python crashes In-Reply-To: <51922EDB.1030404@freehackers.org> References: <51922EDB.1030404@freehackers.org> Message-ID: <519232F6.2050101@timgolden.me.uk> On 14/05/2013 13:32, Philippe Fremy wrote: > I have a reproducable crash on Windows XP with Python 2.7 which I would > like to investigate. I have Visual Studio 2008 installed and I > downloaded the pdb files. However I could not find any instructions on > how to use them and was unsuccessful at getting anything out of it. > > I checked the developer guide but could not find anything on debugging > crashes. On internet, this seems to be also an underdocumented topic. > > So, a few questions : > - is there some documentation to help debugging crashes ? I don't think there is. As you say, it's somewhat underdocumented. Maybe someone else can point to something, but I'm not aware of anything. > - are the pdb files released along python usable with Visual Studio and > stock Python ? Or do you need a hand-compiled version ? I actually have no idea whether you drop in the .pdb files, but if you have VS anyway, it's easy enough to build and run within VS and let the debugger drop you into the code when it crashes. Are you in a position to post a reproducible test case to the tracker? Or were you holding back until you'd done some analysis? TJG From ethan at stoneleaf.us Tue May 14 16:16:25 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 07:16:25 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <5191D590.2070601@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> <5191C51D.3000304@g.nevcal.com> <5191D590.2070601@stoneleaf.us> Message-ID: <51924739.1050803@stoneleaf.us> On 05/13/2013 11:11 PM, Ethan Furman wrote: > On 05/13/2013 10:01 PM, Glenn Linderman wrote: >> On 5/13/2013 7:36 PM, Ethan Furman wrote: >>> On 05/10/2013 10:15 PM, Glenn Linderman wrote: >>>> >>>> So it is quite possible to marry the two, as Ethan helped me figure out using an earlier NamedInt class: >>>> >>>> class NIE( IntET, Enum ): >>>> x = ('NIE.x', 1) >>>> y = ('NIE.y', 2) >>>> z = ('NIE.z', 4) >>>> >>>> and then expressions involving members of NIE (and even associated integers) will be tracked... see demo1.py. >>>> >>>> But the last few lines of demo1 demonstrate that NIE doesn't like, somehow, remember that its values, deep down under >>>> the covers, are really int. And doesn't even like them when they are wrapped into IntET objects. This may or may not >>>> be a bug in the current Enum implementation. >>> >>> [demo1.py excerpt] >>> print( repr( NIE1( 1 ) + NIE1(2))) >>> print( repr( NIE1( IntET('NIE1.x', 1 )) + NIE1(2))) >>> >>> >>>> So the questions are: >>>> 1) Is there a bug in ref435 Enum that makes demo1 report errors instead of those lines working? >>> >>> Nope. >> >> Well, if it isn't a bug, it will be interesting to read the documentation that explains the behavior, when the >> documentation is written: >> >> The "obvious" documentation would be that Enum names values of any type, particularly the first type in the >> multiple-inheritance list. The values assigned to the enumeration members are used as parameters to the constructor of >> that first type, but the value of the enumeration member itself is an item of the type, created by the constructor. >> >> The __call__ syntax [ EnumDerivation( value ) ] looks up enumeration members by value. >> >> The obvious documentation would stop there. > >> >> The one piece of "marriage" that is interesting is to avoid specifying the name twice, and it seems your code >> >> I'd expect NIE1.x.value to be IntET('x', 1) but I'll have to look more carefully at what you've done, when I have >> some time next week. You may have made some "simplifying assumptions", and things _should_ be as simple as possible, but >> no simpler... especially not if it leads to unexpected results. > >> >> This will take more thought than I have time for tonight, also. Right now, I think I want the value for NIE.x to be >> IntET('NIE.x', 1 ). And your code isn't achieving that at present, but maybe I just need to tweak __new__ and then >> can... and maybe it cures the discrepancy in expectations mentioned earlier too... > > Thank you for being persistent. You are correct, the value should be an IntET (at least, with a custom __new__ ;). You know, when you look at something you wrote the night before, and have no idea what you were trying to say, you know you were tired. Ignore my parenthetical remark. Okay, the value is now an IntET, as expected and appropriate. -- ~Ethan~ From carlosnepomuceno at outlook.com Tue May 14 17:22:05 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Tue, 14 May 2013 18:22:05 +0300 Subject: [Python-Dev] First post Message-ID: Hi guys! This is my first post on this list. I'd like have your opinion on how to safely implement WSGI on a production server. My benchmarks show no performance differences between our PHP and Python environments. I'm using mod_wsgi v3.4 with Apache 2.4. Is that ok or can it get faster? Thanks in advance. Regards, Carlos From phil at freehackers.org Tue May 14 17:29:52 2013 From: phil at freehackers.org (Philippe Fremy) Date: Tue, 14 May 2013 17:29:52 +0200 Subject: [Python-Dev] How to debug python crashes In-Reply-To: <519232F6.2050101@timgolden.me.uk> References: <51922EDB.1030404@freehackers.org> <519232F6.2050101@timgolden.me.uk> Message-ID: <51925870.3070504@freehackers.org> On 14/05/2013 14:49, Tim Golden wrote: > On 14/05/2013 13:32, Philippe Fremy wrote: >> I have a reproducable crash on Windows XP with Python 2.7 which I would >> like to investigate. I have Visual Studio 2008 installed and I >> downloaded the pdb files. However I could not find any instructions on >> how to use them and was unsuccessful at getting anything out of it. >> >> I checked the developer guide but could not find anything on debugging >> crashes. On internet, this seems to be also an underdocumented topic. >> >> So, a few questions : >> - is there some documentation to help debugging crashes ? > I don't think there is. As you say, it's somewhat underdocumented. Maybe > someone else can point to something, but I'm not aware of anything. But what's the reason for releasing them ? If you need to recompile Python to use them, that would be strange because they are generated as part of the compilation process anyway. > Are you in a position to post a reproducible test case to the tracker? > Or were you holding back until you'd done some analysis? I can reproduce it systematically, with an open source project (I am debugging winpdb) but this occurs in a middle of a multithreaded XML-RPC server running a python debugger in another thread. So no, I don't have a test case and identifying clearly the bug would make me a small and happy python contributor. cheers, Philippe From brian at python.org Tue May 14 17:34:41 2013 From: brian at python.org (Brian Curtin) Date: Tue, 14 May 2013 10:34:41 -0500 Subject: [Python-Dev] First post In-Reply-To: References: Message-ID: On Tue, May 14, 2013 at 10:22 AM, Carlos Nepomuceno wrote: > Hi guys! This is my first post on this list. > > I'd like have your opinion on how to safely implement WSGI on a production server. > > My benchmarks show no performance differences between our PHP and Python environments. I'm using mod_wsgi v3.4 with Apache 2.4. > > Is that ok or can it get faster? > > Thanks in advance. Hi - this list is about the development of Python. For user questions, python-list is a better place to ask this. From ethan at stoneleaf.us Tue May 14 17:36:28 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 08:36:28 -0700 Subject: [Python-Dev] First post In-Reply-To: References: Message-ID: <519259FC.4080002@stoneleaf.us> On 05/14/2013 08:22 AM, Carlos Nepomuceno wrote: > Hi guys! This is my first post on this list. Hi Carlos! > I'd like have your opinion on how to safely implement WSGI on a production server. Unfortunately this list is for the development /of/ Python, no development /with/ Python. Try asking again over on the regular Python list: http://mail.python.org/mailman/listinfo/python-list -- ~Ethan~ From greg at krypto.org Tue May 14 18:39:10 2013 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 14 May 2013 09:39:10 -0700 Subject: [Python-Dev] Best practices for Enum In-Reply-To: <5192292F.6050406@pearwood.info> References: <5192292F.6050406@pearwood.info> Message-ID: Bad: doctests. On Tue, May 14, 2013 at 5:08 AM, Steven D'Aprano wrote: > On 14/05/13 16:51, Gregory P. Smith wrote: > [...] > > This sounds like a feature request for doctest. doctest could be educated >> about enums and automatically compare to the integer value for such cases. >> > > Please no. Enums are not special enough to break the rules. > > Good: "Doctests look at the object's repr." > > Bad: "Doctests look at an object's repr, unless the object is an Enum, > when it will look at the enum's value." > > If I want a test that checks the enum's value, then I will write a doctest > that explicitly checks the enum's value. > > > > -- > Steven > > ______________________________**_________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/**mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/**mailman/options/python-dev/** > greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From victor.stinner at gmail.com Tue May 14 20:52:11 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Tue, 14 May 2013 20:52:11 +0200 Subject: [Python-Dev] How to debug python crashes In-Reply-To: <51922EDB.1030404@freehackers.org> References: <51922EDB.1030404@freehackers.org> Message-ID: Hi, I don't know if it can help, but if you really don't know where your programcrash/hang occurs, you can use the faulthandler module: https://pypi.python.org/pypi/faulthandler It can be used to display te backtrace of all threads on an event like a signal or a timeout. It works with Python, but you will need a compiler (like Visual Studio) to install it on Windows. I failed to build a MSI installer on Windows 64-bit with Visual Studio 2010 express. If someone can help me to build MSI, please contact me. The documentation: http://docs.python.org/dev/library/faulthandler.html Victor Le mardi 14 mai 2013, Philippe Fremy a ?crit : > Hi, > > I have a reproducable crash on Windows XP with Python 2.7 which I would > like to investigate. I have Visual Studio 2008 installed and I > downloaded the pdb files. However I could not find any instructions on > how to use them and was unsuccessful at getting anything out of it. > > I checked the developer guide but could not find anything on debugging > crashes. On internet, this seems to be also an underdocumented topic. > > So, a few questions : > - is there some documentation to help debugging crashes ? > - are the pdb files released along python usable with Visual Studio and > stock Python ? Or do you need a hand-compiled version ? > > cheers, > > Philippe > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iacobcatalin at gmail.com Tue May 14 21:55:02 2013 From: iacobcatalin at gmail.com (Catalin Iacob) Date: Tue, 14 May 2013 21:55:02 +0200 Subject: [Python-Dev] How to debug python crashes In-Reply-To: <51925870.3070504@freehackers.org> References: <51922EDB.1030404@freehackers.org> <519232F6.2050101@timgolden.me.uk> <51925870.3070504@freehackers.org> Message-ID: Hi Philippe, I don't have access to VS right now but out of my head what you need to do is roughly outlined below. On Tue, May 14, 2013 at 5:29 PM, Philippe Fremy wrote: > But what's the reason for releasing them ? If you need to recompile > Python to use them, that would be strange because they are generated as > part of the compilation process anyway. They can indeed be used like this: You should launch the python.exe process that is going to crash, attach to it with the Visual Studio debugger and then reproduce the crash. This should drop you in the debugger. Once you're in the debugger and python.exe is stopped at the point of the crash you should see the stack trace of each thread in a VS window, the stacktrace will probably have lots of entries of the form python27.dll! (no function names because there VS doesn't know where to find the PDB files). If you right click one of those entries there's an option named "Symbol load information" or similar, this will show a window from which you can make VS ask you where on disk do you have PDB files. You then tell VS where to find python27.pdb and then the stacktrace entries should automatically get function names. Catalin From ethan at stoneleaf.us Tue May 14 22:09:16 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 13:09:16 -0700 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> Message-ID: <519299EC.7070600@stoneleaf.us> On 05/13/2013 11:32 AM, Guido van Rossum wrote: > > But now you enter a different phase of your project, or one of your collaborators does, or perhaps you've released your code on PyPI and one of your users does. So someone tries to pickle some class instance that happens to contain an unpicklable enum. That's not a great experience. Pickling and unpickling errors are often remarkably hard to debug. (Especially the latter, so I have privately admonished Ethan to ensure that if the getframe hack doesn't work, the pickle failure should happen at pickling time, not at unpickle time.) I can get pickle failure on members created using the functional syntax with no module set; I cannot get pickle failure on those same classes; I cannot get pickle failure on class syntax enums that inherit complex types (such as the NEI class in the tests). If anybody has any insight on how to make that work, I'm all ears. -- ~Ethan~ From guido at python.org Tue May 14 22:58:45 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 14 May 2013 13:58:45 -0700 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: <519299EC.7070600@stoneleaf.us> References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> <519299EC.7070600@stoneleaf.us> Message-ID: On Tue, May 14, 2013 at 1:09 PM, Ethan Furman wrote: > On 05/13/2013 11:32 AM, Guido van Rossum wrote: >> >> >> But now you enter a different phase of your project, or one of your >> collaborators does, or perhaps you've released your code on PyPI and one of >> your users does. So someone tries to pickle some class instance that >> happens to contain an unpicklable enum. That's not a great experience. >> Pickling and unpickling errors are often remarkably hard to debug. >> (Especially the latter, so I have privately admonished Ethan to ensure that >> if the getframe hack doesn't work, the pickle failure should happen at >> pickling time, not at unpickle time.) > > > I can get pickle failure on members created using the functional syntax with > no module set; That's the case I care most about. > I cannot get pickle failure on those same classes; I suppose you mean "if you create the same enums using class syntax"? Sounds fine to me. > I cannot > get pickle failure on class syntax enums that inherit complex types (such as > the NEI class in the tests). Is the NEI base class picklable? > If anybody has any insight on how to make that work, I'm all ears. I'm not 100% sure I know what "that" refers to here. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Tue May 14 23:13:00 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 14:13:00 -0700 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> <519299EC.7070600@stoneleaf.us> Message-ID: <5192A8DC.2050007@stoneleaf.us> On 05/14/2013 01:58 PM, Guido van Rossum wrote: > On Tue, May 14, 2013 at 1:09 PM, Ethan Furman wrote: >> On 05/13/2013 11:32 AM, Guido van Rossum wrote: >>> >>> >>> But now you enter a different phase of your project, or one of your >>> collaborators does, or perhaps you've released your code on PyPI and one of >>> your users does. So someone tries to pickle some class instance that >>> happens to contain an unpicklable enum. That's not a great experience. >>> Pickling and unpickling errors are often remarkably hard to debug. >>> (Especially the latter, so I have privately admonished Ethan to ensure that >>> if the getframe hack doesn't work, the pickle failure should happen at >>> pickling time, not at unpickle time.) >> >> >> I can get pickle failure on members created using the functional syntax with >> no module set; > > That's the case I care most about. Good, 'cause that one is handled. :) >> I cannot get pickle failure on those same classes; > > I suppose you mean "if you create the same enums using class syntax"? > Sounds fine to me. No. Example class: --> Example = Enum('Example', 'example ie eg') # no module name given, frame hack fails --> pickle(Example.ie) # blows up --# pickle(Example) # succeeds here, but unpickle will fail >> I cannot get pickle failure on class syntax enums that inherit complex types >> (such as the NEI class in the tests). > > Is the NEI base class picklable? No. If it is, then the derived enum is also picklable (at least the variation I have tested, which is when the NEI base class has __getnewargs__). I'm really hoping you'll say that can be a documentation issue. ;) -- ~Ethan~ From guido at python.org Tue May 14 23:35:29 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 14 May 2013 14:35:29 -0700 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: <5192A8DC.2050007@stoneleaf.us> References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> <519299EC.7070600@stoneleaf.us> <5192A8DC.2050007@stoneleaf.us> Message-ID: On Tue, May 14, 2013 at 2:13 PM, Ethan Furman wrote: > On 05/14/2013 01:58 PM, Guido van Rossum wrote: >> >> On Tue, May 14, 2013 at 1:09 PM, Ethan Furman wrote: >>> I can get pickle failure on members created using the functional syntax >>> with no module set; >> >> >> That's the case I care most about. > > > Good, 'cause that one is handled. :) Then we're good. >>> I cannot get pickle failure on those same classes; >> >> >> I suppose you mean "if you create the same enums using class syntax"? >> Sounds fine to me. > > > No. Example class: > > --> Example = Enum('Example', 'example ie eg') # no module name given, > frame hack fails > > --> pickle(Example.ie) > # blows up > > --# pickle(Example) > # succeeds here, but unpickle will fail Not great, but (a) few people pickle classes, and (b) there's probably something you can do to the metaclass to sabotage this. But it's fine to punt on this now. >>> I cannot get pickle failure on class syntax enums that inherit complex >>> types >>> (such as the NEI class in the tests). >> >> >> Is the NEI base class picklable? > > > No. If it is, then the derived enum is also picklable (at least the > variation I have tested, which is when the NEI base class has > __getnewargs__). > > I'm really hoping you'll say that can be a documentation issue. ;) Essentially the same response -- with enough hackery you can probably get this to do what you want, but I wouldn't hold up a release for it. For example you could file low-priority bugs for both issues in the hope that someone else figures it out. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed May 15 00:16:36 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 15 May 2013 08:16:36 +1000 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> <519299EC.7070600@stoneleaf.us> <5192A8DC.2050007@stoneleaf.us> Message-ID: On 15 May 2013 07:38, "Guido van Rossum" wrote: > > On Tue, May 14, 2013 at 2:13 PM, Ethan Furman wrote: > > On 05/14/2013 01:58 PM, Guido van Rossum wrote: > >> > >> On Tue, May 14, 2013 at 1:09 PM, Ethan Furman wrote: > >>> I can get pickle failure on members created using the functional syntax > >>> with no module set; > >> > >> > >> That's the case I care most about. > > > > > > Good, 'cause that one is handled. :) > > Then we're good. > > >>> I cannot get pickle failure on those same classes; > >> > >> > >> I suppose you mean "if you create the same enums using class syntax"? > >> Sounds fine to me. > > > > > > No. Example class: > > > > --> Example = Enum('Example', 'example ie eg') # no module name given, > > frame hack fails > > > > --> pickle(Example.ie) > > # blows up > > > > --# pickle(Example) > > # succeeds here, but unpickle will fail > > Not great, but (a) few people pickle classes, and (b) there's probably > something you can do to the metaclass to sabotage this. But it's fine > to punt on this now. It may be a bug in pickle - it sounds like it is sanity checking type(obj), but not checking for cases where obj itself is a class. Cheers, Nick. > > >>> I cannot get pickle failure on class syntax enums that inherit complex > >>> types > >>> (such as the NEI class in the tests). > >> > >> > >> Is the NEI base class picklable? > > > > > > No. If it is, then the derived enum is also picklable (at least the > > variation I have tested, which is when the NEI base class has > > __getnewargs__). > > > > I'm really hoping you'll say that can be a documentation issue. ;) > > Essentially the same response -- with enough hackery you can probably > get this to do what you want, but I wouldn't hold up a release for it. > > For example you could file low-priority bugs for both issues in the > hope that someone else figures it out. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed May 15 02:57:31 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 17:57:31 -0700 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> <519299EC.7070600@stoneleaf.us> <5192A8DC.2050007@stoneleaf.us> Message-ID: <5192DD7B.3050205@stoneleaf.us> On 05/14/2013 03:16 PM, Nick Coghlan wrote: > > On 15 May 2013 07:38, "Guido van Rossum" > wrote: >> >> On Tue, May 14, 2013 at 2:13 PM, Ethan Furman > wrote: >> > On 05/14/2013 01:58 PM, Guido van Rossum wrote: >> >> >> >> On Tue, May 14, 2013 at 1:09 PM, Ethan Furman > wrote: >> >>> I can get pickle failure on members created using the functional syntax >> >>> with no module set; >> >> >> >> >> >> That's the case I care most about. >> > >> > >> > Good, 'cause that one is handled. :) >> >> Then we're good. >> >> >>> I cannot get pickle failure on those same classes; >> >> >> >> >> >> I suppose you mean "if you create the same enums using class syntax"? >> >> Sounds fine to me. >> > >> > >> > No. Example class: >> > >> > --> Example = Enum('Example', 'example ie eg') # no module name given, >> > frame hack fails >> > >> > --> pickle(Example.ie) >> > # blows up >> > >> > --# pickle(Example) >> > # succeeds here, but unpickle will fail >> >> Not great, but (a) few people pickle classes, and (b) there's probably >> something you can do to the metaclass to sabotage this. But it's fine >> to punt on this now. > > It may be a bug in pickle - it sounds like it is sanity checking type(obj), but not checking for cases where obj itself > is a class. Well, it's definitely not calling the metaclass' __reduce__ as that's where I put the bomb (hmm, will I be visited by men in dark suits now?) so maybe that's a bug in pickle. At any rate, I figured it out -- give the class' __module__ a dummy name (I like 'uh uh' ;) and when pickle can't find that module it'll blow itself up. -- ~Ethan~ From ethan at stoneleaf.us Wed May 15 04:02:55 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 14 May 2013 19:02:55 -0700 Subject: [Python-Dev] Pickling failure on Enums In-Reply-To: References: <1368469970.49.0.194089170746.issue17947@psf.upfronthosting.co.za> <519299EC.7070600@stoneleaf.us> <5192A8DC.2050007@stoneleaf.us> Message-ID: <5192ECCF.7070007@stoneleaf.us> On 05/14/2013 02:35 PM, Guido van Rossum wrote: > > For example you could file low-priority bugs for both issues in the > hope that someone else figures it out. Got it figured out. -- ~Ethan~ From mal at egenix.com Wed May 15 09:55:08 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 15 May 2013 09:55:08 +0200 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: References: Message-ID: <51933F5C.1030703@egenix.com> On 12.05.2013 06:03, Benjamin Peterson wrote: > The long anticipated "emergency" 2.7.5 release has now been tagged. It > will be publicly announced as binaries arrive. > > Originally, I was just going to cherrypick regression fixes onto the > 2.7.4 release and release those as 2.7.5. I started to this but ran > into some conflicts. Since we don't have buildbot testing of release > branches, I decided it would be best to just cut from the maintenance > branch. Has the release been postponed ? I don't see it on http://www.python.org/download/ Incidentally, the schedule already lists 2.7.5 as released on 2013-05-12 (http://www.python.org/dev/peps/pep-0373/) and the release calendar on 2013-05-11: https://www.google.com/calendar/feeds/b6v58qvojllt0i6ql654r1vh00 at group.calendar.google.com/public/basic?orderby=starttime&sortorder=descending :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 15 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From tismer at stackless.com Wed May 15 13:32:39 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 15 May 2013 13:32:39 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> Message-ID: <51937257.4020103@stackless.com> Hi Raymond, On 08.01.13 15:49, Maciej Fijalkowski wrote: > On Mon, Dec 10, 2012 at 3:44 AM, Raymond Hettinger > wrote: >> The current memory layout for dictionaries is >> unnecessarily inefficient. It has a sparse table of >> 24-byte entries containing the hash value, key pointer, >> and value pointer. >> >> Instead, the 24-byte entries should be stored in a >> dense table referenced by a sparse table of indices. >> >> For example, the dictionary: >> >> d = {'timmy': 'red', 'barry': 'green', 'guido': 'blue'} >> >> is currently stored as: >> >> entries = [['--', '--', '--'], >> [-8522787127447073495, 'barry', 'green'], >> ['--', '--', '--'], >> ['--', '--', '--'], >> ['--', '--', '--'], >> [-9092791511155847987, 'timmy', 'red'], >> ['--', '--', '--'], >> [-6480567542315338377, 'guido', 'blue']] >> >> Instead, the data should be organized as follows: >> >> indices = [None, 1, None, None, None, 0, None, 2] >> entries = [[-9092791511155847987, 'timmy', 'red'], >> [-8522787127447073495, 'barry', 'green'], >> [-6480567542315338377, 'guido', 'blue']] >> >> Only the data layout needs to change. The hash table >> algorithms would stay the same. All of the current >> optimizations would be kept, including key-sharing >> dicts and custom lookup functions for string-only >> dicts. There is no change to the hash functions, the >> table search order, or collision statistics. >> >> The memory savings are significant (from 30% to 95% >> compression depending on the how full the table is). >> Small dicts (size 0, 1, or 2) get the most benefit. >> >> For a sparse table of size t with n entries, the sizes are: >> >> curr_size = 24 * t >> new_size = 24 * n + sizeof(index) * t >> >> In the above timmy/barry/guido example, the current >> size is 192 bytes (eight 24-byte entries) and the new >> size is 80 bytes (three 24-byte entries plus eight >> 1-byte indices). That gives 58% compression. >> >> Note, the sizeof(index) can be as small as a single >> byte for small dicts, two bytes for bigger dicts and >> up to sizeof(Py_ssize_t) for huge dict. >> >> In addition to space savings, the new memory layout >> makes iteration faster. Currently, keys(), values, and >> items() loop over the sparse table, skipping-over free >> slots in the hash table. Now, keys/values/items can >> loop directly over the dense table, using fewer memory >> accesses. >> >> Another benefit is that resizing is faster and >> touches fewer pieces of memory. Currently, every >> hash/key/value entry is moved or copied during a >> resize. In the new layout, only the indices are >> updated. For the most part, the hash/key/value entries >> never move (except for an occasional swap to fill a >> hole left by a deletion). >> >> With the reduced memory footprint, we can also expect >> better cache utilization. >> >> For those wanting to experiment with the design, >> there is a pure Python proof-of-concept here: >> >> http://code.activestate.com/recipes/578375 >> >> YMMV: Keep in mind that the above size statics assume a >> build with 64-bit Py_ssize_t and 64-bit pointers. The >> space savings percentages are a bit different on other >> builds. Also, note that in many applications, the size >> of the data dominates the size of the container (i.e. >> the weight of a bucket of water is mostly the water, >> not the bucket). >> >> >> Raymond >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com > One question Raymond. > > The compression ratios stay true provided you don't overallocate entry > list. If you do overallocate you don't really gain that much (it all > depends vastly on details), or even loose in some cases. What do you > think should the strategy be? > What is the current status of this discussion? I'd like to know whether it is a considered alternative implementation. There is also a discussion in python-ideas right now where this alternative is mentioned, and I think especially for small dicts as **kwargs, it could be a cheap way to introduce order. Is this going on, somewhere? I'm quite interested on that. cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From stefan at drees.name Wed May 15 14:01:31 2013 From: stefan at drees.name (Stefan Drees) Date: Wed, 15 May 2013 14:01:31 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <51937257.4020103@stackless.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <51937257.4020103@stackless.com> Message-ID: <5193791B.6020709@drees.name> Hi Chris, On 15.05.13 13:32 Christian Tismer wrote: > Hi Raymond, > > On 08.01.13 15:49, Maciej Fijalkowski wrote: >> On Mon, Dec 10, 2012 at 3:44 AM, Raymond Hettinger >> wrote: >>> The current memory layout for dictionaries is >>> unnecessarily inefficient. It has a sparse table of >>> 24-byte entries containing the hash value, key pointer, >>> and value pointer. >>> >>> ... >> > > What is the current status of this discussion? > I'd like to know whether it is a considered alternative implementation. > > There is also a discussion in python-ideas right now where this > alternative is mentioned, and I think especially for small dicts > as **kwargs, it could be a cheap way to introduce order. > > Is this going on, somewhere? I'm quite interested on that. +1 I am also interested on the status. Many people seemed to have copied the recipe from the activestate site (was it?) but I wonder if it maybe was to cool to be progressed into "the field" or simply some understandable lack of resources? All the best, Stefan From tismer at stackless.com Wed May 15 14:36:35 2013 From: tismer at stackless.com (Christian Tismer) Date: Wed, 15 May 2013 14:36:35 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <5193791B.6020709@drees.name> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <51937257.4020103@stackless.com> <5193791B.6020709@drees.name> Message-ID: <51938153.4050003@stackless.com> On 15.05.13 14:01, Stefan Drees wrote: > Hi Chris, > > On 15.05.13 13:32 Christian Tismer wrote: >> Hi Raymond, >> >> On 08.01.13 15:49, Maciej Fijalkowski wrote: >>> On Mon, Dec 10, 2012 at 3:44 AM, Raymond Hettinger >>> wrote: >>>> The current memory layout for dictionaries is >>>> unnecessarily inefficient. It has a sparse table of >>>> 24-byte entries containing the hash value, key pointer, >>>> and value pointer. >>>> >>>> ... >>> >> >> What is the current status of this discussion? >> I'd like to know whether it is a considered alternative implementation. >> >> There is also a discussion in python-ideas right now where this >> alternative is mentioned, and I think especially for small dicts >> as **kwargs, it could be a cheap way to introduce order. >> >> Is this going on, somewhere? I'm quite interested on that. > > +1 I am also interested on the status. Many people seemed to have > copied the recipe from the activestate site (was it?) but I wonder if > it maybe was to cool to be progressed into "the field" or simply some > understandable lack of resources? > Right, found the references: http://mail.python.org/pipermail/python-dev/2012-December/123028.html http://stackoverflow.com/questions/14664620/python-dictionary-details http://code.activestate.com/recipes/578375-proof-of-concept-for-a-more-space-efficient-faster/?in=user-178123 cheers - chris -- Christian Tismer :^) Software Consulting : Have a break! Take a ride on Python's Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de phone +49 173 24 18 776 fax +49 (30) 700143-0023 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ From fijall at gmail.com Wed May 15 16:31:40 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 15 May 2013 16:31:40 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <51938153.4050003@stackless.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <51937257.4020103@stackless.com> <5193791B.6020709@drees.name> <51938153.4050003@stackless.com> Message-ID: On Wed, May 15, 2013 at 2:36 PM, Christian Tismer wrote: > On 15.05.13 14:01, Stefan Drees wrote: >> >> Hi Chris, >> >> On 15.05.13 13:32 Christian Tismer wrote: >>> >>> Hi Raymond, >>> >>> On 08.01.13 15:49, Maciej Fijalkowski wrote: >>>> >>>> On Mon, Dec 10, 2012 at 3:44 AM, Raymond Hettinger >>>> wrote: >>>>> >>>>> The current memory layout for dictionaries is >>>>> unnecessarily inefficient. It has a sparse table of >>>>> 24-byte entries containing the hash value, key pointer, >>>>> and value pointer. >>>>> >>>>> ... >>>> >>>> >>> >>> What is the current status of this discussion? >>> I'd like to know whether it is a considered alternative implementation. >>> >>> There is also a discussion in python-ideas right now where this >>> alternative is mentioned, and I think especially for small dicts >>> as **kwargs, it could be a cheap way to introduce order. >>> >>> Is this going on, somewhere? I'm quite interested on that. >> >> >> +1 I am also interested on the status. Many people seemed to have copied >> the recipe from the activestate site (was it?) but I wonder if it maybe was >> to cool to be progressed into "the field" or simply some understandable lack >> of resources? >> > > Right, found the references: > http://mail.python.org/pipermail/python-dev/2012-December/123028.html > http://stackoverflow.com/questions/14664620/python-dictionary-details > http://code.activestate.com/recipes/578375-proof-of-concept-for-a-more-space-efficient-faster/?in=user-178123 > > > cheers - chris > > -- > Christian Tismer :^) > Software Consulting : Have a break! Take a ride on Python's > Karl-Liebknecht-Str. 121 : *Starship* http://starship.python.net/ > 14482 Potsdam : PGP key -> http://pgp.uni-mainz.de > phone +49 173 24 18 776 fax +49 (30) 700143-0023 > PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 > whom do you want to sponsor today? http://www.stackless.com/ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/fijall%40gmail.com I implemented one for pypy btw (it's parked on a branch for messiness reasons) Cheers, fijal From benjamin at python.org Wed May 15 19:11:30 2013 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 15 May 2013 12:11:30 -0500 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: <51933F5C.1030703@egenix.com> References: <51933F5C.1030703@egenix.com> Message-ID: 2013/5/15 M.-A. Lemburg : > On 12.05.2013 06:03, Benjamin Peterson wrote: >> The long anticipated "emergency" 2.7.5 release has now been tagged. It >> will be publicly announced as binaries arrive. >> >> Originally, I was just going to cherrypick regression fixes onto the >> 2.7.4 release and release those as 2.7.5. I started to this but ran >> into some conflicts. Since we don't have buildbot testing of release >> branches, I decided it would be best to just cut from the maintenance >> branch. > > Has the release been postponed ? > > I don't see it on http://www.python.org/download/ We're waiting for binaries. > > Incidentally, the schedule already lists 2.7.5 as released on > 2013-05-12 (http://www.python.org/dev/peps/pep-0373/) and > the release calendar on 2013-05-11: > https://www.google.com/calendar/feeds/b6v58qvojllt0i6ql654r1vh00 at group.calendar.google.com/public/basic?orderby=starttime&sortorder=descending > :-) In practice, those dates mean when I tag the release. -- Regards, Benjamin From g.brandl at gmx.net Wed May 15 20:07:08 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 15 May 2013 20:07:08 +0200 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: <51933F5C.1030703@egenix.com> References: <51933F5C.1030703@egenix.com> Message-ID: Am 15.05.2013 09:55, schrieb M.-A. Lemburg: > On 12.05.2013 06:03, Benjamin Peterson wrote: >> The long anticipated "emergency" 2.7.5 release has now been tagged. It >> will be publicly announced as binaries arrive. >> >> Originally, I was just going to cherrypick regression fixes onto the >> 2.7.4 release and release those as 2.7.5. I started to this but ran >> into some conflicts. Since we don't have buildbot testing of release >> branches, I decided it would be best to just cut from the maintenance >> branch. > > Has the release been postponed ? > > I don't see it on http://www.python.org/download/ > > Incidentally, the schedule already lists 2.7.5 as released on > 2013-05-12 (http://www.python.org/dev/peps/pep-0373/) and > the release calendar on 2013-05-11: > https://www.google.com/calendar/feeds/b6v58qvojllt0i6ql654r1vh00 at group.calendar.google.com/public/basic?orderby=starttime&sortorder=descending > :-) > We're still waiting for the Windows binaries. I think I will publish the source and Mac releases on the website now and make a note that Windows is coming shortly. Has anybody heard from Martin recently? I hope he's well and just overworked... Georg From brian at python.org Wed May 15 20:16:32 2013 From: brian at python.org (Brian Curtin) Date: Wed, 15 May 2013 13:16:32 -0500 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: References: <51933F5C.1030703@egenix.com> Message-ID: On Wed, May 15, 2013 at 1:07 PM, Georg Brandl wrote: > Am 15.05.2013 09:55, schrieb M.-A. Lemburg: >> On 12.05.2013 06:03, Benjamin Peterson wrote: >>> The long anticipated "emergency" 2.7.5 release has now been tagged. It >>> will be publicly announced as binaries arrive. >>> >>> Originally, I was just going to cherrypick regression fixes onto the >>> 2.7.4 release and release those as 2.7.5. I started to this but ran >>> into some conflicts. Since we don't have buildbot testing of release >>> branches, I decided it would be best to just cut from the maintenance >>> branch. >> >> Has the release been postponed ? >> >> I don't see it on http://www.python.org/download/ >> >> Incidentally, the schedule already lists 2.7.5 as released on >> 2013-05-12 (http://www.python.org/dev/peps/pep-0373/) and >> the release calendar on 2013-05-11: >> https://www.google.com/calendar/feeds/b6v58qvojllt0i6ql654r1vh00 at group.calendar.google.com/public/basic?orderby=starttime&sortorder=descending >> :-) >> > > We're still waiting for the Windows binaries. > > I think I will publish the source and Mac releases on the website now > and make a note that Windows is coming shortly. I'm going to get started building the MSIs this evening. I'm looking into how I can obtain a code signing certificate, otherwise we'd potentially be shipping unsigned security releases...*ducks* > Has anybody heard from Martin recently? I hope he's well and just > overworked... I asked some folks on the infrastructure team and the last they heard from him was 11 April. From zachary.ware+pydev at gmail.com Wed May 15 20:23:06 2013 From: zachary.ware+pydev at gmail.com (Zachary Ware) Date: Wed, 15 May 2013 13:23:06 -0500 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: References: <51933F5C.1030703@egenix.com> Message-ID: > I asked some folks on the infrastructure team and the last they heard > from him was 11 April. Martin replied on issue17883 on May 10. From mal at egenix.com Wed May 15 22:01:30 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 15 May 2013 22:01:30 +0200 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: References: <51933F5C.1030703@egenix.com> Message-ID: <5193E99A.3010105@egenix.com> On 15.05.2013 19:11, Benjamin Peterson wrote: > 2013/5/15 M.-A. Lemburg : >> On 12.05.2013 06:03, Benjamin Peterson wrote: >>> The long anticipated "emergency" 2.7.5 release has now been tagged. It >>> will be publicly announced as binaries arrive. >>> >>> Originally, I was just going to cherrypick regression fixes onto the >>> 2.7.4 release and release those as 2.7.5. I started to this but ran >>> into some conflicts. Since we don't have buildbot testing of release >>> branches, I decided it would be best to just cut from the maintenance >>> branch. >> >> Has the release been postponed ? >> >> I don't see it on http://www.python.org/download/ > > We're waiting for binaries. Ah, ok. Thanks for the heads-up. >> Incidentally, the schedule already lists 2.7.5 as released on >> 2013-05-12 (http://www.python.org/dev/peps/pep-0373/) and >> the release calendar on 2013-05-11: >> https://www.google.com/calendar/feeds/b6v58qvojllt0i6ql654r1vh00 at group.calendar.google.com/public/basic?orderby=starttime&sortorder=descending >> :-) > > In practice, those dates mean when I tag the release. Ok. Was just wondering whether something went wrong with the website. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 15 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-05-07: Released mxODBC Zope DA 2.1.2 ... http://egenix.com/go46 2013-05-06: Released mxODBC 3.2.3 ... http://egenix.com/go45 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From barry at python.org Wed May 15 22:58:08 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 15 May 2013 16:58:08 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems Message-ID: <20130515165808.0d99a3df@anarchist> I am looking into a particularly vexing Python problem on Ubuntu that manifests in several different ways. I think the problem is the same one described in http://bugs.python.org/issue13146 and I sent a message on the subject to the ubuntu-devel list: https://lists.ubuntu.com/archives/ubuntu-devel/2013-May/037129.html I don't know what's causing the problem and have no way to reproduce it, but all the clues point to corrupt pyc files in Pythons < 3.3. The common way this manifests is a traceback on an import statement. The actual error can be a "ValueError: bad marshal data (unknown type code)" such as in http://pad.lv/1010077 or an "EOFError: EOF read where not expected" as in http://pad.lv/1060842. We have many more instances of both of these. Since both error messages come from marshal.c when trying to read the pyc for a module being imported, I suspect that something is causing the pyc files to get partially overwritten or corrupted. The workaround is always to essentially blow away the .pyc file and re-create it. (Various different techniques can be used, but they all boil down to the same thing.) Another commonality is that this bug -- so far -- has not been observed in any Python 3.3 code, only 3.2 and earlier, including 2.7 and 2.6. This strengthens my hypothesis, since importlib in Python 3.3 included an atomic rename of the .pyc file whereas older Pythons only do an exclusive open on the pyc files, but do *not* do an atomic rename AFAICT. This leads me to hypothesize that the bug is due to an as yet unidentified race condition during installation of Python source code on Ubuntu, which is normally when we automatically byte compile the source to .pyc files. This can happen at package installation/upgrade time, or during a fresh install. In each of these cases there *should* be only one process attempting to write the .pyc, but my guess is that for some reason, multiple processes are trying to do this, triggering a truncation or other bogus content of .pyc files. Even in Python < 3.3, it should not be possible to corrupt a .pyc when only a single process is involved, due to the import lock and/or GIL. The exclusive open of the .pyc file is clearly not enough of a protection in a multiprocess situation, since the bug has already been identified in Python on buildbots during test_multiprocessing. See http://bugs.python.org/issue13146 I think the list of errors we've seen is too extensive to chalk up to a hardware bug, and I think the systems involved are modern enough to not be subject to file system data loss. There could be a missing fsync somewhere though that might be involved. I think it's doubtful that buggy remote file systems (e.g. NFSv2) are involved. I could be wrong about any of that. I have not succeeded in writing a standalone reproducer using Python 2.7. So, the mystery is: what process on Ubuntu is exploiting holes in the exclusive open and causing this problem? Issue 13146 is closed because the fix was applied to Python 3.3 (see above), but it was not backported to earlier versions. I think it would not be that difficult to backport it, and I would be willing to do so for Python 2.7 and 3.2. We might include 2.6 in that, but only in Ubuntu since I can't see how this bug could be exploited as a security vulnerability. Thoughts? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From Steve.Dower at microsoft.com Wed May 15 23:00:37 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Wed, 15 May 2013 21:00:37 +0000 Subject: [Python-Dev] How to debug python crashes In-Reply-To: References: <51922EDB.1030404@freehackers.org> <519232F6.2050101@timgolden.me.uk> <51925870.3070504@freehackers.org> Message-ID: > From: Catalin Iacob > Hi Philippe, > > I don't have access to VS right now but out of my head what you need > to do is roughly outlined below. > > On Tue, May 14, 2013 at 5:29 PM, Philippe Fremy > wrote: > > But what's the reason for releasing them ? If you need to recompile > > Python to use them, that would be strange because they are generated as > > part of the compilation process anyway. > > They can indeed be used like this: > > You should launch the python.exe process that is going to crash, > attach to it with the Visual Studio debugger and then reproduce the > crash. This should drop you in the debugger. > > Once you're in the debugger and python.exe is stopped at the point of > the crash you should see the stack trace of each thread in a VS > window, the stacktrace will probably have lots of entries of the form > python27.dll! (no function names because there VS doesn't > know where to find the PDB files). If you right click one of those > entries there's an option named "Symbol load information" or similar, > this will show a window from which you can make VS ask you where on > disk do you have PDB files. You then tell VS where to find > python27.pdb and then the stacktrace entries should automatically get > function names. Copying the .pdb files to the same directories as the matching DLL/EXE files (which may be C:\Windows\System32 or C:\Windows\SysWOW64 for python27.dll) should also make this work. VS will always look next to the executable file. Cheers, Steve From brett at python.org Wed May 15 23:34:02 2013 From: brett at python.org (Brett Cannon) Date: Wed, 15 May 2013 17:34:02 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130515165808.0d99a3df@anarchist> References: <20130515165808.0d99a3df@anarchist> Message-ID: On Wed, May 15, 2013 at 4:58 PM, Barry Warsaw wrote: > I am looking into a particularly vexing Python problem on Ubuntu that > manifests in several different ways. I think the problem is the same one > described in http://bugs.python.org/issue13146 and I sent a message on the > subject to the ubuntu-devel list: > https://lists.ubuntu.com/archives/ubuntu-devel/2013-May/037129.html > > I don't know what's causing the problem and have no way to reproduce it, but > all the clues point to corrupt pyc files in Pythons < 3.3. > > The common way this manifests is a traceback on an import statement. The > actual error can be a "ValueError: bad marshal data (unknown type code)" such > as in http://pad.lv/1010077 or an "EOFError: EOF read where not expected" as > in http://pad.lv/1060842. We have many more instances of both of these. > > Since both error messages come from marshal.c when trying to read the pyc for > a module being imported, I suspect that something is causing the pyc files to > get partially overwritten or corrupted. The workaround is always to > essentially blow away the .pyc file and re-create it. (Various different > techniques can be used, but they all boil down to the same thing.) > > Another commonality is that this bug -- so far -- has not been observed in any > Python 3.3 code, only 3.2 and earlier, including 2.7 and 2.6. This > strengthens my hypothesis, since importlib in Python 3.3 included an atomic > rename of the .pyc file whereas older Pythons only do an exclusive open on the > pyc files, but do *not* do an atomic rename AFAICT. Just an FYI, the renaming has caught at least one person off-guard: http://bugs.python.org/issue17222, so you might have to be careful about considering a backport. -Brett From tseaver at palladion.com Thu May 16 00:06:49 2013 From: tseaver at palladion.com (Tres Seaver) Date: Wed, 15 May 2013 18:06:49 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130515165808.0d99a3df@anarchist> References: <20130515165808.0d99a3df@anarchist> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/15/2013 04:58 PM, Barry Warsaw wrote: > This leads me to hypothesize that the bug is due to an as yet > unidentified race condition during installation of Python source code > on Ubuntu, which is normally when we automatically byte compile the > source to .pyc files. Any chance you are using 'detox' or the equivalent to run tests on mutliple interpreters in parallel? The only "bad marshall data" errors I have seen lately seemed to be provoked by that kind of practice. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGUBvkACgkQ+gerLs4ltQ7nCwCcCfcAEGEN26qjQ9sGPaFRx1o4 DhwAoIlNwVU2lcJQ/hs5vQ1PXYT1uUwl =0s+X -----END PGP SIGNATURE----- From ncoghlan at gmail.com Thu May 16 00:33:08 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 16 May 2013 08:33:08 +1000 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: On 16 May 2013 08:11, "Tres Seaver" wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/15/2013 04:58 PM, Barry Warsaw wrote: > > This leads me to hypothesize that the bug is due to an as yet > > unidentified race condition during installation of Python source code > > on Ubuntu, which is normally when we automatically byte compile the > > source to .pyc files. > > Any chance you are using 'detox' or the equivalent to run tests on > mutliple interpreters in parallel? The only "bad marshall data" errors I > have seen lately seemed to be provoked by that kind of practice. 3.2 shouldn't have a problem with that if the interpreters are different versions. Personally, I would be suspicious of developmental web services doing auto-reloading while an installer is recompiling the world. I don't have enough context to be sure how plausible that is as a possible explanation, though. Cheers, Nick. > > > > Tres. > - -- > =================================================================== > Tres Seaver +1 540-429-0999 tseaver at palladion.com > Palladion Software "Excellence by Design" http://palladion.com > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with undefined - http://www.enigmail.net/ > > iEYEARECAAYFAlGUBvkACgkQ+gerLs4ltQ7nCwCcCfcAEGEN26qjQ9sGPaFRx1o4 > DhwAoIlNwVU2lcJQ/hs5vQ1PXYT1uUwl > =0s+X > -----END PGP SIGNATURE----- > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin at v.loewis.de Thu May 16 00:42:23 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 May 2013 00:42:23 +0200 Subject: [Python-Dev] 2.7.5 baking In-Reply-To: References: <51933F5C.1030703@egenix.com> Message-ID: <51940F4F.8030105@v.loewis.de> Am 15.05.13 20:07, schrieb Georg Brandl: > Has anybody heard from Martin recently? I hope he's well and just > overworked... True on both accounts. I was travelling over the weekend, and then didn't manage to catch up with email. Sorry for the delay. Regards, Martin From benjamin at python.org Thu May 16 06:19:06 2013 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 15 May 2013 23:19:06 -0500 Subject: [Python-Dev] [RELEASED] Python 2.7.5 Message-ID: It is my greatest pleasure to announce the release of Python 2.7.5. 2.7.5 is the latest maintenance release in the Python 2.7 series. You may be surprised to hear from me so soon, as Python 2.7.4 was released slightly more than a month ago. As it turns out, 2.7.4 had several regressions and incompatibilities with 2.7.3. Among them were regressions in the zipfile, gzip, and logging modules. 2.7.5 fixes these. In addition, a data file for testing in the 2.7.4 tarballs and binaries aroused the suspicion of some virus checkers. The 2.7.5 release removes this file to resolve that issue. For details, see the Misc/NEWS file in the distribution or view it at http://hg.python.org/cpython/file/ab05e7dd2788/Misc/NEWS Downloads are at http://python.org/download/releases/2.7.5/ As always, please report bugs to http://bugs.python.org/ (Thank you to those who reported these bugs in 2.7.4.) This is a production release. Happy May, Benjamin Peterson 2.7 Release Manager (on behalf of all of Python 2.7's contributors) From carlosnepomuceno at outlook.com Thu May 16 06:48:09 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Thu, 16 May 2013 07:48:09 +0300 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: test_asynchat still hangs! What it does? Should I care? ---------------------------------------- > Date: Wed, 15 May 2013 23:19:06 -0500 > Subject: [RELEASED] Python 2.7.5 > From: benjamin at python.org > To: python-dev at python.org; python-list at python.org; python-announce-list at python.org > > It is my greatest pleasure to announce the release of Python 2.7.5. > > 2.7.5 is the latest maintenance release in the Python 2.7 series. You may be > surprised to hear from me so soon, as Python 2.7.4 was released slightly more > than a month ago. As it turns out, 2.7.4 had several regressions and > incompatibilities with 2.7.3. Among them were regressions in the zipfile, gzip, > and logging modules. 2.7.5 fixes these. In addition, a data file for testing in > the 2.7.4 tarballs and binaries aroused the suspicion of some virus > checkers. The 2.7.5 release removes this file to resolve that issue. > > For details, see the Misc/NEWS file in the distribution or view it at > > http://hg.python.org/cpython/file/ab05e7dd2788/Misc/NEWS > > Downloads are at > > http://python.org/download/releases/2.7.5/ > > As always, please report bugs to > > http://bugs.python.org/ > > (Thank you to those who reported these bugs in 2.7.4.) > > This is a production release. > > Happy May, > Benjamin Peterson > 2.7 Release Manager > (on behalf of all of Python 2.7's contributors) > -- > http://mail.python.org/mailman/listinfo/python-list From benjamin at python.org Thu May 16 06:51:00 2013 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 15 May 2013 23:51:00 -0500 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: 2013/5/15 Carlos Nepomuceno : > test_asynchat still hangs! What it does? Should I care? Is there an issue filed for that? -- Regards, Benjamin From carlosnepomuceno at outlook.com Thu May 16 06:56:04 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Thu, 16 May 2013 07:56:04 +0300 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: , , Message-ID: Just filed 17992! http://bugs.python.org/issue17992 ---------------------------------------- > Date: Wed, 15 May 2013 23:51:00 -0500 > Subject: Re: [Python-Dev] [RELEASED] Python 2.7.5 > From: benjamin at python.org > To: carlosnepomuceno at outlook.com > CC: python-dev at python.org > > 2013/5/15 Carlos Nepomuceno : >> test_asynchat still hangs! What it does? Should I care? > > Is there an issue filed for that? > > > > -- > Regards, > Benjamin From georg at python.org Thu May 16 07:20:30 2013 From: georg at python.org (Georg Brandl) Date: Thu, 16 May 2013 07:20:30 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 Message-ID: <51946C9E.10607@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On behalf of the Python development team, I am pleased to announce the releases of Python 3.2.5 and 3.3.2. The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip and xml.sax modules. Details can be found in the changelogs: http://hg.python.org/cpython/file/v3.2.5/Misc/NEWS and http://hg.python.org/cpython/file/v3.3.2/Misc/NEWS To download Python 3.2.5 or Python 3.3.2, visit: http://www.python.org/download/releases/3.2.5/ or http://www.python.org/download/releases/3.3.2/ respectively. As always, please report bugs to http://bugs.python.org/ (Thank you to those who reported these regressions.) Enjoy! - -- Georg Brandl, Release Manager georg at python.org (on behalf of the entire python-dev team and all contributors) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEARECAAYFAlGUbJ4ACgkQN9GcIYhpnLDH8ACdEM4k7bobLJsFmCb49zuwQR3W EjgAoIWAOFNhJNdTAWEGSWqFWUP20wrb =YnPr -----END PGP SIGNATURE----- From benhoyt at gmail.com Thu May 16 07:18:09 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Thu, 16 May 2013 17:18:09 +1200 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: Thanks, Benjamin -- that's great! This may not be a python-dev question exactly. But on Windows, is it safe to update to 2.7.5 on top of 2.7.4 (at C:\Python27) using the .msi installer? In other words, will it update/add/remove all the files correctly? What if python.exe is running? -Ben On Thu, May 16, 2013 at 4:19 PM, Benjamin Peterson wrote: > It is my greatest pleasure to announce the release of Python 2.7.5. > > 2.7.5 is the latest maintenance release in the Python 2.7 series. You may > be > surprised to hear from me so soon, as Python 2.7.4 was released slightly > more > than a month ago. As it turns out, 2.7.4 had several regressions and > incompatibilities with 2.7.3. Among them were regressions in the zipfile, > gzip, > and logging modules. 2.7.5 fixes these. In addition, a data file for > testing in > the 2.7.4 tarballs and binaries aroused the suspicion of some virus > checkers. The 2.7.5 release removes this file to resolve that issue. > > For details, see the Misc/NEWS file in the distribution or view it at > > http://hg.python.org/cpython/file/ab05e7dd2788/Misc/NEWS > > Downloads are at > > http://python.org/download/releases/2.7.5/ > > As always, please report bugs to > > http://bugs.python.org/ > > (Thank you to those who reported these bugs in 2.7.4.) > > This is a production release. > > Happy May, > Benjamin Peterson > 2.7 Release Manager > (on behalf of all of Python 2.7's contributors) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/benhoyt%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu May 16 10:14:17 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Thu, 16 May 2013 04:14:17 -0400 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: On 5/16/2013 1:18 AM, Ben Hoyt wrote: > Thanks, Benjamin -- that's great! > > This may not be a python-dev question exactly. But on Windows, is it > safe to update to 2.7.5 on top of 2.7.4 (at C:\Python27) using the .msi > installer? In other words, will it update/add/remove all the files > correctly? What if python.exe is running? Yes, I update all the time, but without python running. From benhoyt at gmail.com Thu May 16 10:22:25 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Thu, 16 May 2013 20:22:25 +1200 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: This may not be a python-dev question exactly. But on Windows, is it > safe to update to 2.7.5 on top of 2.7.4 (at C:\Python27) using the .msi >> installer? In other words, will it update/add/remove all the files >> correctly? What if python.exe is running? >> > > Yes, I update all the time, but without python running. Great to know -- thanks. -Ben -------------- next part -------------- An HTML attachment was scrubbed... URL: From benhoyt at gmail.com Thu May 16 10:42:25 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Thu, 16 May 2013 20:42:25 +1200 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: >> Yes, I update all the time, but without python running. FYI, I tried this just now with Python 2.7.4 running, and the installer nicely tells you that "some files that need to be updated are currently in use ... the following applications are using files, please close them and click Retry ... python.exe (Process Id: 5388)". So you can't do it while python.exe is running, but at least it notifies you and gives you the option to retry. Good work, whoever did this installer. -Ben From storchaka at gmail.com Thu May 16 12:15:46 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 16 May 2013 13:15:46 +0300 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 In-Reply-To: <51946C9E.10607@python.org> References: <51946C9E.10607@python.org> Message-ID: 16.05.13 08:20, Georg Brandl ???????(??): > On behalf of the Python development team, I am pleased to announce the > releases of Python 3.2.5 and 3.3.2. > > The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip > and xml.sax modules. Details can be found in the changelogs: It seems that I'm the main culprit of this releases. From martin at v.loewis.de Thu May 16 12:23:12 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 16 May 2013 12:23:12 +0200 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: Message-ID: <5194B390.6060700@v.loewis.de> Am 16.05.13 10:42, schrieb Ben Hoyt: > FYI, I tried this just now with Python 2.7.4 running, and the > installer nicely tells you that "some files that need to be updated > are currently in use ... the following applications are using files, > please close them and click Retry ... python.exe (Process Id: 5388)". > > So you can't do it while python.exe is running, but at least it > notifies you and gives you the option to retry. Good work, whoever did > this installer. This specific feature is part of the MSI technology itself, so the honor goes to Microsoft in this case. They also have an advanced feature where the installer can tell the running application to terminate, and then restart after installation (since Vista, IIRC). Unfortunately, this doesn't apply to Python, as a "safe restart" is typically not feasible. FWIW, I'm the one who put together the Python installer. Regards, Martin From cf.natali at gmail.com Thu May 16 13:24:36 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Thu, 16 May 2013 13:24:36 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 In-Reply-To: References: <51946C9E.10607@python.org> Message-ID: 2013/5/16 Serhiy Storchaka : > 16.05.13 08:20, Georg Brandl ???????(??): >> >> On behalf of the Python development team, I am pleased to announce the >> releases of Python 3.2.5 and 3.3.2. >> >> The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip >> and xml.sax modules. Details can be found in the changelogs: > > > It seems that I'm the main culprit of this releases. Well, when I look at the changelogs, what strikes me more is that you're the author of *many* fixes, and also a lot of new features/improvements. So I wouldn't feel bad if I were you, this kind of things happens (and it certainly did to me). Cheers, Charles From benjamin at python.org Thu May 16 17:04:56 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 16 May 2013 10:04:56 -0500 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 In-Reply-To: References: <51946C9E.10607@python.org> Message-ID: 2013/5/16 Serhiy Storchaka : > 16.05.13 08:20, Georg Brandl ???????(??): > >> On behalf of the Python development team, I am pleased to announce the >> releases of Python 3.2.5 and 3.3.2. >> >> The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip >> and xml.sax modules. Details can be found in the changelogs: > > > It seems that I'm the main culprit of this releases. You've now passed your Python-dev initiation. -- Regards, Benjamin From barry at python.org Thu May 16 17:40:02 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 16 May 2013 11:40:02 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: <20130516114002.1d584f79@anarchist> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On May 15, 2013, at 06:06 PM, Tres Seaver wrote: >On 05/15/2013 04:58 PM, Barry Warsaw wrote: >> This leads me to hypothesize that the bug is due to an as yet >> unidentified race condition during installation of Python source code >> on Ubuntu, which is normally when we automatically byte compile the >> source to .pyc files. > >Any chance you are using 'detox' or the equivalent to run tests on >mutliple interpreters in parallel? The only "bad marshall data" errors I >have seen lately seemed to be provoked by that kind of practice. Nope. PyPI's detox isn't even available in Ubuntu currently. (The detox package in Ubuntu is something else.) Tests should only be run at package build time, not installation time, and the byte compiling of source files at installation time *should* be single threaded and single process. We've since found a few cases where Python 3.3 pyc files are probably corrupted, so that shoots down my theory about a race condition on reading/writing pyc files, since 3.3 implements atomic-rename and *should* be immune to that kind of thing. It's still a mystery though. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAEBCAAGBQJRlP3SAAoJEBJutWOnSwa/Xy8QAI6Ul/s0nF+rac8nGy6fieLB FHHEfmjIgj6MkyUUw/zcbR48ELiOPkkV+GM4HJnY/H2ZG9vZqwuQYWFI0cgIBmxj EmHfKK7OTdaaHZeNTt83RDwGRLUS/gYXQ7JVikGyFnSbftfmUoN/y/ndlX5DX1hT ecDHVtXCH/ti/kcOWe2OlMABONZQPW0qYB7/0PiCCmaOxulqUsz20Ofy8SfWmSPd Rbig5i8fSnI98dkLVUzyy1tbUkdRkLBro/hawu1V9y7qVkoYx1Jz6p8XkQLp3jES m22m+6CLrnD39HxvJGGNkIaYmu5xTW+rK/Li8OrfOKx6QVIZ+XQRJFkiXnKmiezk sMYv/psySWJ/BSImsQOSt/sLHJWAGh8fkMIBpx9tI3BWMvyMkI0Hs9l7JyQn0moo oSTNb9AbgRSrkh0rVv4fhOhd1Ir3LXYTGwwYE5+o7tB/Pp0AKi2tX/XTBctDpy86 xqNHOaCV0hRA2Y+/C2QAAA7LRruP0yv10DfkciVUHR7UzbXgViICEEUizGmnkni7 utGg9EDk5VcSeg2ySxhX9Uj3E2M/ijOuYpXUJ2Gwd4UNUT5XGK1+6i2JTO2pEQOM HqhsGqk4WJsfEBTIrAt4NSxZyEuQ2nRV3MIsNaVCDp1FDySZWt3Cckq8hkZ/6vOM 7ncE6aG1cJgq4WKErvCk =BMFW -----END PGP SIGNATURE----- From barry at python.org Thu May 16 17:42:30 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 16 May 2013 11:42:30 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: <20130516114230.507eefd0@anarchist> On May 16, 2013, at 08:33 AM, Nick Coghlan wrote: >Personally, I would be suspicious of developmental web services doing >auto-reloading while an installer is recompiling the world. I don't have >enough context to be sure how plausible that is as a possible explanation, >though. It's possible that some Python written system service is getting invoked during the bytecompile-on-installation phase. But now that we've found cases of this problem with Python 3.3, I'm less inclined to follow that line of reasoning since it implements atomic rename. Unless there's a flaw in importlib we haven't identified yet. -Barry From christian at python.org Thu May 16 18:22:20 2013 From: christian at python.org (Christian Heimes) Date: Thu, 16 May 2013 18:22:20 +0200 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130516114002.1d584f79@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> Message-ID: <519507BC.2080600@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Am 16.05.2013 17:40, schrieb Barry Warsaw: > We've since found a few cases where Python 3.3 pyc files are > probably corrupted, so that shoots down my theory about a race > condition on reading/writing pyc files, since 3.3 implements > atomic-rename and *should* be immune to that kind of thing. > > It's still a mystery though. Are you able to reproduce the issue? Perhaps you could use inotify to track down file activity. It shouldn't affect timing much and you can track if more than one process it writing to the same file. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBCAAGBQJRlQe4AAoJEMeIxMHUVQ1F7K8P/1QoH8/iJP60dHQHfU12AYFY nUu1ztRmwTSx0eYEumwsRF5iUuNse7kFzYM0u02lEmZyuk34hLtBBcnNGA0wJ4mZ SdiXL1ZA2levg9Qlr8cPQqgnlm9aXnIazQKbUJ+/MOGBdTPBemMunMyMSsNg5ENT gLEVb/lufNssAoo+M0QKq9EjE2xSQEsFjUDM575KHbq006EzdHp7on2xQ20pJzLc iq/qWAFh+kjS42Udk9luvAKy3iGJcGXnG9AY0hkLBh8tQYhISWplsBo5wiigZLyv PZ0tbh5h3bsi80FjDlSfVPFOzlt34xI6tPRONUj/XWLPfvBCGzwqjYGc5Z6+CkAF pPAamF0ntwq76mNBl9EABAY1q85SgEU+toft8KQdxm+SHuKINJc95R8x6ypsnYIQ Ol+L5nUy+zV3vCJe9TM2U2cUB/UWHLM0qGTSYowLqTXtv+1Y+J55g63kOLkfCrnF znVXMU5FMotlh6i1rK/uwBttJ+NdjOTL0+eVbVqm39bBA6PU7UgANNNSIWVPCbfu HwucdVwkY932TxiVpWZBSPVLQmjNHIOlVj8uFIkhBnEeWSYkpIu+wV4f+Gc048AP 7EYYMMHTdlodNdKRlr+ksczhJvO67STjKH0a+vB2fro/wjuxwcqD7A0qG0PfvURg 7vofW0fs9lKap0wk9DsT =v709 -----END PGP SIGNATURE----- From barry at python.org Thu May 16 18:38:27 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 16 May 2013 12:38:27 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <519507BC.2080600@python.org> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> Message-ID: <20130516123827.65a22866@anarchist> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On May 16, 2013, at 06:22 PM, Christian Heimes wrote: >Are you able to reproduce the issue? Perhaps you could use inotify to >track down file activity. It shouldn't affect timing much and you can >track if more than one process it writing to the same file. Sadly, no. I've never seen it on any of my own desktops or servers, and none of the bug reporters have been able to provide a reproducible test case. I tried some smaller test cases to see if I could trigger pyc writing race conditions, but failed to reproduce it there either. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAEBCAAGBQJRlQuDAAoJEBJutWOnSwa/03AP/04BxVqqm0nrGIdtuKDej3MW 4qhXxbYExgpcFpclesFw479TatGCh3hBwDoosrYdk5Lrf8Bwa9FGUNbRJozdAGmE wAB1vq30mGm+2QtBuVPoXu3xrNWGGmUUtI0yzBSwnvxlfuIzsLkibZPMIMfdVCi+ f3/LSldowWx0DJp8V5TUq4GIhOfe3yccgIxMU55YbDj8cplzFJuBuBtO4DOGsoFI IPlFLwGPG503Nju37zzdkoq3Xkw4Og+vXtXsCv/rhAWIqnZgKYNF/CLv0dolZWFy GhjM5bfQtUWwxH6Ng1Wl2kcuCVmF1/vD2vTUCsgpA4qQc0nYrTy/q1OPho72x40o DvvaVHueDqH7N1xm64KL75sFxu6QDIniBbgV7gklU1z6P6ZVADwoilon8HC9FnJN w5I0sYLTnIHxUIrM0h0wi517gQTZHTSF0bQxKqynNV+PrZBprvB9lEkYCpy5tV0s LEqf+oUwXvGIOZ6Nmv2MyjQb0xajxHmzz+RO1qQ3R4tbiQjwGoqc43CrlxhVduJh 1VGM6b7ysZ2iwyJG+q0aVi9YSaStzzUvMPO2F+HTmE+r3MvgdTcKQQzLDuRF6LfV 74eWwtHBpiJuvdBG37uDQj5bU/oLWiYyfM52vASgHB4zoKOx0EUxAd1Wf5nyxc1E Bo0G3kYwbFaNvSnwcJZw =a4x0 -----END PGP SIGNATURE----- From ethan at stoneleaf.us Thu May 16 18:44:55 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 16 May 2013 09:44:55 -0700 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130516123827.65a22866@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> Message-ID: <51950D07.2050107@stoneleaf.us> On 05/16/2013 09:38 AM, Barry Warsaw wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA256 > > On May 16, 2013, at 06:22 PM, Christian Heimes wrote: > >> Are you able to reproduce the issue? Perhaps you could use inotify to >> track down file activity. It shouldn't affect timing much and you can >> track if more than one process it writing to the same file. > > Sadly, no. I've never seen it on any of my own desktops or servers, and none > of the bug reporters have been able to provide a reproducible test case. I > tried some smaller test cases to see if I could trigger pyc writing race > conditions, but failed to reproduce it there either. Is it happening on the same machines? If so, perhaps a daemon to monitor those files and then scream and shout when one changes. Might help track down what's going on at the time. (Yeah, that does sound like saying 'inotify' but with more words...) -- ~Ethan~ From barry at python.org Thu May 16 20:04:57 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 16 May 2013 14:04:57 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <51950D07.2050107@stoneleaf.us> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> Message-ID: <20130516140457.288859bd@anarchist> On May 16, 2013, at 09:44 AM, Ethan Furman wrote: >Is it happening on the same machines? If so, perhaps a daemon to monitor >those files and then scream and shout when one changes. Might help track >down what's going on at the time. (Yeah, that does sound like saying >'inotify' but with more words...) No, it's all different kinds of machines, at different times, on different files. So far, there's no rhyme or reason to the corruptions that I can tell. We're trying to instrument things to collect more data when these failures do occur. -Barry From greg at krypto.org Thu May 16 22:37:38 2013 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 16 May 2013 13:37:38 -0700 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130516140457.288859bd@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> Message-ID: On Thu, May 16, 2013 at 11:04 AM, Barry Warsaw wrote: > On May 16, 2013, at 09:44 AM, Ethan Furman wrote: > > >Is it happening on the same machines? If so, perhaps a daemon to monitor > >those files and then scream and shout when one changes. Might help track > >down what's going on at the time. (Yeah, that does sound like saying > >'inotify' but with more words...) > > No, it's all different kinds of machines, at different times, on different > files. So far, there's no rhyme or reason to the corruptions that I can > tell. We're trying to instrument things to collect more data when these > failures do occur. > Even on machines with ECC ram and reliable storage, not owned by l33t gam0rzs weenies who overclock things? -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu May 16 22:52:19 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Thu, 16 May 2013 16:52:19 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130516140457.288859bd@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> Message-ID: On 5/16/2013 2:04 PM, Barry Warsaw wrote: > No, it's all different kinds of machines, at different times, on different > files. So far, there's no rhyme or reason to the corruptions that I can > tell. If the corruption only happens on Ubuntu, that would constitute 'rhyme' ;-). I realize that asking for reports on other systems is part of the reason you posted, but I don't remember seeing any others yet. > We're trying to instrument things to collect more data when these > failures do occur. Do failures only occur during compileall process? (or whatever substitute you use). At the end of py_compile.complile, after the with block that opens, writes, flushes, and closes, you could add with open(cfile, 'rb') as fc: This would be a high-level write and verify. Verify would be a bit faster if marshal.dump were replaced by marshal.dumps + write to keep alive the string version of the code object. Then the codeobject comparison in the verify step would be replaced by string comparison. You could also read and verify (by unmarshal) after the compile-all process (faster than importing). Terry From guido at python.org Thu May 16 23:19:14 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 16 May 2013 14:19:14 -0700 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130515165808.0d99a3df@anarchist> References: <20130515165808.0d99a3df@anarchist> Message-ID: This reminds me of the following bug, which can happen when two processes are both writing the .pyc file and a third is reading it. First some background. When writing a .pyc file, we use the following strategy: - open the file for writing - write a dummy header (four null bytes) - write the .py file's mtime - write the marshalled code object - replace the dummy heaer with the correct magic word Even py_compile.py (used by compileall.py) uses this strategy. When reading a .pyc file, we ignore it when the magic word isn't there (or when the mtime doesn't match that of the .py file exactly), and then we will write it back like described above. Now consider the following scenario. It involves *three* processes. - Two unrelated processes both start and want to import the same module. - They both see the .pyc file is missing/corrupt and decide to write it. - The first process finishing writing the file, writing the correct header. - Now a third process wants to import the module, sees the valid header, and starts reading the file. - However, while this is going on, the second process gets ready to write the file. - The second process truncates the file, writes the dummy header, and then stalls. - At this point the third process (which thought it was reading a valid file) sees an unexpected EOF because the file has been truncated. Now, this would explain the EOFError, but not necessarily the ValueError with "unknown type code". However, it looks like marshal doesn't always check for EOF immediately (sometimes it calls getc() without checking the result, and sometimes it doesn't check the error state after calling r_string()), so I think all the errors are actually explainable from this scenario. -- --Guido van Rossum (python.org/~guido) From brett at python.org Thu May 16 23:30:26 2013 From: brett at python.org (Brett Cannon) Date: Thu, 16 May 2013 17:30:26 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum wrote: > This reminds me of the following bug, which can happen when two > processes are both writing the .pyc file and a third is reading it. > First some background. > > When writing a .pyc file, we use the following strategy: > - open the file for writing > - write a dummy header (four null bytes) > - write the .py file's mtime > - write the marshalled code object > - replace the dummy heaer with the correct magic word > Just so people know, this is how we used to do it. In importlib we write the entire file to a temp file and then to an atomic rename. > Even py_compile.py (used by compileall.py) uses this strategy. py_compile as of Python 3.4 now just uses importlib directly, so it matches its semantics. -Brett > > When reading a .pyc file, we ignore it when the magic word isn't there > (or when the mtime doesn't match that of the .py file exactly), and > then we will write it back like described above. > > Now consider the following scenario. It involves *three* processes. > > - Two unrelated processes both start and want to import the same module. > - They both see the .pyc file is missing/corrupt and decide to write it. > - The first process finishing writing the file, writing the correct header. > - Now a third process wants to import the module, sees the valid > header, and starts reading the file. > - However, while this is going on, the second process gets ready to > write the file. > - The second process truncates the file, writes the dummy header, and > then stalls. > - At this point the third process (which thought it was reading a > valid file) sees an unexpected EOF because the file has been > truncated. > > Now, this would explain the EOFError, but not necessarily the > ValueError with "unknown type code". However, it looks like marshal > doesn't always check for EOF immediately (sometimes it calls getc() > without checking the result, and sometimes it doesn't check the error > state after calling r_string()), so I think all the errors are > actually explainable from this scenario. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org From gvanrossum at gmail.com Thu May 16 23:40:07 2013 From: gvanrossum at gmail.com (Guido van Rossum) Date: Thu, 16 May 2013 14:40:07 -0700 (PDT) Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: Message-ID: <1368740407310.26db1622@Nodemailer> I still suspect this might explain most of what Barry saw, if not all.? ? Sent from Mailbox On Thu, May 16, 2013 at 2:36 PM, Brett Cannon wrote: > On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum wrote: >> This reminds me of the following bug, which can happen when two >> processes are both writing the .pyc file and a third is reading it. >> First some background. >> >> When writing a .pyc file, we use the following strategy: >> - open the file for writing >> - write a dummy header (four null bytes) >> - write the .py file's mtime >> - write the marshalled code object >> - replace the dummy heaer with the correct magic word >> > Just so people know, this is how we used to do it. In importlib we > write the entire file to a temp file and then to an atomic rename. >> Even py_compile.py (used by compileall.py) uses this strategy. > py_compile as of Python 3.4 now just uses importlib directly, so it > matches its semantics. > -Brett >> >> When reading a .pyc file, we ignore it when the magic word isn't there >> (or when the mtime doesn't match that of the .py file exactly), and >> then we will write it back like described above. >> >> Now consider the following scenario. It involves *three* processes. >> >> - Two unrelated processes both start and want to import the same module. >> - They both see the .pyc file is missing/corrupt and decide to write it. >> - The first process finishing writing the file, writing the correct header. >> - Now a third process wants to import the module, sees the valid >> header, and starts reading the file. >> - However, while this is going on, the second process gets ready to >> write the file. >> - The second process truncates the file, writes the dummy header, and >> then stalls. >> - At this point the third process (which thought it was reading a >> valid file) sees an unexpected EOF because the file has been >> truncated. >> >> Now, this would explain the EOFError, but not necessarily the >> ValueError with "unknown type code". However, it looks like marshal >> doesn't always check for EOF immediately (sometimes it calls getc() >> without checking the result, and sometimes it doesn't check the error >> state after calling r_string()), so I think all the errors are >> actually explainable from this scenario. >> >> -- >> --Guido van Rossum (python.org/~guido) >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu May 16 23:43:46 2013 From: brett at python.org (Brett Cannon) Date: Thu, 16 May 2013 17:43:46 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <1368740407310.26db1622@Nodemailer> References: <1368740407310.26db1622@Nodemailer> Message-ID: On Thu, May 16, 2013 at 5:40 PM, Guido van Rossum wrote: > I still suspect this might explain most of what Barry saw, if not all. Quite possible, especially since he is seeing more issues on 3.2 than 3.3. Just wanted to fill people in on how 3.3 onwards does things is all. -Brett > ? > Sent from Mailbox > > > On Thu, May 16, 2013 at 2:36 PM, Brett Cannon wrote: >> >> On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum >> wrote: >> > This reminds me of the following bug, which can happen when two >> > processes are both writing the .pyc file and a third is reading it. >> > First some background. >> > >> > When writing a .pyc file, we use the following strategy: >> >> > - open the file for writing >> > - write a dummy header (four null bytes) >> > - write the .py file's mtime >> > - write the marshalled code object >> > - replace the dummy heaer with the correct magic word >> > >> >> Just so people know, this is how we used to do it. In importlib we >> write the entire file to a temp file and then to an atomic rename. >> >> > Even py_compile.py (used by compileall.py) uses this strategy. >> >> py_compile as of Python 3.4 now just uses importlib directly, so it >> matches its semantics. >> >> -Brett >> >> > >> > When reading a .pyc file, we ignore it when the magic word isn't there >> > (or when the mtime doesn't match that of the .py file exactly), and >> > then we will write it back like described above. >> > >> > Now consider the following scenario. It involves *three* processes. >> > >> > - Two unrelated processes both start and want to import the same module. >> > - They both see the .pyc file is missing/corrupt and decide to write it. >> > - The first process finishing writing the file, writing the correct >> > header. >> > - Now a third process wants to import the module, sees the valid >> > header, and starts reading the file. >> > - However, while this is going on, the second process gets ready to >> > write the file. >> > - The second process truncates the file, writes the dummy header, and >> > then stalls. >> > - At this point the third process (which thought it was reading a >> > valid file) sees an unexpected EOF because the file has been >> > truncated. >> > >> > Now, this would explain the EOFError, but not necessarily the >> > ValueError with "unknown type code". However, it looks like marshal >> > doesn't always check for EOF immediately (sometimes it calls getc() >> > without checking the result, and sometimes it doesn't check the error >> > state after calling r_string()), so I think all the errors are >> > actually explainable from this scenario. >> > >> > -- >> > --Guido van Rossum (python.org/~guido) >> > _______________________________________________ >> > Python-Dev mailing list >> > Python-Dev at python.org >> > http://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> > http://mail.python.org/mailman/options/python-dev/brett%40python.org > > From tjreedy at udel.edu Fri May 17 00:00:49 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Thu, 16 May 2013 18:00:49 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: On 5/16/2013 5:30 PM, Brett Cannon wrote: > On Thu, May 16, 2013 at 5:19 PM, Guido van Rossum wrote: >> This reminds me of the following bug, which can happen when two >> processes are both writing the .pyc file and a third is reading it. >> First some background. >> >> When writing a .pyc file, we use the following strategy: > >> - open the file for writing >> - write a dummy header (four null bytes) >> - write the .py file's mtime >> - write the marshalled code object >> - replace the dummy heaer with the correct magic word >> > > Just so people know, this is how we used to do it. In importlib we > write the entire file to a temp file and then to an atomic rename. > >> Even py_compile.py (used by compileall.py) uses this strategy. > > py_compile as of Python 3.4 now just uses importlib directly, so it > matches its semantics. But in 3.3, it still is as Guido describes, even though importlib is improved. From thomas at python.org Fri May 17 00:10:13 2013 From: thomas at python.org (Thomas Wouters) Date: Fri, 17 May 2013 00:10:13 +0200 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: On Thu, May 16, 2013 at 11:19 PM, Guido van Rossum wrote: > This reminds me of the following bug, which can happen when two > processes are both writing the .pyc file and a third is reading it. > First some background. > > When writing a .pyc file, we use the following strategy: > - open the file for writing > - write a dummy header (four null bytes) > - write the .py file's mtime > - write the marshalled code object > - replace the dummy heaer with the correct magic word > > Even py_compile.py (used by compileall.py) uses this strategy. > > When reading a .pyc file, we ignore it when the magic word isn't there > (or when the mtime doesn't match that of the .py file exactly), and > then we will write it back like described above. > > Now consider the following scenario. It involves *three* processes. > > - Two unrelated processes both start and want to import the same module. > - They both see the .pyc file is missing/corrupt and decide to write it. > - The first process finishing writing the file, writing the correct header. > - Now a third process wants to import the module, sees the valid > header, and starts reading the file. > - However, while this is going on, the second process gets ready to > write the file. > - The second process truncates the file, writes the dummy header, and > then stalls. > - At this point the third process (which thought it was reading a > valid file) sees an unexpected EOF because the file has been > truncated. > > Now, this would explain the EOFError, but not necessarily the > ValueError with "unknown type code". The 'unknown type codes' can also be explained if the two processes writing to the .pyc files are *different Python versions*. As you may recall, at Google we used to use modified Python interpreters that used '.pyc-2.2', '.pyc-2.4', etc, for the pyc extension. That was because otherwise different Python versions would keep overwriting the .pyc files of shared Python modules, and "at Google scale" it caused all manner of problems... I guess Ubuntu is approaching Google scale ;-) (The decision to rename to an awkward extension broke a lot of third-party tools; it was made before I -- or you, for that matter -- joined Google... Now we just turn on -B by default :) > However, it looks like marshal > doesn't always check for EOF immediately (sometimes it calls getc() > without checking the result, and sometimes it doesn't check the error > state after calling r_string()), so I think all the errors are > actually explainable from this scenario. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/thomas%40python.org > -- Thomas Wouters Hi! I'm an email virus! Think twice before sending your email to help me spread! -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri May 17 00:27:01 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 17 May 2013 10:27:01 +1200 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: <51955D35.8060909@canterbury.ac.nz> Guido van Rossum wrote: > This reminds me of the following bug, which can happen when two > processes are both writing the .pyc file and a third is reading it. > ... I think all the errors are > actually explainable from this scenario. The second writer will still carry on to write a valid .pyc file, though, won't it? So this wouldn't result in a permanently broken .pyc file being left behind, which is what the original problem description seemed say was happening. -- Greg From guido at python.org Fri May 17 00:34:41 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 16 May 2013 15:34:41 -0700 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <51955D35.8060909@canterbury.ac.nz> References: <20130515165808.0d99a3df@anarchist> <51955D35.8060909@canterbury.ac.nz> Message-ID: On Thu, May 16, 2013 at 3:27 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> This reminds me of the following bug, which can happen when two >> processes are both writing the .pyc file and a third is reading it. >> ... I think all the errors are >> >> actually explainable from this scenario. > > > The second writer will still carry on to write a valid > .pyc file, though, won't it? So this wouldn't result in > a permanently broken .pyc file being left behind, which > is what the original problem description seemed say > was happening. >From the evidence that is not completely clear to me. Thomas Wouters' scenario with two different Python versions writing the same .pyc file could cause that; I don't know if Barry has ruled that possibility out yet. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Fri May 17 00:59:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 May 2013 08:59:05 +1000 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> <51955D35.8060909@canterbury.ac.nz> Message-ID: On 17 May 2013 08:37, "Guido van Rossum" wrote: > > On Thu, May 16, 2013 at 3:27 PM, Greg Ewing wrote: > > Guido van Rossum wrote: > >> > >> This reminds me of the following bug, which can happen when two > >> processes are both writing the .pyc file and a third is reading it. > >> ... I think all the errors are > >> > >> actually explainable from this scenario. > > > > > > The second writer will still carry on to write a valid > > .pyc file, though, won't it? So this wouldn't result in > > a permanently broken .pyc file being left behind, which > > is what the original problem description seemed say > > was happening. > > From the evidence that is not completely clear to me. > > Thomas Wouters' scenario with two different Python versions writing > the same .pyc file could cause that; I don't know if Barry has ruled > that possibility out yet. 3.2 uses __pycache__, so it should only potentially conflict within the same version. I haven't heard any rumblings about anything like this in Fedora or RHEL, so my suspicions still lean towards a Debian or Ubuntu specific background service somehow managing to interfere. However, I'll ask explicitly on the Fedora Python list to see if anyone has encountered anything similar. Cheers, Nick. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From newellm at blur.com Fri May 17 01:17:20 2013 From: newellm at blur.com (Matt Newell) Date: Thu, 16 May 2013 16:17:20 -0700 Subject: [Python-Dev] Why is nb_inplace_add copied to sq_inplace_concat? Message-ID: <201305161617.20758.newellm@blur.com> I have encountered what I believe to be a bug but I'm sure there is some reason things are done as they are and I am hoping someone can shed some light or confirm it is indeed a bug. As a bit of background I have a c++ class that I use sip to generate python bindings. The potential python bug manifests itself as: >>> rl = RecordList() >>> rl += [] >>> rl NotImplemented The bindings fill in nb_inplace_add which appears to be implemented properly, returning a new reference to Py_NotImplemented if the right hand argument is not as expected. Where things appear to go wrong is that PyNumber_InPlaceAdd, after getting a NotImplemented return value from nb_inplace_add, then attempts to call sq_inplace_concat. From reading the code it appears that sq_inplace_concat is not supposed to return NotImplemented, instead it should set an exception and return null if the right hand arg is not supported. In my case sq_inplace_concat ends up being the same function as nb_inplace_add, which results in the buggy behavior. When I figured this out I tried to find out why sq_inplace_concat was set to the same function as nb_inplace_add, and ended up having to set a watchpoint in gdb which finally gave me the answer that python itself is setting sq_inplace_concat during type creation in the various functions in typeobject.c. Stack trace is below. I don't really understand what the fixup_slot_dispatchers function is doing, but it does seem like there must be a bug either in what it's doing, or in PyNumber_InPlaceAdd's handling of a NotImplemented return value from sq_inplace_concat. Thanks, Matt Python 2.7.3 (default, Jan 2 2013, 13:56:14) [GCC 4.7.2] on linux2 Stack trace where a watch on sq->sq_inplace_concat reveals the change: Hardware watchpoint 5: *(binaryfunc *) 0xcf6f88 Old value = (binaryfunc) 0 New value = (binaryfunc) 0x7ffff4d41c78 #0 update_one_slot.25588 (type=type at entry=0xcf6c70, p=0x86ba90) at ../Objects/typeobject.c:6203 #1 0x00000000004b96d0 in fixup_slot_dispatchers (type=0xcf6c70) at ../Objects/typeobject.c:6299 #2 type_new.part.40 (kwds=, args=0x0, metatype=) at ../Objects/typeobject.c:2464 #3 type_new.25999 (metatype=, args=0x0, kwds=) at ../Objects/typeobject.c:2048 #4 0x0000000000463c08 in type_call.25547 (type=0x7ffff65953a0, args=('RecordList', (,), {'__module__': 'blur.Stone'}), kwds=0x0) at ../Objects/typeobject.c:721 #5 0x00000000004644eb in PyObject_Call (func=, arg=, kw=) at ../Objects/abstract.c:2529 From tjreedy at udel.edu Fri May 17 02:31:50 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 16 May 2013 20:31:50 -0400 Subject: [Python-Dev] [Python-checkins] cpython: fix compilation on Windows In-Reply-To: <3bBP7q4xSRzRkQ@mail.python.org> References: <3bBP7q4xSRzRkQ@mail.python.org> Message-ID: <51957A76.7030601@udel.edu> On 5/16/2013 4:17 PM, victor.stinner wrote: > summary: fix compilation on Windows That fixed my problem with compiling 3.4, 32 bit, Win 7. Thanks. But I cannot compile 3.3 python_d since May 6. In fact, there are more errors now than 8 hours ago. 7 things failed to build instead of 5 (3 is normal for me, given the lack of some dependencies). I believe the following is new. Red error box with .../p33/PCBuild/make_versioninfo_d.exe is not a valid Win32 application. The VS gui output box has "Please verify that you have sufficient rights to run this command." Some more errors: 10>..\PC\pylauncher.rc(16): error RC2104: undefined keyword or key name: FIELD3 10> 9> symtable.c 9>..\Python\symtable.c(1245): error C2143: syntax error : missing ';' before 'type' 9>..\Python\symtable.c(1246): error C2065: 'cur' : undeclared identifier 9>..\Python\symtable.c(1248): error C2065: 'cur' : undeclared identifier 9>..\Python\symtable.c(1253): error C2065: 'cur' : undeclared identifier 23> Traceback (most recent call last): 23> File "build_ssl.py", line 253, in 23> main() 23> File "build_ssl.py", line 187, in main 23> os.chdir(ssl_dir) 23> FileNotFoundError: [WinError 2] The system cannot find the file specified: '..\\..\\openssl-1.0.1e' Earlier, about 4 other files had several warnings. I do no see the warnings now because they compiled then and have not changed. Errors are more urgent, but should warnings be ignored? Terry From ncoghlan at gmail.com Fri May 17 05:38:21 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 May 2013 13:38:21 +1000 Subject: [Python-Dev] Why is nb_inplace_add copied to sq_inplace_concat? In-Reply-To: <201305161617.20758.newellm@blur.com> References: <201305161617.20758.newellm@blur.com> Message-ID: On Fri, May 17, 2013 at 9:17 AM, Matt Newell wrote: > I don't really understand what the fixup_slot_dispatchers function is doing, > but it does seem like there must be a bug either in what it's doing, or in > PyNumber_InPlaceAdd's handling of a NotImplemented return value from > sq_inplace_concat. I didn't read your post in detail, but operand precedence in CPython is known to be broken for types which only populate the sq_* slots without also populating the corresponding nb_* slots: http://bugs.python.org/issue11477 The bug doesn't affect types implemented in Python, as the interpreter always populates both slots (I believe Cython also populated both slots for types defined that way). I made one attempt at fixing it (by changing the fallback handling in abstract.c) but it turned out to be completely unmaintainable (and didn't really work right anyway). There's another suggested approach that would likely work better (automatically populating the nb_* slots with delegation wrappers and losing the fallback code in abstract.c entirely), but it still needs a patch (the test cases from my failed attempt may still prove useful, though). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 17 05:41:32 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 17 May 2013 13:41:32 +1000 Subject: [Python-Dev] Why is nb_inplace_add copied to sq_inplace_concat? In-Reply-To: References: <201305161617.20758.newellm@blur.com> Message-ID: On Fri, May 17, 2013 at 1:38 PM, Nick Coghlan wrote: > On Fri, May 17, 2013 at 9:17 AM, Matt Newell wrote: >> I don't really understand what the fixup_slot_dispatchers function is doing, >> but it does seem like there must be a bug either in what it's doing, or in >> PyNumber_InPlaceAdd's handling of a NotImplemented return value from >> sq_inplace_concat. > > I didn't read your post in detail, but operand precedence in CPython > is known to be broken for types which only populate the sq_* slots > without also populating the corresponding nb_* slots: > http://bugs.python.org/issue11477 Oops, I meant to state that one of the consequences of the bug is that returning NotImplemented from the sq_* methods doesn't work at all - it's never checked and thus never turned into a TypeError. That's why changing to delegation from the nb_* slots is the most promising approach - all that handling is there and correct for the numeric types, but pure sequence types (which can only be created from C code) bypass that handling. I *did* read enough of the original post to know that was the symptom you were seeing, I just failed to mention that in my initial reply... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tseaver at palladion.com Fri May 17 05:48:24 2013 From: tseaver at palladion.com (Tres Seaver) Date: Thu, 16 May 2013 23:48:24 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> <51955D35.8060909@canterbury.ac.nz> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/16/2013 06:59 PM, Nick Coghlan wrote: > 3.2 uses __pycache__, so it should only potentially conflict within > the same version. > > I haven't heard any rumblings about anything like this in Fedora or > RHEL, so my suspicions still lean towards a Debian or Ubuntu specific > background service somehow managing to interfere. However, I'll ask > explicitly on the Fedora Python list to see if anyone has encountered > anything similar. I can confirm at least that I have seen this problem within the last two weeks on Ubuntu boxes unrelated to the thw Debian / Ubuntu build infrastruction. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGVqIgACgkQ+gerLs4ltQ6ksACePs7jO1TynGm3kNodpV4lPA2b VbgAoNNHMmQhJQhOvxuHMO/LFyv+Umho =KNdc -----END PGP SIGNATURE----- From newellm at blur.com Fri May 17 06:10:54 2013 From: newellm at blur.com (Matt Newell) Date: Thu, 16 May 2013 21:10:54 -0700 Subject: [Python-Dev] Why is nb_inplace_add copied to sq_inplace_concat? In-Reply-To: References: <201305161617.20758.newellm@blur.com> Message-ID: <201305162110.54756.newellm@blur.com> On Thursday, May 16, 2013 08:41:32 PM you wrote: > On Fri, May 17, 2013 at 1:38 PM, Nick Coghlan wrote: > > On Fri, May 17, 2013 at 9:17 AM, Matt Newell wrote: > >> I don't really understand what the fixup_slot_dispatchers function is > >> doing, but it does seem like there must be a bug either in what it's > >> doing, or in PyNumber_InPlaceAdd's handling of a NotImplemented return > >> value from sq_inplace_concat. > > > > I didn't read your post in detail, but operand precedence in CPython > > is known to be broken for types which only populate the sq_* slots > > without also populating the corresponding nb_* slots: > > http://bugs.python.org/issue11477 In this case it's the other way around. Only nb_inplace_add is populated, and python forces the buggy behavior that you describe below by copying nb_inplace_add to sq_inplace_concat. > > Oops, I meant to state that one of the consequences of the bug is that > returning NotImplemented from the sq_* methods doesn't work at all - > it's never checked and thus never turned into a TypeError. That's why > changing to delegation from the nb_* slots is the most promising > approach - all that handling is there and correct for the numeric > types, but pure sequence types (which can only be created from C code) > bypass that handling. > > I *did* read enough of the original post to know that was the symptom > you were seeing, I just failed to mention that in my initial reply... > I read through the bug and it looks like whatever solution you choose will fix this problem also. In the meantime I guess the solution for me is to always define sq_inplace_concat with a function that simply raises a TypeError. Hmm, even simpler would be to reset sq_inplace_concat to 0 after python sets it. I actually tested the latter in gdb and it gave the correct results. I'll just have to keep an eye out to make sure my workaround doesn't break things when the real fix gets into python. Matt From solipsis at pitrou.net Fri May 17 08:47:54 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 17 May 2013 08:47:54 +0200 Subject: [Python-Dev] Mysterious Python pyc file corruption problems References: <20130515165808.0d99a3df@anarchist> <20130516114230.507eefd0@anarchist> Message-ID: <20130517084754.61878b28@fsol> On Thu, 16 May 2013 11:42:30 -0400 Barry Warsaw wrote: > On May 16, 2013, at 08:33 AM, Nick Coghlan wrote: > > >Personally, I would be suspicious of developmental web services doing > >auto-reloading while an installer is recompiling the world. I don't have > >enough context to be sure how plausible that is as a possible explanation, > >though. > > It's possible that some Python written system service is getting invoked > during the bytecompile-on-installation phase. But now that we've found cases > of this problem with Python 3.3, I'm less inclined to follow that line of > reasoning since it implements atomic rename. Please try to reproduce it by adding e.g. some sleep() calls in the middle of the writing routine. Regards Antoine. From arigo at tunes.org Fri May 17 09:36:00 2013 From: arigo at tunes.org (Armin Rigo) Date: Fri, 17 May 2013 09:36:00 +0200 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130517084754.61878b28@fsol> References: <20130515165808.0d99a3df@anarchist> <20130516114230.507eefd0@anarchist> <20130517084754.61878b28@fsol> Message-ID: Hi all, How about using the shared-or-exclusive advisory file locks (with flock() or fcntl())? It may only work on Posix though. A bient?t, Armin. From solipsis at pitrou.net Fri May 17 15:01:19 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 17 May 2013 15:01:19 +0200 Subject: [Python-Dev] HAVE_FSTAT? Message-ID: <20130517150119.01077496@pitrou.net> Hello, Some pieces of code are still guarded by: #ifdef HAVE_FSTAT ... #endif I would expect all systems to have fstat() these days. It's pretty basic POSIX, and even Windows has had it for ages. Shouldn't we simply make those code blocks unconditional? It would avoid having to maintain unused fallback paths. Regards Antoine. From benjamin at python.org Fri May 17 15:56:55 2013 From: benjamin at python.org (Benjamin Peterson) Date: Fri, 17 May 2013 08:56:55 -0500 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <20130517150119.01077496@pitrou.net> References: <20130517150119.01077496@pitrou.net> Message-ID: 2013/5/17 Antoine Pitrou : > > Hello, > > Some pieces of code are still guarded by: > #ifdef HAVE_FSTAT > ... > #endif > > I would expect all systems to have fstat() these days. It's pretty > basic POSIX, and even Windows has had it for ages. Shouldn't we simply > make those code blocks unconditional? It would avoid having to maintain > unused fallback paths. +1 (Maybe Snakebite has such an exotic system, though?) :) -- Regards, Benjamin From skip at pobox.com Fri May 17 16:15:29 2013 From: skip at pobox.com (Skip Montanaro) Date: Fri, 17 May 2013 09:15:29 -0500 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <20130517150119.01077496@pitrou.net> References: <20130517150119.01077496@pitrou.net> Message-ID: > Some pieces of code are still guarded by: > #ifdef HAVE_FSTAT > ... > #endif Are there other guards for similarly common libc functions? If so, perhaps each one should be removed in a series of change sets, one per guard. Skip From solipsis at pitrou.net Fri May 17 17:56:08 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 17 May 2013 17:56:08 +0200 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: References: <20130517150119.01077496@pitrou.net> Message-ID: <20130517175608.58631665@fsol> On Fri, 17 May 2013 09:15:29 -0500 Skip Montanaro wrote: > > Some pieces of code are still guarded by: > > #ifdef HAVE_FSTAT > > ... > > #endif > > Are there other guards for similarly common libc functions? I don't think so. Someone should take a look though :-) Regards Antoine. From status at bugs.python.org Fri May 17 18:07:32 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 17 May 2013 18:07:32 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130517160732.AAA1356A3F@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-05-10 - 2013-05-17) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3966 ( +3) closed 25805 (+47) total 29771 (+50) Open issues with patches: 1776 Issues opened (32) ================== #17487: wave.Wave_read.getparams should be more user friendly http://bugs.python.org/issue17487 reopened by Claudiu.Popa #17807: Generator cleanup without tp_del http://bugs.python.org/issue17807 reopened by pitrou #17905: Add check for locale.h http://bugs.python.org/issue17905 reopened by pitrou #17951: TypeError during gdb backtracing http://bugs.python.org/issue17951 opened by Catalin.Patulea #17953: sys.modules cannot be reassigned http://bugs.python.org/issue17953 opened by Valentin.Lorentz #17955: Minor updates to Functional HOWTO http://bugs.python.org/issue17955 opened by akuchling #17956: add ScheduledExecutor http://bugs.python.org/issue17956 opened by neologix #17957: remove outdated (and unexcellent) paragraph in whatsnew http://bugs.python.org/issue17957 opened by tshepang #17960: Clarify the required behaviour of locals() http://bugs.python.org/issue17960 opened by ncoghlan #17961: Use enum names as values in enum.Enum convenience API http://bugs.python.org/issue17961 opened by ncoghlan #17963: Deprecate the frame hack for implicitly getting module details http://bugs.python.org/issue17963 opened by ncoghlan #17967: urllib2.open failed to access a url when a perent directory of http://bugs.python.org/issue17967 opened by foxkiller #17969: multiprocessing crash on exit http://bugs.python.org/issue17969 opened by kristjan.jonsson #17970: Mutlithread XML parsing cause segfault http://bugs.python.org/issue17970 opened by mrDoctorWho0.. #17972: inspect module docs omits many functions http://bugs.python.org/issue17972 opened by s7v7nislands at gmail.com #17974: Migrate unittest to argparse http://bugs.python.org/issue17974 opened by pitrou #17975: libpython3.so conflicts between $VERSIONs http://bugs.python.org/issue17975 opened by prlw1 #17976: file.write doesn't raise IOError when it should http://bugs.python.org/issue17976 opened by jasujm #17978: Python crashes if Py_Initialize/Py_Finalize are called multipl http://bugs.python.org/issue17978 opened by romuloceccon #17979: Cannot build 2.7 with --enable-unicode=no http://bugs.python.org/issue17979 opened by amaury.forgeotdarc #17980: CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard http://bugs.python.org/issue17980 opened by fweimer #17984: io and _pyio modules require the _io module http://bugs.python.org/issue17984 opened by serhiy.storchaka #17985: multiprocessing Queue.qsize() and Queue.empty() with different http://bugs.python.org/issue17985 opened by aod #17986: Alternative async subprocesses (pep 3145) http://bugs.python.org/issue17986 opened by sbt #17987: test.support.captured_stderr, captured_stdin not documented http://bugs.python.org/issue17987 opened by fdrake #17988: ElementTree.Element != ElementTree._ElementInterface http://bugs.python.org/issue17988 opened by jwilk #17989: ElementTree.Element broken attribute setting http://bugs.python.org/issue17989 opened by jwilk #17991: ctypes.c_char gives a misleading error when passed a one-chara http://bugs.python.org/issue17991 opened by Steven.Barker #17994: Change necessary in platform.py to support IronPython http://bugs.python.org/issue17994 opened by icordasc #17996: socket module should expose AF_LINK http://bugs.python.org/issue17996 opened by giampaolo.rodola #17997: ssl.match_hostname(): sub string wildcard should not match IDN http://bugs.python.org/issue17997 opened by christian.heimes #17998: internal error in regular expression engine http://bugs.python.org/issue17998 opened by jdemeyer Most recent 15 issues with no replies (15) ========================================== #17998: internal error in regular expression engine http://bugs.python.org/issue17998 #17997: ssl.match_hostname(): sub string wildcard should not match IDN http://bugs.python.org/issue17997 #17996: socket module should expose AF_LINK http://bugs.python.org/issue17996 #17994: Change necessary in platform.py to support IronPython http://bugs.python.org/issue17994 #17991: ctypes.c_char gives a misleading error when passed a one-chara http://bugs.python.org/issue17991 #17987: test.support.captured_stderr, captured_stdin not documented http://bugs.python.org/issue17987 #17986: Alternative async subprocesses (pep 3145) http://bugs.python.org/issue17986 #17975: libpython3.so conflicts between $VERSIONs http://bugs.python.org/issue17975 #17942: IDLE Debugger: names, values misaligned http://bugs.python.org/issue17942 #17933: test_ftp failure / ftplib error formatting issue http://bugs.python.org/issue17933 #17924: Deprecate stat.S_IF* integer constants http://bugs.python.org/issue17924 #17923: test glob with trailing slash fail http://bugs.python.org/issue17923 #17916: Provide dis.Bytecode based equivalent of dis.distb http://bugs.python.org/issue17916 #17909: Autodetecting JSON encoding http://bugs.python.org/issue17909 #17902: Document that _elementtree C API cannot use custom TreeBuilder http://bugs.python.org/issue17902 Most recent 15 issues waiting for review (15) ============================================= #17988: ElementTree.Element != ElementTree._ElementInterface http://bugs.python.org/issue17988 #17980: CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard http://bugs.python.org/issue17980 #17979: Cannot build 2.7 with --enable-unicode=no http://bugs.python.org/issue17979 #17978: Python crashes if Py_Initialize/Py_Finalize are called multipl http://bugs.python.org/issue17978 #17976: file.write doesn't raise IOError when it should http://bugs.python.org/issue17976 #17974: Migrate unittest to argparse http://bugs.python.org/issue17974 #17956: add ScheduledExecutor http://bugs.python.org/issue17956 #17951: TypeError during gdb backtracing http://bugs.python.org/issue17951 #17947: Code, test, and doc review for PEP-0435 Enum http://bugs.python.org/issue17947 #17945: tkinter/Python 3.3.0: peer_create doesn't instantiate Text http://bugs.python.org/issue17945 #17944: Refactor test_zipfile http://bugs.python.org/issue17944 #17941: namedtuple should support fully qualified name for more portab http://bugs.python.org/issue17941 #17940: extra code in argparse.py http://bugs.python.org/issue17940 #17937: Collect garbage harder at shutdown http://bugs.python.org/issue17937 #17936: O(n**2) behaviour when adding/removing classes http://bugs.python.org/issue17936 Top 10 most discussed issues (10) ================================= #17914: add os.cpu_count() http://bugs.python.org/issue17914 36 msgs #17980: CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard http://bugs.python.org/issue17980 31 msgs #17947: Code, test, and doc review for PEP-0435 Enum http://bugs.python.org/issue17947 27 msgs #15392: Create a unittest framework for IDLE http://bugs.python.org/issue15392 16 msgs #17961: Use enum names as values in enum.Enum convenience API http://bugs.python.org/issue17961 15 msgs #17936: O(n**2) behaviour when adding/removing classes http://bugs.python.org/issue17936 14 msgs #17969: multiprocessing crash on exit http://bugs.python.org/issue17969 13 msgs #17976: file.write doesn't raise IOError when it should http://bugs.python.org/issue17976 10 msgs #8604: Adding an atomic FS write API http://bugs.python.org/issue8604 9 msgs #17974: Migrate unittest to argparse http://bugs.python.org/issue17974 9 msgs Issues closed (41) ================== #6208: path separator output ignores shell's path separator: / instea http://bugs.python.org/issue6208 closed by terry.reedy #14596: struct.unpack memory leak http://bugs.python.org/issue14596 closed by pitrou #17237: m68k aligns on 16bit boundaries. http://bugs.python.org/issue17237 closed by pitrou #17468: Generator memory leak http://bugs.python.org/issue17468 closed by pitrou #17547: "checking whether gcc supports ParseTuple __format__... " erro http://bugs.python.org/issue17547 closed by python-dev #17563: Excessive resizing of dicts when used as a cache http://bugs.python.org/issue17563 closed by rhettinger #17606: xml.sax.saxutils.XMLGenerator doesn't support byte strings http://bugs.python.org/issue17606 closed by serhiy.storchaka #17732: distutils.cfg Can Break venv http://bugs.python.org/issue17732 closed by georg.brandl #17742: Add _PyBytesWriter API http://bugs.python.org/issue17742 closed by haypo #17754: test_ctypes assumes LANG=C LC_ALL=C http://bugs.python.org/issue17754 closed by doko #17843: Lib/test/testbz2_bigmem.bz2 trigger virus warnings http://bugs.python.org/issue17843 closed by georg.brandl #17895: TemporaryFile name returns an integer in python3 http://bugs.python.org/issue17895 closed by terry.reedy #17906: JSON should accept lone surrogates http://bugs.python.org/issue17906 closed by serhiy.storchaka #17915: Encoding error with sax and codecs http://bugs.python.org/issue17915 closed by georg.brandl #17920: Documentation: "complete ordering" should be "total ordering" http://bugs.python.org/issue17920 closed by rhettinger #17927: Argument copied into cell still referenced by frame http://bugs.python.org/issue17927 closed by benjamin.peterson #17943: AttributeError: 'long' object has no attribute 'release' in Qu http://bugs.python.org/issue17943 closed by georg.brandl #17948: HTTPS and sending a big file size hangs. http://bugs.python.org/issue17948 closed by pitrou #17949: operator documentation mixup http://bugs.python.org/issue17949 closed by ezio.melotti #17950: Dynamic classes contain non-breakable reference cycles http://bugs.python.org/issue17950 closed by gvanrossum #17952: editors-and-tools section of devguide does not appear to be ac http://bugs.python.org/issue17952 closed by ned.deily #17954: Support creation of extensible enums through metaclass subclas http://bugs.python.org/issue17954 closed by ncoghlan #17958: int(math.log(2**i, 2)) http://bugs.python.org/issue17958 closed by mark.dickinson #17959: Alternate approach to aliasing for PEP 435 http://bugs.python.org/issue17959 closed by ncoghlan #17962: Broken OpenSSL version in Windows builds http://bugs.python.org/issue17962 closed by python-dev #17964: os.sysconf(): return type of the C function sysconf() is long, http://bugs.python.org/issue17964 closed by haypo #17965: argparse does not dest.replace('-', '_') for positionals http://bugs.python.org/issue17965 closed by r.david.murray #17966: Lack of consistency in PEP 8 -- Style Guide for Python Code http://bugs.python.org/issue17966 closed by gvanrossum #17968: memory leak in listxattr() http://bugs.python.org/issue17968 closed by pitrou #17971: Weird interaction between Komodo Python debugger C module & Py http://bugs.python.org/issue17971 closed by benjamin.peterson #17973: '+=' on a list inside tuple both succeeds and raises an except http://bugs.python.org/issue17973 closed by ronaldoussoren #17977: urllib.request.urlopen() cadefault argument is documented with http://bugs.python.org/issue17977 closed by barry #17981: SysLogHandler closes connection before using it http://bugs.python.org/issue17981 closed by vinay.sajip #17982: Syntax Error in IDLE3 not in IDLE http://bugs.python.org/issue17982 closed by terry.reedy #17983: global __class__ statement in class declaration http://bugs.python.org/issue17983 closed by python-dev #17990: 2.7 builds can fail due to unconditional inclusion of include http://bugs.python.org/issue17990 closed by benjamin.peterson #17992: test_asynchat hangs http://bugs.python.org/issue17992 closed by neologix #17993: Missed comma causes unintentional implicit string literal conc http://bugs.python.org/issue17993 closed by serhiy.storchaka #17995: report??????????????????????????????????????????158766 http://bugs.python.org/issue17995 closed by fdrake #995907: memory leak with threads and enhancement of the timer class http://bugs.python.org/issue995907 closed by neologix #1662581: the re module can perform poorly: O(2**n) versus O(n**2) http://bugs.python.org/issue1662581 closed by gregory.p.smith From barry at python.org Fri May 17 18:17:14 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 12:17:14 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: <20130517121714.591a3097@anarchist> On May 16, 2013, at 02:19 PM, Guido van Rossum wrote: >Now consider the following scenario. It involves *three* processes. > >- Two unrelated processes both start and want to import the same module. >- They both see the .pyc file is missing/corrupt and decide to write it. >- The first process finishing writing the file, writing the correct header. >- Now a third process wants to import the module, sees the valid >header, and starts reading the file. >- However, while this is going on, the second process gets ready to >write the file. >- The second process truncates the file, writes the dummy header, and >then stalls. >- At this point the third process (which thought it was reading a >valid file) sees an unexpected EOF because the file has been >truncated. > >Now, this would explain the EOFError, but not necessarily the >ValueError with "unknown type code". However, it looks like marshal >doesn't always check for EOF immediately (sometimes it calls getc() >without checking the result, and sometimes it doesn't check the error >state after calling r_string()), so I think all the errors are >actually explainable from this scenario. Thanks for this, it's a very interesting scenario. I think this isn't a complete explanation of what's going on though. I've spoken with our defect analyst and looked at a bunch of the bug reports, and as far as we can tell, the corruptions are permanent. Users generally have to take manual action to delete the .pyc files and re-create them. One thing I hadn't realized until now is that until Python 3.4, py_compile.py doesn't write the pyc files atomically, and in fact this is the mechanism we're using to create the pyc files at package installation time. That could explain why we're still seeing these issues even in Python 3.3. I've also uncovered a bug from 2010 reported in Debian[1] about pyc file corruptions that happened when the byte-compilation driver program exited before its workers[2] could complete. We're definitely seeing issues post-landing of this fix, so I need to do some more analysis to see if that fix was enough. If it wasn't, and we're not doing atomic renames, than that could explain the permanent corruptions. Cheers, -Barry [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=590224 [2] the workers each call calling `$PYTHON -m py_compile - < py-filenames` -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri May 17 18:26:23 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 12:26:23 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> <51955D35.8060909@canterbury.ac.nz> Message-ID: <20130517122623.78b4bbd5@anarchist> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On May 16, 2013, at 11:48 PM, Tres Seaver wrote: >I can confirm at least that I have seen this problem within the last two >weeks on Ubuntu boxes unrelated to the thw Debian / Ubuntu build >infrastruction. Hi Tres. If you see this happen, *please* get in touch with me, preferably before you fix it ;). I'd like to do some additional analysis on a broken system in semi-realtime. Cheers, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAEBCAAGBQJRllovAAoJEBJutWOnSwa/+kgP/iyqwiKJvSszzPITSSU8lOps Ll75gWXyyrLyQ/NBp4neo+d7hsf/GqLT3wn6dc7dcq23khqo4c6tYlmdVaZ5QJGk qu9BjSPdKLdDQFt3k3MyHpjJKwOa6Fn/JmCyZnoPd2zST/RZUsL6kCBOosG6FGTU QJUeEM/GIv5wQ60tfdVmI3zBtYYkjfrCX6uHcw2xFt1JHAKKgH/rNh72t2q3YKKG yqVAbyxQyjxEkS7IQmacPPSy7sEbT5GD8xR9F/P6w5h05DaTqbau9haF2kyxhPvI wEP0UQZWsZ/QmMNaS9w/WcWx31ASCNR4NIYqdWYd5KHQqs7Y1Vf6sFW1hNhfPFR9 0C4wJhublK1ewKYf5AjHeLOEVddN1xNGqNZr/7FouOqNlDYwRObde8J0FONWzHkl GV6qqxvKnq4FI4Y+EtUaRS52j3NkMhD22bpkTf2EJij1LV80AsHeB01o/ZBUjTqu jmgJ2QLS34h3gGIU9+OgE9rtzTACdI783SQ5827hiCVaE62IhTgrkLGBNSv1WFxp Onc+E2YteEwHtYMsBK1ck8YQkbgC/XYpSAVZGoW8xbXy5/+33fQRUZPj9zXNBW2F Jzgo+mGqPC5SIKkaiZY22GYERC6bvwCsoV9o+DpwmCUhwT/2ATb7IAp+8Vembjit JqMBWyvdEg4uzjnDuAYZ =93Og -----END PGP SIGNATURE----- From barry at python.org Fri May 17 18:32:35 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 12:32:35 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: <20130517123235.4c3494ae@anarchist> On May 17, 2013, at 12:10 AM, Thomas Wouters wrote: >The 'unknown type codes' can also be explained if the two processes writing >to the .pyc files are *different Python versions*. As you may recall, at >Google we used to use modified Python interpreters that used '.pyc-2.2', >'.pyc-2.4', etc, for the pyc extension. That was because otherwise >different Python versions would keep overwriting the .pyc files of shared >Python modules, and "at Google scale" it caused all manner of problems... I >guess Ubuntu is approaching Google scale ;-) I'd like to think so. :) But I don't think this is part of the equation. For Python 2 on Debian/Ubuntu, we use an elaborate symlink farm to keep all pyc files in Python-version-specific directories. The stdlib files are already segregated, but the symlink farm takes care of package installs. Note that the symlinks are to the .py files, not the pyc files. Fortunately we almost don't care about this anymore. We dropped 2.6 in Ubuntu a while ago and we'll very likely drop 2.6 in Debian Jessie. We don't care about any Python 3s earlier than 3.2, and getting rid of the symlink farm was the primary motivator for PEP 3147. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri May 17 18:34:34 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 12:34:34 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> Message-ID: <20130517123434.033a22e7@anarchist> On May 16, 2013, at 05:30 PM, Brett Cannon wrote: >Just so people know, this is how we used to do it. In importlib we >write the entire file to a temp file and then to an atomic rename. Yep, and I suspect that our fix, even if we don't completely identify the root cause, will be to change py_compile.py to do atomic renames. Whether that would be an appropriate fix for 3.2, 3.3 and 2.7 is a different discussion. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri May 17 18:42:25 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 12:42:25 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> Message-ID: <20130517124225.55a5ed29@anarchist> On May 16, 2013, at 04:52 PM, Terry Jan Reedy wrote: >If the corruption only happens on Ubuntu, that would constitute 'rhyme' >;-). I realize that asking for reports on other systems is part of the reason >you posted, but I don't remember seeing any others yet. Right. :) It's harder to dig out similar problems in Debian[1] but it's pretty clear that there have been *some* similar reports in Debian. Ubuntu and Debian share almost all their Python infrastructure. It would definitely be interesting to whether Fedora/RedHat or any other Linux distros have seen similar problems. I don't know how Fedora/RH does package installation. In Debian/Ubuntu, we do not ship pyc files, but instead they are generated in "post-installation" scripts, which boil down to calls to `$PYTHON -m py_compile - < filenames`. >Do failures only occur during compileall process? (or whatever substitute you >use). No, they are all post-installation failures in unrelated packages that try to import pure-Python modules. AFAICT, the post-installation byte-compilation scripts are not erroring. Doing a post-compilation verification step might be interesting, but I bet backporting atomic renames to py_compile.py will fix the problem, or at least band-aid over it. ;) -Barry From dmalcolm at redhat.com Fri May 17 19:19:27 2013 From: dmalcolm at redhat.com (David Malcolm) Date: Fri, 17 May 2013 13:19:27 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130517124225.55a5ed29@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> <20130517124225.55a5ed29@anarchist> Message-ID: <1368811167.13771.23.camel@surprise> On Fri, 2013-05-17 at 12:42 -0400, Barry Warsaw wrote: > On May 16, 2013, at 04:52 PM, Terry Jan Reedy wrote: > > >If the corruption only happens on Ubuntu, that would constitute 'rhyme' > >;-). I realize that asking for reports on other systems is part of the reason > >you posted, but I don't remember seeing any others yet. > > Right. :) It's harder to dig out similar problems in Debian[1] but it's > pretty clear that there have been *some* similar reports in Debian. Ubuntu > and Debian share almost all their Python infrastructure. It would definitely > be interesting to whether Fedora/RedHat or any other Linux distros have seen > similar problems. FWIW I don't recall seeing such problems on Fedora/RH, though that could be due to... > I don't know how Fedora/RH does package installation. In Debian/Ubuntu, we do > not ship pyc files, but instead they are generated in "post-installation" > scripts, which boil down to calls to `$PYTHON -m py_compile - < filenames`. Fedora/RH pregenerate the .pyc files during rpm creation, and they exist as part of the rpm payload. Dave From barry at python.org Fri May 17 20:23:56 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 14:23:56 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <1368811167.13771.23.camel@surprise> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> <20130517124225.55a5ed29@anarchist> <1368811167.13771.23.camel@surprise> Message-ID: <20130517142356.21d937c8@anarchist> On May 17, 2013, at 01:19 PM, David Malcolm wrote: >Fedora/RH pregenerate the .pyc files during rpm creation, and they exist >as part of the rpm payload. Good to know, thanks. Do you use `$PYTHON -m py_compile` to generate the pyc files at build time? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From tseaver at palladion.com Fri May 17 20:51:12 2013 From: tseaver at palladion.com (Tres Seaver) Date: Fri, 17 May 2013 14:51:12 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130517122623.78b4bbd5@anarchist> References: <20130515165808.0d99a3df@anarchist> <51955D35.8060909@canterbury.ac.nz> <20130517122623.78b4bbd5@anarchist> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/17/2013 12:26 PM, Barry Warsaw wrote: > On May 16, 2013, at 11:48 PM, Tres Seaver wrote: > >> I can confirm at least that I have seen this problem within the last >> two weeks on Ubuntu boxes unrelated to the thw Debian / Ubuntu >> build infrastruction. > > Hi Tres. If you see this happen, *please* get in touch with me, > preferably before you fix it ;). I'd like to do some additional > analysis on a broken system in semi-realtime. Wilco (although I don't know for sure what provoked it: my memory is that it was while running 'tox' or 'detox' for ZODB). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGWfCAACgkQ+gerLs4ltQ5YcQCguzlxAP8InrLEgdGx7JiK0as4 z9MAnR53bubpntt+272Y0BNYlEO8YcdI =LSAR -----END PGP SIGNATURE----- From tjreedy at udel.edu Fri May 17 21:02:00 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 17 May 2013 15:02:00 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130517124225.55a5ed29@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> <20130517124225.55a5ed29@anarchist> Message-ID: On 5/17/2013 12:42 PM, Barry Warsaw wrote: > On May 16, 2013, at 04:52 PM, Terry Jan Reedy wrote: >> Do failures only occur during compileall process? (or whatever substitute you >> use). > > No, they are all post-installation failures in unrelated packages that try to > import pure-Python modules. What I mean is, is the corruption (not the detection of corruption) only happening during mass compilation of the stdlib? When user imports a single non-stdlib file he has written the first time, does that ever get corrupted. > AFAICT, the post-installation byte-compilation scripts are not erroring. I THINK that you are answering my question by saying that corruption only happens during installation mass compilation. > > Doing a post-compilation verification step might be interesting, but I bet > backporting atomic renames to py_compile.py will fix the problem, or at least > band-aid over it. ;) I intended to suggest that py_compile be changed to do that. Then Brett said it already had for 3.4. I see no reason why not to backport, but maybe someone else will. The main design use of marshal is to produce .pyc files that consist of a prefix and marshalled codeobject. Perhaps marshal.dump(s) should be extended to take a third prefix='' parameter that would be prepended to the result as produced today. .dump first does .dumps, though inline. I assume that .dumps constructs a string by starting with [], appending pieces, and joining. At least, any composite object dump would. The change would amount to starting with [prefix] instead of []. Then py_compile would amount to pyc = marshal.dump(codeobject, file, pyc) Terry From barry at python.org Fri May 17 21:16:41 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 17 May 2013 15:16:41 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> <20130517124225.55a5ed29@anarchist> Message-ID: <20130517151641.46133368@anarchist> On May 17, 2013, at 03:02 PM, Terry Jan Reedy wrote: >What I mean is, is the corruption (not the detection of corruption) only >happening during mass compilation of the stdlib? When user imports a single >non-stdlib file he has written the first time, does that ever get corrupted. It's not limited to the stdlib, but yes, as far as we can tell, it happens during package installation on an end-user system, when the trigger script mass byte-compiles a package's Python source files. >I intended to suggest that py_compile be changed to do that. Then Brett said >it already had for 3.4. I see no reason why not to backport, but maybe >someone else will. I tend to agree. >The main design use of marshal is to produce .pyc files that consist of a >prefix and marshalled codeobject. Perhaps marshal.dump(s) should be extended >to take a third prefix='' parameter that would be prepended to the result as >produced today. .dump first does .dumps, though inline. I assume that .dumps >constructs a string by starting with [], appending pieces, and joining. At >least, any composite object dump would. The change would amount to starting >with [prefix] instead of []. Then py_compile would amount to > pyc = > marshal.dump(codeobject, file, pyc) That wouldn't be a backportable change, and right now I'm trying to stay focused on fixing this specific problem. ;) -Barry From dmalcolm at redhat.com Fri May 17 21:38:46 2013 From: dmalcolm at redhat.com (David Malcolm) Date: Fri, 17 May 2013 15:38:46 -0400 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130517142356.21d937c8@anarchist> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> <20130517124225.55a5ed29@anarchist> <1368811167.13771.23.camel@surprise> <20130517142356.21d937c8@anarchist> Message-ID: <1368819526.13771.36.camel@surprise> On Fri, 2013-05-17 at 14:23 -0400, Barry Warsaw wrote: > On May 17, 2013, at 01:19 PM, David Malcolm wrote: > > >Fedora/RH pregenerate the .pyc files during rpm creation, and they exist > >as part of the rpm payload. > > Good to know, thanks. Do you use `$PYTHON -m py_compile` to generate the pyc > files at build time? We use compileall.compiledir() most of the time, but occasionally use py_compile.compile() Specifically, for python 2, the core rpm-build package has a script: /usr/lib/rpm/brp-python-bytecompile run automatically in a postprocessing phase after the upstream source has installed to a DESTDIR, and this invokes compileall.compiledir() on all .py files in the package payload, with various logic to segment the different parts of the filesystem to be bytecompiled by the appropriate python binary (since we have duplicate .py files for different python runtimes). This is all done sequentially, so I'd be surprised if different pythons splatted on each other's .pyc files at this time. In addition, python3-devel contains a: /etc/rpm/macros.pybytecompile which defines a py_byte_compile() macro, which can be used for overriding these rules (IIRC), and this does use pycompile.compile() Hope this is helpful Dave From doko at ubuntu.com Fri May 17 21:32:06 2013 From: doko at ubuntu.com (Matthias Klose) Date: Fri, 17 May 2013 21:32:06 +0200 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <20130515165808.0d99a3df@anarchist> References: <20130515165808.0d99a3df@anarchist> Message-ID: <519685B6.4010604@ubuntu.com> Am 15.05.2013 22:58, schrieb Barry Warsaw: > I am looking into a particularly vexing Python problem on Ubuntu that > manifests in several different ways. I think the problem is the same one > described in http://bugs.python.org/issue13146 and I sent a message on the > subject to the ubuntu-devel list: > https://lists.ubuntu.com/archives/ubuntu-devel/2013-May/037129.html please consider that Ubuntu does have some other upgrade issues, when files in the dpkg database (/var/lib/dpkg/info/*) are corrupted, or just are files having null bytes. So these and the pyc issues share in common that these files are written after a package is unpacked. I'm not saying that the problem might be with the pyc writing, but we do see other file corruption as well. Matthias From solipsis at pitrou.net Fri May 17 23:59:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 17 May 2013 23:59:51 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 References: <51946C9E.10607@python.org> Message-ID: <20130517235951.41780c67@fsol> On Thu, 16 May 2013 13:24:36 +0200 Charles-Fran?ois Natali wrote: > 2013/5/16 Serhiy Storchaka : > > 16.05.13 08:20, Georg Brandl ???????(??): > >> > >> On behalf of the Python development team, I am pleased to announce the > >> releases of Python 3.2.5 and 3.3.2. > >> > >> The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip > >> and xml.sax modules. Details can be found in the changelogs: > > > > > > It seems that I'm the main culprit of this releases. > > Well, when I look at the changelogs, what strikes me more is that > you're the author of *many* fixes, and also a lot of new > features/improvements. > > So I wouldn't feel bad if I were you, this kind of things happens (and > it certainly did to me). Seconded. Thanks Serhiy for your contributions :=) Regards Antoine. From ncoghlan at gmail.com Sat May 18 05:16:12 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 May 2013 13:16:12 +1000 Subject: [Python-Dev] Mysterious Python pyc file corruption problems In-Reply-To: <1368811167.13771.23.camel@surprise> References: <20130515165808.0d99a3df@anarchist> <20130516114002.1d584f79@anarchist> <519507BC.2080600@python.org> <20130516123827.65a22866@anarchist> <51950D07.2050107@stoneleaf.us> <20130516140457.288859bd@anarchist> <20130517124225.55a5ed29@anarchist> <1368811167.13771.23.camel@surprise> Message-ID: On Sat, May 18, 2013 at 3:19 AM, David Malcolm wrote: > On Fri, 2013-05-17 at 12:42 -0400, Barry Warsaw wrote: >> On May 16, 2013, at 04:52 PM, Terry Jan Reedy wrote: >> >> >If the corruption only happens on Ubuntu, that would constitute 'rhyme' >> >;-). I realize that asking for reports on other systems is part of the reason >> >you posted, but I don't remember seeing any others yet. >> >> Right. :) It's harder to dig out similar problems in Debian[1] but it's >> pretty clear that there have been *some* similar reports in Debian. Ubuntu >> and Debian share almost all their Python infrastructure. It would definitely >> be interesting to whether Fedora/RedHat or any other Linux distros have seen >> similar problems. > > FWIW I don't recall seeing such problems on Fedora/RH, though that could > be due to... > >> I don't know how Fedora/RH does package installation. In Debian/Ubuntu, we do >> not ship pyc files, but instead they are generated in "post-installation" >> scripts, which boil down to calls to `$PYTHON -m py_compile - < filenames`. > > Fedora/RH pregenerate the .pyc files during rpm creation, and they exist > as part of the rpm payload. So in effect, we'd always be doing an atomic rename (and file conflicts between RPMs wouldn't be allowed in the first place). Combined with Brett's info that even 3.3 doesn't use atomic renames for pre-compilation (only for implicit compilation), I consider that strong evidence in favour of Guido's theory that Debian are getting write conflicts somewhere. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat May 18 05:21:49 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 May 2013 13:21:49 +1000 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 In-Reply-To: <20130517235951.41780c67@fsol> References: <51946C9E.10607@python.org> <20130517235951.41780c67@fsol> Message-ID: On Sat, May 18, 2013 at 7:59 AM, Antoine Pitrou wrote: > On Thu, 16 May 2013 13:24:36 +0200 > Charles-Fran?ois Natali wrote: >> 2013/5/16 Serhiy Storchaka : >> > 16.05.13 08:20, Georg Brandl ???????(??): >> >> >> >> On behalf of the Python development team, I am pleased to announce the >> >> releases of Python 3.2.5 and 3.3.2. >> >> >> >> The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip >> >> and xml.sax modules. Details can be found in the changelogs: >> > >> > >> > It seems that I'm the main culprit of this releases. >> >> Well, when I look at the changelogs, what strikes me more is that >> you're the author of *many* fixes, and also a lot of new >> features/improvements. >> >> So I wouldn't feel bad if I were you, this kind of things happens (and >> it certainly did to me). > > Seconded. Thanks Serhiy for your contributions :=) Indeed! Any need for quick releases to address regressions is always a collective failure - for them to happen, the error has to be in something not checked by our test suite, and the code change has to be one where nobody monitoring python-checkins spotted a potential issue. Hopefully the fixes for these regressions also came with new test cases (although that is obviously difficult for upstream regressions like those in the PyOpenSSL release bundled with the original Windows binaries). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From g.brandl at gmx.net Sat May 18 07:32:18 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 18 May 2013 07:32:18 +0200 Subject: [Python-Dev] [RELEASED] Python 3.2.5 and Python 3.3.2 In-Reply-To: References: <51946C9E.10607@python.org> <20130517235951.41780c67@fsol> Message-ID: Am 18.05.2013 05:21, schrieb Nick Coghlan: > On Sat, May 18, 2013 at 7:59 AM, Antoine Pitrou wrote: >> On Thu, 16 May 2013 13:24:36 +0200 >> Charles-Fran?ois Natali wrote: >>> 2013/5/16 Serhiy Storchaka : >>> > 16.05.13 08:20, Georg Brandl ???????(??): >>> >> >>> >> On behalf of the Python development team, I am pleased to announce the >>> >> releases of Python 3.2.5 and 3.3.2. >>> >> >>> >> The releases fix a few regressions in 3.2.4 and 3.3.1 in the zipfile, gzip >>> >> and xml.sax modules. Details can be found in the changelogs: >>> > >>> > >>> > It seems that I'm the main culprit of this releases. >>> >>> Well, when I look at the changelogs, what strikes me more is that >>> you're the author of *many* fixes, and also a lot of new >>> features/improvements. >>> >>> So I wouldn't feel bad if I were you, this kind of things happens (and >>> it certainly did to me). >> >> Seconded. Thanks Serhiy for your contributions :=) > > Indeed! > > Any need for quick releases to address regressions is always a > collective failure - for them to happen, the error has to be in > something not checked by our test suite, and the code change has to be > one where nobody monitoring python-checkins spotted a potential issue. > > Hopefully the fixes for these regressions also came with new test > cases (although that is obviously difficult for upstream regressions > like those in the PyOpenSSL release bundled with the original Windows > binaries). Exactly. Thanks Serhiy for making us improve the test suite :) Georg From a.cavallo at cavallinux.eu Sat May 18 09:31:36 2013 From: a.cavallo at cavallinux.eu (Antonio Cavallo) Date: Sat, 18 May 2013 08:31:36 +0100 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <20130517175608.58631665@fsol> References: <20130517150119.01077496@pitrou.net> <20130517175608.58631665@fsol> Message-ID: <64691D60-4EB7-49B8-8775-D8D084325284@cavallinux.eu> I've had a quick look with grep -R HAVE_ * | egrep '[.]c:'. Modules/posixmodule.c has HAVE_UTIME_H and it might be standard libc on all posix platforms. Objects/obmalloc.c has HAVE_MMAP? but I guess that's fine given other platforms might not have such facility. Depending on the granularity (on a per platform or per feature) probably yes, there aren't many left. I hope this helps On 17 May 2013, at 16:56, Antoine Pitrou wrote: > On Fri, 17 May 2013 09:15:29 -0500 > Skip Montanaro wrote: >>> Some pieces of code are still guarded by: >>> #ifdef HAVE_FSTAT >>> ... >>> #endif >> >> Are there other guards for similarly common libc functions? > > I don't think so. Someone should take a look though :-) > > Regards > > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/a.cavallo%40cavallinux.eu From solipsis at pitrou.net Sat May 18 10:59:10 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 10:59:10 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization Message-ID: <20130518105910.1eecfa5f@fsol> Hello, I would like to submit the following PEP for discussion and evaluation. Regards Antoine. PEP: 442 Title: Safe object finalization Version: $Revision$ Last-Modified: $Date$ Author: Antoine Pitrou Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2013-05-18 Python-Version: 3.4 Post-History: Resolution: TBD Abstract ======== This PEP proposes to deal with the current limitations of object finalization. The goal is to be able to define and run finalizers for any object, regardless of their position in the object graph. This PEP doesn't call for any change in Python code. Objects with existing finalizers will benefit automatically. Definitions =========== Reference A directional link from an object to another. The target of the reference is kept alive by the reference, as long as the source is itself alive and the reference isn't cleared. Weak reference A directional link from an object to another, which doesn't keep alive its target. This PEP focusses on non-weak references. Reference cycle A cyclic subgraph of directional links between objects, which keeps those objects from being collected in a pure reference-counting scheme. Cyclic isolate (CI) A reference cycle in which no object is referenced from outside the cycle *and* whose objects are still in a usable, non-broken state: they can access each other from their respective finalizers. Cyclic garbage collector (GC) A device able to detect cyclic isolates and turn them into cyclic trash. Objects in cyclic trash are eventually disposed of by the natural effect of the references being cleared and their reference counts dropping to zero. Cyclic trash (CT) A reference cycle, or former reference cycle, in which no object is referenced from outside the cycle *and* whose objects have started being cleared by the GC. Objects in cyclic trash are potential zombies; if they are accessed by Python code, the symptoms can vary from weird AttributeErrors to crashes. Zombie / broken object An object part of cyclic trash. The term stresses that the object is not safe: its outgoing references may have been cleared, or one of the objects it references may be zombie. Therefore, it should not be accessed by arbitrary code (such as finalizers). Finalizer A function or method called when an object is intended to be disposed of. The finalizer can access the object and release any resource held by the object (for example mutexes or file descriptors). An example is a ``__del__`` method. Resurrection The process by which a finalizer creates a new reference to an object in a CI. This can happen as a quirky but supported side-effect of ``__del__`` methods. Impact ====== While this PEP discusses CPython-specific implementation details, the change in finalization semantics is expected to affect the Python ecosystem as a whole. In particular, this PEP obsoletes the current guideline that "objects with a ``__del__`` method should not be part of a reference cycle". Benefits ======== The primary benefits of this PEP regard objects with finalizers, such as objects with a ``__del__`` method and generators with a ``finally`` block. Those objects can now be reclaimed when they are part of a reference cycle. The PEP also paves the way for further benefits: * The module shutdown procedure may not need to set global variables to None anymore. This could solve a well-known class of irritating issues. The PEP doesn't change the semantics of: * Weak references caught in reference cycles. * C extension types with a custom ``tp_dealloc`` function. Description =========== Reference-counted disposal -------------------------- In normal reference-counted disposal, an object's finalizer is called just before the object is deallocated. If the finalizer resurrects the object, deallocation is aborted. *However*, if the object was already finalized, then the finalizer isn't called. This prevents us from finalizing zombies (see below). Disposal of cyclic isolates --------------------------- Cyclic isolates are first detected by the garbage collector, and then disposed of. The detection phase doesn't change and won't be described here. Disposal of a CI traditionally works in the following order: 1. Weakrefs to CI objects are cleared, and their callbacks called. At this point, the objects are still safe to use. 2. The CI becomes a CT as the GC systematically breaks all known references inside it (using the ``tp_clear`` function). 3. Nothing. All CT objects should have been disposed of in step 2 (as a side-effect of clearing references); this collection is finished. This PEP proposes to turn CI disposal into the following sequence (new steps are in bold): 1. Weakrefs to CI objects are cleared, and their callbacks called. At this point, the objects are still safe to use. 2. **The finalizers of all CI objects are called.** 3. **The CI is traversed again to determine if it is still isolated. If it is determined that at least one object in CI is now reachable from outside the CI, this collection is aborted and the whole CI is resurrected. Otherwise, proceed.** 4. The CI becomes a CT as the GC systematically breaks all known references inside it (using the ``tp_clear`` function). 5. Nothing. All CT objects should have been disposed of in step 4 (as a side-effect of clearing references); this collection is finished. C-level changes =============== Type objects get a new ``tp_finalize`` slot to which ``__del__`` methods are bound. Generators are also modified to use this slot, rather than ``tp_del``. At the C level, a ``tp_finalize`` function is a normal function which will be called with a regular, alive object as its only argument. It should not attempt to revive or collect the object. For compatibility, ``tp_del`` is kept in the type structure. Handling of objects with a non-NULL ``tp_del`` is unchanged: when part of a CI, they are not finalized and end up in ``gc.garbage``. However, a non-NULL ``tp_del`` is not encountered anymore in the CPython source tree (except for testing purposes). On the internal side, a bit is reserved in the GC header for GC-managed objects to signal that they were finalized. This helps avoid finalizing an object twice (and, especially, finalizing a CT object after it was broken by the GC). Discussion ========== Predictability -------------- Following this scheme, an object's finalizer is always called exactly once. The only exception is if an object is resurrected: the finalizer will be called again later. For CI objects, the order in which finalizers are called (step 2 above) is undefined. Safety ------ It is important to explain why the proposed change is safe. There are two aspects to be discussed: * Can a finalizer access zombie objects (including the object being finalized)? * What happens if a finalizer mutates the object graph so as to impact the CI? Let's discuss the first issue. We will divide possible cases in two categories: * If the object being finalized is part of the CI: by construction, no objects in CI are zombies yet, since CI finalizers are called before any reference breaking is done. Therefore, the finalizer cannot access zombie objects, which don't exist. * If the object being finalized is not part of the CI/CT: by definition, objects in the CI/CT don't have any references pointing to them from outside the CI/CT. Therefore, the finalizer cannot reach any zombie object (that is, even if the object being finalized was itself referenced from a zombie object). Now for the second issue. There are three potential cases: * The finalizer clears an existing reference to a CI object. The CI object may be disposed of before the GC tries to break it, which is fine (the GC simply has to be aware of this possibility). * The finalizer creates a new reference to a CI object. This can only happen from a CI object's finalizer (see above why). Therefore, the new reference will be detected by the GC after all CI finalizers are called (step 3 above), and collection will be aborted without any objects being broken. * The finalizer clears or creates a reference to a non-CI object. By construction, this is not a problem. Implementation ============== An implementation is available in branch ``finalize`` of the repository at http://hg.python.org/features/finalize/. Validation ========== Besides running the normal Python test suite, the implementation adds test cases for various finalization possibilities including reference cycles, object resurrection and legacy ``tp_del`` slots. The implementation has also been checked to not produce any regressions on the following test suites: * `Tulip `_, which makes an extensive use of generators * `Tornado `_ * `SQLAlchemy `_ * `Django `_ * `zope.interface `_ References ========== Notes about reference cycle collection and weak reference callbacks: http://hg.python.org/cpython/file/4e687d53b645/Modules/gc_weakref.txt Generator memory leak: http://bugs.python.org/issue17468 Allow objects to decide if they can be collected by GC: http://bugs.python.org/issue9141 Module shutdown procedure based on GC http://bugs.python.org/issue812369 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: From ncoghlan at gmail.com Sat May 18 13:05:48 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 May 2013 21:05:48 +1000 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518105910.1eecfa5f@fsol> References: <20130518105910.1eecfa5f@fsol> Message-ID: On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou wrote: > Resurrection > The process by which a finalizer creates a new reference to an > object in a CI. This can happen as a quirky but supported > side-effect of ``__del__`` methods. I really like the PEP overall, but could we at least get the option to have cases of object resurrection spit out a warning? And a clear rationale for not turning on such a warning by default? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sat May 18 13:46:54 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 13:46:54 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: References: <20130518105910.1eecfa5f@fsol> Message-ID: <20130518134654.6ae9838c@fsol> On Sat, 18 May 2013 21:05:48 +1000 Nick Coghlan wrote: > On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou wrote: > > Resurrection > > The process by which a finalizer creates a new reference to an > > object in a CI. This can happen as a quirky but supported > > side-effect of ``__del__`` methods. > > I really like the PEP overall, but could we at least get the option to > have cases of object resurrection spit out a warning? And a clear > rationale for not turning on such a warning by default? Where would you put the option? As for the rationale, it's simply compatibility: resurrection works without warnings right now :) Regards Antoine. From ncoghlan at gmail.com Sat May 18 14:51:35 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 18 May 2013 22:51:35 +1000 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518134654.6ae9838c@fsol> References: <20130518105910.1eecfa5f@fsol> <20130518134654.6ae9838c@fsol> Message-ID: On Sat, May 18, 2013 at 9:46 PM, Antoine Pitrou wrote: > On Sat, 18 May 2013 21:05:48 +1000 > Nick Coghlan wrote: >> On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou wrote: >> > Resurrection >> > The process by which a finalizer creates a new reference to an >> > object in a CI. This can happen as a quirky but supported >> > side-effect of ``__del__`` methods. >> >> I really like the PEP overall, but could we at least get the option to >> have cases of object resurrection spit out a warning? And a clear >> rationale for not turning on such a warning by default? > > Where would you put the option? > As for the rationale, it's simply compatibility: resurrection works > without warnings right now :) Command line, probably. However, you're right that's something we can consider later - for the PEP it's enough that it still works, and we just avoid calling the __del__ method a second time. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sat May 18 15:02:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 15:02:52 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: References: <20130518105910.1eecfa5f@fsol> <20130518134654.6ae9838c@fsol> Message-ID: <20130518150252.5825eeb5@fsol> On Sat, 18 May 2013 22:51:35 +1000 Nick Coghlan wrote: > On Sat, May 18, 2013 at 9:46 PM, Antoine Pitrou wrote: > > On Sat, 18 May 2013 21:05:48 +1000 > > Nick Coghlan wrote: > >> On Sat, May 18, 2013 at 6:59 PM, Antoine Pitrou wrote: > >> > Resurrection > >> > The process by which a finalizer creates a new reference to an > >> > object in a CI. This can happen as a quirky but supported > >> > side-effect of ``__del__`` methods. > >> > >> I really like the PEP overall, but could we at least get the option to > >> have cases of object resurrection spit out a warning? And a clear > >> rationale for not turning on such a warning by default? > > > > Where would you put the option? > > As for the rationale, it's simply compatibility: resurrection works > > without warnings right now :) > > Command line, probably. However, you're right that's something we can > consider later - for the PEP it's enough that it still works, and we > just avoid calling the __del__ method a second time. Actually, the __del__ method is called again on the next destruction attempt - as mentioned in the PEP: ? Following this scheme, an object's finalizer is always called exactly once. The only exception is if an object is resurrected: the finalizer will be called again later. ? I could change it to only call __del__ ever once, it just sounded more logical to call it each time destruction is attempted. (this is in contrast to weakrefs, though, which are cleared once and for all) Regards Antoine. From arigo at tunes.org Sat May 18 15:24:08 2013 From: arigo at tunes.org (Armin Rigo) Date: Sat, 18 May 2013 15:24:08 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518105910.1eecfa5f@fsol> References: <20130518105910.1eecfa5f@fsol> Message-ID: Hi Antoine, On Sat, May 18, 2013 at 10:59 AM, Antoine Pitrou wrote: > Cyclic isolate (CI) > A reference cycle in which no object is referenced from outside the > cycle *and* whose objects are still in a usable, non-broken state: > they can access each other from their respective finalizers. Does this definition include more complicated cases? For example: A -> B -> A and A -> C -> A Neither cycle is isolated. If there is no reference from outside, then the set of all three objects is isolated, but isn't strictly a cycle. I think the term is "strongly connected component". > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > this point, the objects are still safe to use. > > 2. **The finalizers of all CI objects are called.** You need to be very careful about what each call to a finalizer can do to the object graph. It may already be what you're doing, but the most careful solution is to collect in "1." the complete list of objects with finalizers that are in cycles; then incref them all; then call the finalizer of each of them; then decref them all. Such a solution gives new cases to think about, which are slightly unexpected for CPython's model: for example, if you have a cycle A -> B -> A, let's say the GC calls A.__del__ first; it might cause it to store a reference to B somewhere else, e.g. in some global; but then the GC calls B.__del__ anyway. This is probably fine but should be considered. > 3. **The CI is traversed again to determine if it is still isolated. How is this done? I don't see a clear way to determine it by looking only at the objects in the CI, given that arbitrary modifications of the object graph may have occurred. The solution I can think of doesn't seem robust against minor changes done by the finalizer. Take the example "A -> lst -> B -> A", where the reference from A to B is via a list (e.g. there is an attribute "A.attr = [B]"). If A.__del__ does the seemingly innocent change of replacing the list with a copy of itself, e.g. "A.attr = A.attr[:]", then after the finalizers are called, "lst" is gone and we're left with "A -> lst2 -> B -> A". Checking that this cycle is still isolated requires a possibly large number of checks, as far as I can tell. This can lead to O(n**2) behavior if there are n objects in total and O(n) cycles. The solution seems to be to simply wait for the next GC execution. Assuming that a finalizer is only called once, this only delays a bit freeing objects with finalizers in cycles (but your PEP still works to call finalizers and eventually collect the objects). Alternatively, this might be done immediately: in the point "3." above we can forget everything we found so far, and redo the tracking on all objects (this time ignoring finalizers that were already called). In fact, it may be necessary anyway: anything found before might be invalid after the finalizers are called, so forgetting it all and redoing the tracking from scratch seems to be the only way. > Type objects get a new ``tp_finalize`` slot to which ``__del__`` methods > are bound. Generators are also modified to use this slot, rather than > ``tp_del``. At the C level, a ``tp_finalize`` function is a normal > function which will be called with a regular, alive object as its only > argument. It should not attempt to revive or collect the object. Do you mean the opposite in the latest sentence? ``tp_finalize`` can do anything... A bient?t, Armin. From eliben at gmail.com Sat May 18 15:37:54 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 18 May 2013 06:37:54 -0700 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518105910.1eecfa5f@fsol> References: <20130518105910.1eecfa5f@fsol> Message-ID: Great PEP, I would really like to see this happen as it defines much saner semantics for finalization than what we currently have. One small question below: This PEP proposes to turn CI disposal into the following sequence (new > steps are in bold): > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > this point, the objects are still safe to use. > > 2. **The finalizers of all CI objects are called.** > > 3. **The CI is traversed again to determine if it is still isolated. > If it is determined that at least one object in CI is now reachable > from outside the CI, this collection is aborted and the whole CI > is resurrected. Otherwise, proceed.** > Not sure if my question is the same as Armin's here, but worth a try: by saying "the CI is traversed again" do you mean the original objects from the CI as discovered earlier, or is a new scan being done? What about a new object entering the CI during step (2)? I.e. the original CI was A->B->A but now one of the finalizers created some C such that B->C and C->A adding it to the connected component? Reading your description in (3) strictly it says: in this case the collection is aborted. This CI will be disposed next time collection is run. Is this correct? Eli > > 4. The CI becomes a CT as the GC systematically breaks all > known references inside it (using the ``tp_clear`` function). > > 5. Nothing. All CT objects should have been disposed of in step 4 > (as a side-effect of clearing references); this collection is > finished. > Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat May 18 15:45:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 15:45:52 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: References: <20130518105910.1eecfa5f@fsol> Message-ID: <20130518154552.2a879bf6@fsol> Hi Armin, On Sat, 18 May 2013 15:24:08 +0200 Armin Rigo wrote: > Hi Antoine, > > On Sat, May 18, 2013 at 10:59 AM, Antoine Pitrou wrote: > > Cyclic isolate (CI) > > A reference cycle in which no object is referenced from outside the > > cycle *and* whose objects are still in a usable, non-broken state: > > they can access each other from their respective finalizers. > > Does this definition include more complicated cases? For example: > > A -> B -> A and A -> C -> A > > Neither cycle is isolated. If there is no reference from outside, > then the set of all three objects is isolated, but isn't strictly a > cycle. I think the term is "strongly connected component". Yes, I should fix this definition to be more exact. > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > > this point, the objects are still safe to use. > > > > 2. **The finalizers of all CI objects are called.** > > You need to be very careful about what each call to a finalizer can do > to the object graph. It may already be what you're doing, but the > most careful solution is to collect in "1." the complete list of > objects with finalizers that are in cycles; then incref them all; then > call the finalizer of each of them; then decref them all. Such a > solution gives new cases to think about, which are slightly unexpected > for CPython's model: for example, if you have a cycle A -> B -> A, > let's say the GC calls A.__del__ first; it might cause it to store a > reference to B somewhere else, e.g. in some global; but then the GC > calls B.__del__ anyway. This is probably fine but should be > considered. Yes, I know this is possible. My opinion is that it is fine to call B's finalizer anyway. Calling all finalizers regardless of interim changes in the object graph also makes things a bit more deterministic: otherwise, which finalizers are called would depend on the call order, which is undefined. > > 3. **The CI is traversed again to determine if it is still isolated. > > How is this done? I don't see a clear way to determine it by looking > only at the objects in the CI, given that arbitrary modifications of > the object graph may have occurred. The same way a generation is traversed, but restricted to the CI. First the gc_refs field of each CI object is initialized to its ob_refcnt (again). Then, tp_traverse is called on each CI object, and each visited CI object has its gc_refs decremented. This substracts CI-internal references from the gc_refs fields. At the end of the traversal, if all CI objects have their gc_refs equal to 0, then the CI has no external reference to it and can be cleared. If at least one CI object has non-zero gc_refs, the CI cannot be cleared. > Alternatively, > this might be done immediately: in the point "3." above we can forget > everything we found so far, and redo the tracking on all objects (this > time ignoring finalizers that were already called). This would also be more costly, performance-wise. A CI should generally be quite small, but a whole generation is arbitrary big. > > Type objects get a new ``tp_finalize`` slot to which ``__del__`` methods > > are bound. Generators are also modified to use this slot, rather than > > ``tp_del``. At the C level, a ``tp_finalize`` function is a normal > > function which will be called with a regular, alive object as its only > > argument. It should not attempt to revive or collect the object. > > Do you mean the opposite in the latest sentence? ``tp_finalize`` can > do anything... Not exactly, but I worded it poorly. What I meant is that the C code in tp_finalize shouldn't *manually* revive the object, since it is called with an object with a strictly positive refcount. Regards Antoine. From solipsis at pitrou.net Sat May 18 15:47:51 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 15:47:51 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization References: <20130518105910.1eecfa5f@fsol> Message-ID: <20130518154751.2b9d5bd1@fsol> On Sat, 18 May 2013 06:37:54 -0700 Eli Bendersky wrote: > Great PEP, I would really like to see this happen as it defines much saner > semantics for finalization than what we currently have. One small question > below: > > > This PEP proposes to turn CI disposal into the following sequence (new > > steps are in bold): > > > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > > this point, the objects are still safe to use. > > > > 2. **The finalizers of all CI objects are called.** > > > > 3. **The CI is traversed again to determine if it is still isolated. > > If it is determined that at least one object in CI is now reachable > > from outside the CI, this collection is aborted and the whole CI > > is resurrected. Otherwise, proceed.** > > > > Not sure if my question is the same as Armin's here, but worth a try: by > saying "the CI is traversed again" do you mean the original objects from > the CI as discovered earlier, or is a new scan being done? What about a new > object entering the CI during step (2)? I.e. the original CI was A->B->A > but now one of the finalizers created some C such that B->C and C->A adding > it to the connected component? It is the original CI which is traversed. If a new reference is introduced into the reference chain, the traversal in step 3 will decide to resurrect the CI. This is not necessarily a problem, since the next GC collection will try collecting again. > Reading your description in (3) strictly it says: in this case the > collection is aborted. This CI will be disposed next time collection is > run. Is this correct? Yup. Regards Antoine. From eliben at gmail.com Sat May 18 15:56:26 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 18 May 2013 06:56:26 -0700 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518154751.2b9d5bd1@fsol> References: <20130518105910.1eecfa5f@fsol> <20130518154751.2b9d5bd1@fsol> Message-ID: On Sat, May 18, 2013 at 6:47 AM, Antoine Pitrou wrote: > On Sat, 18 May 2013 06:37:54 -0700 > Eli Bendersky wrote: > > Great PEP, I would really like to see this happen as it defines much > saner > > semantics for finalization than what we currently have. One small > question > > below: > > > > > > This PEP proposes to turn CI disposal into the following sequence (new > > > steps are in bold): > > > > > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > > > this point, the objects are still safe to use. > > > > > > 2. **The finalizers of all CI objects are called.** > > > > > > 3. **The CI is traversed again to determine if it is still isolated. > > > If it is determined that at least one object in CI is now reachable > > > from outside the CI, this collection is aborted and the whole CI > > > is resurrected. Otherwise, proceed.** > > > > > > > Not sure if my question is the same as Armin's here, but worth a try: by > > saying "the CI is traversed again" do you mean the original objects from > > the CI as discovered earlier, or is a new scan being done? What about a > new > > object entering the CI during step (2)? I.e. the original CI was A->B->A > > but now one of the finalizers created some C such that B->C and C->A > adding > > it to the connected component? > > It is the original CI which is traversed. If a new reference is > introduced into the reference chain, the traversal in step 3 will > decide to resurrect the CI. This is not necessarily a problem, since > the next GC collection will try collecting again. > > > Reading your description in (3) strictly it says: in this case the > > collection is aborted. This CI will be disposed next time collection is > > run. Is this correct? > > Yup. > Thanks, this actually makes a lot of sense. It's strictly better than the current situation where objects with __del__ are never collected. In the proposed scheme, the weird ones will be delayed and some really weird ones may never be collected, but the vast majority of __del__ methods do no resurrection so usually it will just work. This is a great proposal - killer new feature for 3.4 ;-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Sat May 18 15:56:38 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Sat, 18 May 2013 14:56:38 +0100 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518105910.1eecfa5f@fsol> References: <20130518105910.1eecfa5f@fsol> Message-ID: On 18/05/2013 9:59am, Antoine Pitrou wrote: > This PEP proposes to turn CI disposal into the following sequence (new > steps are in bold): > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > this point, the objects are still safe to use. > > 2. **The finalizers of all CI objects are called.** How do you know that one of the finalizers will not do something which causes another to fail? Presumably the following would cause an AttributeError to be printed: class Node: def __init__(self): self.next = None def __del__(self): print(self, self.next) del self.next # break Node object a = Node() b = Node() a.next = b b.next = a del a, b gc.collect() Are there are less contrived examples which will cause errors where currently there are none? -- Richard From solipsis at pitrou.net Sat May 18 16:18:11 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 16:18:11 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization References: <20130518105910.1eecfa5f@fsol> Message-ID: <20130518161811.63647e20@fsol> On Sat, 18 May 2013 14:56:38 +0100 Richard Oudkerk wrote: > On 18/05/2013 9:59am, Antoine Pitrou wrote: > > This PEP proposes to turn CI disposal into the following sequence (new > > steps are in bold): > > > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > > this point, the objects are still safe to use. > > > > 2. **The finalizers of all CI objects are called.** > > How do you know that one of the finalizers will not do something which > causes another to fail? > > Presumably the following would cause an AttributeError to be printed: > > class Node: > def __init__(self): > self.next = None > def __del__(self): > print(self, self.next) > del self.next # break Node object > > a = Node() > b = Node() > a.next = b > b.next = a > del a, b > gc.collect() It works fine: $ ./python sbt.py <__main__.Node object at 0x7f3acbf8f400> <__main__.Node object at 0x7f3acbf8f878> <__main__.Node object at 0x7f3acbf8f878> <__main__.Node object at 0x7f3acbf8f400> The reason is that, when you execute "del self.next", this removes the last reference to self.next and destroys it immediately. In essence, you were expecting to see: - enter a.__del__, destroy b - leave a.__del__ - enter b.__del__ oops? But what happens is: - enter a.__del__, destroy b - enter b.__del__ - leave b.__del__ - leave a.__del__ Regards Antoine. From arigo at tunes.org Sat May 18 16:22:55 2013 From: arigo at tunes.org (Armin Rigo) Date: Sat, 18 May 2013 16:22:55 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518154552.2a879bf6@fsol> References: <20130518105910.1eecfa5f@fsol> <20130518154552.2a879bf6@fsol> Message-ID: Hi Antoine, On Sat, May 18, 2013 at 3:45 PM, Antoine Pitrou wrote: >> How is this done? I don't see a clear way to determine it by looking >> only at the objects in the CI, given that arbitrary modifications of >> the object graph may have occurred. > > The same way a generation is traversed, but restricted to the CI. > > First the gc_refs field of each CI object is initialized to its > ob_refcnt (again). > > Then, tp_traverse is called on each CI object, and each visited > CI object has its gc_refs decremented. This substracts CI-internal > references from the gc_refs fields. > > At the end of the traversal, if all CI objects have their gc_refs equal > to 0, then the CI has no external reference to it and can be cleared. > If at least one CI object has non-zero gc_refs, the CI cannot be > cleared. Ok, indeed. Then you really should call finalizers only once: in case one of the finalizers in a cycle did a trivial change like I described, the algorithm above will conservatively assume the cycle should be kept alive. At the next GC collection we must not call the finalizer again, because it's likely to just do a similar trivial change. (There are other open questions about calling finalizers multiple times; e.g. an instance of this class has its finalizer called ad infinitum and leaks, even though X() is never part of any cycle: class X(object): def __del__(self): print "tick" lst = [self] lst.append(lst) Try interactively: every gc.collect() prints "tick", even if you make only one instance.) A bient?t, Armin. From solipsis at pitrou.net Sat May 18 16:33:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 16:33:15 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: References: <20130518105910.1eecfa5f@fsol> <20130518154552.2a879bf6@fsol> Message-ID: <20130518163315.6a21a0cd@fsol> On Sat, 18 May 2013 16:22:55 +0200 Armin Rigo wrote: > Hi Antoine, > > On Sat, May 18, 2013 at 3:45 PM, Antoine Pitrou wrote: > >> How is this done? I don't see a clear way to determine it by looking > >> only at the objects in the CI, given that arbitrary modifications of > >> the object graph may have occurred. > > > > The same way a generation is traversed, but restricted to the CI. > > > > First the gc_refs field of each CI object is initialized to its > > ob_refcnt (again). > > > > Then, tp_traverse is called on each CI object, and each visited > > CI object has its gc_refs decremented. This substracts CI-internal > > references from the gc_refs fields. > > > > At the end of the traversal, if all CI objects have their gc_refs equal > > to 0, then the CI has no external reference to it and can be cleared. > > If at least one CI object has non-zero gc_refs, the CI cannot be > > cleared. > > Ok, indeed. Then you really should call finalizers only once: in case > one of the finalizers in a cycle did a trivial change like I > described, the algorithm above will conservatively assume the cycle > should be kept alive. At the next GC collection we must not call the > finalizer again, because it's likely to just do a similar trivial > change. Well, the finalizer will only be called if the resurrected object is dereferenced again; otherwise the object won't be considered by the GC. So, this will only happen if someone keeps trying to destroy a resurrected object. Calling finalizers only once is fine with me, but it would be a change in behaviour; I don't know if it may break existing code. (for example, say someone is using __del__ to manage a freelist) Regards Antoine. From shibturn at gmail.com Sat May 18 16:52:56 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Sat, 18 May 2013 15:52:56 +0100 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518161811.63647e20@fsol> References: <20130518105910.1eecfa5f@fsol> <20130518161811.63647e20@fsol> Message-ID: On 18/05/2013 3:18pm, Antoine Pitrou wrote: > It works fine: > > $ ./python sbt.py > <__main__.Node object at 0x7f3acbf8f400> <__main__.Node object at 0x7f3acbf8f878> > <__main__.Node object at 0x7f3acbf8f878> <__main__.Node object at 0x7f3acbf8f400> > > The reason is that, when you execute "del self.next", this removes the > last reference to self.next and destroys it immediately. So even more contrived: class Node: def __init__(self, x): self.x = x self.next = None def __del__(self): print(self.x, self.next.x) del self.x a = Node(1) b = Node(2) a.next = b b.next = a del a, b gc.collect() -- Richard From solipsis at pitrou.net Sat May 18 17:22:02 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 18 May 2013 17:22:02 +0200 Subject: [Python-Dev] PEP 442: Safe object finalization References: <20130518105910.1eecfa5f@fsol> <20130518161811.63647e20@fsol> Message-ID: <20130518172202.090e9b2c@fsol> On Sat, 18 May 2013 15:52:56 +0100 Richard Oudkerk wrote: > On 18/05/2013 3:18pm, Antoine Pitrou wrote: > > It works fine: > > > > $ ./python sbt.py > > <__main__.Node object at 0x7f3acbf8f400> <__main__.Node object at 0x7f3acbf8f878> > > <__main__.Node object at 0x7f3acbf8f878> <__main__.Node object at 0x7f3acbf8f400> > > > > The reason is that, when you execute "del self.next", this removes the > > last reference to self.next and destroys it immediately. > > So even more contrived: > > class Node: > def __init__(self, x): > self.x = x > self.next = None > def __del__(self): > print(self.x, self.next.x) > del self.x > > a = Node(1) > b = Node(2) > a.next = b > b.next = a > del a, b > gc.collect() Indeed, there is an exception during destruction (which is ignored as any exception raised from __del__): $ ./python sbt.py 1 2 Exception ignored in: > Traceback (most recent call last): File "sbt.py", line 17, in __del__ print(self.x, self.next.x) AttributeError: 'Node' object has no attribute 'x' The only reason this currently succeeds is that the objects end up in gc.garbage, of course. Regards Antoine. From tjreedy at udel.edu Sat May 18 18:25:30 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sat, 18 May 2013 12:25:30 -0400 Subject: [Python-Dev] PEP 442: Safe object finalization In-Reply-To: <20130518172202.090e9b2c@fsol> References: <20130518105910.1eecfa5f@fsol> <20130518161811.63647e20@fsol> <20130518172202.090e9b2c@fsol> Message-ID: On 5/18/2013 11:22 AM, Antoine Pitrou wrote: > On Sat, 18 May 2013 15:52:56 +0100 > Richard Oudkerk wrote: >> So even more contrived: >> >> class Node: >> def __init__(self, x): >> self.x = x >> self.next = None >> def __del__(self): >> print(self.x, self.next.x) >> del self.x An attribute reference that can fail should be wrapped with try-except. >> >> a = Node(1) >> b = Node(2) >> a.next = b >> b.next = a >> del a, b >> gc.collect() > > Indeed, there is an exception during destruction (which is ignored as > any exception raised from __del__): > > $ ./python sbt.py > 1 2 > Exception ignored in: > > Traceback (most recent call last): > File "sbt.py", line 17, in __del__ > print(self.x, self.next.x) > AttributeError: 'Node' object has no attribute 'x' Though ignored, the bug is reported, hinting that you should fix it ;-). From storchaka at gmail.com Sat May 18 21:48:26 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 18 May 2013 22:48:26 +0300 Subject: [Python-Dev] cpython: Undo the deprecation of _asdict(). In-Reply-To: <3bCHTY1dCHzRwP@mail.python.org> References: <3bCHTY1dCHzRwP@mail.python.org> Message-ID: 18.05.13 10:06, raymond.hettinger ???????(??): > http://hg.python.org/cpython/rev/1b760f926846 > changeset: 83823:1b760f926846 > user: Raymond Hettinger > date: Sat May 18 00:05:20 2013 -0700 > summary: > Undo the deprecation of _asdict(). Why? From storchaka at gmail.com Sat May 18 22:00:58 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 18 May 2013 23:00:58 +0300 Subject: [Python-Dev] cpython: Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd format. In-Reply-To: <3bCX8R6JBhzSbk@mail.python.org> References: <3bCX8R6JBhzSbk@mail.python.org> Message-ID: 18.05.13 19:37, richard.oudkerk ???????(??): > http://hg.python.org/cpython/rev/0648e7fe7a72 > changeset: 83829:0648e7fe7a72 > user: Richard Oudkerk > date: Sat May 18 17:35:19 2013 +0100 > summary: > Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd format. See also DEBUG_PRINT_FORMAT_SPEC() in Python/formatter_unicode.c, _PyDebugAllocatorStats() in Objects/obmalloc.c, and kqueue_event_repr() in Modules/selectmodule.c. From storchaka at gmail.com Sat May 18 22:03:24 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 18 May 2013 23:03:24 +0300 Subject: [Python-Dev] cpython: Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd format. In-Reply-To: References: <3bCX8R6JBhzSbk@mail.python.org> Message-ID: 18.05.13 23:00, Serhiy Storchaka ???????(??): > 18.05.13 19:37, richard.oudkerk ???????(??): >> http://hg.python.org/cpython/rev/0648e7fe7a72 >> changeset: 83829:0648e7fe7a72 >> user: Richard Oudkerk >> date: Sat May 18 17:35:19 2013 +0100 >> summary: >> Use PY_FORMAT_SIZE_T because Visual Studio does not understand %zd >> format. > > See also DEBUG_PRINT_FORMAT_SPEC() in Python/formatter_unicode.c, > _PyDebugAllocatorStats() in Objects/obmalloc.c, and kqueue_event_repr() > in Modules/selectmodule.c. And _PyUnicode_Dump() in Objects/unicodeobject.c. From raymond.hettinger at gmail.com Sun May 19 07:27:36 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 18 May 2013 22:27:36 -0700 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <51937257.4020103@stackless.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <51937257.4020103@stackless.com> Message-ID: <1FD36F75-A4E0-4504-9A38-C80200BAA017@gmail.com> On May 15, 2013, at 4:32 AM, Christian Tismer wrote: > What is the current status of this discussion? > I'd like to know whether it is a considered alternative implementation. As far as I can tell, I'm the only one working on it (and a bit slowly at that). My plan is to implement it for frozensets to see how it works out. Frozensets are a nice first experiment for several reasons: * The current implementation is cleaner than dictionaries (which have become more complicated due to key-sharing). * It will be easy to benchmark (by racing sets vs frozen sets) for an apples-to-apples comparison. * There is no need to have a list-like over-allocation scheme since frozensets can't grow after they are created. That will guarantee a significant space savings and it will simplify the coding. * I wrote the code for setobject.c so I know all the ins-and-outs. > > There is also a discussion in python-ideas right now where this > alternative is mentioned, and I think especially for small dicts > as **kwargs, it could be a cheap way to introduce order. The compaction of keys and values into a dense array was intended to save space, improve cache performance, and improve iteration speed. The ordering was just a side-effect and one that is easily disturbed if keys ever get deleted. So a compacted dict might be a cheap way to introduce order for kwargs, but it would need special handling if the user decided to delete keys. BTW, I'm +1 on the idea for ordering keyword-args. It makes it easier to debug if the arguments show-up in the order they were created. AFAICT, no purpose is served by scrambling them (which is exacerbated by the new randomized hashing security feature). Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun May 19 07:47:35 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 18 May 2013 22:47:35 -0700 Subject: [Python-Dev] Ordering keyword dicts Message-ID: On Sat, May 18, 2013 at 10:27 PM, Raymond Hettinger wrote: > BTW, I'm +1 on the idea for ordering keyword-args. It makes > it easier to debug if the arguments show-up in the order they > were created. AFAICT, no purpose is served by scrambling them > (which is exacerbated by the new randomized hashing security feature). I'm slow at warming up to the idea. My main concern is speed -- since most code doesn't need it and function calls are already slow (and obviously very common :-) it would be a shame if this slowed down function calls that don't need it noticeably. An observation is that it's only necessary to preserve order if the function definition uses **kwds. AFAIK we currently don't know if this is the case when the call is made though, but perhaps the information could be made available to the call site somehow. There are also many special cases to consider; e.g. using **kwds in the call where kwds is an unordered dict, or calls from C, or calls to C. But maybe someone considers this a challenge and comes up with a patch? The benefits to *some* use cases would be obvious. -- --Guido van Rossum (python.org/~guido) From raymond.hettinger at gmail.com Sun May 19 08:41:59 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 18 May 2013 23:41:59 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: On May 14, 2013, at 9:39 AM, Gregory P. Smith wrote: > Bad: doctests. I'm hoping that core developers don't get caught-up in the "doctests are bad meme". Instead, we should be clear about their primary purpose which is to test the examples given in docstrings. In many cases, there is a great deal of benefit to docstrings that have worked-out examples (see the docstrings in the decimal module for example). In such cases it is also worthwhile to make sure those examples continue to match reality. Doctests are a vehicle for such assurance. In other words, doctests have a perfectly legitimate use case. We should continue to encourage users to make thorough unit tests and to leave doctests for documentation. That said, it should be recognized that some testing is better than no testing. And doctests may be attractive in that regard because it is almost effortless to cut-and-paste a snippet from the interactive prompt. That isn't a best practice, but it isn't a worst practice either. Another meme that I hope dispel is the notion that the core developers are free to break user code (such as doctests) if they believe the users aren't coding in accordance with best practices. Our goal is to improve their lives with our modifications, not to make their lives more difficult. Currently, we face an adoption problem with Python 3. At PyCon, an audience of nearly 2500 people said they had tried Python 3 but weren't planning to convert to it in production code. All of the coredevs are working to make Python 3 more attractive than Python 2, but we also have to be careful to not introduce obstacles to conversion. Breaking tests makes it much harder to convert (especially because people need to rely on their tests to see if the conversion was successful). Raymond P.S. Breaking doctests should also be seen as a "canary in a coal mine." When they break, it also means that printed examples are out of date, that code parsers may break, that diffs start being different, that programs that feed into other programs (perhaps via pipes and filters) may be changing their interface, etc. Occasionally, we make need to break such things but there should be a compelling offsetting benefit (i.e. evaluating thoughtfully whether "I'm trying to help you by making your constant integers have a nicer repr" is worth "Sorry, I broke your tests, made your published examples out of date, and slowed down your code." -- in some modules it will be worth it, but in others we should value stability over micro-improvments). -------------- next part -------------- An HTML attachment was scrubbed... URL: From cf.natali at gmail.com Sun May 19 10:08:39 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 19 May 2013 10:08:39 +0200 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <20130517150119.01077496@pitrou.net> References: <20130517150119.01077496@pitrou.net> Message-ID: 2013/5/17 Antoine Pitrou : > > Hello, > > Some pieces of code are still guarded by: > #ifdef HAVE_FSTAT > ... > #endif > > I would expect all systems to have fstat() these days. It's pretty > basic POSIX, and even Windows has had it for ages. Shouldn't we simply > make those code blocks unconditional? It would avoid having to maintain > unused fallback paths. I was sure I'd seen a post/bug report about this: http://bugs.python.org/issue12082 The OP was trying to build Python on an embedded platform without fstat(). cf From ncoghlan at gmail.com Sun May 19 13:49:19 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 19 May 2013 21:49:19 +1000 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: On Sun, May 19, 2013 at 4:41 PM, Raymond Hettinger wrote: > nicer repr" is worth "Sorry, I broke your tests, made your published > examples > out of date, and slowed down your code." While the first two considerations are always potentially applicable when using enums, the latter should only be true for code that uses str() and repr() a lot. For other operations, int-based enums shouldn't add any more overhead than namedtuple does for tuples. I agree with basically everything you said, but I don't want "enums are slower than normal integers" to become a meme - there really shouldn't be a speed difference for any arithmetic operations when using IntEnum. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Sun May 19 14:19:04 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 May 2013 14:19:04 +0200 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: References: <20130517150119.01077496@pitrou.net> Message-ID: <20130519141904.1dc2ff56@fsol> On Sun, 19 May 2013 10:08:39 +0200 Charles-Fran?ois Natali wrote: > 2013/5/17 Antoine Pitrou : > > > > Hello, > > > > Some pieces of code are still guarded by: > > #ifdef HAVE_FSTAT > > ... > > #endif > > > > I would expect all systems to have fstat() these days. It's pretty > > basic POSIX, and even Windows has had it for ages. Shouldn't we simply > > make those code blocks unconditional? It would avoid having to maintain > > unused fallback paths. > > I was sure I'd seen a post/bug report about this: > http://bugs.python.org/issue12082 > > The OP was trying to build Python on an embedded platform without fstat(). Ah, right. Ok, judging by the answers I'm being consistent in my opinions :-) I still wonder why an embedded platform can't provide at least some emulation of fstat(), even by returning fake values. Not providing such a basic function must break a lot of existing third-party software. Regards Antoine. From skip at pobox.com Sun May 19 14:42:53 2013 From: skip at pobox.com (Skip Montanaro) Date: Sun, 19 May 2013 07:42:53 -0500 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: Message-ID: > On Sat, May 18, 2013 at 10:27 PM, Raymond Hettinger > wrote: >> BTW, I'm +1 on the idea for ordering keyword-args. It makes >> it easier to debug if the arguments show-up in the order they >> were created. AFAICT, no purpose is served by scrambling them >> (which is exacerbated by the new randomized hashing security feature). (This is really for Raymond, though I'm replying to Guido's post.) I'm having a hard time understanding why this matters. Maybe I'm just dense on a Sunday morning. Can you explain what makes it difficult to debug about keyword arguments if they are held in a normal dictionary? Debugging at the Python level or the C level? Can you give an example where it would be easier to debug? If it makes it easier here, would it make it easier to debug other dictionary usage if they were ordered? Skip From solipsis at pitrou.net Sun May 19 15:01:48 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 May 2013 15:01:48 +0200 Subject: [Python-Dev] Ordering keyword dicts References: Message-ID: <20130519150148.513a1e5b@fsol> On Sat, 18 May 2013 22:47:35 -0700 Guido van Rossum wrote: > On Sat, May 18, 2013 at 10:27 PM, Raymond Hettinger > wrote: > > BTW, I'm +1 on the idea for ordering keyword-args. It makes > > it easier to debug if the arguments show-up in the order they > > were created. AFAICT, no purpose is served by scrambling them > > (which is exacerbated by the new randomized hashing security feature). > > I'm slow at warming up to the idea. My main concern is speed -- since > most code doesn't need it and function calls are already slow (and > obviously very common :-) it would be a shame if this slowed down > function calls that don't need it noticeably. > > An observation is that it's only necessary to preserve order if the > function definition uses **kwds. AFAIK we currently don't know if this > is the case when the call is made though, but perhaps the information > could be made available to the call site somehow. > > There are also many special cases to consider; e.g. using **kwds in > the call where kwds is an unordered dict, or calls from C, or calls to > C. > > But maybe someone considers this a challenge and comes up with a > patch? The benefits to *some* use cases would be obvious. The main use case seems to be the OrderedDict constructor itself. Otherwise, I can't think of any situation where I would've wanted it. Changing keyword arguments to be an OrderedDict without impacting performance in all the cases you mentioned (and without breaking C-level compatibility) would be a real, tough challenge. Regards Antoine. From fijall at gmail.com Sun May 19 16:09:01 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 19 May 2013 16:09:01 +0200 Subject: [Python-Dev] More compact dictionaries with faster iteration In-Reply-To: <1FD36F75-A4E0-4504-9A38-C80200BAA017@gmail.com> References: <9BD2AD6A-125D-4A34-B6BF-A99B167554B6@gmail.com> <51937257.4020103@stackless.com> <1FD36F75-A4E0-4504-9A38-C80200BAA017@gmail.com> Message-ID: On Sun, May 19, 2013 at 7:27 AM, Raymond Hettinger wrote: > > On May 15, 2013, at 4:32 AM, Christian Tismer wrote: > > What is the current status of this discussion? > I'd like to know whether it is a considered alternative implementation. > > > As far as I can tell, I'm the only one working on it (and a bit slowly at > that). > My plan is to implement it for frozensets to see how it works out. > > Frozensets are a nice first experiment for several reasons: > * The current implementation is cleaner than dictionaries > (which have become more complicated due to key-sharing). > * It will be easy to benchmark (by racing sets vs frozen sets) > for an apples-to-apples comparison. > * There is no need to have a list-like over-allocation scheme > since frozensets can't grow after they are created. > That will guarantee a significant space savings and > it will simplify the coding. > * I wrote the code for setobject.c so I know all the ins-and-outs. > > > > There is also a discussion in python-ideas right now where this > alternative is mentioned, and I think especially for small dicts > as **kwargs, it could be a cheap way to introduce order. > > > The compaction of keys and values into a dense array was > intended to save space, improve cache performance, and > improve iteration speed. The ordering was just a side-effect > and one that is easily disturbed if keys ever get deleted. > > So a compacted dict might be a cheap way to introduce order > for kwargs, but it would need special handling if the user decided > to delete keys. > > BTW, I'm +1 on the idea for ordering keyword-args. It makes > it easier to debug if the arguments show-up in the order they > were created. AFAICT, no purpose is served by scrambling them > (which is exacerbated by the new randomized hashing security feature). > > > Raymond The completely ordered dict is easy to get too - you mark deleted entries instead of removing them (then all the keys are in order) and every now and then you just compact the whole thing by removing all the delted entries, presumably on the resize or so. Cheers, fijal From ncoghlan at gmail.com Sun May 19 16:40:13 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 00:40:13 +1000 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: <20130519150148.513a1e5b@fsol> References: <20130519150148.513a1e5b@fsol> Message-ID: On Sun, May 19, 2013 at 11:01 PM, Antoine Pitrou wrote: > The main use case seems to be the OrderedDict constructor itself. > Otherwise, I can't think of any situation where I would've wanted it. I've had a couple related to populating other mappings where order matters, at least from a predictability and readability perspective, even if it's not strictly required from a standards compliance point of view (think writing XML attributes, etc). I quite liked the idea of a simple flag attribute on function objects that the interpreter checked, with a decorator in functools (or even the builtins) to set it. It's not a particularly elegant solution, but it would get the job done with minimal performance impact on existing functions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From gvanrossum at gmail.com Sun May 19 16:47:14 2013 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 19 May 2013 07:47:14 -0700 (PDT) Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <20130519141904.1dc2ff56@fsol> References: <20130519141904.1dc2ff56@fsol> Message-ID: <1368974833994.c2cfd605@Nodemailer> Fake values would probably cause hard to debug problems. It's a long standing Python tradition not to offer low level APIs that the platform doesn't have. ? Sent from Mailbox On Sun, May 19, 2013 at 5:20 AM, Antoine Pitrou wrote: > On Sun, 19 May 2013 10:08:39 +0200 > Charles-Fran?ois Natali wrote: >> 2013/5/17 Antoine Pitrou : >> > >> > Hello, >> > >> > Some pieces of code are still guarded by: >> > #ifdef HAVE_FSTAT >> > ... >> > #endif >> > >> > I would expect all systems to have fstat() these days. It's pretty >> > basic POSIX, and even Windows has had it for ages. Shouldn't we simply >> > make those code blocks unconditional? It would avoid having to maintain >> > unused fallback paths. >> >> I was sure I'd seen a post/bug report about this: >> http://bugs.python.org/issue12082 >> >> The OP was trying to build Python on an embedded platform without fstat(). > Ah, right. Ok, judging by the answers I'm being consistent in my > opinions :-) > I still wonder why an embedded platform can't provide at least some > emulation of fstat(), even by returning fake values. Not providing > such a basic function must break a lot of existing third-party software. > Regards > Antoine. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun May 19 16:51:55 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 May 2013 16:51:55 +0200 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <1368974833994.c2cfd605@Nodemailer> References: <20130519141904.1dc2ff56@fsol> <1368974833994.c2cfd605@Nodemailer> Message-ID: <20130519165155.3245995a@fsol> On Sun, 19 May 2013 07:47:14 -0700 (PDT) "Guido van Rossum" wrote: > Fake values would probably cause hard to debug problems. It's a long standing Python tradition not to offer low level APIs that the platform doesn't have. I meant the platform, not Python. Regards Antoine. > ? > Sent from Mailbox > > On Sun, May 19, 2013 at 5:20 AM, Antoine Pitrou > wrote: > > > On Sun, 19 May 2013 10:08:39 +0200 > > Charles-Fran?ois Natali wrote: > >> 2013/5/17 Antoine Pitrou : > >> > > >> > Hello, > >> > > >> > Some pieces of code are still guarded by: > >> > #ifdef HAVE_FSTAT > >> > ... > >> > #endif > >> > > >> > I would expect all systems to have fstat() these days. It's pretty > >> > basic POSIX, and even Windows has had it for ages. Shouldn't we simply > >> > make those code blocks unconditional? It would avoid having to maintain > >> > unused fallback paths. > >> > >> I was sure I'd seen a post/bug report about this: > >> http://bugs.python.org/issue12082 > >> > >> The OP was trying to build Python on an embedded platform without fstat(). > > Ah, right. Ok, judging by the answers I'm being consistent in my > > opinions :-) > > I still wonder why an embedded platform can't provide at least some > > emulation of fstat(), even by returning fake values. Not providing > > such a basic function must break a lot of existing third-party software. > > Regards > > Antoine. > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org From gvanrossum at gmail.com Sun May 19 16:57:22 2013 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 19 May 2013 07:57:22 -0700 (PDT) Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: Message-ID: <1368975441504.b153a948@Nodemailer> Hm. Wouldn'tvevery call site be slowed down by checking for that flag? ? Sent from Mailbox On Sun, May 19, 2013 at 7:42 AM, Nick Coghlan wrote: > On Sun, May 19, 2013 at 11:01 PM, Antoine Pitrou wrote: >> The main use case seems to be the OrderedDict constructor itself. >> Otherwise, I can't think of any situation where I would've wanted it. > I've had a couple related to populating other mappings where order > matters, at least from a predictability and readability perspective, > even if it's not strictly required from a standards compliance point > of view (think writing XML attributes, etc). > I quite liked the idea of a simple flag attribute on function objects > that the interpreter checked, with a decorator in functools (or even > the builtins) to set it. It's not a particularly elegant solution, but > it would get the job done with minimal performance impact on existing > functions. > Cheers, > Nick. > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From fijall at gmail.com Sun May 19 16:59:04 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Sun, 19 May 2013 16:59:04 +0200 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: <20130519150148.513a1e5b@fsol> Message-ID: On Sun, May 19, 2013 at 4:40 PM, Nick Coghlan wrote: > On Sun, May 19, 2013 at 11:01 PM, Antoine Pitrou wrote: >> The main use case seems to be the OrderedDict constructor itself. >> Otherwise, I can't think of any situation where I would've wanted it. > > I've had a couple related to populating other mappings where order > matters, at least from a predictability and readability perspective, > even if it's not strictly required from a standards compliance point > of view (think writing XML attributes, etc). > > I quite liked the idea of a simple flag attribute on function objects > that the interpreter checked, with a decorator in functools (or even > the builtins) to set it. It's not a particularly elegant solution, but > it would get the job done with minimal performance impact on existing > functions. > > Cheers, > Nick. Note that raymonds proposal would make dicts and ordereddicts almost exactly the same speed. From ncoghlan at gmail.com Sun May 19 17:09:19 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 01:09:19 +1000 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: <20130519165155.3245995a@fsol> References: <20130519141904.1dc2ff56@fsol> <1368974833994.c2cfd605@Nodemailer> <20130519165155.3245995a@fsol> Message-ID: On Mon, May 20, 2013 at 12:51 AM, Antoine Pitrou wrote: > On Sun, 19 May 2013 07:47:14 -0700 (PDT) > "Guido van Rossum" wrote: >> Fake values would probably cause hard to debug problems. It's a long standing Python tradition not to offer low level APIs that the platform doesn't have. > > I meant the platform, not Python. For CPython derivatives like PyMite, it can help to get things to compile. Perhaps rather than dropping it, we can just replace all the complex fallback code with code that triggers 'RuntimeError("Operation requires fstat, which is not available on this platform")'. Derivatives that support fstat-free platforms will have a clear place to put their custom code, but we get the simpler assumption of fstat always being available for the code paths we care about (and can reasonably test). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From jsbueno at python.org.br Sun May 19 17:22:54 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Sun, 19 May 2013 12:22:54 -0300 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: <1368975441504.b153a948@Nodemailer> References: <1368975441504.b153a948@Nodemailer> Message-ID: On 19 May 2013 11:57, Guido van Rossum wrote: > Hm. Wouldn'tvevery call site be slowed down by checking for that flag? Actually, when I was thinking on the subject I came to the same idea, of having some functions marked differently so they would use a different call mechanism - but them I wondered around having a different opcode for the ordered-dict calls. Would that be feasible? js -><- > ? > Sent from Mailbox > > > On Sun, May 19, 2013 at 7:42 AM, Nick Coghlan wrote: >> >> On Sun, May 19, 2013 at 11:01 PM, Antoine Pitrou >> wrote: >> > The main use case seems to be the OrderedDict constructor itself. >> > Otherwise, I can't think of any situation where I would've wanted it. >> >> I've had a couple related to populating other mappings where order >> matters, at least from a predictability and readability perspective, >> even if it's not strictly required from a standards compliance point >> of view (think writing XML attributes, etc). >> >> I quite liked the idea of a simple flag attribute on function objects >> that the interpreter checked, with a decorator in functools (or even >> the builtins) to set it. It's not a particularly elegant solution, but >> it would get the job done with minimal performance impact on existing >> functions. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jsbueno%40python.org.br > From solipsis at pitrou.net Sun May 19 17:40:40 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 19 May 2013 17:40:40 +0200 Subject: [Python-Dev] HAVE_FSTAT? In-Reply-To: References: <20130519141904.1dc2ff56@fsol> <1368974833994.c2cfd605@Nodemailer> <20130519165155.3245995a@fsol> Message-ID: <20130519174040.541938a7@fsol> On Mon, 20 May 2013 01:09:19 +1000 Nick Coghlan wrote: > On Mon, May 20, 2013 at 12:51 AM, Antoine Pitrou wrote: > > On Sun, 19 May 2013 07:47:14 -0700 (PDT) > > "Guido van Rossum" wrote: > >> Fake values would probably cause hard to debug problems. It's a long standing Python tradition not to offer low level APIs that the platform doesn't have. > > > > I meant the platform, not Python. > > For CPython derivatives like PyMite, it can help to get things to compile. It's not a CPython derivative. Regards Antoine. > > Perhaps rather than dropping it, we can just replace all the complex > fallback code with code that triggers 'RuntimeError("Operation > requires fstat, which is not available on this platform")'. > > Derivatives that support fstat-free platforms will have a clear place > to put their custom code, but we get the simpler assumption of fstat > always being available for the code paths we care about (and can > reasonably test). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From gvanrossum at gmail.com Sun May 19 16:48:22 2013 From: gvanrossum at gmail.com (Guido van Rossum) Date: Sun, 19 May 2013 07:48:22 -0700 (PDT) Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: Message-ID: <1368974902254.f812247c@Nodemailer> Anyway, if you're doing arithmetic on enums you're doing it wrong.? ? Sent from Mailbox On Sun, May 19, 2013 at 4:55 AM, Nick Coghlan wrote: > On Sun, May 19, 2013 at 4:41 PM, Raymond Hettinger > wrote: >> nicer repr" is worth "Sorry, I broke your tests, made your published >> examples >> out of date, and slowed down your code." > While the first two considerations are always potentially applicable > when using enums, the latter should only be true for code that uses > str() and repr() a lot. For other operations, int-based enums > shouldn't add any more overhead than namedtuple does for tuples. > I agree with basically everything you said, but I don't want "enums > are slower than normal integers" to become a meme - there really > shouldn't be a speed difference for any arithmetic operations when > using IntEnum. > Cheers, > Nick. > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Sun May 19 17:59:52 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Sun, 19 May 2013 16:59:52 +0100 Subject: [Python-Dev] Async subprocesses on Windows with tulip Message-ID: Attached is a pretty trivial example of asynchronous interaction with a python subprocess using tulip on Windows. It does not use transports or protocols -- instead sock_recv() and sock_sendall() are used inside tasks. I am not sure what the plan is for dealing with subprocesses currently. Shall I just add this to the examples folder for now? -- Richard -------------- next part -------------- ''' Example of asynchronous interaction with a subprocess on Windows. This requires use of overlapped pipe handles and (a modified) iocp proactor. ''' import itertools import logging import msvcrt import os import subprocess import sys import tempfile import _winapi import tulip from tulip import _overlapped, windows_events, events PIPE = subprocess.PIPE BUFSIZE = 8192 _mmap_counter=itertools.count() def _pipe(duplex=True, overlapped=(True, True)): ''' Return handles for a pipe with one or both ends overlapped. ''' address = tempfile.mktemp(prefix=r'\\.\pipe\python-pipe-%d-%d-' % (os.getpid(), next(_mmap_counter))) if duplex: openmode = _winapi.PIPE_ACCESS_DUPLEX access = _winapi.GENERIC_READ | _winapi.GENERIC_WRITE obsize, ibsize = BUFSIZE, BUFSIZE else: openmode = _winapi.PIPE_ACCESS_INBOUND access = _winapi.GENERIC_WRITE obsize, ibsize = 0, BUFSIZE openmode |= _winapi.FILE_FLAG_FIRST_PIPE_INSTANCE if overlapped[0]: openmode |= _winapi.FILE_FLAG_OVERLAPPED if overlapped[1]: flags_and_attribs = _winapi.FILE_FLAG_OVERLAPPED else: flags_and_attribs = 0 h1 = h2 = None try: h1 = _winapi.CreateNamedPipe( address, openmode, _winapi.PIPE_WAIT, 1, obsize, ibsize, _winapi.NMPWAIT_WAIT_FOREVER, _winapi.NULL) h2 = _winapi.CreateFile( address, access, 0, _winapi.NULL, _winapi.OPEN_EXISTING, flags_and_attribs, _winapi.NULL) ov = _winapi.ConnectNamedPipe(h1, overlapped=True) ov.GetOverlappedResult(True) return h1, h2 except: if h1 is not None: _winapi.CloseHandle(h1) if h2 is not None: _winapi.CloseHandle(h2) raise class PipeHandle: ''' Wrapper for a pipe handle ''' def __init__(self, handle): self._handle = handle @property def handle(self): return self._handle def fileno(self): return self._handle def close(self, *, CloseHandle=_winapi.CloseHandle): if self._handle is not None: CloseHandle(self._handle) self._handle = None __del__ = close def __enter__(self): return self def __exit__(self, t, v, tb): self.close() class Popen(subprocess.Popen): ''' Subclass of Popen which uses overlapped pipe handles wrapped with PipeHandle instead of normal file objects for stdin, stdout, stderr. ''' _WriteWrapper = PipeHandle _ReadWrapper = PipeHandle def __init__(self, args, *, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, startupinfo=None, creationflags=0, restore_signals=True, start_new_session=False, pass_fds=()): stdin_rfd = stdout_wfd = stderr_wfd = None stdin_wh = stdout_rh = stderr_rh = None if stdin == PIPE: stdin_rh, stdin_wh = _pipe(False, (False, True)) stdin_rfd = msvcrt.open_osfhandle(stdin_rh, os.O_RDONLY) if stdout == PIPE: stdout_rh, stdout_wh = _pipe(False, (True, False)) stdout_wfd = msvcrt.open_osfhandle(stdout_wh, 0) if stderr == PIPE: stderr_rh, stderr_wh = _pipe(False, (True, False)) stderr_wfd = msvcrt.open_osfhandle(stderr_wh, 0) try: super().__init__(args, stdin=stdin_rfd, stdout=stdout_wfd, stderr=stderr_wfd, executable=executable, preexec_fn=preexec_fn, close_fds=close_fds, shell=shell, cwd=cwd, env=env, startupinfo=startupinfo, creationflags=creationflags, restore_signals=restore_signals, start_new_session=start_new_session, pass_fds=pass_fds) except: for h in (stdin_wh, stdout_rh, stderr_rh): _winapi.CloseHandle(h) raise else: if stdin_wh is not None: self.stdin = self._WriteWrapper(stdin_wh) if stdout_rh is not None: self.stdout = self._ReadWrapper(stdout_rh) if stderr_rh is not None: self.stderr = self._ReadWrapper(stderr_rh) finally: if stdin == PIPE: os.close(stdin_rfd) if stdout == PIPE: os.close(stdout_wfd) if stderr == PIPE: os.close(stderr_wfd) class ProactorEventLoop(windows_events.ProactorEventLoop): ''' Eventloop which uses ReadFile() and WriteFile() instead of WSARecv() and WSASend() for PipeHandle objects. ''' def sock_recv(self, conn, n): self._proactor._register_with_iocp(conn) ov = _overlapped.Overlapped(_winapi.NULL) handle = getattr(conn, 'handle', None) if handle is None: ov.WSARecv(conn.fileno(), n, 0) else: ov.ReadFile(conn.fileno(), n) return self._proactor._register(ov, conn, ov.getresult) def sock_sendall(self, conn, data): self._proactor._register_with_iocp(conn) ov = _overlapped.Overlapped(_winapi.NULL) handle = getattr(conn, 'handle', None) if handle is None: ov.WSASend(conn.fileno(), data, 0) else: ov.WriteFile(conn.fileno(), data) return self._proactor._register(ov, conn, ov.getresult) if __name__ == '__main__': @tulip.task def read_and_close(loop, f): with f: collected = [] while True: s = yield from loop.sock_recv(f, 4096) if s == b'': return b''.join(collected) collected.append(s) @tulip.task def write_and_close(loop, f, buf): with f: return (yield from loop.sock_sendall(f, buf)) @tulip.task def main(loop): # start process which upper cases its input code = r'''if 1: import os os.write(2, b"starting\n") while True: s = os.read(0, 1024) if not s: break s = s.upper() while s: n = os.write(1, s) s = s[n:] os.write(2, b"exiting\n") ''' p = Popen([sys.executable, '-c', code], stdin=PIPE, stdout=PIPE, stderr=PIPE) # start tasks to write to and read from the process bytes_written = write_and_close(loop, p.stdin, b"hello world\n"*100000) stdout_data = read_and_close(loop, p.stdout) stderr_data = read_and_close(loop, p.stderr) # wait for tasks to finish and get the results bytes_written = yield from bytes_written stdout_data = yield from stdout_data stderr_data = yield from stderr_data # print results print('bytes_written:', bytes_written) print('stdout_data[:50]:', stdout_data[:50]) print('len(stdout_data):', len(stdout_data)) print('stderr_data:', stderr_data) loop = ProactorEventLoop() events.set_event_loop(loop) try: loop.run_until_complete(main(loop)) finally: loop.close() From benjamin at python.org Sun May 19 18:03:47 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 19 May 2013 09:03:47 -0700 Subject: [Python-Dev] Async subprocesses on Windows with tulip In-Reply-To: References: Message-ID: Shouldn't this go to the python-tulip list? 2013/5/19 Richard Oudkerk : > Attached is a pretty trivial example of asynchronous interaction with a > python subprocess using tulip on Windows. It does not use transports or > protocols -- instead sock_recv() and sock_sendall() are used inside tasks. > > I am not sure what the plan is for dealing with subprocesses currently. > Shall I just add this to the examples folder for now? > > -- > Richard > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/benjamin%40python.org > -- Regards, Benjamin From drsalists at gmail.com Sun May 19 18:14:12 2013 From: drsalists at gmail.com (Dan Stromberg) Date: Sun, 19 May 2013 09:14:12 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: On Sat, May 18, 2013 at 11:41 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > On May 14, 2013, at 9:39 AM, Gregory P. Smith wrote: > > Bad: doctests. > > > I'm hoping that core developers don't get caught-up in the "doctests are > bad meme". > Don't doctests intended for CPython not work on Jython, Pypy, IronPython... I've been avoiding them for this reason. -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Sun May 19 18:47:44 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Sun, 19 May 2013 17:47:44 +0100 Subject: [Python-Dev] Async subprocesses on Windows with tulip In-Reply-To: References: Message-ID: On 19/05/2013 5:03pm, Benjamin Peterson wrote: > Shouldn't this go to the python-tulip list? Yes. Sorry about that. -- Richard From tseaver at palladion.com Sun May 19 22:13:53 2013 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 19 May 2013 16:13:53 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <1368974902254.f812247c@Nodemailer> References: <1368974902254.f812247c@Nodemailer> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/19/2013 10:48 AM, Guido van Rossum wrote: > Anyway, if you're doing arithmetic on enums you're doing it wrong. Hmm, bitwise operations, even? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGZMoAACgkQ+gerLs4ltQ79qwCgq6FWTl6ZDIDctBg69In47YB2 +FkAnj5cEyw1szQ8GCl6aQ9+aGKcwp3y =d/xt -----END PGP SIGNATURE----- From tseaver at palladion.com Sun May 19 22:17:26 2013 From: tseaver at palladion.com (Tres Seaver) Date: Sun, 19 May 2013 16:17:26 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/19/2013 12:14 PM, Dan Stromberg wrote: > On Sat, May 18, 2013 at 11:41 PM, Raymond Hettinger < > raymond.hettinger at gmail.com> wrote: > >> >> On May 14, 2013, at 9:39 AM, Gregory P. Smith >> wrote: >> >> Bad: doctests. >> >> >> I'm hoping that core developers don't get caught-up in the "doctests >> are bad meme". >> > > Don't doctests intended for CPython not work on Jython, Pypy, > IronPython... > > I've been avoiding them for this reason. "Naive" doctests depend a lot on repr, and hence tend to break even between minor releases of CPython. Folks who use a lot of them apply a great deal of elbow grease to working around that problem, e.g. through "renoormalizing" the output: https://github.com/zopefoundation/zope.testing/blob/master/src/zope/testing/renormalizing.txt Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGZM1YACgkQ+gerLs4ltQ6zRACgx266WAzy1RDX0vOm7fThXzi5 zX4AoNyZFGBOML2XR4ZOecXwzG6XaHW+ =yGon -----END PGP SIGNATURE----- From tjreedy at udel.edu Sun May 19 22:21:01 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sun, 19 May 2013 16:21:01 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <1368974902254.f812247c@Nodemailer> Message-ID: On 5/19/2013 4:13 PM, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/19/2013 10:48 AM, Guido van Rossum wrote: >> Anyway, if you're doing arithmetic on enums you're doing it wrong. > > Hmm, bitwise operations, even? Those are logic, not arithmetic as usually understood. (The fact that one can do each with the other is beside the point.) From demianbrecht at gmail.com Mon May 20 00:29:37 2013 From: demianbrecht at gmail.com (Demian Brecht) Date: Sun, 19 May 2013 15:29:37 -0700 Subject: [Python-Dev] Why is documentation not inline? Message-ID: This is more out of curiosity than to spark change (although I wouldn't argue against it): Does anyone know why it was decided to document external to source files rather than inline? When rapidly digging through source, it would be much more helpful to see parameter docs than to either have to find source lines (that can easily be missed) to figure out the intention. Case in point, I've been digging through cookiejar.py and request.py to figure out their interactions. When reading through build_opener, it took me a few minutes to figure out that each element of *handlers can be either an instance /or/ a class definition (I was looking at how to define a custom cookiejar for an HTTPCookieProcessor). Yes, I'm (now) aware that there's some documentation at the top of request.py, but it would have been helpful to have it right in the definition of build_opener. It seems like external docs is standard throughout the stdlib. Is there an actual reason for this? Thanks, -- Demian Brecht http://demianbrecht.github.com From solipsis at pitrou.net Mon May 20 00:32:58 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 00:32:58 +0200 Subject: [Python-Dev] Why is documentation not inline? References: Message-ID: <20130520003258.23c57fa9@fsol> On Sun, 19 May 2013 15:29:37 -0700 Demian Brecht wrote: > This is more out of curiosity than to spark change (although I > wouldn't argue against it): Does anyone know why it was decided to > document external to source files rather than inline? > > When rapidly digging through source, it would be much more helpful to > see parameter docs than to either have to find source lines (that can > easily be missed) to figure out the intention. Case in point, I've > been digging through cookiejar.py and request.py to figure out their > interactions. When reading through build_opener, it took me a few > minutes to figure out that each element of *handlers can be either an > instance /or/ a class definition (I was looking at how to define a > custom cookiejar for an HTTPCookieProcessor). Yes, I'm (now) aware > that there's some documentation at the top of request.py, but it would > have been helpful to have it right in the definition of build_opener. > > It seems like external docs is standard throughout the stdlib. Is > there an actual reason for this? Have you seen the length of the documentation pages? Putting them inline in the stdlib module would make the code much harder to skim through. Regards Antoine. From benjamin at python.org Mon May 20 00:33:18 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 19 May 2013 15:33:18 -0700 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: References: Message-ID: 2013/5/19 Demian Brecht : > This is more out of curiosity than to spark change (although I > wouldn't argue against it): Does anyone know why it was decided to > document external to source files rather than inline? > > When rapidly digging through source, it would be much more helpful to > see parameter docs than to either have to find source lines (that can > easily be missed) to figure out the intention. Case in point, I've > been digging through cookiejar.py and request.py to figure out their > interactions. When reading through build_opener, it took me a few > minutes to figure out that each element of *handlers can be either an > instance /or/ a class definition (I was looking at how to define a > custom cookiejar for an HTTPCookieProcessor). Yes, I'm (now) aware > that there's some documentation at the top of request.py, but it would > have been helpful to have it right in the definition of build_opener. > > It seems like external docs is standard throughout the stdlib. Is > there an actual reason for this? ernal One is legacy. It certainly wasn't possible with the old LaTeX doc system. Another is that even if you do have API documentation inline, you have to have a lot of juggling in the external file to create the desired narrative structure which may not be the same as the code layout in the file. -- Regards, Benjamin From ncoghlan at gmail.com Mon May 20 00:42:05 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 08:42:05 +1000 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: <1368975441504.b153a948@Nodemailer> References: <1368975441504.b153a948@Nodemailer> Message-ID: On 20 May 2013 00:57, "Guido van Rossum" wrote: > > Hm. Wouldn'tvevery call site be slowed down by checking for that flag? Yeah, I forgot about having to push everything through the tp_call slot, so we can't easily limit the ordering check to just those cases where the callable accepts arbitrary kwargs. Cheers, Nick. > ? > Sent from Mailbox > > > On Sun, May 19, 2013 at 7:42 AM, Nick Coghlan wrote: >> >> On Sun, May 19, 2013 at 11:01 PM, Antoine Pitrou wrote: >> > The main use case seems to be the OrderedDict constructor itself. >> > Otherwise, I can't think of any situation where I would've wanted it. >> >> I've had a couple related to populating other mappings where order >> matters, at least from a predictability and readability perspective, >> even if it's not strictly required from a standards compliance point >> of view (think writing XML attributes, etc). >> >> I quite liked the idea of a simple flag attribute on function objects >> that the interpreter checked, with a decorator in functools (or even >> the builtins) to set it. It's not a particularly elegant solution, but >> it would get the job done with minimal performance impact on existing >> functions. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 20 00:46:53 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 08:46:53 +1000 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <1368974902254.f812247c@Nodemailer> Message-ID: On 20 May 2013 06:25, "Terry Jan Reedy" wrote: > > On 5/19/2013 4:13 PM, Tres Seaver wrote: >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 05/19/2013 10:48 AM, Guido van Rossum wrote: >>> >>> Anyway, if you're doing arithmetic on enums you're doing it wrong. >> >> >> Hmm, bitwise operations, even? > > > Those are logic, not arithmetic as usually understood. (The fact that one can do each with the other is beside the point.) I consider those to be binary arithmetic, but it's a fair point. The word I really wanted was "comparison" anyway, since the main intended uses of enums are as flags, lookup keys and marker values. Cheers, Nick. > > > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From demianbrecht at gmail.com Mon May 20 00:47:18 2013 From: demianbrecht at gmail.com (Demian Brecht) Date: Sun, 19 May 2013 15:47:18 -0700 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: <20130520003258.23c57fa9@fsol> References: <20130520003258.23c57fa9@fsol> Message-ID: @benjamin: Ah, i see. I wasn't around pre-Sphinx. However, unless there's some custom build steps that I'm unaware of that may prevent it, it should still be relatively easy to maintain the desired narrative structure as long as the inline API docs are kept terse. @antoine: Sorry, I may not have been clear. I wasn't advocating the inclusion of the /entire/ doc pages inline. I'm advocating terse documentation for the stdlib APIs and parameters. Narrative documentation can (and should be) maintained externally, but could use autodoc to include the terse references when desired. This would ensure that the same docs are available (and consistent) when reading the documentation as well as when neck-deep in code. On Sun, May 19, 2013 at 3:32 PM, Antoine Pitrou wrote: > On Sun, 19 May 2013 15:29:37 -0700 > Demian Brecht wrote: >> This is more out of curiosity than to spark change (although I >> wouldn't argue against it): Does anyone know why it was decided to >> document external to source files rather than inline? >> >> When rapidly digging through source, it would be much more helpful to >> see parameter docs than to either have to find source lines (that can >> easily be missed) to figure out the intention. Case in point, I've >> been digging through cookiejar.py and request.py to figure out their >> interactions. When reading through build_opener, it took me a few >> minutes to figure out that each element of *handlers can be either an >> instance /or/ a class definition (I was looking at how to define a >> custom cookiejar for an HTTPCookieProcessor). Yes, I'm (now) aware >> that there's some documentation at the top of request.py, but it would >> have been helpful to have it right in the definition of build_opener. >> >> It seems like external docs is standard throughout the stdlib. Is >> there an actual reason for this? > > Have you seen the length of the documentation pages? Putting them > inline in the stdlib module would make the code much harder to skim > through. > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/demianbrecht%40gmail.com -- Demian Brecht http://demianbrecht.github.com From prouleau001 at gmail.com Mon May 20 00:48:06 2013 From: prouleau001 at gmail.com (Pierre Rouleau) Date: Sun, 19 May 2013 18:48:06 -0400 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: <5194B390.6060700@v.loewis.de> References: <5194B390.6060700@v.loewis.de> Message-ID: Hi all, I just installed Python 2.7.5 64-bit () on a Windows 7 64-bit OS computer. When I evaluate sys.maxint I don't get what I was expected. I get this: Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit (AMD64)] on win32 Type "copyright", "credits" or "license()" for more information. >>> import sys >>> sys.maxint 2147483647 >>> import platform >>> platform.machine() 'AMD64' >>> import os >>> os.environ['PROCESSOR_ARCHITECTURE'] 'AMD64' >>> Should I not get a 64-bit integer maxint (9223372036854775807) for sys.maxint ? Or is there something I am missing here? Thanks! / Pierre Rouleau On Thu, May 16, 2013 at 6:23 AM, "Martin v. L?wis" wrote: > Am 16.05.13 10:42, schrieb Ben Hoyt: > > > FYI, I tried this just now with Python 2.7.4 running, and the > > installer nicely tells you that "some files that need to be updated > > are currently in use ... the following applications are using files, > > please close them and click Retry ... python.exe (Process Id: 5388)". > > > > So you can't do it while python.exe is running, but at least it > > notifies you and gives you the option to retry. Good work, whoever did > > this installer. > > This specific feature is part of the MSI technology itself, so the honor > goes to Microsoft in this case. They also have an advanced feature where > the installer can tell the running application to terminate, and then > restart after installation (since Vista, IIRC). Unfortunately, this > doesn't apply to Python, as a "safe restart" is typically not feasible. > > FWIW, I'm the one who put together the Python installer. > > Regards, > Martin > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/prouleau001%40gmail.com > -- /Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Mon May 20 00:56:51 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 19 May 2013 15:56:51 -0700 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: <5194B390.6060700@v.loewis.de> Message-ID: 2013/5/19 Pierre Rouleau : > Hi all, > > I just installed Python 2.7.5 64-bit () on a Windows 7 64-bit OS computer. > When I evaluate sys.maxint I don't get what I was expected. I get this: > > Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit (AMD64)] on > win32 > Type "copyright", "credits" or "license()" for more information. >>>> import sys >>>> sys.maxint > 2147483647 >>>> import platform >>>> platform.machine() > 'AMD64' >>>> import os >>>> os.environ['PROCESSOR_ARCHITECTURE'] > 'AMD64' >>>> > > > Should I not get a 64-bit integer maxint (9223372036854775807) for > sys.maxint ? This is correct. sizeof(long) != sizeof(void *) on Win64, and size Python int's are platform longs, you get the maxsize of a 32-bit int. Check sys.maxsize for comparison. -- Regards, Benjamin From dreamingforward at gmail.com Mon May 20 01:22:22 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sun, 19 May 2013 16:22:22 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <1368974902254.f812247c@Nodemailer> Message-ID: On Sun, May 19, 2013 at 1:13 PM, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/19/2013 10:48 AM, Guido van Rossum wrote: >> Anyway, if you're doing arithmetic on enums you're doing it wrong. > > Hmm, bitwise operations, even? I think it's rather pointless to do bitwise operations on python enums. We're not that close to the machine. MarkJ Tacoma, Washington From prouleau001 at gmail.com Mon May 20 01:23:23 2013 From: prouleau001 at gmail.com (Pierre Rouleau) Date: Sun, 19 May 2013 19:23:23 -0400 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: <5194B390.6060700@v.loewis.de> Message-ID: OK thanks, Benjamin, you are correct sys.maxsize is 2*63-1 on it. I was under the impression that Python was using int_64_t for the implementation of Win64 based integers. Most probably because I've sen discussion on Python 64 bits and those post were most probably were in the scope of some Unix-type platform. Regards, On Sun, May 19, 2013 at 6:56 PM, Benjamin Peterson wrote: > 2013/5/19 Pierre Rouleau : > > Hi all, > > > > I just installed Python 2.7.5 64-bit () on a Windows 7 64-bit OS > computer. > > When I evaluate sys.maxint I don't get what I was expected. I get this: > > > > Python 2.7.5 (default, May 15 2013, 22:44:16) [MSC v.1500 64 bit > (AMD64)] on > > win32 > > Type "copyright", "credits" or "license()" for more information. > >>>> import sys > >>>> sys.maxint > > 2147483647 > >>>> import platform > >>>> platform.machine() > > 'AMD64' > >>>> import os > >>>> os.environ['PROCESSOR_ARCHITECTURE'] > > 'AMD64' > >>>> > > > > > > Should I not get a 64-bit integer maxint (9223372036854775807) for > > sys.maxint ? > > This is correct. sizeof(long) != sizeof(void *) on Win64, and size > Python int's are platform longs, you get the maxsize of a 32-bit int. > Check sys.maxsize for comparison. > > > > -- > Regards, > Benjamin > -- /Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon May 20 01:27:33 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 19 May 2013 16:27:33 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: On Sat, May 18, 2013 at 11:41 PM, Raymond Hettinger < raymond.hettinger at gmail.com> wrote: > > On May 14, 2013, at 9:39 AM, Gregory P. Smith wrote: > > Bad: doctests. > > > I'm hoping that core developers don't get caught-up in the "doctests are > bad meme". > So long as doctests insist on comparing the repr of things being the number one practice that people use when writing them there is no other position I can hold on the matter. reprs are not stable and never have been. ordering changes, hashes change, ids change, pointer values change, wording and presentation of things change. none of those side effect behaviors were ever part of the public API to be depended on. That one can write doctests that don't depend on such things as the repr doesn't ultimately matter because the easiest thing to do, as encouraged by examples that are pasted from an interactive interpreter session into docs, is to have the interactive interpreter show the repr and not add code to check things in a accurate-for-testing manner that would otherwise make the documentation harder for a human to read. Instead, we should be clear about their primary purpose which is to test > the examples given in docstrings. In many cases, there is a great deal > of benefit to docstrings that have worked-out examples (see the docstrings > in the decimal module for example). In such cases it is also worthwhile > to make sure those examples continue to match reality. Doctests are > a vehicle for such assurance. In other words, doctests have a perfectly > legitimate use case. > I really do applaud the goal of keeping examples in documentation up to date. But doctest as it is today is the wrong approach to that. A repr mismatch does not mean the example is out of date. We should continue to encourage users to make thorough unit tests > and to leave doctests for documentation. That said, it should be > recognized that some testing is better than no testing. And doctests > may be attractive in that regard because it is almost effortless to > cut-and-paste a snippet from the interactive prompt. That isn't a > best practice, but it isn't a worst practice either. > Not quite, they at least tested something (yay!) but it is uncomfortably close to a worst practice. It means someone else needs to come understand the body of code containing this doctest when they make an unrelated change that triggered a behavior change as a side effect that the doctested code may or may not actually depend on but does not actually declare its intent one way or another for the purposes of being a readable example rather than accurate test. bikeshed colors: If doctest were never called a test but instead were called docchecker to not imply any testing aspect that might've helped (too late? the cat's out of the bag). Or if it never compared anything but simply ran the example code to generate and update the doc examples from the statements with the current actual results of execution instead of doing string comparisons... (ie: more of an documentation example "keep up to date" tool) Another meme that I hope dispel is the notion that the core developers > are free to break user code (such as doctests) if they believe the > users aren't coding in accordance with best practices. Our goal is to > improve their lives with our modifications, not to make their lives > more difficult. > Educating users how to apply best practices and making that easier for them every step of the way is a primary goal. Occasionally we'll have to do something annoying in the process but we do try to limit that. In my earlier message I suggested that someone improve doctest to not do dumb string comparisons of reprs. I still think that is a good goal if doctest is going to continue to be promoted. It would help alleviate many of the issues with doctests and bring them more in line with the issues many people's regular unittests have. As Tres already showed in an example, individual doctest using projects jump through hoops to do some of that today; centralizing saner repr comparisons for less false failures as an actual doctest feature just makes sense. Successful example: We added a bunch of new comparison methods to unittest in 2.7 that make it much easier to write tests that don't depend on implementation details such as ordering. Many users prefer to use those new features; even with older Python's via unittest2 on pypi. It doesn't mean users always write good tests, but a higher percentage of tests written are more future proof than they were before because it became easier. Currently, we face an adoption problem with Python 3. At PyCon, > an audience of nearly 2500 people said they had tried Python 3 > but weren't planning to convert to it in production code. All of the > coredevs are working to make Python 3 more attractive than Python 2, > but we also have to be careful to not introduce obstacles to conversion. > Breaking tests makes it much harder to convert (especially because > people need to rely on their tests to see if the conversion was > successful). > Idea: I don't believe anybody has written a fixer for lib2to3 that applies fixers to doctests. That'd be an interesting project for someone. Now you've got me wondering what Python would be like if repr, `` and __repr__ never existed as language features. Upon first thoughts, I actually don't see much downside (no, i'm not advocating making that change). Something to ponder. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Mon May 20 01:31:33 2013 From: benjamin at python.org (Benjamin Peterson) Date: Sun, 19 May 2013 16:31:33 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: 2013/5/19 Gregory P. Smith : > Idea: I don't believe anybody has written a fixer for lib2to3 that applies > fixers to doctests. That'd be an interesting project for someone. 2to3 can operate on doctests, though it doesn't do anything different to them than it does to normal sourcecode. -- Regards, Benjamin From prouleau001 at gmail.com Mon May 20 01:37:46 2013 From: prouleau001 at gmail.com (Pierre Rouleau) Date: Sun, 19 May 2013 19:37:46 -0400 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: <5194B390.6060700@v.loewis.de> Message-ID: On that topic of bitness for 64-bit platforms, would it not be better for CPython to be written such that it uses the same 64-bit strategy on all 64-bit platforms, regardless of the OS? As it is now, Python running on 64-bit Windows behaves differently (in terms of bits for the Python's integer) than it is behaving in other platforms. I assume that the Python C code is using the type 'long' instead of something like the C99 int64_t. Since Microsoft is using the LLP64 model and everyone else is using the LP64, code using the C 'long' type would mean something different on Windows than Unix-like platforms. Isn't that unfortunate? Would it not be better to hide the difference at Python level? Or is it done this way to allow existing C extension modules to work the way they were and request Python code that depends on integer sizes to check sys.maxint? Also, I would imagine that the performance delta between a Windows 32-bit Python versus 64-bit Python is not as big as it would be on a Unix computer. As far as I can se Python-64 bits on Windows 64-bit OS has a larger address space and probably does not benefit from anything else. Has anyone have data on this? Thanks -- /Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon May 20 01:41:28 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 01:41:28 +0200 Subject: [Python-Dev] [RELEASED] Python 2.7.5 References: <5194B390.6060700@v.loewis.de> Message-ID: <20130520014128.3b7c5300@fsol> On Sun, 19 May 2013 19:37:46 -0400 Pierre Rouleau wrote: > On that topic of bitness for 64-bit platforms, would it not be better for > CPython to be written such that it uses the same 64-bit strategy on all > 64-bit platforms, regardless of the OS? > > As it is now, Python running on 64-bit Windows behaves differently (in > terms of bits for the Python's integer) than it is behaving in other > platforms. I assume that the Python C code is using the type 'long' > instead of something like the C99 int64_t. Since Microsoft is using the > LLP64 model and everyone else is using the LP64, code using the C 'long' > type would mean something different on Windows than Unix-like platforms. > Isn't that unfortunate? Well, it's Microsoft's choice. But from a Python point of view, which C type a Python int maps to is of little relevance. Moreover, the development version is 3.4, and in Python 3 the int type is a variable-length integer type (sys.maxint doesn't exist anymore). So this discussion is largely moot now. Regards Antoine. From prouleau001 at gmail.com Mon May 20 01:47:20 2013 From: prouleau001 at gmail.com (Pierre Rouleau) Date: Sun, 19 May 2013 19:47:20 -0400 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: <20130520014128.3b7c5300@fsol> References: <5194B390.6060700@v.loewis.de> <20130520014128.3b7c5300@fsol> Message-ID: On Sun, May 19, 2013 at 7:41 PM, Antoine Pitrou wrote: > On Sun, 19 May 2013 19:37:46 -0400 > Pierre Rouleau wrote: > > > On that topic of bitness for 64-bit platforms, would it not be better for > > CPython to be written such that it uses the same 64-bit strategy on all > > 64-bit platforms, regardless of the OS? > > > > As it is now, Python running on 64-bit Windows behaves differently (in > > terms of bits for the Python's integer) than it is behaving in other > > platforms. I assume that the Python C code is using the type 'long' > > instead of something like the C99 int64_t. Since Microsoft is using the > > LLP64 model and everyone else is using the LP64, code using the C 'long' > > type would mean something different on Windows than Unix-like platforms. > > Isn't that unfortunate? > > Well, it's Microsoft's choice. But from a Python point of view, which C > type a Python int maps to is of little relevance. > Fair > > Moreover, the development version is 3.4, and in Python 3 the int > type is a variable-length integer type (sys.maxint doesn't exist > anymore). So this discussion is largely moot now. > > Good to know. Too bad there still are libraries not supporting Python 3. Thanks. > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/prouleau001%40gmail.com > -- /Pierre -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon May 20 01:51:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 09:51:07 +1000 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: References: <20130520003258.23c57fa9@fsol> Message-ID: On 20 May 2013 08:51, "Demian Brecht" wrote: > > @benjamin: Ah, i see. I wasn't around pre-Sphinx. However, unless > there's some custom build steps that I'm unaware of that may prevent > it, it should still be relatively easy to maintain the desired > narrative structure as long as the inline API docs are kept terse. That's what docstrings are for - abbreviated docs for use when reading the code and at the interactive prompt. The prose docs are designed to be a more discursive introduction to the full details of each operation, whereas docstrings are usually written more to provide someone that already knows what the function does with a reminder of the details. Cheers, Nick. > > @antoine: Sorry, I may not have been clear. I wasn't advocating the > inclusion of the /entire/ doc pages inline. I'm advocating terse > documentation for the stdlib APIs and parameters. Narrative > documentation can (and should be) maintained externally, but could use > autodoc to include the terse references when desired. This would > ensure that the same docs are available (and consistent) when reading > the documentation as well as when neck-deep in code. > > On Sun, May 19, 2013 at 3:32 PM, Antoine Pitrou wrote: > > On Sun, 19 May 2013 15:29:37 -0700 > > Demian Brecht wrote: > >> This is more out of curiosity than to spark change (although I > >> wouldn't argue against it): Does anyone know why it was decided to > >> document external to source files rather than inline? > >> > >> When rapidly digging through source, it would be much more helpful to > >> see parameter docs than to either have to find source lines (that can > >> easily be missed) to figure out the intention. Case in point, I've > >> been digging through cookiejar.py and request.py to figure out their > >> interactions. When reading through build_opener, it took me a few > >> minutes to figure out that each element of *handlers can be either an > >> instance /or/ a class definition (I was looking at how to define a > >> custom cookiejar for an HTTPCookieProcessor). Yes, I'm (now) aware > >> that there's some documentation at the top of request.py, but it would > >> have been helpful to have it right in the definition of build_opener. > >> > >> It seems like external docs is standard throughout the stdlib. Is > >> there an actual reason for this? > > > > Have you seen the length of the documentation pages? Putting them > > inline in the stdlib module would make the code much harder to skim > > through. > > > > Regards > > > > Antoine. > > > > > > _______________________________________________ > > Python-Dev mailing list > > Python-Dev at python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: http://mail.python.org/mailman/options/python-dev/demianbrecht%40gmail.com > > > > -- > Demian Brecht > http://demianbrecht.github.com > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Mon May 20 01:51:21 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 19 May 2013 16:51:21 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: On May 19, 2013 4:31 PM, "Benjamin Peterson" wrote: > > 2013/5/19 Gregory P. Smith : > > Idea: I don't believe anybody has written a fixer for lib2to3 that applies > > fixers to doctests. That'd be an interesting project for someone. > > 2to3 can operate on doctests, though it doesn't do anything different > to them than it does to normal sourcecode. > Oh cool. I didn't realize that already existed! > > -- > Regards, > Benjamin -------------- next part -------------- An HTML attachment was scrubbed... URL: From shibturn at gmail.com Mon May 20 02:04:31 2013 From: shibturn at gmail.com (Richard Oudkerk) Date: Mon, 20 May 2013 01:04:31 +0100 Subject: [Python-Dev] [RELEASED] Python 2.7.5 In-Reply-To: References: <5194B390.6060700@v.loewis.de> <20130520014128.3b7c5300@fsol> Message-ID: On 20/05/2013 12:47am, Pierre Rouleau wrote: > Moreover, the development version is 3.4, and in Python 3 the int > type is a variable-length integer type (sys.maxint doesn't exist > anymore). So this discussion is largely moot now. > > > Good to know. Too bad there still are libraries not supporting Python > 3. Thanks. Even in Python 2, if the result of arithmetic on ints which would overflow, the result automatically gets promoted to a long integer which is variable-length. >>> 2**128 340282366920938463463374607431768211456L >>> type(2), type(2**128) (, ) So the size of an int is pretty much irrelevant. -- Richard From ned at nedbatchelder.com Mon May 20 02:04:03 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 19 May 2013 20:04:03 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <1368974902254.f812247c@Nodemailer> Message-ID: <51996873.6020404@nedbatchelder.com> On 5/19/2013 7:22 PM, Mark Janssen wrote: > On Sun, May 19, 2013 at 1:13 PM, Tres Seaver wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> On 05/19/2013 10:48 AM, Guido van Rossum wrote: >>> Anyway, if you're doing arithmetic on enums you're doing it wrong. >> Hmm, bitwise operations, even? > I think it's rather pointless to do bitwise operations on python > enums. We're not that close to the machine. It makes sense if the enums represent bit-oriented values that will be used close to the machine. Python is used in many disciplines. --Ned. > MarkJ > Tacoma, Washington > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ned%40nedbatchelder.com > From solipsis at pitrou.net Mon May 20 02:14:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 02:14:25 +0200 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] References: <1368974902254.f812247c@Nodemailer> <51996873.6020404@nedbatchelder.com> Message-ID: <20130520021425.4a544ae2@fsol> On Sun, 19 May 2013 20:04:03 -0400 Ned Batchelder wrote: > On 5/19/2013 7:22 PM, Mark Janssen wrote: > > On Sun, May 19, 2013 at 1:13 PM, Tres Seaver wrote: > >> -----BEGIN PGP SIGNED MESSAGE----- > >> Hash: SHA1 > >> > >> On 05/19/2013 10:48 AM, Guido van Rossum wrote: > >>> Anyway, if you're doing arithmetic on enums you're doing it wrong. > >> Hmm, bitwise operations, even? > > I think it's rather pointless to do bitwise operations on python > > enums. We're not that close to the machine. > > It makes sense if the enums represent bit-oriented values that will be > used close to the machine. Python is used in many disciplines. Then it's up to the library writer to not use enums in that case. (assuming the performance of bitwise operations is critical here, which I doubt) Regards Antoine. From ncoghlan at gmail.com Mon May 20 02:24:08 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 10:24:08 +1000 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <20130520021425.4a544ae2@fsol> References: <1368974902254.f812247c@Nodemailer> <51996873.6020404@nedbatchelder.com> <20130520021425.4a544ae2@fsol> Message-ID: On Mon, May 20, 2013 at 10:14 AM, Antoine Pitrou wrote: > On Sun, 19 May 2013 20:04:03 -0400 > Ned Batchelder wrote: >> On 5/19/2013 7:22 PM, Mark Janssen wrote: >> > On Sun, May 19, 2013 at 1:13 PM, Tres Seaver wrote: >> >> -----BEGIN PGP SIGNED MESSAGE----- >> >> Hash: SHA1 >> >> >> >> On 05/19/2013 10:48 AM, Guido van Rossum wrote: >> >>> Anyway, if you're doing arithmetic on enums you're doing it wrong. >> >> Hmm, bitwise operations, even? >> > I think it's rather pointless to do bitwise operations on python >> > enums. We're not that close to the machine. >> >> It makes sense if the enums represent bit-oriented values that will be >> used close to the machine. Python is used in many disciplines. > > Then it's up to the library writer to not use enums in that case. > (assuming the performance of bitwise operations is critical here, which > I doubt) This is the point I was trying to make: once you use IntEnum (as you would in any case where you need bitwise operators), Enum gets out of the way for everything other than __str__, __repr__, and one other slot (that escapes me for the moment...). The metaclass does extra work at definition time so there shouldn't be any runtime overhead - the slots should be inherited directly from the non-Enum parent. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tim.peters at gmail.com Mon May 20 02:28:42 2013 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 19 May 2013 19:28:42 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: [Raymond Hettinger] > I'm hoping that core developers don't get caught-up in the "doctests are bad > meme". > > Instead, we should be clear about their primary purpose which is to test > the examples given in docstrings. I disagree. > In many cases, there is a great deal of benefit to docstrings that have > worked-out examples (see the docstrings in the decimal module for > example). In such cases it is also worthwhile to make sure those examples > continue to match reality. Doctests are a vehicle for such assurance. That's representative of how doctest was developed: to help me in keeping some well-defined mathematical functions working as intended. It still excels in that particular area (a few examples to illustrate normal cases, and a few more to illustrate well-defined end and error cases - and that's all there _is_ to be tested). > In other words, doctests have a perfectly legitimate use case. But more than just one ;-) Another great use has nothing to do with docstrings: using an entire file as "a doctest". This encourages writing lots of text explaining what you're doing,. with snippets of code interspersed to illustrate that the code really does behave in the ways you've claimed. > We should continue to encourage users to make thorough unit tests > and to leave doctests for documentation. I'd rather encourage users to turn their brains on when writing doctest files - and when writing unit tests. I've lost count of how many times I've seen a unit test fail, then stared helplessly at the unit test code just trying to figure out what the author thought they were doing. A lot of comments in the test code could have helped that, but - outside of doctest-based tests - there's typically very little explanatory text in testing code. picking-your-poison-ly y'rs - tim From demianbrecht at gmail.com Mon May 20 03:19:08 2013 From: demianbrecht at gmail.com (Demian Brecht) Date: Sun, 19 May 2013 18:19:08 -0700 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: References: <20130520003258.23c57fa9@fsol> Message-ID: @nick: Yes, I realize what docstrings are for (I should have used that term rather than "inline" docs, my bad there :)). I think the problem that I've run into is simply inconsistencies in methods of documenting code (and the few times that it would have been helpful, what I was looking at had not been authored using docstrings). Is the usage of docstrings a requirement (or a strong suggestion) for new commits (I didn't see anything while reading the submission guidelines)? If not, would it perhaps be a worthy addition? On Sun, May 19, 2013 at 4:51 PM, Nick Coghlan wrote: > > On 20 May 2013 08:51, "Demian Brecht" wrote: >> >> @benjamin: Ah, i see. I wasn't around pre-Sphinx. However, unless >> there's some custom build steps that I'm unaware of that may prevent >> it, it should still be relatively easy to maintain the desired >> narrative structure as long as the inline API docs are kept terse. > > That's what docstrings are for - abbreviated docs for use when reading the > code and at the interactive prompt. > > The prose docs are designed to be a more discursive introduction to the full > details of each operation, whereas docstrings are usually written more to > provide someone that already knows what the function does with a reminder of > the details. > > Cheers, > Nick. > >> >> @antoine: Sorry, I may not have been clear. I wasn't advocating the >> inclusion of the /entire/ doc pages inline. I'm advocating terse >> documentation for the stdlib APIs and parameters. Narrative >> documentation can (and should be) maintained externally, but could use >> autodoc to include the terse references when desired. This would >> ensure that the same docs are available (and consistent) when reading >> the documentation as well as when neck-deep in code. >> >> On Sun, May 19, 2013 at 3:32 PM, Antoine Pitrou >> wrote: >> > On Sun, 19 May 2013 15:29:37 -0700 >> > Demian Brecht wrote: >> >> This is more out of curiosity than to spark change (although I >> >> wouldn't argue against it): Does anyone know why it was decided to >> >> document external to source files rather than inline? >> >> >> >> When rapidly digging through source, it would be much more helpful to >> >> see parameter docs than to either have to find source lines (that can >> >> easily be missed) to figure out the intention. Case in point, I've >> >> been digging through cookiejar.py and request.py to figure out their >> >> interactions. When reading through build_opener, it took me a few >> >> minutes to figure out that each element of *handlers can be either an >> >> instance /or/ a class definition (I was looking at how to define a >> >> custom cookiejar for an HTTPCookieProcessor). Yes, I'm (now) aware >> >> that there's some documentation at the top of request.py, but it would >> >> have been helpful to have it right in the definition of build_opener. >> >> >> >> It seems like external docs is standard throughout the stdlib. Is >> >> there an actual reason for this? >> > >> > Have you seen the length of the documentation pages? Putting them >> > inline in the stdlib module would make the code much harder to skim >> > through. >> > >> > Regards >> > >> > Antoine. >> > >> > >> > _______________________________________________ >> > Python-Dev mailing list >> > Python-Dev at python.org >> > http://mail.python.org/mailman/listinfo/python-dev >> > Unsubscribe: >> > http://mail.python.org/mailman/options/python-dev/demianbrecht%40gmail.com >> >> >> >> -- >> Demian Brecht >> http://demianbrecht.github.com >> _______________________________________________ >> Python-Dev mailing list >> Python-Dev at python.org >> http://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -- Demian Brecht http://demianbrecht.github.com From greg.ewing at canterbury.ac.nz Mon May 20 03:35:49 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 20 May 2013 13:35:49 +1200 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: <1368975441504.b153a948@Nodemailer> Message-ID: <51997DF5.9080902@canterbury.ac.nz> Joao S. O. Bueno wrote: > Actually, when I was thinking on the subject I came to the same idea, of having > some functions marked differently so they would use a different call mechanism - > but them I wondered around having a different opcode for the ordered-dict calls. > > Would that be feasible? No, because the callee is the only one that knows whether it requires its keyword args to be ordered. In fact, not even the callee might know at the time of the call. Consider a function that takes **kwds and passes them on to another function that requires ordered keywords. -- Greg From guido at python.org Mon May 20 03:46:40 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 19 May 2013 18:46:40 -0700 Subject: [Python-Dev] What if we didn't have repr? Message-ID: On Sun, May 19, 2013 at 4:27 PM, Gregory P. Smith wrote: > Now you've got me wondering what Python would be like if repr, `` and > __repr__ never existed as language features. Upon first thoughts, I actually > don't see much downside (no, i'm not advocating making that change). > Something to ponder. I have pondered it many times, although usually in the form "Why do we need both str and repr?" Unfortunately I always come back to the same issue: I really want print() of a string to write just the characters of the string (without quotes), but I also really want the >>> prompt to write the string with quotes (and escapes). Moreover, these are just two examples of the different use cases -- repr() is more useful whenever you are writing a value for a debugging purpose (e.g. when logging), and str() is more useful when writing a value as "output" of a program. One could argume that the only type for which it makes sense to distinguish between the two is str itself -- indeed I rarely define different __repr__ and __str__ functions for new classes I create (but I do note that PEP 435 does define them differently for enum members). But for the str type, it's pretty important to have str(s) equal to s, and it's also pretty important to have a way to produce a string literal from a string value. And it would be annoying if we only had str()/__str__() as a general protocol and repr() was just a string method -- imagine the number of times people would be calling s.repr() in order to have unambiguous debug output only to get a traceback because s is None or some other non-str object... So it looks like we really need both str(x) and repr(x). But maybe we only need the __repr__ protocol? str(x) could just special-case its own type, and use repr(x) (or x.__repr__(), which is the same) in all other cases. The __repr__() method on the string type would do just what it does today. But there would not be a __str__() protocol at all. That would reduce some confusion and make the language spec a tiny bit smaller, and it might stop people from proposing/requesting that str() of a list should return something different than its repr(). It also would make it a little harder for some classes (like enums) to do something nicer when printed. But IIRC there are almost no builtin types that use this feature. Personally I think that "Color.red" still looks like debug output, intended for the developer of the program, not for its user. If I wanted to show a Color to an end user of my program, I'd be printing x.name anyway. And for debugging, "Color.red" suits me fine as well -- it even fulfills the (always informal!) rule of thumb that the repr() of an object should resemble an expression that has that object as a value better than the repr() currently defined in the PEP. After all, if I really wanted to know what was inside, I could always print x.value... Please don't see this as a proposal to change the language. Like Greg, I'm not advocating, just pondering. (With the exception that if I was allowed to use the time machine to go back a couple of weeks, I'd adjust PEP 435 to define str(x) as x.name, e.g. "red", and repr(x) as what is currently defined as str(x), e.g. "Color.red".) -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Mon May 20 04:33:44 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 20 May 2013 12:33:44 +1000 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: <51998B88.2020600@pearwood.info> On 20/05/13 09:27, Gregory P. Smith wrote: > On Sat, May 18, 2013 at 11:41 PM, Raymond Hettinger < > raymond.hettinger at gmail.com> wrote: > >> >> On May 14, 2013, at 9:39 AM, Gregory P. Smith wrote: >> >> Bad: doctests. >> >> >> I'm hoping that core developers don't get caught-up in the "doctests are >> bad meme". >> > > So long as doctests insist on comparing the repr of things being the number > one practice that people use when writing them there is no other position I > can hold on the matter. reprs are not stable and never have been. I think this *massively* exaggerates the "problem" with doc tests. I make heavy use of them, and have no problem writing doc tests that work in code running over multiple versions, including from 2.4 through 3.3. Objects that I write myself, I control the repr and can make it as stable as I wish. Many built-in types also have stable reprs. The repr for small ints is not going to change, the repr for floats like 0.5, 0.25, 0.125 etc. are stable and predictable, lists and tuples and strings all have stable well-defined reprs. Dicts are a conspicuous counter-example, but there are trivial work-arounds. Doc tests are not limited to a simple-minded "compare the object's repr". You can write as much, or as little, scaffolding around the test as you need. If the scaffolding becomes too large, that's a sign that the test doesn't belong in documentation and should be moved out, perhaps into a unit test, or perhaps into a separate "literate testing" document that can be as big as necessary without overwhelming the doc string. > ordering changes, hashes change, ids change, pointer values change, > wording and presentation of things change. none of those side effect > behaviors were ever part of the public API to be depended on. Then don't write doctests that depend on those things. It really is that simple. There's no rule that says doctests have to test the entire API. Doctests in docstrings are *documentation first*, so you write tests that make good documentation. The fact that things that are not stable parts of the API can be tested is independent of the framework you use to do the testing. If I, as an ignorant and foolish developer, wrote a unit test like this: class MyDumbTest(unittest.TestCase): def testSpamRepr(self): x = Spam(arg) self.assertEquals(repr(x), "") we shouldn't conclude that "unit tests are bad", but that MyDumbTest is bad and needs to be fixed. Perhaps the fix is to re-write the test to care less about the exact repr. (Doctest's ellipsis directive is excellent for that.) Perhaps the fix is to give the Spam object a stable repr that doesn't suck. Or perhaps the fix is to just say, this doesn't need to be a test at all. (And doctest has a directive for that too.) They are all good solutions to the "problem" of unit testing things that aren't part of the API, and they are also good solutions to the same problem when it comes to doctests. [...] > I really do applaud the goal of keeping examples in documentation up to > date. But doctest as it is today is the wrong approach to that. A repr > mismatch does not mean the example is out of date. No, it means that either the test was buggy, or the test has failed. I must admit that I don't understand what you think happens with doc testing in practice. You give the impression that there are masses of doc tests being written that look like this: >>> x = Spam(arg) >>> print(x) and therefore the use of doc tests are bad because it leads to broken tests. But I don't understand why you think that nobody has noticed that this test will have failed right from the start, and will have fixed it immediately. I suppose it is possible that some people write doc tests but never run them, not even once, but that's no different from those who write unit tests but never run them. They're hardly representative of the average developer, who either doesn't write tests at all, or who both writes and runs them and will notice if they fail. [...] > In my earlier message I suggested that someone improve doctest to not do > dumb string comparisons of reprs. I still think that is a good goal if > doctest is going to continue to be promoted. It would help alleviate many > of the issues with doctests and bring them more in line with the issues > many people's regular unittests have. As Tres already showed in an example, > individual doctest using projects jump through hoops to do some of that > today; centralizing saner repr comparisons for less false failures as an > actual doctest feature just makes sense. If a test needs to jump through hoops to work, then it doesn't belong as a test in the doc string. It should be a unit test, or possibly a separate test file that can be as big and complicated as needed. If you want to keep it as an example, but not actually run it, doctest has a skip directive. There's no need to complicate doctest by making it "smarter" (and therefore more likely to be buggy, harder to use, or both). > Successful example: We added a bunch of new comparison methods to unittest > in 2.7 that make it much easier to write tests that don't depend on > implementation details such as ordering. Many users prefer to use those new > features; even with older Python's via unittest2 on pypi. And that's great, it really is, I'm not being sarcastic. But unit testing is not in competition to doc testing, they are complimentary, not alternatives. If you're not using both, then you're probably missing out on something. -- Steven From ncoghlan at gmail.com Mon May 20 05:08:19 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 13:08:19 +1000 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: References: <20130520003258.23c57fa9@fsol> Message-ID: On Mon, May 20, 2013 at 11:19 AM, Demian Brecht wrote: > @nick: Yes, I realize what docstrings are for (I should have used that > term rather than "inline" docs, my bad there :)). I think the problem > that I've run into is simply inconsistencies in methods of documenting > code (and the few times that it would have been helpful, what I was > looking at had not been authored using docstrings). > > Is the usage of docstrings a requirement (or a strong suggestion) for > new commits (I didn't see anything while reading the submission > guidelines)? If not, would it perhaps be a worthy addition? It's already covered by PEP 8 : http://www.python.org/dev/peps/pep-0008/#documentation-strings (and yes, reviewers should be checking for that in new patches and commits) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Mon May 20 05:22:34 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 13:22:34 +1000 Subject: [Python-Dev] What if we didn't have repr? In-Reply-To: References: Message-ID: On Mon, May 20, 2013 at 11:46 AM, Guido van Rossum wrote: > On Sun, May 19, 2013 at 4:27 PM, Gregory P. Smith wrote: >> Now you've got me wondering what Python would be like if repr, `` and >> __repr__ never existed as language features. Upon first thoughts, I actually >> don't see much downside (no, i'm not advocating making that change). >> Something to ponder. > > I have pondered it many times, although usually in the form "Why do we > need both str and repr?" > So it looks like we really need both str(x) and repr(x). But maybe we > only need the __repr__ protocol? str(x) could just special-case its > own type, and use repr(x) (or x.__repr__(), which is the same) in all > other cases. The __repr__() method on the string type would do just > what it does today. But there would not be a __str__() protocol at > all. In my own code, I tend to map "__repr__" to either object identity (the classic "" format) or object reconstruction (the not-quite-as-classic-but-still-popular "ClassName(constructor_args)" format). I then tend to use "__str__" to emit something that matches any equivalence classes defined through "__eq__". This way of thinking about it does correlate with the "for developers" and "for end user" distinction, but without needing to think about it in those terms. However, even that approach has its limitations, and I offer the existence of both the pprint module and the "__format__" protocol as evidence, along with the multitude of context specific conversion and escaping functions. In many respects, conversion of arbitrary objects to context-appropriate strings is a signature use case for single dispatch generic functions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Mon May 20 06:17:04 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 20 May 2013 13:17:04 +0900 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: <87ppwmb7zz.fsf@uwakimon.sk.tsukuba.ac.jp> Gregory P. Smith writes: > I really do applaud the goal of keeping examples in documentation up to > date. But doctest as it is today is the wrong approach to that. A repr > mismatch does not mean the example is out of date. Of course it does. The user sees something in the doc that's different from what his interpreter tells him. That may not bother a long-time user of the module, or one who hangs out on python-commits (uh-huh, uh-huh, yeah, right), but it worries new ones, and it should. "What else may have changed?" is what they should be thinking. Also, there are many cases where the output of a function is defined by some external protocol: XML, JSON, RFC xxxx, etc. Here doctests are very valuable. There are lots of testing applications where doctests suck. There are lots of testing applications where doctests are pretty much the optimal balance between ease of creation and ease of maintenance. I wouldn't be surprised if there are applications (RAD?) where *creating* tests as doctests and *converting* to a more precisely-specified framework in maintenance is best practice. Maybe somebody (not me, I do far too little testing even with doctests :-( ) should write an informational PEP. From ethan at stoneleaf.us Mon May 20 06:08:13 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 19 May 2013 21:08:13 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <1368974902254.f812247c@Nodemailer> <51996873.6020404@nedbatchelder.com> <20130520021425.4a544ae2@fsol> Message-ID: <5199A1AD.4000306@stoneleaf.us> On 05/19/2013 05:24 PM, Nick Coghlan wrote: > > This is the point I was trying to make: once you use IntEnum (as you > would in any case where you need bitwise operators), Enum gets out of > the way for everything other than __str__, __repr__, and one other > slot (that escapes me for the moment...). __getnewargs__ and __new__ But if you do math, the result is no longer an Enum of any type. -- ~Ethan~ From regebro at gmail.com Mon May 20 07:57:43 2013 From: regebro at gmail.com (Lennart Regebro) Date: Mon, 20 May 2013 07:57:43 +0200 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: On Mon, May 20, 2013 at 1:51 AM, Gregory P. Smith wrote: > > On May 19, 2013 4:31 PM, "Benjamin Peterson" wrote: >> >> 2013/5/19 Gregory P. Smith : >> > Idea: I don't believe anybody has written a fixer for lib2to3 that >> > applies >> > fixers to doctests. That'd be an interesting project for someone. >> >> 2to3 can operate on doctests, though it doesn't do anything different >> to them than it does to normal sourcecode. > > Oh cool. I didn't realize that already existed! It won't change any output, though, which still means that they tend to break. From solipsis at pitrou.net Mon May 20 12:45:57 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 12:45:57 +0200 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] References: <5192292F.6050406@pearwood.info> Message-ID: <20130520124557.4aad2b85@fsol> On Sat, 18 May 2013 23:41:59 -0700 Raymond Hettinger wrote: > > We should continue to encourage users to make thorough unit tests > and to leave doctests for documentation. That said, it should be > recognized that some testing is better than no testing. And doctests > may be attractive in that regard because it is almost effortless to > cut-and-paste a snippet from the interactive prompt. That isn't a > best practice, but it isn't a worst practice either. There are other reasons to hate doctest, such as the obnoxious error reporting. Having to wade through ten pages of output to find what went wrong is no fun. Also the difficulty of editing them. For some reason, my editor doesn't offer me facilities to edit interactive prompt session snippets. All in all, I try hard to ignore any doctest present in the Python test suite. Regards Antoine. From tseaver at palladion.com Mon May 20 13:59:56 2013 From: tseaver at palladion.com (Tres Seaver) Date: Mon, 20 May 2013 07:59:56 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <1368974902254.f812247c@Nodemailer> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/19/2013 07:22 PM, Mark Janssen wrote: > On Sun, May 19, 2013 at 1:13 PM, Tres Seaver > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> On 05/19/2013 10:48 AM, Guido van Rossum wrote: >>> Anyway, if you're doing arithmetic on enums you're doing it >>> wrong. >> >> Hmm, bitwise operations, even? > > I think it's rather pointless to do bitwise operations on python > enums. We're not that close to the machine. What, nobody uses Python to do networking? How abaout driving the GPIO on a RaspberryPI? Using the bitwise operators to compbine named "flag" values seems like a pretty natural fit to me (if you don't care about the specific values, you don't need IntEnum anyway). Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGaEDwACgkQ+gerLs4ltQ5eXACfTrmegJsYDvbuwrbr5zyjwWV+ jMUAoIHQBi/qkm+MClGeh/ZwWOUGCMFm =4ey/ -----END PGP SIGNATURE----- From arigo at tunes.org Mon May 20 14:30:22 2013 From: arigo at tunes.org (Armin Rigo) Date: Mon, 20 May 2013 14:30:22 +0200 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: <20130519150148.513a1e5b@fsol> Message-ID: Hi all, On Sun, May 19, 2013 at 4:59 PM, Maciej Fijalkowski wrote: > Note that raymonds proposal would make dicts and ordereddicts almost > exactly the same speed. Just checking: in view of Raymond's proposal, is there a good reason against having all dicts be systematically ordered? It would definitely improve the debugging experience, by making multiple runs of the same program more like each other, instead of depending on the random address-based ordering. (Performance-wise, I guess it might be a little bit slower or faster depending on cache issues and so on, but the emphasis I'd put is on the "little bit".) I apologize if this was already shot down. A bient?t, Armin. From storchaka at gmail.com Mon May 20 14:37:59 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 20 May 2013 15:37:59 +0300 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: References: Message-ID: 20.05.13 01:33, Benjamin Peterson ???????(??): > 2013/5/19 Demian Brecht : >> It seems like external docs is standard throughout the stdlib. Is >> there an actual reason for this? > ernal > One is legacy. It certainly wasn't possible with the old LaTeX doc > system. Do you know that TeX itself written using a "literate programming". TeX binary and the TeXbook are compiled from the same source. From stefan at drees.name Mon May 20 15:02:08 2013 From: stefan at drees.name (Stefan Drees) Date: Mon, 20 May 2013 15:02:08 +0200 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: References: Message-ID: <519A1ED0.3030802@drees.name> On 20.05.13 14:37, Serhiy Storchaka wrote: > 20.05.13 01:33, Benjamin Peterson ???????(??): >> 2013/5/19 Demian Brecht : >>> It seems like external docs is standard throughout the stdlib. Is >>> there an actual reason for this? >> ernal >> One is legacy. It certainly wasn't possible with the old LaTeX doc >> system. > > Do you know that TeX itself written using a "literate programming". TeX > binary and the TeXbook are compiled from the same source. Separation of concerns and DRY - tension rising: Who wants to tangle and weave? Anyone :-?) All the best, Stefan From steve at pearwood.info Mon May 20 15:32:10 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 20 May 2013 23:32:10 +1000 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <20130520124557.4aad2b85@fsol> References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> Message-ID: <519A25DA.6060909@pearwood.info> On 20/05/13 20:45, Antoine Pitrou wrote: > On Sat, 18 May 2013 23:41:59 -0700 > Raymond Hettinger wrote: >> >> We should continue to encourage users to make thorough unit tests >> and to leave doctests for documentation. That said, it should be >> recognized that some testing is better than no testing. And doctests >> may be attractive in that regard because it is almost effortless to >> cut-and-paste a snippet from the interactive prompt. That isn't a >> best practice, but it isn't a worst practice either. > > There are other reasons to hate doctest, such as the obnoxious > error reporting. Having to wade through ten pages of output to find > what went wrong is no fun. Ten pages of broken unit tests are no picnic either. If you have ten pages of failures, then it doesn't matter *what* testing framework you use, you're going to have a bad time. But personally, I find doc test error reports perfectly clear and readable, and not overly verbose. File "test.py", line 4, in __main__ Failed example: len("abcd") Expected: 24 Got: 4 That's even simpler than a traceback. > Also the difficulty of editing them. For some reason, my editor doesn't > offer me facilities to edit interactive prompt session snippets. Your text editor doesn't allow you to edit text? Even Notepad allows that! Seriously, what editor are you using that doesn't allow you to edit pasted snippets? -- Steven From ethan at stoneleaf.us Mon May 20 15:12:41 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 06:12:41 -0700 Subject: [Python-Dev] PEP 409 and the stdlib Message-ID: <519A2149.3040903@stoneleaf.us> As a quick reminder, PEP 409 allows this: try: ... except AnError: raise SomeOtherError from None so that if the exception is not caught, we get the traditional single exception traceback, instead of the new: During handling of the above exception, another exception occurred My question: How do we go about putting this in the stdlib? Is this one of the occasions where we don't do it unless we're modifying a module already for some other reason? For that matter, should we? Pros: Makes tracebacks much less confusing, especially coming from a library Cons: Could hide bugs unrelated to what is being caught and transformed -- ~Ethan~ From solipsis at pitrou.net Mon May 20 15:38:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 15:38:52 +0200 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> <519A25DA.6060909@pearwood.info> Message-ID: <20130520153852.7996fa1f@fsol> On Mon, 20 May 2013 23:32:10 +1000 Steven D'Aprano wrote: > On 20/05/13 20:45, Antoine Pitrou wrote: > > On Sat, 18 May 2013 23:41:59 -0700 > > Raymond Hettinger wrote: > >> > >> We should continue to encourage users to make thorough unit tests > >> and to leave doctests for documentation. That said, it should be > >> recognized that some testing is better than no testing. And doctests > >> may be attractive in that regard because it is almost effortless to > >> cut-and-paste a snippet from the interactive prompt. That isn't a > >> best practice, but it isn't a worst practice either. > > > > There are other reasons to hate doctest, such as the obnoxious > > error reporting. Having to wade through ten pages of output to find > > what went wrong is no fun. > > Ten pages of broken unit tests are no picnic either. You didn't understand the objection. I'm talking about *one* broken doctest in a sea of non-broken ones. For some reason doctest (or its unittest driver) insists on either displaying everything, or nothing. It doesn't only print the errors and leave the rest silent. > > Also the difficulty of editing them. For some reason, my editor doesn't > > offer me facilities to edit interactive prompt session snippets. > > Your text editor doesn't allow you to edit text? Even Notepad allows that! > > Seriously, what editor are you using that doesn't allow you to edit pasted snippets? I don't know if you're intentionally being stupid. Of course I can edit them *by hand*. But I'll have to re-create by hand the various artifacts of an interpreter session, e.g. the prompts. Regards Antoine. From barry at python.org Mon May 20 15:39:45 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 20 May 2013 09:39:45 -0400 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: <20130519150148.513a1e5b@fsol> Message-ID: <20130520093945.00dec293@anarchist> On May 20, 2013, at 02:30 PM, Armin Rigo wrote: >Just checking: in view of Raymond's proposal, is there a good reason >against having all dicts be systematically ordered? It would >definitely improve the debugging experience, by making multiple runs >of the same program more like each other, instead of depending on the >random address-based ordering. (Performance-wise, I guess it might be >a little bit slower or faster depending on cache issues and so on, but >the emphasis I'd put is on the "little bit".) I'm ambivalent on the proposal -- I could get behind it if it was demonstrably *not* a performance hit (I'm already fighting enough "Python is too slow" battles). However, if such a change were made, I think it must be adopted as a change to the language specification. Meaning, if dicts (or even just keyword arguments) are to be ordered, it can't be as a side-effect of the implementation. We've had a lot of churn getting code and tests to the point where most libraries have adjusted to the undefined order of dictionary iteration. I don't want to go back to the situation where lots of implicit ordering assumptions caused broken code when run in one implementation or another. Or in other words, if dicts are to be ordered, let's make it an explicit language feature that we can measure compliance against. -Barry From rdmurray at bitdance.com Mon May 20 15:37:32 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 20 May 2013 09:37:32 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <20130520124557.4aad2b85@fsol> References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> Message-ID: <20130520133732.800302504B4@webabinitio.net> On Mon, 20 May 2013 12:45:57 +0200, Antoine Pitrou wrote: > On Sat, 18 May 2013 23:41:59 -0700 > Raymond Hettinger wrote: > > > > We should continue to encourage users to make thorough unit tests > > and to leave doctests for documentation. That said, it should be > > recognized that some testing is better than no testing. And doctests > > may be attractive in that regard because it is almost effortless to > > cut-and-paste a snippet from the interactive prompt. That isn't a > > best practice, but it isn't a worst practice either. > > There are other reasons to hate doctest, such as the obnoxious > error reporting. Having to wade through ten pages of output to find > what went wrong is no fun. That's why I added the 'failfast' option to doctest. > Also the difficulty of editing them. For some reason, my editor doesn't > offer me facilities to edit interactive prompt session snippets. I don't have much problem with lacking tailored facilities for this in vim. I suppose that is a matter of personal style. I *would* like to teach it the proper indentation, but I haven't been bothered enough yet to do it. (After all, weren't you the one who told me the lack of tab key indentation at the interactive prompt after you enabled completion by default wasn't an issue because one could just use space to indent? :) --David From rdmurray at bitdance.com Mon May 20 15:47:23 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 20 May 2013 09:47:23 -0400 Subject: [Python-Dev] Why is documentation not inline? In-Reply-To: <519A1ED0.3030802@drees.name> References: <519A1ED0.3030802@drees.name> Message-ID: <20130520134724.2CCBD2504B4@webabinitio.net> On Mon, 20 May 2013 15:02:08 +0200, Stefan Drees wrote: > On 20.05.13 14:37, Serhiy Storchaka wrote: > > 20.05.13 01:33, Benjamin Peterson ??????????????(????): > >> 2013/5/19 Demian Brecht : > >>> It seems like external docs is standard throughout the stdlib. Is > >>> there an actual reason for this? > >> ernal > >> One is legacy. It certainly wasn't possible with the old LaTeX doc > >> system. > > > > Do you know that TeX itself written using a "literate programming". TeX > > binary and the TeXbook are compiled from the same source. > > Separation of concerns and DRY - tension rising: > > Who wants to tangle and weave? Anyone :-?) I loved that concept so much when I first encountered it that I subsequently wrote systems (in REXX :) for doing something similar on two big projects I worked in my IBM mainframe days (one of them using SGML, if anyone remembers when there were actual source-to-printed-document systems for SGML). I guess I pretty much forgot about it when I moved to unix, but I suppose it is one of the reasons I do like doctest. A quick google tells me there are some links I should check out :) --David From ncoghlan at gmail.com Mon May 20 15:47:42 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 20 May 2013 23:47:42 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A2149.3040903@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> Message-ID: On 20 May 2013 23:38, "Ethan Furman" wrote: > > As a quick reminder, PEP 409 allows this: > > try: > ... > except AnError: > raise SomeOtherError from None > > so that if the exception is not caught, we get the traditional single exception traceback, instead of the new: > > During handling of the above exception, another exception occurred > > > My question: > > How do we go about putting this in the stdlib? Is this one of the occasions where we don't do it unless we're modifying a module already for some other reason? > > For that matter, should we? > > Pros: Makes tracebacks much less confusing, especially coming from a library > > Cons: Could hide bugs unrelated to what is being caught and transformed Be pretty conservative with this one - we should only use it when we're confident we know the original exception is almost certain to be irrelevant noise. Ensuring the traceback module makes it easy to display both would also be a good preliminary step. Cheers, Nick. > > -- > ~Ethan~ > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From rdmurray at bitdance.com Mon May 20 15:52:08 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 20 May 2013 09:52:08 -0400 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A2149.3040903@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> Message-ID: <20130520135208.ED8552504B4@webabinitio.net> On Mon, 20 May 2013 06:12:41 -0700, Ethan Furman wrote: > As a quick reminder, PEP 409 allows this: > > try: > ... > except AnError: > raise SomeOtherError from None > > so that if the exception is not caught, we get the traditional single exception traceback, instead of the new: > > During handling of the above exception, another exception occurred > > > My question: > > How do we go about putting this in the stdlib? Is this one of the occasions where we don't do it unless we're modifying > a module already for some other reason? > > For that matter, should we? > > Pros: Makes tracebacks much less confusing, especially coming from a library > > Cons: Could hide bugs unrelated to what is being caught and transformed I'm pretty sure the answer is "almost never". I think a case needs to be made for any place that seems like it would actually improve things, because usually I don't think it will, in the stdlib. --David From solipsis at pitrou.net Mon May 20 15:57:35 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 15:57:35 +0200 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> <20130520133732.800302504B4@webabinitio.net> Message-ID: <20130520155735.54f9a06e@fsol> On Mon, 20 May 2013 09:37:32 -0400 "R. David Murray" wrote: > On Mon, 20 May 2013 12:45:57 +0200, Antoine Pitrou wrote: > > On Sat, 18 May 2013 23:41:59 -0700 > > Raymond Hettinger wrote: > > > > > > We should continue to encourage users to make thorough unit tests > > > and to leave doctests for documentation. That said, it should be > > > recognized that some testing is better than no testing. And doctests > > > may be attractive in that regard because it is almost effortless to > > > cut-and-paste a snippet from the interactive prompt. That isn't a > > > best practice, but it isn't a worst practice either. > > > > There are other reasons to hate doctest, such as the obnoxious > > error reporting. Having to wade through ten pages of output to find > > what went wrong is no fun. > > That's why I added the 'failfast' option to doctest. I didn't know that. Is it propagated by regrtest? I never use doctest standalone. > > Also the difficulty of editing them. For some reason, my editor doesn't > > offer me facilities to edit interactive prompt session snippets. > > I don't have much problem with lacking tailored facilities for this > in vim. I suppose that is a matter of personal style. I *would* like to > teach it the proper indentation, but I haven't been bothered enough yet > to do it. (After all, weren't you the one who told me the lack of tab > key indentation at the interactive prompt after you enabled completion > by default wasn't an issue because one could just use space to indent? :) An interpreter prompt session is throwaway, so you can pretty much indent as you like (which may not be very pretty in a tests file). Besides, I was thinking about the prompts ('>>> ' and '... '), not the indentation itself. Regards Antoine. From ethan at stoneleaf.us Mon May 20 16:12:07 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 07:12:07 -0700 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> Message-ID: <519A2F37.6020406@stoneleaf.us> On 05/20/2013 06:47 AM, Nick Coghlan wrote: > On 20 May 2013 23:38, Ethan Furman wrote: >> >> As a quick reminder, PEP 409 allows this: >> >> try: >> ... >> except AnError: >> raise SomeOtherError from None >> >> so that if the exception is not caught, we get the traditional single exception traceback, instead of the new: >> >> During handling of the above exception, another exception occurred >> >> >> My question: >> >> How do we go about putting this in the stdlib? Is this one of the occasions where we don't do it unless we're modifying a module already for some other reason? >> >> For that matter, should we? >> >> Pros: Makes tracebacks much less confusing, especially coming from a library >> >> Cons: Could hide bugs unrelated to what is being caught and transformed > > Be pretty conservative with this one - we should only use it when we're confident we know the original exception is > almost certain to be irrelevant noise. > > Ensuring the traceback module makes it easy to display both would also be a good preliminary step. As a case in point, base64.py is currently getting a bug fix, and also contains this code: def b32decode(s, casefold=False, map01=None): . . . for i in range(0, len(s), 8): quanta = s[i: i + 8] acc = 0 try: for c in quanta: acc = (acc << 5) + b32rev[c] except KeyError: raise binascii.Error('Non-base32 digit found') . . . else: raise binascii.Error('Incorrect padding') Does the KeyError qualify as irrelevant noise? If we're not going to suppress the originating error I think we should at least change the double trace back message as it implies two failures, instead of just one. -- ~Ethan~ From rdmurray at bitdance.com Mon May 20 16:45:31 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 20 May 2013 10:45:31 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <20130520155735.54f9a06e@fsol> References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> <20130520133732.800302504B4@webabinitio.net> <20130520155735.54f9a06e@fsol> Message-ID: <20130520144532.2A74925007D@webabinitio.net> On Mon, 20 May 2013 15:57:35 +0200, Antoine Pitrou wrote: > On Mon, 20 May 2013 09:37:32 -0400 > "R. David Murray" wrote: > > On Mon, 20 May 2013 12:45:57 +0200, Antoine Pitrou wrote: > > > On Sat, 18 May 2013 23:41:59 -0700 > > > Raymond Hettinger wrote: > > > > > > > > We should continue to encourage users to make thorough unit tests > > > > and to leave doctests for documentation. That said, it should be > > > > recognized that some testing is better than no testing. And doctests > > > > may be attractive in that regard because it is almost effortless to > > > > cut-and-paste a snippet from the interactive prompt. That isn't a > > > > best practice, but it isn't a worst practice either. > > > > > > There are other reasons to hate doctest, such as the obnoxious > > > error reporting. Having to wade through ten pages of output to find > > > what went wrong is no fun. > > > > That's why I added the 'failfast' option to doctest. > > I didn't know that. Is it propagated by regrtest? I never use doctest > standalone. I don't think so. That's a good idea, though. > > > Also the difficulty of editing them. For some reason, my editor doesn't > > > offer me facilities to edit interactive prompt session snippets. > > > > I don't have much problem with lacking tailored facilities for this > > in vim. I suppose that is a matter of personal style. I *would* like to > > teach it the proper indentation, but I haven't been bothered enough yet > > to do it. (After all, weren't you the one who told me the lack of tab > > key indentation at the interactive prompt after you enabled completion > > by default wasn't an issue because one could just use space to indent? :) > > An interpreter prompt session is throwaway, so you can pretty much > indent as you like (which may not be very pretty in a tests file). > Besides, I was thinking about the prompts ('>>> ' and '... '), not the > indentation itself. True. I don't find typing >>> or ... very burdensome, though. Less even than fixing the alignment after hitting tab :) --David From rdmurray at bitdance.com Mon May 20 16:50:03 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 20 May 2013 10:50:03 -0400 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A2F37.6020406@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> Message-ID: <20130520145004.1B1FA250BCF@webabinitio.net> On Mon, 20 May 2013 07:12:07 -0700, Ethan Furman wrote: > As a case in point, base64.py is currently getting a bug fix, and also > contains this code: > > def b32decode(s, casefold=False, map01=None): > . > . > . > for i in range(0, len(s), 8): > quanta = s[i: i + 8] > acc = 0 > try: > for c in quanta: > acc = (acc << 5) + b32rev[c] > except KeyError: > raise binascii.Error('Non-base32 digit found') > . > . > . > else: > raise binascii.Error('Incorrect padding') > > Does the KeyError qualify as irrelevant noise? I don't see that it is of benefit to suppress it. > If we're not going to suppress the originating error I think we should > at least change the double trace back message as it implies two > failures, instead of just one. I don't understand what you want to do here. --David From barry at python.org Mon May 20 17:15:56 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 20 May 2013 11:15:56 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: <20130520111556.447a5a9e@anarchist> On May 18, 2013, at 11:41 PM, Raymond Hettinger wrote: >I'm hoping that core developers don't get caught-up in the "doctests are bad >meme". Thanks for your message Raymond. I know that doctests are controversial, but I do firmly believe that when used correctly, they have value and should not be broken without careful consideration. You make excellent points about Python 3 adoption and the "canary-like" nature of doctests. Cheers, -Barry From fwierzbicki at gmail.com Mon May 20 17:21:09 2013 From: fwierzbicki at gmail.com (fwierzbicki at gmail.com) Date: Mon, 20 May 2013 08:21:09 -0700 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: <20130520093945.00dec293@anarchist> References: <20130519150148.513a1e5b@fsol> <20130520093945.00dec293@anarchist> Message-ID: On Mon, May 20, 2013 at 6:39 AM, Barry Warsaw wrote: > Or in other words, if dicts are to be ordered, let's make it an explicit > language feature that we can measure compliance against. Guaranteeing a dict order would be tough on Jython - today it's nice that we can just have a thin wrapper around ConcurrentHashMap. In a world with hard ordering guarantees I think we'd need to write our own from scratch. -Frank From barry at python.org Mon May 20 17:27:09 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 20 May 2013 11:27:09 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: <20130520112709.5e202b43@anarchist> On May 19, 2013, at 07:28 PM, Tim Peters wrote: >But more than just one ;-) Another great use has nothing to do with >docstrings: using an entire file as "a doctest". This encourages >writing lots of text explaining what you're doing,. with snippets of >code interspersed to illustrate that the code really does behave in >the ways you've claimed. Agreed! I love separate-file doctests, and the marriage of doctests and reST/Sphinx is just fantastic. It's a pleasure to write usage documentation that contains code samples that are guaranteed to work, because they pass their doctest. (I personally don't like long-winded docstring doctests because they are harder to edit and distract from the code, but YMMV.) Several years ago, I spent some time experimenting with using doctest for *everything*. I deliberately wanted to go that extreme in order to better explore where doctests are good and where they're not so good. A general rule of thumb I came up with is that reST-style doctests are great for explanations involving mostly good-path usage of a library, or IOW "this is how you're supposed to use this API, and see it works!". IME, doctests are not so good at testing all the failure modes, odd corner cases, and the perhaps less-common good-path use cases. Fortunately, we have another useful tool for testing that stuff . >I'd rather encourage users to turn their brains on when writing >doctest files - and when writing unit tests. I've lost count of how >many times I've seen a unit test fail, then stared helplessly at the >unit test code just trying to figure out what the author thought they >were doing. A lot of comments in the test code could have helped >that, but - outside of doctest-based tests - there's typically very >little explanatory text in testing code. +1 A rule-of-thumb I use is what I call the FORTH rule[1]. If you should be able to understand what your own test is trying to accomplish a week later, otherwise you're not writing very good tests. ;) -Barry [1] or PERL rule maybe, depending on the unit of time. :) From barry at python.org Mon May 20 17:30:37 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 20 May 2013 11:30:37 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: <20130520113037.283dc721@anarchist> On May 19, 2013, at 04:27 PM, Gregory P. Smith wrote: >Idea: I don't believe anybody has written a fixer for lib2to3 that applies >fixers to doctests. That'd be an interesting project for someone. I'm not sure that's true. I don't use 2to3 anymore if I can help it, but I'm pretty sure you can 2to3 your doctests too. -Barry From guido at python.org Mon May 20 17:35:07 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 20 May 2013 08:35:07 -0700 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: References: <20130519150148.513a1e5b@fsol> <20130520093945.00dec293@anarchist> Message-ID: I think that kills the "let's make all dicts ordered" idea, even for CPython. I wouldn't want people to start relying on this. The dict type should be clearly recognizable as the hash table it is. Making **kwds ordered is still open, but requires careful design and implementation to avoid slowing down function calls that don't benefit. --Guido van Rossum (sent from Android phone) On May 20, 2013 8:25 AM, "fwierzbicki at gmail.com" wrote: > On Mon, May 20, 2013 at 6:39 AM, Barry Warsaw wrote: > > Or in other words, if dicts are to be ordered, let's make it an explicit > > language feature that we can measure compliance against. > Guaranteeing a dict order would be tough on Jython - today it's nice > that we can just have a thin wrapper around ConcurrentHashMap. In a > world with hard ordering guarantees I think we'd need to write our own > from scratch. > > -Frank > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon May 20 17:15:08 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 08:15:08 -0700 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <20130520145004.1B1FA250BCF@webabinitio.net> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> Message-ID: <519A3DFC.5090705@stoneleaf.us> On 05/20/2013 07:50 AM, R. David Murray wrote: > On Mon, 20 May 2013 07:12:07 -0700, Ethan Furman wrote: >> As a case in point, base64.py is currently getting a bug fix, and also >> contains this code: >> >> def b32decode(s, casefold=False, map01=None): >> . >> . >> . >> for i in range(0, len(s), 8): >> quanta = s[i: i + 8] >> acc = 0 >> try: >> for c in quanta: >> acc = (acc << 5) + b32rev[c] >> except KeyError: >> raise binascii.Error('Non-base32 digit found') >> . >> . >> . >> else: >> raise binascii.Error('Incorrect padding') >> >> Does the KeyError qualify as irrelevant noise? > > I don't see that it is of benefit to suppress it. > >> If we're not going to suppress the originating error I think we should >> at least change the double trace back message as it implies two >> failures, instead of just one. > > I don't understand what you want to do here. As a user of the b32decode the KeyError is an implementation detail and noise in the traceback. If I've got a non-base32 digit in my submitted string then the only exception I care about is the binascii.Error... but I'm going to see both, and the wording is such that it seems like I have two errors to deal with instead of just one. So I guess I see three options here: 1) Do nothing and be happy I use 'raise ... from None' in my own libraries 2) Change the wording of 'During handling of the above exception, another exception occurred' (no ideas as to what at the moment) 3) have the traceback module be configurable to show both exceptions even when 'raise ... from None' is used to help with debugging, then we can make the changes in stdlib confident that in our own testing of bugs we can see all available information. I would prefer 3, but I can live with 1. :) -- ~Ethan~ From steve at pearwood.info Mon May 20 17:39:03 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 21 May 2013 01:39:03 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A2F37.6020406@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> Message-ID: <519A4397.4090707@pearwood.info> On 21/05/13 00:12, Ethan Furman wrote: > As a case in point, base64.py is currently getting a bug fix, and also contains this code: > > def b32decode(s, casefold=False, map01=None): > . > . > . > for i in range(0, len(s), 8): > quanta = s[i: i + 8] > acc = 0 > try: > for c in quanta: > acc = (acc << 5) + b32rev[c] > except KeyError: > raise binascii.Error('Non-base32 digit found') > . > . > . > else: > raise binascii.Error('Incorrect padding') > > Does the KeyError qualify as irrelevant noise? IMO, it is irrelevant noise, and obviously so. The binascii.Error raised is not a bug to be fixed, it is a deliberate exception and part of the API of the binascii module. That it occurs inside an "except KeyError" block is a mere implementation detail. It merely happens to be that digits are converted by looking up in a mapping, another implementation might use a completely different mechanism. In fact, the implementation in Python 3.3 *is* completely different, and there is no KeyError to suppress. In another reply, R.David Murray answered: "I don't see that it is of benefit to suppress [the KeyError]." Can I suggest that it's obviously been a long, long time since you were a beginner to the language, and you've forgotten how intimidating error messages can be? Error messages should be *relevant*. Irrelevant details don't help, they hinder, and I suggest that the KeyError is irrelevant. -- Steven From solipsis at pitrou.net Mon May 20 17:46:38 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 17:46:38 +0200 Subject: [Python-Dev] PEP 409 and the stdlib References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> Message-ID: <20130520174638.12fae7ee@fsol> On Mon, 20 May 2013 07:12:07 -0700 Ethan Furman wrote: > > As a case in point, base64.py is currently getting a bug fix, and also contains this code: > > def b32decode(s, casefold=False, map01=None): > . > . > . > for i in range(0, len(s), 8): > quanta = s[i: i + 8] > acc = 0 > try: > for c in quanta: > acc = (acc << 5) + b32rev[c] > except KeyError: > raise binascii.Error('Non-base32 digit found') > . > . > . > else: > raise binascii.Error('Incorrect padding') > > Does the KeyError qualify as irrelevant noise? I think it is a legitimate case where to silence the original exception. However, the binascii.Error would be more informative if it said *which* non-base32 digit was encountered. Regards Antoine. From steve at pearwood.info Mon May 20 18:00:32 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 21 May 2013 02:00:32 +1000 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <20130520153852.7996fa1f@fsol> References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> <519A25DA.6060909@pearwood.info> <20130520153852.7996fa1f@fsol> Message-ID: <519A48A0.2090707@pearwood.info> On 20/05/13 23:38, Antoine Pitrou wrote: > On Mon, 20 May 2013 23:32:10 +1000 > Steven D'Aprano wrote: >> On 20/05/13 20:45, Antoine Pitrou wrote: >>> On Sat, 18 May 2013 23:41:59 -0700 >>> Raymond Hettinger wrote: >>>> >>>> We should continue to encourage users to make thorough unit tests >>>> and to leave doctests for documentation. That said, it should be >>>> recognized that some testing is better than no testing. And doctests >>>> may be attractive in that regard because it is almost effortless to >>>> cut-and-paste a snippet from the interactive prompt. That isn't a >>>> best practice, but it isn't a worst practice either. >>> >>> There are other reasons to hate doctest, such as the obnoxious >>> error reporting. Having to wade through ten pages of output to find >>> what went wrong is no fun. >> >> Ten pages of broken unit tests are no picnic either. > > You didn't understand the objection. I'm talking about *one* broken > doctest in a sea of non-broken ones. For some reason doctest (or its > unittest driver) insists on either displaying everything, or nothing. > It doesn't only print the errors and leave the rest silent. It sounds like you are inadvertently calling doctest with the verbose option. It is not standard behaviour to display "everything or nothing". Here is the output from 1 failing test out of 112, with absolutely nothing edited. [steve at ando ~]$ python test.py ********************************************************************** File "test.py", line 224, in __main__ Failed example: len("abcd") Expected: 24 Got: 4 ********************************************************************** 1 items had failures: 1 of 112 in __main__ ***Test Failed*** 1 failures. If I had any criticism of doctest, it would be that by default it prints nothing at all if all tests pass. I hate that, ever since I had a bunch of doctests that for about a week I thought were passing when in fact they weren't running at all. So now I always write something like this: if __name__ == '__main__': import doctest failed, tried = doctest.testmod() if failed == 0: print("Successfully ran %d tests" % tried) -- Steven From solipsis at pitrou.net Mon May 20 18:10:48 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 20 May 2013 18:10:48 +0200 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> <519A25DA.6060909@pearwood.info> <20130520153852.7996fa1f@fsol> <519A48A0.2090707@pearwood.info> Message-ID: <20130520181048.3dd5d53b@fsol> On Tue, 21 May 2013 02:00:32 +1000 Steven D'Aprano wrote: > On 20/05/13 23:38, Antoine Pitrou wrote: > > On Mon, 20 May 2013 23:32:10 +1000 > > Steven D'Aprano wrote: > >> On 20/05/13 20:45, Antoine Pitrou wrote: > >>> On Sat, 18 May 2013 23:41:59 -0700 > >>> Raymond Hettinger wrote: > >>>> > >>>> We should continue to encourage users to make thorough unit tests > >>>> and to leave doctests for documentation. That said, it should be > >>>> recognized that some testing is better than no testing. And doctests > >>>> may be attractive in that regard because it is almost effortless to > >>>> cut-and-paste a snippet from the interactive prompt. That isn't a > >>>> best practice, but it isn't a worst practice either. > >>> > >>> There are other reasons to hate doctest, such as the obnoxious > >>> error reporting. Having to wade through ten pages of output to find > >>> what went wrong is no fun. > >> > >> Ten pages of broken unit tests are no picnic either. > > > > You didn't understand the objection. I'm talking about *one* broken > > doctest in a sea of non-broken ones. For some reason doctest (or its > > unittest driver) insists on either displaying everything, or nothing. > > It doesn't only print the errors and leave the rest silent. > > > It sounds like you are inadvertently calling doctest with the verbose option. It is not standard behaviour to display "everything or nothing". Well, I never run doctest directly, I use regrtest (there are some doctests in the standard library). So perhaps the blame lies on regrtest or on the unittest adapter, my bad. Regards Antoine. From olemis at gmail.com Mon May 20 18:27:33 2013 From: olemis at gmail.com (Olemis Lang) Date: Mon, 20 May 2013 11:27:33 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: Hi ! :) I'll be replying some individual messages in this thread in spite of putting my replies in the right context . Sorry if I repeat something , or this makes the thread hard to read . Indeed , IMHO this is a subject suitable to discuss in TiP ML . On 5/19/13, Gregory P. Smith wrote: > On Sat, May 18, 2013 at 11:41 PM, Raymond Hettinger < > raymond.hettinger at gmail.com> wrote: > >> >> On May 14, 2013, at 9:39 AM, Gregory P. Smith wrote: >> >> Bad: doctests. >> >> >> I'm hoping that core developers don't get caught-up in the "doctests are >> bad meme". >> > > So long as doctests insist on comparing the repr of things being the number > one practice that people use when writing them there is no other position I > can hold on the matter. reprs are not stable and never have been. > ordering changes, hashes change, ids change, pointer values change, > wording and presentation of things change. none of those side effect > behaviors were ever part of the public API to be depended on. > ?Bad doctests? slogan is not positive because the subliminal message for new users is ?there's something wrong with that ... let's better not use it? . IMHO that's not true ; doctest is an incredibly awesome testing framework for delta assertions and there is nothing wrong with the philosophy behind that module and its intent . This surfaces an issue I've noticed years ago wrt doctest module (so, yes , it's obvious there's an issue ;) . The way I see it this is more about the fact that module frontend does not offer the means to benefit from all the possibilities of doctest classes in the backend (e.g. output checkers , doctest runners, ...) > That one can write doctests that don't depend on such things as the repr > doesn't ultimately matter because the easiest thing to do, as encouraged by > examples that are pasted from an interactive interpreter session into docs, > is to have the interactive interpreter show the repr and not add code to > check things in a accurate-for-testing manner that would otherwise make the > documentation harder for a human to read. > This is something that could be easily mitigated by a custom output checker . In the end , in docs there is no difference between output messages like '' or '' (i.e. some deterministic label like computed hex number or anything else ...) . You might also avoid printing repr(s) >> Instead, we should be clear about their primary purpose which is to test >> the examples given in docstrings. In many cases, there is a great deal >> of benefit to docstrings that have worked-out examples (see the >> docstrings >> in the decimal module for example). In such cases it is also worthwhile >> to make sure those examples continue to match reality. Doctests are >> a vehicle for such assurance. In other words, doctests have a perfectly >> legitimate use case. >> > > I really do applaud the goal of keeping examples in documentation up to > date. But doctest as it is today is the wrong approach to that. A repr > mismatch does not mean the example is out of date. > ... and I confess I never use doctest ?as it is today? in stdlib . So , you are right . > We should continue to encourage users to make thorough unit tests >> and to leave doctests for documentation. That said, it should be >> recognized that some testing is better than no testing. And doctests >> may be attractive in that regard because it is almost effortless to >> cut-and-paste a snippet from the interactive prompt. That isn't a >> best practice, but it isn't a worst practice either. >> > > Not quite, they at least tested something (yay!) but it is uncomfortably > close to a worst practice. > I disagree . IMO what is a bad practice is to spread the rumor that ?doctests are evil? rather than saying ?doctest module has limitations? > It means someone else needs to come understand the body of code containing > this doctest when they make an unrelated change that triggered a behavior > change as a side effect that the doctested code may or may not actually > depend on but does not actually declare its intent one way or another for > the purposes of being a readable example rather than accurate test. > I see no problem in keeping both these aspects . > bikeshed colors: If doctest were never called a test but instead were > called docchecker to not imply any testing aspect No way ! ( IMHO ) I just wrote dutest [1]_ framework , built on top of doctest and unittest , that does the following (among other things) : 1. Implements unittest loaders for doctests 2. Allows for customizing output checkers , doctest runners , ... anything you might find in the backend * For instance , replacing default test runner and output checkers might be useful to write delta assertions for command-line scripts 3. Tightly integrated with unittest (e.g. custom TestSuite(s) ...) 4. Access to unittest test case in special __tc__ variable , so all known assertion methods are handy ootb 5. Encapsulate doctest setup code (setUp , tearDown for doctests) e.g. to make doctests like the following work without actually writing in the docs all steps needed to load """Provided that ?complex scenario? is prepared and satisfies some preconditions (performed in hidden unittest setup methods) >>> do_something() >>> do_something_else() >>> ... """ I can report usage of dutest module in practice to test non-trivial web apps . I've even started to write a micro-framework on top of it to test Trac plugins (based on 3. , 4. , 5. above + twill ) , and it's possible to do the same thing for other web frameworks . Use cases might not be restricted to web apps ; like I mentioned above , custom output checkers + doctest runners will make it possible to test cli . ... so , in a few words , delta assertions (like doctests) are about testing and are much more than a doc checker. > that might've helped (too > late? the cat's out of the bag). just to create more confusion ... afaict > Or if it never compared anything but > simply ran the example code to generate and update the doc examples from > the statements with the current actual results of execution instead of > doing string comparisons... (ie: more of an documentation example "keep up > to date" tool) > Considering what I mentioned above , I disagree ... [...] > > In my earlier message I suggested that someone improve doctest to not do > dumb string comparisons of reprs. FWIW , that's possible with dutest [1]_ , indeed one of the main goals it was created for . > I still think that is a good goal if > doctest is going to continue to be promoted. It would help alleviate many > of the issues with doctests and bring them more in line with the issues > many people's regular unittests have. > > As Tres already showed in an example, > individual doctest using projects jump through hoops to do some of that > today; centralizing saner repr comparisons for less false failures as an > actual doctest feature just makes sense. > +1 Besides , IMHO doctest is very useful for APIs and lowers the barrier for writing testing code for people not used to XUnit philosophy . > Successful example: We added a bunch of new comparison methods to unittest > in 2.7 that make it much easier to write tests that don't depend on > implementation details such as ordering. Many users prefer to use those new > features; even with older Python's via unittest2 on pypi. It doesn't mean > users always write good tests, but a higher percentage of tests written are > more future proof than they were before because it became easier. > all this is possible with dutest [1]_ . Assertion methods of the underlying unittest test case are available in __tc__ variable e.g. """ >>> __tc__.assertEqual(x, y) """ so both approaches to testing may be combined ... and everybody *should* be happy . [...] .. [1] dutest module @ PyPI (https://pypi.python.org/pypi/dutest) -- Regards, Olemis. Apache? Bloodhound contributor http://issues.apache.org/bloodhound Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From olemis at gmail.com Mon May 20 18:37:53 2013 From: olemis at gmail.com (Olemis Lang) Date: Mon, 20 May 2013 11:37:53 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> <20130520124557.4aad2b85@fsol> Message-ID: ---------- Forwarded message ---------- From: Olemis Lang Date: Mon, 20 May 2013 11:33:42 -0500 Subject: Re: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] To: Antoine Pitrou On 5/20/13, Antoine Pitrou wrote: > On Sat, 18 May 2013 23:41:59 -0700 > Raymond Hettinger wrote: >> >> We should continue to encourage users to make thorough unit tests >> and to leave doctests for documentation. That said, it should be >> recognized that some testing is better than no testing. And doctests >> may be attractive in that regard because it is almost effortless to >> cut-and-paste a snippet from the interactive prompt. That isn't a >> best practice, but it isn't a worst practice either. > > There are other reasons to hate doctest, such as the obnoxious > error reporting. Having to wade through ten pages of output to find > what went wrong is no fun. > +1 FWIW , while using dutest [1]_ each interactive example will be a test case and therefore the match for that particular assertion will be reported using the usual unittest output format .. [1] dutest (https://pypi.python.org/pypi/dutest) -- Regards, Olemis. Apache? Bloodhound contributor http://issues.apache.org/bloodhound Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From dreamingforward at gmail.com Mon May 20 19:26:51 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 20 May 2013 10:26:51 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: >> I'm hoping that core developers don't get caught-up in the "doctests are bad >> meme". >> >> Instead, we should be clear about their primary purpose which is to test >> the examples given in docstrings. >> In other words, doctests have a perfectly legitimate use case. > > But more than just one ;-) Another great use has nothing to do with > docstrings: using an entire file as "a doctest". This encourages > writing lots of text explaining what you're doing,. with snippets of > code interspersed to illustrate that the code really does behave in > the ways you've claimed. +1, very true. I think doctest excel in almost every way above UnitTests. I don't understand the popularity of UnitTests, except perhaps for GUI testing which doctest can't handle. I think people just aren't very *imaginative* about how to create good doctests that are *also* good documentation. That serves two very good purposes in one. How can you beat that? The issues of teardown and setup are fixable and even more beautifully solved with doctests -- just use the lexical scoping of the program to determine the execution environment for the doctests. > picking-your-poison-ly y'rs - tim Cheers, Mark From g.brandl at gmx.net Mon May 20 19:38:28 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 20 May 2013 19:38:28 +0200 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A4397.4090707@pearwood.info> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <519A4397.4090707@pearwood.info> Message-ID: Am 20.05.2013 17:39, schrieb Steven D'Aprano: > On 21/05/13 00:12, Ethan Furman wrote: > > >> As a case in point, base64.py is currently getting a bug fix, and also >> contains this code: >> >> def b32decode(s, casefold=False, map01=None): . . . for i in range(0, >> len(s), 8): quanta = s[i: i + 8] acc = 0 try: for c in quanta: acc = (acc >> << 5) + b32rev[c] except KeyError: raise binascii.Error('Non-base32 digit >> found') . . . else: raise binascii.Error('Incorrect padding') >> >> Does the KeyError qualify as irrelevant noise? > > > IMO, it is irrelevant noise, and obviously so. The binascii.Error raised is > not a bug to be fixed, it is a deliberate exception and part of the API of > the binascii module. That it occurs inside an "except KeyError" block is a > mere implementation detail. It merely happens to be that digits are converted > by looking up in a mapping, another implementation might use a completely > different mechanism. In fact, the implementation in Python 3.3 *is* > completely different, and there is no KeyError to suppress. > > In another reply, R.David Murray answered: > > "I don't see that it is of benefit to suppress [the KeyError]." > > Can I suggest that it's obviously been a long, long time since you were a > beginner to the language, and you've forgotten how intimidating error > messages can be? Error messages should be *relevant*. Irrelevant details > don't help, they hinder, and I suggest that the KeyError is irrelevant. I agree. This is a case of a well isolated exception where there's no chance of hiding a bug because the KeyError was exceptional (). The argument of not making it harder than necessary to beginners (or casual users) seems valid to me, and since the code is being touched anyway, there shouldn't be unnecessary code churn. Georg From olemis at gmail.com Mon May 20 19:52:01 2013 From: olemis at gmail.com (Olemis Lang) Date: Mon, 20 May 2013 12:52:01 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <51998B88.2020600@pearwood.info> References: <5192292F.6050406@pearwood.info> <51998B88.2020600@pearwood.info> Message-ID: On 5/19/13, Steven D'Aprano wrote: > On 20/05/13 09:27, Gregory P. Smith wrote: >> On Sat, May 18, 2013 at 11:41 PM, Raymond Hettinger < >> raymond.hettinger at gmail.com> wrote: >> >>> >>> On May 14, 2013, at 9:39 AM, Gregory P. Smith wrote: >>> >>> Bad: doctests. >>> >>> >>> I'm hoping that core developers don't get caught-up in the "doctests are >>> bad meme". >>> >> >> So long as doctests insist on comparing the repr of things being the >> number >> one practice that people use when writing them there is no other position >> I >> can hold on the matter. reprs are not stable and never have been. > > I think this *massively* exaggerates the "problem" with doc tests. I agree , and it is a negative influence for beginners . > I make > heavy use of them, and have no problem writing doc tests that work in code > running over multiple versions, including from 2.4 through 3.3. Objects that > I write myself, I control the repr and can make it as stable as I wish. Many > built-in types also have stable reprs. The repr for small ints is not going > to change, the repr for floats like 0.5, 0.25, 0.125 etc. are stable and > predictable, lists and tuples and strings all have stable well-defined > reprs. Dicts are a conspicuous counter-example, but there are trivial > work-arounds. > +1 > Doc tests are not limited to a simple-minded "compare the object's repr". Yes > You can write as much, or as little, scaffolding around the test as you > need. If the scaffolding becomes too large, that's a sign that the test > doesn't belong in documentation and should be moved out, perhaps into a unit > test, or perhaps into a separate "literate testing" document that can be as > big as necessary without overwhelming the doc string. > There is an alternate approach related to a feature of dutest [1]_ I mentioned in a previous message (i.e. doctests setUp and tearDown methods) . The main reason to desire to leave long doctests scaffolding code out (e.g. loading a Trac environment, or setting up a separate Python virtual environment , subversion repository , ... as part of -unit, functional, ...- test setup ) is to focus on SUT / API details , avoid repetition of some steps , and keep tests readable . This code is moved to underlying unittest setUp method and it's still possible to write readable doctests for the particular feature of the SUT . In general there's a need to find a balance to decide what should be ?hidden? in doctests fixture methods and what should be written in doctests . Based on my experience there's no benefit in using unittest over doctests unittests : - are unreadable - require knowledge of XUnit , etc ... - Writing complex assertions might be hard and tedious doctests: - are extremely readable - anybody familiar with the SUT could write tests - especially for modules that are meant to be used by persons who are not (professional / skilled) software developers encapsulating the use of a testing framework is a plus ; your test suite is ?talking in users language? (/me not sure about stdlib ...) > >> ordering changes, hashes change, ids change, pointer values change, >> wording and presentation of things change. none of those side effect >> behaviors were ever part of the public API to be depended on. > > Then don't write doctests that depend on those things. It really is that > simple. There's no rule that says doctests have to test the entire API. > Doctests in docstrings are *documentation first*, so you write tests that > make good documentation. > ... but someone could do so , if it wasn't by the current limitations of doctest frontend . ;) > The fact that things that are not stable parts of the API can be tested is > independent of the framework you use to do the testing. If I, as an ignorant > and foolish developer, wrote a unit test like this: > > class MyDumbTest(unittest.TestCase): > def testSpamRepr(self): > x = Spam(arg) > self.assertEquals(repr(x), "") > > > we shouldn't conclude that "unit tests are bad", but that MyDumbTest is bad > and needs to be fixed. +1 [...] > And that's great, it really is, I'm not being sarcastic. But unit testing is > not in competition to doc testing, they are complimentary, not alternatives. > If you're not using both, then you're probably missing out on something. > +1 PS: ... and well , this would be my last message about dutest and how it improves upon what's offered by doctest module ... Summarizing : ?Bad doctests? is not a cool statement .. [1] dutest @ PyPI (https://pypi.python.org/pypi/dutest) -- Regards, Olemis. Apache? Bloodhound contributor http://issues.apache.org/bloodhound Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From dreamingforward at gmail.com Mon May 20 20:14:49 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 20 May 2013 11:14:49 -0700 Subject: [Python-Dev] What if we didn't have repr? In-Reply-To: References: Message-ID: > I have pondered it many times, although usually in the form "Why do we > need both str and repr?" Here's an idea: considering python objects are "stateful". Make a general, state-query operator: "?". Then the distinction is clear. >>> ?"This is a string" #Returns the contents of the string This is a string Then repr() is clearly the object "as it is" -- unstripped; i.e., not just it's state (or contents, or whatever). -- MarkJ Tacoma, Washington From storchaka at gmail.com Mon May 20 20:31:44 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 20 May 2013 21:31:44 +0300 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A2149.3040903@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> Message-ID: 20.05.13 16:12, Ethan Furman ???????(??): > As a quick reminder, PEP 409 allows this: > > try: > ... > except AnError: > raise SomeOtherError from None > > so that if the exception is not caught, we get the traditional single > exception traceback, instead of the new: > > During handling of the above exception, another exception occurred > > > My question: > > How do we go about putting this in the stdlib? Is this one of the > occasions where we don't do it unless we're modifying a module already > for some other reason? Usually I use "from None" in a new code when it hides irrelevant details. But in case of b32decode() (changeset 1b5ef05d6ced) I didn't do it. It's my fault, I'll fix it in next commit. From tjreedy at udel.edu Mon May 20 20:32:53 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 20 May 2013 14:32:53 -0400 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A4397.4090707@pearwood.info> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <519A4397.4090707@pearwood.info> Message-ID: On 5/20/2013 11:39 AM, Steven D'Aprano wrote: > On 21/05/13 00:12, Ethan Furman wrote: > > >> As a case in point, base64.py is currently getting a bug fix, and also >> contains this code: >> >> def b32decode(s, casefold=False, map01=None): >> . >> . >> . >> for i in range(0, len(s), 8): >> quanta = s[i: i + 8] >> acc = 0 >> try: >> for c in quanta: >> acc = (acc << 5) + b32rev[c] >> except KeyError: >> raise binascii.Error('Non-base32 digit found') >> . >> . >> . >> else: >> raise binascii.Error('Incorrect padding') >> >> Does the KeyError qualify as irrelevant noise? > > > IMO, it is irrelevant noise, and obviously so. The binascii.Error raised > is not a bug to be fixed, it is a deliberate exception and part of the > API of the binascii module. That it occurs inside an "except KeyError" > block is a mere implementation detail. Yes, the code could be revised to make a check on c before the indexing. This would be redundant (and a slowdown) in that the check is already done by the indexing mechanism. The whole point of the above is to *replace* the default KeyError with a custom binascii.Error for too-large chars. And I agree with Georg, please say which bad digit was found. Terry From ethan at stoneleaf.us Mon May 20 20:23:10 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 11:23:10 -0700 Subject: [Python-Dev] What if we didn't have repr? In-Reply-To: References: Message-ID: <519A6A0E.7050809@stoneleaf.us> On 05/20/2013 11:14 AM, Mark Janssen wrote: >> I have pondered it many times, although usually in the form "Why do we >> need both str and repr?" > > Here's an idea: considering python objects are "stateful". Make a > general, state-query operator: "?". Then the distinction is clear. > >--> ?"This is a string" #Returns the contents of the string > This is a string > > Then repr() is clearly the object "as it is" -- unstripped; i.e., not > just it's state (or contents, or whatever). You can have that now, just make your __repr__ do what you want. -- ~Ethan~ From ethan at stoneleaf.us Mon May 20 20:58:24 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 20 May 2013 11:58:24 -0700 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <519A4397.4090707@pearwood.info> Message-ID: <519A7250.8070704@stoneleaf.us> On 05/20/2013 11:32 AM, Terry Jan Reedy wrote: > On 5/20/2013 11:39 AM, Steven D'Aprano wrote: >> On 21/05/13 00:12, Ethan Furman wrote: >> >> >>> As a case in point, base64.py is currently getting a bug fix, and also >>> contains this code: >>> >>> def b32decode(s, casefold=False, map01=None): >>> . >>> . >>> . >>> for i in range(0, len(s), 8): >>> quanta = s[i: i + 8] >>> acc = 0 >>> try: >>> for c in quanta: >>> acc = (acc << 5) + b32rev[c] >>> except KeyError: >>> raise binascii.Error('Non-base32 digit found') >>> . >>> . >>> . >>> else: >>> raise binascii.Error('Incorrect padding') >>> >>> Does the KeyError qualify as irrelevant noise? >> >> >> IMO, it is irrelevant noise, and obviously so. The binascii.Error raised >> is not a bug to be fixed, it is a deliberate exception and part of the >> API of the binascii module. That it occurs inside an "except KeyError" >> block is a mere implementation detail. > > Yes, the code could be revised to make a check on c before the indexing. > This would be redundant (and a slowdown) in that the check is already done by the indexing mechanism. The whole point of > the above is to *replace* the default KeyError with a custom binascii.Error for too-large chars. > > And I agree with Georg, please say which bad digit was found. Actually, that was Antoine, but I'm sure Georg also agrees. ;) -- ~Ethan~ From eric at trueblade.com Mon May 20 21:31:48 2013 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 20 May 2013 15:31:48 -0400 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <20130520174638.12fae7ee@fsol> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520174638.12fae7ee@fsol> Message-ID: On May 20, 2013, at 11:46 AM, Antoine Pitrou wrote: > On Mon, 20 May 2013 07:12:07 -0700 > Ethan Furman wrote: >> >> As a case in point, base64.py is currently getting a bug fix, and also contains this code: >> >> def b32decode(s, casefold=False, map01=None): >> . >> . >> . >> for i in range(0, len(s), 8): >> quanta = s[i: i + 8] >> acc = 0 >> try: >> for c in quanta: >> acc = (acc << 5) + b32rev[c] >> except KeyError: >> raise binascii.Error('Non-base32 digit found') >> . >> . >> . >> else: >> raise binascii.Error('Incorrect padding') >> >> Does the KeyError qualify as irrelevant noise? > > I think it is a legitimate case where to silence the original > exception. However, the binascii.Error would be more informative if it > said *which* non-base32 digit was encountered. > And, if possible, the location (index) in the string. Eric. From ncoghlan at gmail.com Mon May 20 23:37:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 May 2013 07:37:38 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A7250.8070704@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <519A4397.4090707@pearwood.info> <519A7250.8070704@stoneleaf.us> Message-ID: On 21 May 2013 05:01, "Ethan Furman" wrote: > > On 05/20/2013 11:32 AM, Terry Jan Reedy wrote: >> >> On 5/20/2013 11:39 AM, Steven D'Aprano wrote: >>> >>> On 21/05/13 00:12, Ethan Furman wrote: >>> >>> >>>> As a case in point, base64.py is currently getting a bug fix, and also >>>> contains this code: >>>> >>>> def b32decode(s, casefold=False, map01=None): >>>> . >>>> . >>>> . >>>> for i in range(0, len(s), 8): >>>> quanta = s[i: i + 8] >>>> acc = 0 >>>> try: >>>> for c in quanta: >>>> acc = (acc << 5) + b32rev[c] >>>> except KeyError: >>>> raise binascii.Error('Non-base32 digit found') >>>> . >>>> . >>>> . >>>> else: >>>> raise binascii.Error('Incorrect padding') >>>> >>>> Does the KeyError qualify as irrelevant noise? >>> >>> >>> >>> IMO, it is irrelevant noise, and obviously so. The binascii.Error raised >>> is not a bug to be fixed, it is a deliberate exception and part of the >>> API of the binascii module. That it occurs inside an "except KeyError" >>> block is a mere implementation detail. >> >> >> Yes, the code could be revised to make a check on c before the indexing. >> This would be redundant (and a slowdown) in that the check is already done by the indexing mechanism. The whole point of >> the above is to *replace* the default KeyError with a custom binascii.Error for too-large chars. >> >> And I agree with Georg, please say which bad digit was found. > > > Actually, that was Antoine, but I'm sure Georg also agrees. ;) Indeed, a good question to ask when making use of PEP 409 is what debugging info is being lost by suppressing the original exception, and then making sure that info is captured and reported by the outer exception. There's probably a new PEP 8 guideline in this thread - perhaps something based on the above paragraph. Cheers, Nick. > > -- > ~Ethan~ > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Tue May 21 00:26:53 2013 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 20 May 2013 23:26:53 +0100 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> Message-ID: <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> On 20 May 2013, at 18:26, Mark Janssen wrote: >>> I'm hoping that core developers don't get caught-up in the "doctests are bad >>> meme". >>> >>> Instead, we should be clear about their primary purpose which is to test >>> the examples given in docstrings. >>> In other words, doctests have a perfectly legitimate use case. >> >> But more than just one ;-) Another great use has nothing to do with >> docstrings: using an entire file as "a doctest". This encourages >> writing lots of text explaining what you're doing,. with snippets of >> code interspersed to illustrate that the code really does behave in >> the ways you've claimed. > > +1, very true. I think doctest excel in almost every way above > UnitTests. I don't understand the popularity of UnitTests, except > perhaps for GUI testing which doctest can't handle. I think people > just aren't very *imaginative* about how to create good doctests that > are *also* good documentation. > Doc tests have lots of problems for unit testing. * Every line is a test with *all* output part of the test - in unit tests you only assert the specific details you're interested in * Unordered types are a pain with doctest unless you jump through hoops * Tool support for editing within doctests is *generally* worse * A failure on one line doesn't halt execution, so you can get many many reported errors from a single failure * Try adding diagnostic prints and then running your doctests! * Tools support in terms of test discovery and running individual tests is not as smooth * Typing >>> and ... all the time is really annoying * Doctests practically beg you to write your code first and then copy and paste terminal sessions - they're the enemy of TDD * Failure messages are not over helpful and you lose the benefit of some of the new asserts (and their diagnostic output) in unittest * Tests with non-linear code paths (branches) are more painful to express in doctests and so on... However doctests absolutely rock for testing documentation / docstring examples. So I'm with Raymond on this one. All the best, Michael > That serves two very good purposes in one. How can you beat that? > The issues of teardown and setup are fixable and even more beautifully > solved with doctests -- just use the lexical scoping of the program to > determine the execution environment for the doctests. > >> picking-your-poison-ly y'rs - tim > > Cheers, > > Mark > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From rosuav at gmail.com Tue May 21 00:31:23 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 21 May 2013 08:31:23 +1000 Subject: [Python-Dev] Ordering keyword dicts In-Reply-To: <51997DF5.9080902@canterbury.ac.nz> References: <1368975441504.b153a948@Nodemailer> <51997DF5.9080902@canterbury.ac.nz> Message-ID: On Mon, May 20, 2013 at 11:35 AM, Greg Ewing wrote: > Joao S. O. Bueno wrote: >> >> Actually, when I was thinking on the subject I came to the same idea, of >> having >> some functions marked differently so they would use a different call >> mechanism - >> but them I wondered around having a different opcode for the ordered-dict >> calls. >> >> Would that be feasible? > > > No, because the callee is the only one that knows whether it > requires its keyword args to be ordered. > > In fact, not even the callee might know at the time of the > call. Consider a function that takes **kwds and passes them > on to another function that requires ordered keywords. I wouldn't be bothered by that case, as it's no different from any other means of stuffing a dictionary through **kwds. If you want to preserve order through a wrapper, the wrapper needs to be declared to preserve order. The trouble is that there can't be any compile-time lookup to determine what (type of) function will be called, ergo this can't be resolved with a unique bytecode based on the destination. How big a deal would it be to bless OrderedDict with a special literal notation? Something like: od = o{'a': 1, 'b': 2, 'c': 3} Much of the need for ordered kwargs is for constructing OrderedDict itself after all (cf Antoine). ChrisA From barry at python.org Tue May 21 00:48:32 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 20 May 2013 18:48:32 -0400 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> References: <5192292F.6050406@pearwood.info> <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> Message-ID: <20130520184832.77df6e8b@anarchist> I don't think a python-dev discussion about the value of doctests is going to change minds one way or the other, but I just *had* to respond to this one point: On May 20, 2013, at 11:26 PM, Michael Foord wrote: >* Doctests practically beg you to write your code first and then copy and >* paste terminal sessions - they're the enemy of TDD In a sense, they're your best friend too. Countless times, when I'm designing an API, writing the documentation first helps clarify how I want the library to work, or where I need to think about the API more deeply. In much the same way that TDD is ideal when you know what you're aiming for, when you *don't* exactly know, it's a huge benefit to write the documentation first. Doing so will bring into stark contrast what needs improvement in your API. The fact that you can then test much of this documentation as you go, brings the win of TDD to your documentation. -Barry From v+python at g.nevcal.com Tue May 21 01:10:20 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 20 May 2013 16:10:20 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <5199A1AD.4000306@stoneleaf.us> References: <1368974902254.f812247c@Nodemailer> <51996873.6020404@nedbatchelder.com> <20130520021425.4a544ae2@fsol> <5199A1AD.4000306@stoneleaf.us> Message-ID: <519AAD5C.90003@g.nevcal.com> On 5/19/2013 9:08 PM, Ethan Furman wrote: > On 05/19/2013 05:24 PM, Nick Coghlan wrote: >> >> This is the point I was trying to make: once you use IntEnum (as you >> would in any case where you need bitwise operators), Enum gets out of >> the way for everything other than __str__, __repr__, and one other >> slot (that escapes me for the moment...). > > __getnewargs__ and __new__ > > But if you do math, the result is no longer an Enum of any type. And thus completely loses the debugging benefits of having a nice __repr__. IntEnum isn't useful for bitfields. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olemis at gmail.com Tue May 21 01:35:34 2013 From: olemis at gmail.com (Olemis Lang) Date: Mon, 20 May 2013 18:35:34 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> References: <5192292F.6050406@pearwood.info> <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> Message-ID: Hi ! ... sorry , I could not avoid to reply this message ... On 5/20/13, Michael Foord wrote: > > On 20 May 2013, at 18:26, Mark Janssen wrote: > >>>> I'm hoping that core developers don't get caught-up in the "doctests are >>>> bad >>>> meme". >>>> >>>> Instead, we should be clear about their primary purpose which is to >>>> test >>>> the examples given in docstrings. >>>> In other words, doctests have a perfectly legitimate use case. >>> >>> But more than just one ;-) Another great use has nothing to do with >>> docstrings: using an entire file as "a doctest". This encourages >>> writing lots of text explaining what you're doing,. with snippets of >>> code interspersed to illustrate that the code really does behave in >>> the ways you've claimed. >> >> +1, very true. I think doctest excel in almost every way above >> UnitTests. I don't understand the popularity of UnitTests, except >> perhaps for GUI testing which doctest can't handle. I think people >> just aren't very *imaginative* about how to create good doctests that >> are *also* good documentation. >> > With enhanced doctests solution in mind ... > Doc tests have lots of problems for unit testing. > > * Every line is a test with *all* output part of the test - in unit tests > you only assert the specific details you're interested in custom output checkers > * Unordered types are a pain with doctest unless you jump through hoops ( custom output checkers + doctest runner ) | (dutest __tc__ global var) > * Tool support for editing within doctests is *generally* worse this is true , let's do it ! > * A failure on one line doesn't halt execution, so you can get many many > reported errors from a single failure it should if REPORT_ONLY_FIRST_FAILURE option [1]_ is set . > * Try adding diagnostic prints and then running your doctests! I have ... dutest suites for my Trac plugins do so . However logging information is outputted to /path/to/trac/env/log/trac.log ... so a tail -f is always handy . > * Tools support in terms of test discovery and running individual tests is > not as smooth dutest offers two options since years ago MultiTestLoader combines multiple test loaders to *load* different kinds of tests at once from a module , whereas a package loader performs test discovery . These loader objects are composable , so if an instance of MultiTestLoader is supplied in to the package test loader then multiple types of tests are loaded out of modules all over across the package hierarchy . Indeed , in +10 years of Python development I've never used unittest(2) discovery, and even recently implemented the one that's used in Apache? Bloodhound test suite . Unfortunately I've had no much time to spend on improving all this support in dutest et al. > * Typing >>> and ... all the time is really annoying ... I have faith ... there should be something like this for vim ... I have faith ... ;) > * Doctests practically beg you to write your code first and then copy and > paste terminal sessions - they're the enemy of TDD Of course , not , all the opposite . If the approach is understood correctly then the first thing test author will do is to write the code ?expected? to get something done . When everything is ok with API code style then write the code . Many problems in the API and inconsistencies are thus detected early . > * Failure messages are not over helpful and you lose the benefit of some of > the new asserts (and their diagnostic output) in unittest (custom ouput checkers) | ( dutest __tc__ variable ) > * Tests with non-linear code paths (branches) are more painful to express in > doctests > that's a fact , not just branches , but also exceptions Beyond this ... My really short answer is that I do not agree with this . Like I just said in previous messages with enhanced support like the one offered by dutest (i.e. __tc__ global var bound to an instance of unittest.TestCase) it's possible to invoke each and every unittest assertion method . So this may be seen all the other way round ?unittest machinery is already used without even declaring a single test class? ... and so on ... ... so , in concept , there is no real benefit in using unittest over doctest *if* doctest module is eventually upgraded . [...] > > However doctests absolutely rock for testing documentation / docstring > examples. > FWIW , +1 [...] .. [1] doctest.REPORT_ONLY_FIRST_FAILURE (http://docs.python.org/2/library/doctest.html#doctest.REPORT_ONLY_FIRST_FAILURE) -- Regards, Olemis. Apache? Bloodhound contributor http://issues.apache.org/bloodhound Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From olemis at gmail.com Tue May 21 01:59:11 2013 From: olemis at gmail.com (Olemis Lang) Date: Mon, 20 May 2013 18:59:11 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> Message-ID: On 5/20/13, Olemis Lang wrote: [...] > On 5/20/13, Michael Foord wrote: [...] > >> * Tool support for editing within doctests is *generally* worse > > this is true , let's do it ! > [...] >> * Typing >>> and ... all the time is really annoying > > ... I have faith ... there should be something like this for vim ... I > have faith ... ;) > FWIW ... an option could be to combine >>> auto-completion (in the end that's yet another indentation ;) to this http://architects.dzone.com/articles/real-time-doctest-checking-vim ... and I could better enjoy my vim + python development experience ;) -- Regards, Olemis. Apache? Bloodhound contributor http://issues.apache.org/bloodhound Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From dreamingforward at gmail.com Tue May 21 03:55:15 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 20 May 2013 18:55:15 -0700 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> Message-ID: >> * Doctests practically beg you to write your code first and then copy and >> paste terminal sessions - they're the enemy of TDD > > Of course , not , all the opposite . If the approach is understood > correctly then the first thing test author will do is to write the > code ?expected? to get something done . When everything is ok with API > code style then write the code . Many problems in the API and > inconsistencies are thus detected early . Now all we need is a test() built-in, a companion to help() and we have the primo platform for doctest-code-test cycle for TDD and agile development. --Mark From v+python at g.nevcal.com Tue May 21 08:44:50 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Mon, 20 May 2013 23:44:50 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <51924739.1050803@stoneleaf.us> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> <5191C51D.3000304@g.nevcal.com> <5191D590.2070601@stoneleaf.us> <51924739.1050803@stoneleaf.us> Message-ID: <519B17E2.5000701@g.nevcal.com> On 5/14/2013 7:16 AM, Ethan Furman wrote: >> Thank you for being persistent. You are correct, the value should be >> an IntET (at least, with a custom __new__ ;). > > You know, when you look at something you wrote the night before, and > have no idea what you were trying to say, you know you were tired. > Ignore my parenthetical remark. Gladly. And we now have several more days to have forgotten what we were doing/talking about... > Okay, the value is now an IntET, as expected and appropriate. Maybe. I upgraded my ref435.py from yours at https://bitbucket.org/stoneleaf/ref435 (and your test file there references enum.py which is not there). My demo1.py still doesn't work. The first 4 lines are fine, but not the last two. I still cannot do a lookup (via __call__ syntax) by either int or IntET value. You have my old misnamed NEI class in your test file now, and the tests you use with it work... but you don't have a lookup test. My demo1 does, and that fails. After instrumenting Enum.__new__ it seems that the member.value is still the constructor parameters... Maybe I picked up the wrong version of your code? Oh and demo1.py has leftover __new__ and __init__ definitions for NIE, modeled after your earlier suggestions. Leaving them in causes everything to be named 'temp'. Taking them out makes things not work differently. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hrvoje.niksic at avl.com Tue May 21 09:17:29 2013 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 21 May 2013 09:17:29 +0200 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A3DFC.5090705@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> Message-ID: <519B1F89.4030003@avl.com> On 05/20/2013 05:15 PM, Ethan Furman wrote: > 1) Do nothing and be happy I use 'raise ... from None' in my own libraries > > 2) Change the wording of 'During handling of the above exception, another exception occurred' (no ideas as to what at > the moment) The word "occurred" misleads one to think that, during handling of the real exception, an unrelated and unintended exception occurred. This is not the case when the "raise" keyword is used. In that case, the exception was intentionally *converted* from one type to another. For the "raise" case a wording like the following might work better: The above exception was converted to the following exception: ... That makes it clear that the conversion was explicit and (hopefully) intentional, and that the latter exception supersedes the former. Hrvoje From storchaka at gmail.com Tue May 21 10:36:29 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 21 May 2013 11:36:29 +0300 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519B1F89.4030003@avl.com> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> Message-ID: 21.05.13 10:17, Hrvoje Niksic ???????(??): > On 05/20/2013 05:15 PM, Ethan Furman wrote: >> 1) Do nothing and be happy I use 'raise ... from None' in my own >> libraries >> >> 2) Change the wording of 'During handling of the above exception, >> another exception occurred' (no ideas as to what at >> the moment) > > The word "occurred" misleads one to think that, during handling of the > real exception, an unrelated and unintended exception occurred. This is > not the case when the "raise" keyword is used. In that case, the > exception was intentionally *converted* from one type to another. For > the "raise" case a wording like the following might work better: > > The above exception was converted to the following exception: > ... > > That makes it clear that the conversion was explicit and (hopefully) > intentional, and that the latter exception supersedes the former. How do you distinguish intentional and unintentional exceptions? From hrvoje.niksic at avl.com Tue May 21 11:28:33 2013 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 21 May 2013 11:28:33 +0200 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> Message-ID: <519B3E41.3090501@avl.com> On 05/21/2013 10:36 AM, Serhiy Storchaka wrote: >> The above exception was converted to the following exception: >> ... >> >> That makes it clear that the conversion was explicit and (hopefully) >> intentional, and that the latter exception supersedes the former. > > How do you distinguish intentional and unintentional exceptions? By the use of the "raise" keyword. Given the code: try: x = d['key'] except KeyError: raise BusinessError(...) ...the explicit raising is a giveaway that the new exception was quite intentional. Hrvoje From v+python at g.nevcal.com Tue May 21 11:16:23 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 21 May 2013 02:16:23 -0700 Subject: [Python-Dev] PEP 435 - ref impl disc 2 In-Reply-To: <519B17E2.5000701@g.nevcal.com> References: <5185FCB1.6030702@g.nevcal.com> <518DD40F.1070005@g.nevcal.com> <5191A348.90805@stoneleaf.us> <5191C51D.3000304@g.nevcal.com> <5191D590.2070601@stoneleaf.us> <51924739.1050803@stoneleaf.us> <519B17E2.5000701@g.nevcal.com> Message-ID: <519B3B67.5000804@g.nevcal.com> On 5/20/2013 11:44 PM, Glenn Linderman wrote: > On 5/14/2013 7:16 AM, Ethan Furman wrote: >>> Thank you for being persistent. You are correct, the value should >>> be an IntET (at least, with a custom __new__ ;). >> >> You know, when you look at something you wrote the night before, and >> have no idea what you were trying to say, you know you were tired. >> Ignore my parenthetical remark. > > Gladly. And we now have several more days to have forgotten what we > were doing/talking about... > >> Okay, the value is now an IntET, as expected and appropriate. > > Maybe. > > I upgraded my ref435.py from yours at > https://bitbucket.org/stoneleaf/ref435 (and your test file there > references enum.py which is not there). > > My demo1.py still doesn't work. The first 4 lines are fine, but not > the last two. I still cannot do a lookup (via __call__ syntax) by > either int or IntET value. > > You have my old misnamed NEI class in your test file now, and the > tests you use with it work... but you don't have a lookup test. My > demo1 does, and that fails. > > After instrumenting Enum.__new__ it seems that the member.value is > still the constructor parameters... > > Maybe I picked up the wrong version of your code? > > Oh and demo1.py has leftover __new__ and __init__ definitions for NIE, > modeled after your earlier suggestions. Leaving them in causes > everything to be named 'temp'. Taking them out makes things not work > differently. Oh, it was an hg misunderstanding (hg newbie here)... I wasn't getting the latest code. Things are working much better now. I notice, however, with my latest code at https://v_python at bitbucket.org/v_python/ref435a that demo1, which has an explicit duplicate name, and no __new__ or __init__ code, has a .value which is actually a IntET (as shown by the last print of the repr of the value). However, demo2, which attempts to "marry" the classes and avoid the duplicate name specifications, has a .value which is an "unnamed" IntET, whereas one would expect it to be named. Noticing the changes you made, I think it is a result of line 177 in ref435.py where you actually instantiate a 2nd copy of IntET?using the same parameters, but a separate instantiation from the "married-with-Enum" copy?to use as the value. I guess there is no way to "extract" the IntET from the "married-with-Enum" copy, to use as the value? So then, this is good, but not quite good enough: the 2nd copy of the IntET should have the same name as the "married-with-Enum" copy. Now in demo4.py I figured out how I could fix that, since the second copy is (currently) made before the __init__ call for the "married-with-Enum" copy, and stored in an accessible place. On the other hand, it is a bit of a surprise to have to do that, and it would also be a bit of a surprise for classes that have class state that affects the instantiation of instances... That might just mean that some classes can't be mixed with Enum, but I suppose known restrictions and/or side effects should be documented. As an example of this, I tried to resurrect your AutoNumber from your message of 6 May 2013 19:29 -0700 in the "PEP 435: initial values must be specified? Yes" thread, but couldn't, apparently due to changes in the implementation of ref435, but after fixing those problems, I still got an error where it demanded a parameter to new, even though one shouldn't be needed in that case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Tue May 21 11:56:14 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 21 May 2013 12:56:14 +0300 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519B3E41.3090501@avl.com> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B3E41.3090501@avl.com> Message-ID: 21.05.13 12:28, Hrvoje Niksic ???????(??): > On 05/21/2013 10:36 AM, Serhiy Storchaka wrote: >>> The above exception was converted to the following exception: >>> ... >>> >>> That makes it clear that the conversion was explicit and (hopefully) >>> intentional, and that the latter exception supersedes the former. >> >> How do you distinguish intentional and unintentional exceptions? > > By the use of the "raise" keyword. Given the code: > > try: > x = d['key'] > except KeyError: > raise BusinessError(...) > > ....the explicit raising is a giveaway that the new exception was quite > intentional. try: x = d['key'] except KeyError: x = fallback('key') def fallback(key): if key not in a: raise BusinessError(...) return 1 / a[key] # possible TypeError, ZeroDivisionError, etc From hrvoje.niksic at avl.com Tue May 21 12:05:37 2013 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 21 May 2013 12:05:37 +0200 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B3E41.3090501@avl.com> Message-ID: <519B46F1.4040801@avl.com> On 05/21/2013 11:56 AM, Serhiy Storchaka wrote: > try: > x = d['key'] > except KeyError: > x = fallback('key') > > def fallback(key): > if key not in a: > raise BusinessError(...) > return 1 / a[key] # possible TypeError, ZeroDivisionError, etc Yes, in that case the exception will appear unintentional and you get the old message ? it's on a best-effort basis. Hrvoje From ncoghlan at gmail.com Tue May 21 13:23:01 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 May 2013 21:23:01 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519B1F89.4030003@avl.com> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> Message-ID: On Tue, May 21, 2013 at 5:17 PM, Hrvoje Niksic wrote: > On 05/20/2013 05:15 PM, Ethan Furman wrote: >> >> 1) Do nothing and be happy I use 'raise ... from None' in my own >> libraries >> >> 2) Change the wording of 'During handling of the above exception, another >> exception occurred' (no ideas as to what at >> the moment) > > > The word "occurred" misleads one to think that, during handling of the real > exception, an unrelated and unintended exception occurred. This is not the > case when the "raise" keyword is used. In that case, the exception was > intentionally *converted* from one type to another. For the "raise" case a > wording like the following might work better: > > The above exception was converted to the following exception: > ... > > That makes it clear that the conversion was explicit and (hopefully) > intentional, and that the latter exception supersedes the former. This ship sailed long ago (it was covered by the original exception chaining spec in PEP 3134). If you want to deliberately replace an exception while retaining the full traceback, you use "raise X from Y", and the intro text will change to something like "This exception was the direct cause of the following exception:" This thread is about the case where you want to use "raise X from None" to suppress the display of the original exception completely, which is a new capability in Python 3.3. So whenever we consider changing the standard library, we should also look at the explicit chaining option, particularly when the original exception may have happened inside a user provided callback (including method calls) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From storchaka at gmail.com Tue May 21 14:57:02 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 21 May 2013 15:57:02 +0300 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519B46F1.4040801@avl.com> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B3E41.3090501@avl.com> <519B46F1.4040801@avl.com> Message-ID: 21.05.13 13:05, Hrvoje Niksic ???????(??): > On 05/21/2013 11:56 AM, Serhiy Storchaka wrote: >> try: >> x = d['key'] >> except KeyError: >> x = fallback('key') >> >> def fallback(key): >> if key not in a: >> raise BusinessError(...) >> return 1 / a[key] # possible TypeError, ZeroDivisionError, etc > > Yes, in that case the exception will appear unintentional and you get > the old message ? it's on a best-effort basis. In both cases the BusinessError exception raised explicitly. How do you distinguish one case from another? From lukasz at langa.pl Tue May 21 15:16:40 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Tue, 21 May 2013 15:16:40 +0200 Subject: [Python-Dev] What if we didn't have repr? In-Reply-To: References: Message-ID: On 20 maj 2013, at 03:46, Guido van Rossum wrote: > On Sun, May 19, 2013 at 4:27 PM, Gregory P. Smith wrote: >> Now you've got me wondering what Python would be like if repr, `` and >> __repr__ never existed as language features. Upon first thoughts, I actually >> don't see much downside (no, i'm not advocating making that change). >> Something to ponder. > > I have pondered it many times, although usually in the form "Why do we > need both str and repr?" What if we did the opposite? 1. Make __str__() a protocol for arbitrary string conversion. 2. Move the current __repr__() contracts, both firm and informal to a new, extensible version of pprint. There has been some discussion led by Raymond in 2010 about a general `pprint rewrite`__ and I'm willing to pick up the idea with a PEP for inclusion in 3.4. __ http://bugs.python.org/issue7434 -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 21 15:24:17 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 May 2013 23:24:17 +1000 Subject: [Python-Dev] [Python-checkins] cpython (3.3): #17973: Add FAQ entry for ([], )[0] += [1] both extending and raising. In-Reply-To: <3bDjLs4fyNz7Lkh@mail.python.org> References: <3bDjLs4fyNz7Lkh@mail.python.org> Message-ID: On Tue, May 21, 2013 at 12:35 AM, r.david.murray wrote: Yay for having this in the FAQ, but... > +If you wrote:: > + > + >>> a_tuple = (1, 2) > + >>> a_tuple[0] += 1 > + Traceback (most recent call last): > + ... > + TypeError: 'tuple' object does not support item assignment > + > +The reason for the exception should be immediately clear: ``1`` is added to the > +object ``a_tuple[0]`` points to (``1``), producing the result object, ``2``, > +but when we attempt to assign the result of the computation, ``2``, to element > +``0`` of the tuple, we get an error because we can't change what an element of > +a tuple points to. > + > +Under the covers, what this augmented assignment statement is doing is > +approximately this:: > + > + >>> result = a_tuple[0].__iadd__(1) > + >>> a_tuple[0] = result > + Traceback (most recent call last): > + ... > + TypeError: 'tuple' object does not support item assignment For the immutable case, this expansion is incorrect: >>> hasattr(0, "__iadd__") False With immutable objects, += almost always expands to: >>> result = a_tuple[0] + 1 >>> a_tuple[0] = result (For containers that support binary operators, the presence of the relevant __i*__ methods is actually a reasonable heuristic for distinguishing the mutable and immutable versions) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From hrvoje.niksic at avl.com Tue May 21 15:23:44 2013 From: hrvoje.niksic at avl.com (Hrvoje Niksic) Date: Tue, 21 May 2013 15:23:44 +0200 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B3E41.3090501@avl.com> <519B46F1.4040801@avl.com> Message-ID: <519B7560.1020302@avl.com> On 05/21/2013 02:57 PM, Serhiy Storchaka wrote: > 21.05.13 13:05, Hrvoje Niksic ???????(??): >> On 05/21/2013 11:56 AM, Serhiy Storchaka wrote: >>> try: >>> x = d['key'] >>> except KeyError: >>> x = fallback('key') >>> >>> def fallback(key): >>> if key not in a: >>> raise BusinessError(...) >>> return 1 / a[key] # possible TypeError, ZeroDivisionError, etc >> >> Yes, in that case the exception will appear unintentional and you get >> the old message ? it's on a best-effort basis. > > In both cases the BusinessError exception raised explicitly. How do you > distinguish one case from another? In my example code the "raise" keyword appears lexically inside the "except" clause. The compiler would automatically emit a different raise opcode in that case. NB in your example the "raise" is just as intentional, but invoked from a different function, which causes the above criterion to result in a false negative. Even in so, the behavior would be no worse than now, you'd just get the old message. Hrvoje From guido at python.org Tue May 21 15:36:54 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 21 May 2013 06:36:54 -0700 Subject: [Python-Dev] What if we didn't have repr? In-Reply-To: References: Message-ID: Actually changing __str__ or __repr__ is out of the question, best we can do is discourage makingbthem different. But adding a protocol for pprint (with extra parameters to convey options) is a fair idea. I note that Nick sggested to use single-dispatch generic functions for this though. Both have pros and cons. Post design ideas to python-ideas please, not here! --Guido On Tuesday, May 21, 2013, ?ukasz Langa wrote: > On 20 maj 2013, at 03:46, Guido van Rossum > > wrote: > > On Sun, May 19, 2013 at 4:27 PM, Gregory P. Smith > > wrote: > > Now you've got me wondering what Python would be like if repr, `` and > __repr__ never existed as language features. Upon first thoughts, I > actually > don't see much downside (no, i'm not advocating making that change). > Something to ponder. > > > I have pondered it many times, although usually in the form "Why do we > need both str and repr?" > > > What if we did the opposite? > > 1. Make __str__() a protocol for arbitrary string conversion. > 2. Move the current __repr__() contracts, both firm and informal to a new, > extensible version of pprint. > > There has been some discussion led by Raymond in 2010 about a general > `pprint rewrite`__ and I'm willing to pick up the idea with a PEP for > inclusion in 3.4. > > > > __ http://bugs.python.org/issue7434 > > -- > Best regards, > ?ukasz Langa > > WWW: http://lukasz.langa.pl/ > Twitter: @llanga > IRC: ambv on #python-dev > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue May 21 15:46:31 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 21 May 2013 23:46:31 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519B7560.1020302@avl.com> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B3E41.3090501@avl.com> <519B46F1.4040801@avl.com> <519B7560.1020302@avl.com> Message-ID: On Tue, May 21, 2013 at 11:23 PM, Hrvoje Niksic wrote: > In my example code the "raise" keyword appears lexically inside the "except" > clause. The compiler would automatically emit a different raise opcode in > that case. Hrvoje, can we drop this subthread please. The topic was addressed way back when PEP 3134 was written, and there is already dedicated syntax to distinguish incidental exceptions in error handlers ("raise new") from deliberate replacement of an exception with a new one ("raise new from original") Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rdmurray at bitdance.com Tue May 21 17:55:43 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 21 May 2013 11:55:43 -0400 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519A4397.4090707@pearwood.info> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <519A4397.4090707@pearwood.info> Message-ID: <20130521155544.3295225007D@webabinitio.net> On Tue, 21 May 2013 01:39:03 +1000, Steven D'Aprano wrote: > On 21/05/13 00:12, Ethan Furman wrote: > > > As a case in point, base64.py is currently getting a bug fix, and also contains this code: > > > > def b32decode(s, casefold=False, map01=None): > > . > > . > > . > > for i in range(0, len(s), 8): > > quanta = s[i: i + 8] > > acc = 0 > > try: > > for c in quanta: > > acc = (acc << 5) + b32rev[c] > > except KeyError: > > raise binascii.Error('Non-base32 digit found') > > . > > . > > . > > else: > > raise binascii.Error('Incorrect padding') > > > > Does the KeyError qualify as irrelevant noise? [...] > In another reply, R.David Murray answered: > > "I don't see that it is of benefit to suppress [the KeyError]." > > Can I suggest that it's obviously been a long, long time since you > were a beginner to the language, and you've forgotten how intimidating > error messages can be? Error messages should be *relevant*. Irrelevant > details don't help, they hinder, and I suggest that the KeyError is > irrelevant. Doubtless you are correct. Now that you mention it I do remember being confused, even as an experienced programmer, by the chained exceptions when I first started dealing with them, but at this point I suppose it has become second nature :). I agree with the subsequent discussion that this error is a good case for 'from None', given that any such conversion should make sure all essential information is contained in the new error message. And I agree with Nick that there are probably many more places where 'raise from' will help clarify things when we *don't* want 'from None'. --David From solipsis at pitrou.net Tue May 21 17:57:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 21 May 2013 17:57:39 +0200 Subject: [Python-Dev] PEP 442 delegate Message-ID: <20130521175739.21e55c68@fsol> Hello, I would like to nominate Benjamin as BDFL-Delegate for PEP 442. Please tell me if you would like to object :) Regards Antoine. From benjamin at python.org Tue May 21 18:00:28 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 21 May 2013 09:00:28 -0700 Subject: [Python-Dev] PEP 442 delegate In-Reply-To: <20130521175739.21e55c68@fsol> References: <20130521175739.21e55c68@fsol> Message-ID: 2013/5/21 Antoine Pitrou : > > Hello, > > I would like to nominate Benjamin as BDFL-Delegate for PEP 442. > Please tell me if you would like to object :) I think he's a scoundrel. -- Regards, Benjamin From guido at python.org Tue May 21 18:42:20 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 21 May 2013 09:42:20 -0700 Subject: [Python-Dev] PEP 442 delegate In-Reply-To: References: <20130521175739.21e55c68@fsol> Message-ID: No objections. Benjamin, don't accept it until we've had a chance to talk this over in person. I think we'll see a lot of each other starting next week... :-) On Tue, May 21, 2013 at 9:00 AM, Benjamin Peterson wrote: > 2013/5/21 Antoine Pitrou : >> >> Hello, >> >> I would like to nominate Benjamin as BDFL-Delegate for PEP 442. >> Please tell me if you would like to object :) > > I think he's a scoundrel. > > > > -- > Regards, > Benjamin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Tue May 21 18:03:13 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 21 May 2013 09:03:13 -0700 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> Message-ID: <519B9AC1.6050808@stoneleaf.us> On 05/21/2013 04:23 AM, Nick Coghlan wrote: > On Tue, May 21, 2013 at 5:17 PM, Hrvoje Niksic wrote: >> On 05/20/2013 05:15 PM, Ethan Furman wrote: >>> >>> 1) Do nothing and be happy I use 'raise ... from None' in my own >>> libraries >>> >>> 2) Change the wording of 'During handling of the above exception, another >>> exception occurred' (no ideas as to what at >>> the moment) >> >> >> The word "occurred" misleads one to think that, during handling of the real >> exception, an unrelated and unintended exception occurred. This is not the >> case when the "raise" keyword is used. In that case, the exception was >> intentionally *converted* from one type to another. For the "raise" case a >> wording like the following might work better: >> >> The above exception was converted to the following exception: >> ... >> >> That makes it clear that the conversion was explicit and (hopefully) >> intentional, and that the latter exception supersedes the former. > > This ship sailed long ago (it was covered by the original exception > chaining spec in PEP 3134). If you want to deliberately replace an > exception while retaining the full traceback, you use "raise X from > Y", and the intro text will change to something like "This exception > was the direct cause of the following exception:" I had forgotten about that, Nick, thanks. So the moral of the story for our library code and replacing exceptions is we should either do raise ... from OldException or raise ... from None depending on the importance of the originating exception. And, of course, we only make these changes when we're already modifying the module for some other reason. -- ~Ethan~ From sorin.stelian at axis.com Tue May 21 18:50:23 2013 From: sorin.stelian at axis.com (Sorin Stelian) Date: Tue, 21 May 2013 18:50:23 +0200 Subject: [Python-Dev] Is thread-safe smtpd desired/possible? Message-ID: Hi, I am posting this here since I could find no active maintainer of the smtpd module. In my work as a test engineer for Axis (www.axis.com) I encountered the need of having thread-safe SMTP servers. I know the use case of several SMTP servers running in concurrent threads might seem odd, but it can actually be quite useful for testing purposes. I have implemented (for my own use) a possible solution which basically means that every SMTP channel has its own socket map, instead of using asyncore's global socket map. It would not involve any change in asyncore. Looking at the disucssion from http://bugs.python.org/issue11959 it seems to me that such a solution would not be accepted. Do you think that modifying asyncore is more feasible? If not, is this something that might be looked at? I can provide code if needed, but I would first like to know your thoughts about this. Best regards, Sorin From rdmurray at bitdance.com Tue May 21 19:43:32 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 21 May 2013 13:43:32 -0400 Subject: [Python-Dev] Is thread-safe smtpd desired/possible? In-Reply-To: References: Message-ID: <20130521174332.C1BA9250BD3@webabinitio.net> On Tue, 21 May 2013 18:50:23 +0200, Sorin Stelian wrote: > I am posting this here since I could find no active maintainer of the smtpd module. Currently I am effectively the maintainer of that module, though other people are helping out. > In my work as a test engineer for Axis (www.axis.com) I encountered > the need of having thread-safe SMTP servers. I know the use case of > several SMTP servers running in concurrent threads might seem odd, but > it can actually be quite useful for testing purposes. > > I have implemented (for my own use) a possible solution which > basically means that every SMTP channel has its own socket map, > instead of using asyncore's global socket map. It would not involve > any change in asyncore. > > Looking at the disucssion from http://bugs.python.org/issue11959 it > seems to me that such a solution would not be accepted. Do you think > that modifying asyncore is more feasible? If not, is this something > that might be looked at? > > I can provide code if needed, but I would first like to know your > thoughts about this. I don't think issue 11959 represents a categorical rejection of improvements here; however, I suspect that tulip has an impact on this. Regardless of that, any changes need to be discussed in a wider context than just the smtpd module, no matter where changes are actually made. --David From greg.ewing at canterbury.ac.nz Wed May 22 01:14:35 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 22 May 2013 11:14:35 +1200 Subject: [Python-Dev] What if we didn't have repr? In-Reply-To: References: Message-ID: <519BFFDB.9070603@canterbury.ac.nz> ?ukasz Langa wrote: > 1. Make __str__() a protocol for arbitrary string conversion. > 2. Move the current __repr__() contracts, both firm and informal to a > new, extensible version of pprint. -1. The purposes of repr() and pprint() are quite different. Please let's not make any sweeping changes about str() and repr(). They're generally okay as they are, IMO. -- Greg From olemis at gmail.com Wed May 22 06:14:42 2013 From: olemis at gmail.com (Olemis Lang) Date: Tue, 21 May 2013 23:14:42 -0500 Subject: [Python-Dev] Purpose of Doctests [Was: Best practices for Enum] In-Reply-To: References: <5192292F.6050406@pearwood.info> <3CD9CCA8-836D-4527-A699-235E6A9BCF90@voidspace.org.uk> Message-ID: On 5/20/13, Mark Janssen wrote: >>> * Doctests practically beg you to write your code first and then copy >>> and >>> paste terminal sessions - they're the enemy of TDD >> >> Of course , not , all the opposite . If the approach is understood >> correctly then the first thing test author will do is to write the >> code ?expected? to get something done . When everything is ok with API >> code style then write the code . Many problems in the API and >> inconsistencies are thus detected early . > > Now all we need is a test() built-in, a companion to help() and we > have the primo platform for doctest-code-test cycle for TDD and agile > development. > ?test() built-in? , "interesting" observation ... at least to me setup.py test is more than enough in real-life , and I guess many people really involved in building APIs for sure will notice that in real life it's not as simple as ?doctest-code-test? ; in the same way that TDD is not always exactly like what is read in the books . However writing doctests first for APIs could definitely be helpful to think in advance in terms of the clients , especially when there are some aspects looking a bit fuzzy . Nevertheless , what is really needed , like I've been saying (elsewhere) since years ago , is a better doctest module . The API in stdlib does not offer the means to really benefit of its potential (<= that does not mean it's bad , it might be better ;) . Above I was talking about testing libraries defining APIs . In the meantime following the approach sketched above , it's been possible (at least to me) to develop tested & documented RESTful + RPC APIs with relatively little effort . Besides , the differences between RPC and functions due to subtle technological & implementation details may be erased . Using the approach I've sketched in previous messages it's also possible to run the very same doctests for APIs that are meant to work transparently locally or hosted online (e.g. pastebins ... or other services in the cloud) . The only thing needed is to use the right implementation of doctests setUp / tearDown e.g. switching from Python functions to ServerProxy , or REST , or ... ... so , yes , it's proven to be useful in practice ... -- Regards, Olemis. Apache? Bloodhound contributor http://issues.apache.org/bloodhound Blog ES: http://simelo-es.blogspot.com/ Blog EN: http://simelo-en.blogspot.com/ Featured article: From greg at krypto.org Wed May 22 09:02:52 2013 From: greg at krypto.org (Gregory P. Smith) Date: Wed, 22 May 2013 00:02:52 -0700 Subject: [Python-Dev] PEP 442 delegate In-Reply-To: <20130521175739.21e55c68@fsol> References: <20130521175739.21e55c68@fsol> Message-ID: +1 I second the scoundrel! fwiw, that pep being implemented is going to be a great addition to Python. :) On Tue, May 21, 2013 at 8:57 AM, Antoine Pitrou wrote: > > Hello, > > I would like to nominate Benjamin as BDFL-Delegate for PEP 442. > Please tell me if you would like to object :) > > Regards > > Antoine. > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg%40krypto.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kristjan at ccpgames.com Wed May 22 21:49:31 2013 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Wed, 22 May 2013 19:49:31 +0000 Subject: [Python-Dev] PEP 442 delegate In-Reply-To: References: <20130521175739.21e55c68@fsol> Message-ID: Stackless python, already with their own special handling of GC finalization, is excited by this development :) K From: Python-Dev [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf Of Gregory P. Smith Sent: 22. ma? 2013 07:03 To: Antoine Pitrou Cc: Python-Dev Subject: Re: [Python-Dev] PEP 442 delegate +1 I second the scoundrel! fwiw, that pep being implemented is going to be a great addition to Python. :) On Tue, May 21, 2013 at 8:57 AM, Antoine Pitrou > wrote: Hello, I would like to nominate Benjamin as BDFL-Delegate for PEP 442. Please tell me if you would like to object :) Regards Antoine. _______________________________________________ Python-Dev mailing list Python-Dev at python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/greg%40krypto.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlosnepomuceno at outlook.com Thu May 23 00:26:14 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Thu, 23 May 2013 01:26:14 +0300 Subject: [Python-Dev] PEP 442 delegate In-Reply-To: References: <20130521175739.21e55c68@fsol>, , Message-ID: ________________________________ > From: kristjan at ccpgames.com > To: greg at krypto.org; solipsis at pitrou.net > Date: Wed, 22 May 2013 19:49:31 +0000 > CC: python-dev at python.org > Subject: Re: [Python-Dev] PEP 442 delegate > > > Stackless python, already with their own special handling of GC > finalization, is excited by this development ? > > K > Didn't know about Stackless Python. Is it faster than CPython? I'm developing an application that takes more than 5000 active threads, sometimes up to 100000. Will it benefit from Stackless Python? Can I use it for WSGI with Apache httpd? From lukasz at langa.pl Thu May 23 00:33:35 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 23 May 2013 00:33:35 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions Message-ID: Hello, I would like to submit the following PEP for discussion and evaluation. PEP: 443 Title: Single-dispatch generic functions Version: $Revision$ Last-Modified: $Date$ Author: ?ukasz Langa Discussions-To: Python-Dev Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 22-May-2013 Post-History: 22-May-2013 Replaces: 245, 246, 3124 Abstract ======== This PEP proposes a new mechanism in the ``functools`` standard library module that provides a simple form of generic programming known as single-dispatch generic functions. A **generic function** is composed of multiple functions sharing the same name. Which form should be used during a call is determined by the dispatch algorithm. When the implementation is chosen based on the type of a single argument, this is known as **single dispatch**. Rationale and Goals =================== Python has always provided a variety of built-in and standard-library generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``, ``copy.copy()``, and most of the functions in the ``operator`` module. However, it currently: 1. does not have a simple or straightforward way for developers to create new generic functions, 2. does not have a standard way for methods to be added to existing generic functions (i.e., some are added using registration functions, others require defining ``__special__`` methods, possibly by monkeypatching). In addition, it is currently a common anti-pattern for Python code to inspect the types of received arguments, in order to decide what to do with the objects. For example, code may wish to accept either an object of some type, or a sequence of objects of that type. Currently, the "obvious way" to do this is by type inspection, but this is brittle and closed to extension. Abstract Base Classes make it easier to discover present behaviour, but don't help adding new behaviour. A developer using an already-written library may be unable to change how their objects are treated by such code, especially if the objects they are using were created by a third party. Therefore, this PEP proposes a uniform API to address dynamic overloading using decorators. User API ======== To define a generic function, decorate it with the ``@singledispatch`` decorator. Note that the dispatch happens on the type of the first argument, create your function accordingly: .. code-block:: pycon >>> from functools import singledispatch >>> @singledispatch ... def fun(arg, verbose=False): ... if verbose: ... print("Let me just say,", end=" ") ... print(arg) To add overloaded implementations to the function, use the ``register()`` attribute of the generic function. It takes a type parameter: .. code-block:: pycon >>> @fun.register(int) ... def _(arg, verbose=False): ... if verbose: ... print("Strength in numbers, eh?", end=" ") ... print(arg) ... >>> @fun.register(list) ... def _(arg, verbose=False): ... if verbose: ... print("Enumerate this:") ... for i, elem in enumerate(arg): ... print(i, elem) To enable registering lambdas and pre-existing functions, the ``register()`` attribute can be used in a functional form: .. code-block:: pycon >>> def nothing(arg, verbose=False): ... print("Nothing.") ... >>> fun.register(type(None), nothing) When called, the function dispatches on the first argument: .. code-block:: pycon >>> fun("Hello, world.") Hello, world. >>> fun("test.", verbose=True) Let me just say, test. >>> fun(42, verbose=True) Strength in numbers, eh? 42 >>> fun(['spam', 'spam', 'eggs', 'spam'], verbose=True) Enumerate this: 0 spam 1 spam 2 eggs 3 spam >>> fun(None) Nothing. The proposed API is intentionally limited and opinionated, as to ensure it is easy to explain and use, as well as to maintain consistency with existing members in the ``functools`` module. Implementation Notes ==================== The functionality described in this PEP is already implemented in the ``pkgutil`` standard library module as ``simplegeneric``. Because this implementation is mature, the goal is to move it largely as-is. Several open issues remain: * the current implementation relies on ``__mro__`` alone, making it incompatible with Abstract Base Classes' ``register()``/``unregister()`` functionality. A possible solution has been proposed by PJE on the original issue for exposing ``pkgutil.simplegeneric`` as part of the ``functools`` API [#issue-5135]_. * the dispatch type is currently specified as a decorator argument. The implementation could allow a form using argument annotations. This usage pattern is out of scope for the standard library [#pep-0008]_. However, whether this registration form would be acceptable for general usage, is up to debate. Based on the current ``pkgutil.simplegeneric`` implementation and following the convention on registering virtual subclasses on Abstract Base Classes, the dispatch registry will not be thread-safe. Usage Patterns ============== This PEP proposes extending behaviour only of functions specifically marked as generic. Just as a base class method may be overridden by a subclass, so too may a function be overloaded to provide custom functionality for a given type. Universal overloading does not equal *arbitrary* overloading, in the sense that we need not expect people to randomly redefine the behavior of existing functions in unpredictable ways. To the contrary, generic function usage in actual programs tends to follow very predictable patterns and overloads are highly-discoverable in the common case. If a module is defining a new generic operation, it will usually also define any required overloads for existing types in the same place. Likewise, if a module is defining a new type, then it will usually define overloads there for any generic functions that it knows or cares about. As a result, the vast majority of overloads can be found adjacent to either the function being overloaded, or to a newly-defined type for which the overload is adding support. It is only in rather infrequent cases that one will have overloads in a module that contains neither the function nor the type(s) for which the overload is added. In the absence of incompetence or deliberate intention to be obscure, the few overloads that are not adjacent to the relevant type(s) or function(s), will generally not need to be understood or known about outside the scope where those overloads are defined. (Except in the "support modules" case, where best practice suggests naming them accordingly.) As mentioned earlier, single-dispatch generics are already prolific throughout the standard library. A clean, standard way of doing them provides a way forward to refactor those custom implementations to use a common one, opening them up for user extensibility at the same time. Alternative approaches ====================== In PEP 3124 [#pep-3124]_ Phillip J. Eby proposes a full-grown solution with overloading based on arbitrary rule sets (with the default implementation dispatching on argument types), as well as interfaces, adaptation and method combining. PEAK-Rules [#peak-rules]_ is a reference implementation of the concepts described in PJE's PEP. Such a broad approach is inherently complex, which makes reaching a consensus hard. In contrast, this PEP focuses on a single piece of functionality that is simple to reason about. It's important to note this does not preclude the use of other approaches now or in the future. In a 2005 article on Artima [#artima2005]_ Guido van Rossum presents a generic function implementation that dispatches on types of all arguments on a function. The same approach was chosen in Andrey Popp's ``generic`` package available on PyPI [#pypi-generic]_, as well as David Mertz's ``gnosis.magic.multimethods`` [#gnosis-multimethods]_. While this seems desirable at first, I agree with Fredrik Lundh's comment that "if you design APIs with pages of logic just to sort out what code a function should execute, you should probably hand over the API design to someone else". In other words, the single argument approach proposed in this PEP is not only easier to implement but also clearly communicates that dispatching on a more complex state is an anti-pattern. It also has the virtue of corresponding directly with the familiar method dispatch mechanism in object oriented programming. The only difference is whether the custom implementation is associated more closely with the data (object-oriented methods) or the algorithm (single-dispatch overloading). Acknowledgements ================ Apart from Phillip J. Eby's work on PEP 3124 [#pep-3124]_ and PEAK-Rules, influences include Paul Moore's original issue [#issue-5135]_ that proposed exposing ``pkgutil.simplegeneric`` as part of the ``functools`` API, Guido van Rossum's article on multimethods [#artima2005]_, and discussions with Raymond Hettinger on a general pprint rewrite. References ========== .. [#issue-5135] http://bugs.python.org/issue5135 .. [#pep-0008] PEP 8 states in the "Programming Recommendations" section that "the Python standard library will not use function annotations as that would result in a premature commitment to a particular annotation style". (http://www.python.org/dev/peps/pep-0008) .. [#pep-3124] http://www.python.org/dev/peps/pep-3124/ .. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules .. [#artima2005] http://www.artima.com/weblogs/viewpost.jsp?thread=101605 .. [#pypi-generic] http://pypi.python.org/pypi/generic .. [#gnosis-multimethods] http://gnosis.cx/publish/programming/charming_python_b12.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From tjreedy at udel.edu Thu May 23 01:16:01 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Wed, 22 May 2013 19:16:01 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: I like the general idea. Does you have any specific stdlib use cases in mind? I thought of pprint, which at some point dispatches on dict versus set/sequence, but overall it seems more complicated than mere arg type dispatch. Unittest.TestCase.assertEqual mostly (but not completely) uses first arg dispatch based on an instance-specific dict, and it has an custom instance registration method addTypeEqualityFunc. (Since each test_xxx runs in a new instance, a registration for multiple methods has to be done either in a setup method or repeated in each test_method.) Terry From v+python at g.nevcal.com Thu May 23 02:14:55 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 22 May 2013 17:14:55 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: <519D5F7F.1070401@g.nevcal.com> On 5/22/2013 3:33 PM, ?ukasz Langa wrote: > 2. does not have a standard way for methods to be added to existing > generic functions (i.e., some are added using registration > functions, others require defining ``__special__`` methods, possibly > by monkeypatching). I assume you are talking about things like __add__, for operator overloading. And later you mention: > To define a generic function, decorate it with the ``@singledispatch`` > decorator. Note that the dispatch happens on the type of the first > argument, create your function accordingly: Yet about half of the operator overloads would be incomplete if there were not corresponding __r*__ methods (__radd__, __rsub__, etc.) because the second parameter is as key to the dispatch as the first. While unary operators, and one argument functions would be fully covered by single dispatch, it is clear that single dispatch doesn't cover a large collection of useful cases for operator overloading. It would seem appropriate to me for the PEP to explain why single dispatch is sufficient, in the presence of a large collection of operations for which it has been demonstrably shown to be insufficient... while the solution is already in place for such operations, single dispatch could clearly not be used as a replacement solution for those operations, opening the door to the thought that maybe single dispatch is an insufficiently useful mechanism, and that perhaps at least two arguments should be used for dispatch (when they exist). On the other hand, when using function call notation instead of operator notation, maybe single dispatch is sufficient... still, non-commutative operations (subtract, divide, etc.) can be difficult to express without resorting to function names like "backwardsSubtract" (__rsub__). But even with commutative operations between unlike objects, it may be that only one of the objects knows how to perform the operations and must be the one that controls the dispatch... Granted, there are few ternary (or n-ary) operators that are not expressed using functional notation anyway, but certainly there is a case to be made for dispatch to happen based on types of all arguments. While that doesn't necessarily detract from the benefits of a single dispatch system, it does raise the question about whether single dispatch is sufficient, especially in the presence of a large collection of (binary) operations for which it is already known to be insufficient. -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu May 23 03:03:17 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Wed, 22 May 2013 18:03:17 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519D5F7F.1070401@g.nevcal.com> Message-ID: <519D6AD5.5050706@g.nevcal.com> On 5/22/2013 5:55 PM, Guido van Rossum wrote: > On Wed, May 22, 2013 at 5:14 PM, Glenn Linderman wrote: >> Yet about half of the operator overloads would be incomplete if there were >> not corresponding __r*__ methods (__radd__, __rsub__, etc.) because the >> second parameter is as key to the dispatch as the first. > This (and your subsequent argument) sounds like a typical case of > "perfection is the enemy of the good." ?ukasz already pointed out that > for dispatch on multiple arguments, consensus has been elusive, and > there are some strong statements in opposition. While this does not > exclude the possibility that it might be easier to get consensus on > dual-argument dispatch, I think the case for dual-argument dispatch is > still much weaker than that for single-argument dispatch. > > The binary operations, which you use as the primary example, are > already special because they correspond to syntactic forms. Python > intentionally does not have a generalized syntax to invoke arbitrary > binary operations, but only supports a small number of predefined > binary operators -- code in other languages (like Haskell) that uses > "unconventional" binary operators is usually hard to read except for > mathematicians. > > Since the language already offers a way to do dual-argument dispatch > for the predefined operations, your proposed dual-argument dispatch > wouldn't be particularly useful for those. (And retrofitting it would > be a very tricky business, given the many subtleties in the existing > binary operator dispatch -- for example, did you know that there's a > scenario where __radd__ is tried *before* __add__?) > > For standard function calls, it would be very odd if dual-dispatch > were supported but multiple-dispatch weren't. In general, 0, 1 and > infinity are fair game for special treatment, but treating 2 special > as well usually smells. So I'd say that ?ukasz's single-dispatch > proposal covers a fairly important patch of new ground, while > dual-dispatch is both much harder and less useful. Ergo, ?ukasz has > made the right trade-off. > Yep. The above, plus a recap of the arguments in opposition to multiple argument dispatch, would make the PEP stronger, which is all I was asking for. I sort of agree with his quote of Frederick Lundh, regarding the complexity of multiple argument dispatch, and multiple argument dispatch/overloading is one of the most complex things to understand and use in C++. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu May 23 02:55:28 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 22 May 2013 17:55:28 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519D5F7F.1070401@g.nevcal.com> References: <519D5F7F.1070401@g.nevcal.com> Message-ID: On Wed, May 22, 2013 at 5:14 PM, Glenn Linderman wrote: > Yet about half of the operator overloads would be incomplete if there were > not corresponding __r*__ methods (__radd__, __rsub__, etc.) because the > second parameter is as key to the dispatch as the first. This (and your subsequent argument) sounds like a typical case of "perfection is the enemy of the good." ?ukasz already pointed out that for dispatch on multiple arguments, consensus has been elusive, and there are some strong statements in opposition. While this does not exclude the possibility that it might be easier to get consensus on dual-argument dispatch, I think the case for dual-argument dispatch is still much weaker than that for single-argument dispatch. The binary operations, which you use as the primary example, are already special because they correspond to syntactic forms. Python intentionally does not have a generalized syntax to invoke arbitrary binary operations, but only supports a small number of predefined binary operators -- code in other languages (like Haskell) that uses "unconventional" binary operators is usually hard to read except for mathematicians. Since the language already offers a way to do dual-argument dispatch for the predefined operations, your proposed dual-argument dispatch wouldn't be particularly useful for those. (And retrofitting it would be a very tricky business, given the many subtleties in the existing binary operator dispatch -- for example, did you know that there's a scenario where __radd__ is tried *before* __add__?) For standard function calls, it would be very odd if dual-dispatch were supported but multiple-dispatch weren't. In general, 0, 1 and infinity are fair game for special treatment, but treating 2 special as well usually smells. So I'd say that ?ukasz's single-dispatch proposal covers a fairly important patch of new ground, while dual-dispatch is both much harder and less useful. Ergo, ?ukasz has made the right trade-off. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu May 23 04:12:26 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 May 2013 12:12:26 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519D5F7F.1070401@g.nevcal.com> References: <519D5F7F.1070401@g.nevcal.com> Message-ID: On Thu, May 23, 2013 at 10:14 AM, Glenn Linderman wrote: > Yet about half of the operator overloads would be incomplete if there were > not corresponding __r*__ methods (__radd__, __rsub__, etc.) because the > second parameter is as key to the dispatch as the first. > > While unary operators, and one argument functions would be fully covered by > single dispatch, it is clear that single dispatch doesn't cover a large > collection of useful cases for operator overloading. The binary operators can be more accurately said to use a complicated single-dispatch dance rather than supporting native dual-dispatch. As you say, the PEP would be strengthened by pointing this out as an argument in favour of staying *away* from a multi-dispatch system (because it isn't obvious how to build a comprehensible one that would even support our existing NotImplemented based dual dispatch system for the binary operators). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From guido at python.org Thu May 23 05:26:56 2013 From: guido at python.org (Guido van Rossum) Date: Wed, 22 May 2013 20:26:56 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519D5F7F.1070401@g.nevcal.com> Message-ID: Funny. I thought that the PEP was quite strong enough already in its desire to stay away from multi-dispatch. But sure, I don't mind making it stronger. :-) On Wed, May 22, 2013 at 7:12 PM, Nick Coghlan wrote: > On Thu, May 23, 2013 at 10:14 AM, Glenn Linderman wrote: >> Yet about half of the operator overloads would be incomplete if there were >> not corresponding __r*__ methods (__radd__, __rsub__, etc.) because the >> second parameter is as key to the dispatch as the first. >> >> While unary operators, and one argument functions would be fully covered by >> single dispatch, it is clear that single dispatch doesn't cover a large >> collection of useful cases for operator overloading. > > The binary operators can be more accurately said to use a complicated > single-dispatch dance rather than supporting native dual-dispatch. As > you say, the PEP would be strengthened by pointing this out as an > argument in favour of staying *away* from a multi-dispatch system > (because it isn't obvious how to build a comprehensible one that would > even support our existing NotImplemented based dual dispatch system > for the binary operators). > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) From carlosnepomuceno at outlook.com Thu May 23 05:53:11 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Thu, 23 May 2013 06:53:11 +0300 Subject: [Python-Dev] _PyString_InsertThousandsGrouping() Message-ID: Hi guys! Can someone explain to me where in the CPython 2.7.5 source code is _PyString_InsertThousandsGrouping() implemented? I've found the following declaration in 'Objects/stringobject.c' but it just defines _Py_InsertThousandsGrouping() as _PyString_InsertThousandsGrouping(): "#define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping" I'm looking for the opposite! I don't even know how that doesn't cause an error! What's the trick? Besides that I've found a lot of code inside some header files, such as 'Objects/stringlib/formatter.h'. Why did you chose that way? Thanks in advance. Carlos From eliben at gmail.com Thu May 23 06:09:50 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 22 May 2013 21:09:50 -0700 Subject: [Python-Dev] _PyString_InsertThousandsGrouping() In-Reply-To: References: Message-ID: On Wed, May 22, 2013 at 8:53 PM, Carlos Nepomuceno < carlosnepomuceno at outlook.com> wrote: > Hi guys! > > Can someone explain to me where in the CPython 2.7.5 source code is > _PyString_InsertThousandsGrouping() implemented? > > I've found the following declaration in 'Objects/stringobject.c' but it > just defines _Py_InsertThousandsGrouping() as > _PyString_InsertThousandsGrouping(): > > "#define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping" > > I'm looking for the opposite! > No, you aren't :-) #define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping #include "stringlib/localeutil.h" Now look inside "stringlib/localeutil.h" and think what the pre-processor does with the function definition having the #define above. Eli > > I don't even know how that doesn't cause an error! What's the trick? > > Thanks in advance. > > Carlos > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/eliben%40gmail.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlosnepomuceno at outlook.com Thu May 23 06:18:11 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Thu, 23 May 2013 07:18:11 +0300 Subject: [Python-Dev] _PyString_InsertThousandsGrouping() In-Reply-To: References: , Message-ID: ________________________________ > From: eliben at gmail.com [...] > I've found the following declaration in 'Objects/stringobject.c' but it > just defines _Py_InsertThousandsGrouping() as > _PyString_InsertThousandsGrouping(): > > "#define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping" > > I'm looking for the opposite! > > No, you aren't :-) > > #define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping > #include "stringlib/localeutil.h" > > Now look inside "stringlib/localeutil.h" and think what the > pre-processor does with the function definition having the #define > above. > > Eli lol I can see clearly now! :p That reminds me of "Which came first, the chicken or the egg?" Thank you! Somehow I got intrigued by such use... Do you know why they've put a lot of source code inside the header files? From eliben at gmail.com Thu May 23 06:39:46 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 22 May 2013 21:39:46 -0700 Subject: [Python-Dev] _PyString_InsertThousandsGrouping() In-Reply-To: References: Message-ID: On Wed, May 22, 2013 at 9:18 PM, Carlos Nepomuceno < carlosnepomuceno at outlook.com> wrote: > ________________________________ > > From: eliben at gmail.com > [...] > > I've found the following declaration in 'Objects/stringobject.c' but it > > just defines _Py_InsertThousandsGrouping() as > > _PyString_InsertThousandsGrouping(): > > > > "#define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping" > > > > I'm looking for the opposite! > > > > No, you aren't :-) > > > > #define _Py_InsertThousandsGrouping _PyString_InsertThousandsGrouping > > #include "stringlib/localeutil.h" > > > > Now look inside "stringlib/localeutil.h" and think what the > > pre-processor does with the function definition having the #define > > above. > > > > Eli > > lol I can see clearly now! :p > > That reminds me of "Which came first, the chicken or the egg?" > > Thank you! Somehow I got intrigued by such use... > > Do you know why they've put a lot of source code inside the header files? > _______________________________________________ > > This depends per use-case. Commonly, code is placed in header files in C to achieve some sort of C++-template-like behavior with the preprocessor. In particular, I think Objects/stringlib/formatter.h does this. Note this comment near its top: /* Before including this, you must include either: stringlib/unicodedefs.h stringlib/stringdefs.h Also, you should define the names: FORMAT_STRING FORMAT_LONG FORMAT_FLOAT FORMAT_COMPLEX to be whatever you want the public names of these functions to be. These are the only non-static functions defined here. */ Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu May 23 08:04:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 23 May 2013 08:04:03 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions References: <519D5F7F.1070401@g.nevcal.com> Message-ID: <20130523080403.67bb4dd3@fsol> On Thu, 23 May 2013 12:12:26 +1000 Nick Coghlan wrote: > On Thu, May 23, 2013 at 10:14 AM, Glenn Linderman wrote: > > Yet about half of the operator overloads would be incomplete if there were > > not corresponding __r*__ methods (__radd__, __rsub__, etc.) because the > > second parameter is as key to the dispatch as the first. > > > > While unary operators, and one argument functions would be fully covered by > > single dispatch, it is clear that single dispatch doesn't cover a large > > collection of useful cases for operator overloading. > > The binary operators can be more accurately said to use a complicated > single-dispatch dance rather than supporting native dual-dispatch. Not one based on the type of a single argument, though. I guess you can also reduce every function of several arguments to a function accepting a single tuple of several items, but that doesn't sound very interesting. Regards Antoine. From jeanpierreda at gmail.com Thu May 23 08:33:57 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 23 May 2013 02:33:57 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <20130523080403.67bb4dd3@fsol> References: <519D5F7F.1070401@g.nevcal.com> <20130523080403.67bb4dd3@fsol> Message-ID: On Thu, May 23, 2013 at 2:04 AM, Antoine Pitrou wrote: > On Thu, 23 May 2013 12:12:26 +1000 > Nick Coghlan wrote: >> The binary operators can be more accurately said to use a complicated >> single-dispatch dance rather than supporting native dual-dispatch. > > Not one based on the type of a single argument, though. Why not? I'd expect it to look something like this: @singledispatch def ladd(left, right): return NotImplemented @singledispatch def radd(right, left): return NotImplemented def add(left, right): x = ladd(left, right) if x is not NotImplemented: return x x = radd(right, left) if x is not NotImplemented: return x raise TypeError Then instead of defining __add__ you define an overloaded implementation of ladd, and instead of defining __radd__ you define an overloaded implementation of radd. -- Devin From solipsis at pitrou.net Thu May 23 09:14:35 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 23 May 2013 09:14:35 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519D5F7F.1070401@g.nevcal.com> <20130523080403.67bb4dd3@fsol> Message-ID: <20130523091435.0dd5f61e@fsol> On Thu, 23 May 2013 02:33:57 -0400 Devin Jeanpierre wrote: > On Thu, May 23, 2013 at 2:04 AM, Antoine Pitrou wrote: > > On Thu, 23 May 2013 12:12:26 +1000 > > Nick Coghlan wrote: > >> The binary operators can be more accurately said to use a complicated > >> single-dispatch dance rather than supporting native dual-dispatch. > > > > Not one based on the type of a single argument, though. > > Why not? > > I'd expect it to look something like this: > > @singledispatch > def ladd(left, right): > return NotImplemented > > @singledispatch > def radd(right, left): > return NotImplemented > > def add(left, right): > x = ladd(left, right) > if x is not NotImplemented: > return x > x = radd(right, left) > if x is not NotImplemented: > return x > raise TypeError > > Then instead of defining __add__ you define an overloaded > implementation of ladd, and instead of defining __radd__ you define an > overloaded implementation of radd. Well, I don't think you can say add() dispatches based on the type of a single argument. But that may be a question of how you like to think about decomposed problems. Regards Antoine. From arigo at tunes.org Thu May 23 09:33:33 2013 From: arigo at tunes.org (Armin Rigo) Date: Thu, 23 May 2013 09:33:33 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: Hi, On Thu, May 23, 2013 at 12:33 AM, ?ukasz Langa wrote: > Alternative approaches > ====================== You could also mention "pairtype", used in PyPy: https://bitbucket.org/pypy/pypy/raw/default/rpython/tool/pairtype.py (very short code). It's originally about adding double-dispatch, but the usage that grew out of it is for generic single-dispatch functions that are bound to some common "state" object as follows (Python 2 syntax): class MyRepr(object): ...state of my repr... class __extend__(pairtype(MyRepr, int)): def show((myrepr, x), y): print "hi, I'm the integer %d, arg is %s" % (x, y) class __extend__(pairtype(MyRepr, list)): def show((myrepr, x), y): print "hi, I'm a list" ...use myrepr to control the state... pair(MyRepr(), [2,3,4]).show(42) - Armin From ncoghlan at gmail.com Thu May 23 09:34:59 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 23 May 2013 17:34:59 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519D5F7F.1070401@g.nevcal.com> <20130523080403.67bb4dd3@fsol> Message-ID: On 23 May 2013 16:37, "Devin Jeanpierre" wrote: > > On Thu, May 23, 2013 at 2:04 AM, Antoine Pitrou wrote: > > On Thu, 23 May 2013 12:12:26 +1000 > > Nick Coghlan wrote: > >> The binary operators can be more accurately said to use a complicated > >> single-dispatch dance rather than supporting native dual-dispatch. > > > > Not one based on the type of a single argument, though. > > Why not? > > I'd expect it to look something like this: > > @singledispatch > def ladd(left, right): > return NotImplemented > > @singledispatch > def radd(right, left): > return NotImplemented > > def add(left, right): > x = ladd(left, right) > if x is not NotImplemented: > return x > x = radd(right, left) > if x is not NotImplemented: > return x > raise TypeError > > Then instead of defining __add__ you define an overloaded > implementation of ladd, and instead of defining __radd__ you define an > overloaded implementation of radd. That's the basic idea, but there's the extra complication that if type(right) is a strict subclass of type(left), you try radd first. Cheers, Nick. > > -- Devin > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From v+python at g.nevcal.com Thu May 23 09:31:38 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Thu, 23 May 2013 00:31:38 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <20130523091435.0dd5f61e@fsol> References: <519D5F7F.1070401@g.nevcal.com> <20130523080403.67bb4dd3@fsol> <20130523091435.0dd5f61e@fsol> Message-ID: <519DC5DA.6050500@g.nevcal.com> On 5/23/2013 12:14 AM, Antoine Pitrou wrote: > On Thu, 23 May 2013 02:33:57 -0400 > Devin Jeanpierre wrote: >> >On Thu, May 23, 2013 at 2:04 AM, Antoine Pitrou wrote: >>> > >On Thu, 23 May 2013 12:12:26 +1000 >>> > >Nick Coghlan wrote: >>>> > >>The binary operators can be more accurately said to use a complicated >>>> > >>single-dispatch dance rather than supporting native dual-dispatch. >>> > > >>> > >Not one based on the type of a single argument, though. >> > >> >Why not? >> > >> >I'd expect it to look something like this: >> > >> > @singledispatch >> > def ladd(left, right): >> > return NotImplemented >> > >> > @singledispatch >> > def radd(right, left): >> > return NotImplemented >> > >> > def add(left, right): >> > x = ladd(left, right) >> > if x is not NotImplemented: >> > return x >> > x = radd(right, left) >> > if x is not NotImplemented: >> > return x >> > raise TypeError >> > >> >Then instead of defining __add__ you define an overloaded >> >implementation of ladd, and instead of defining __radd__ you define an >> >overloaded implementation of radd. > Well, I don't think you can say add() dispatches based on the type of a > single argument. But that may be a question of how you like to think > about decomposed problems. I suspect the point was not that add can be described as doing single dispatch (it can't), but rather that add could possibly be implemented in terms of lower-level functions doing single dispatch. If that was the point, perhaps the next level point is trying to be that single dispatch is a sufficient mechanism that can be augmented (as above) to handle more complex cases. Whether the above (which I think would need to use raise and try instead of return and if) is sufficient to handle such cases is not yet proven. The case Guido mention where radd is tried before add would seem to require a bit more complex logic than the above. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu May 23 10:24:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 23 May 2013 10:24:53 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions References: <519D5F7F.1070401@g.nevcal.com> <20130523080403.67bb4dd3@fsol> <20130523091435.0dd5f61e@fsol> <519DC5DA.6050500@g.nevcal.com> Message-ID: <20130523102453.6952dcf0@pitrou.net> Le Thu, 23 May 2013 00:31:38 -0700, Glenn Linderman a ?crit : > > I suspect the point was not that add can be described as doing single > dispatch (it can't), but rather that add could possibly be > implemented in terms of lower-level functions doing single dispatch. > If that was the point, perhaps the next level point is trying to be > that single dispatch is a sufficient mechanism that can be augmented > (as above) to handle more complex cases. This is true, but as it is of everything Turing-complete. Generic functions don't add anything that you can't already do manually (for example with custom registries) :-) Regardless, I also agree that single-dispatch is much easier to reason about, and good enough for now. Regards Antoine. From kristjan at ccpgames.com Thu May 23 11:25:46 2013 From: kristjan at ccpgames.com (=?utf-8?B?S3Jpc3Rqw6FuIFZhbHVyIErDs25zc29u?=) Date: Thu, 23 May 2013 09:25:46 +0000 Subject: [Python-Dev] PEP 442 delegate In-Reply-To: References: <20130521175739.21e55c68@fsol>, , Message-ID: > Didn't know about Stackless Python. Is it faster than CPython? > > I'm developing an application that takes more than 5000 active threads, > sometimes up to 100000. > Will it benefit from Stackless Python? > > Can I use it for WSGI with Apache httpd? > Stackless has its own website and mailing list. Please visit www.stackless.com for full info, since it is offtopic for this list. K From lukasz at langa.pl Thu May 23 13:25:47 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 23 May 2013 13:25:47 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: <9224C6DF-9C05-415E-AA9A-9C0AF92D77EE@langa.pl> On 23 maj 2013, at 01:16, Terry Jan Reedy wrote: > I like the general idea. Does you have any specific stdlib use cases in mind? > > I thought of pprint, which at some point dispatches on dict versus set/sequence, but overall it seems more complicated than mere arg type dispatch. I want to make pprint extensible for 3.4 and PEP 443 started out as an idea to introduce a uniform API for the boilerplate I'm going to need anyway. It turned out the idea has been around for years. > Unittest.TestCase.assertEqual mostly (but not completely) uses first arg dispatch based on an instance-specific dict, and it has an custom instance registration method addTypeEqualityFunc. (Since each test_xxx runs in a new instance, a registration for multiple methods has to be done either in a setup method or repeated in each test_method.) If a registration mechanism is already in place, it will probably need to stay (backwards compatibility). The feasability of refactoring to @singledispatch will have to be considered on a case-by-case basis. On a more general note, I'm sure that @singledispatch won't cover every use case. Still, PJE implemented both pkgutil.simplegeneric and PEAK-Rules because the former is the proverbial 20% that gets you 80% there. For those use cases the simplicity and transparency provided by a basic solution are a virtue. This is what PEP 443 targets. If @singledispatch turns out so successful that we'll find ourselves longing for multiple dispatch or predicate-based dispatch in the future, I'm sure there's still going to be enough PEP numbers free. The @singledispatch name has been chosen to ensure there's no name clash in that case (thanks Nick for suggesting that!). -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From martin at v.loewis.de Thu May 23 13:36:27 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Thu, 23 May 2013 13:36:27 +0200 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519B9AC1.6050808@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B9AC1.6050808@stoneleaf.us> Message-ID: <519DFF3B.8060003@v.loewis.de> Am 21.05.13 18:03, schrieb Ethan Furman: > And, of course, we only make these changes when we're already modifying > the module for some other reason. In the specific case, the KeyError has indeed useful information that the TypeError does not, namely the specific character that is the culprit. So if you do drop the KeyError entirely, please carry over information about the character into the TypeError. Regards, Martin From lukasz at langa.pl Thu May 23 16:13:18 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 23 May 2013 16:13:18 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> On 23 maj 2013, at 09:33, Armin Rigo wrote: > Hi, > > On Thu, May 23, 2013 at 12:33 AM, ?ukasz Langa wrote: >> Alternative approaches >> ====================== > > You could also mention "pairtype", used in PyPy: Thanks for pointing that out. Information on it added in http://hg.python.org/peps/rev/b7979219f3cc#l1.7 +PyPy's RPython offers ``extendabletype`` [#pairtype]_, a metaclass which +enables classes to be externally extended. In combination with +``pairtype()`` and ``pair()`` factories, this offers a form of +single-dispatch generics. +.. [#pairtype] + https://bitbucket.org/pypy/pypy/raw/default/rpython/tool/pairtype.py -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From ethan at stoneleaf.us Thu May 23 16:24:48 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 23 May 2013 07:24:48 -0700 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519DFF3B.8060003@v.loewis.de> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B9AC1.6050808@stoneleaf.us> <519DFF3B.8060003@v.loewis.de> Message-ID: <519E26B0.20403@stoneleaf.us> On 05/23/2013 04:36 AM, "Martin v. L?wis" wrote: > Am 21.05.13 18:03, schrieb Ethan Furman: >> And, of course, we only make these changes when we're already modifying >> the module for some other reason. > > In the specific case, the KeyError has indeed useful information that > the TypeError does not, namely the specific character that is the culprit. > > So if you do drop the KeyError entirely, please carry over information > about the character into the TypeError. Here's the code that existed at one point: for c in s: val = _b32rev.get(c) if val is None: raise TypeError('Non-base32 digit found') Even though there is no KeyError to convert in this incarnation, providing the cause of failure is still appreciated by the user who's trying to figure out what, exactly, went wrong. -- ~Ethan~ From guido at python.org Thu May 23 16:49:22 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 23 May 2013 07:49:22 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: ?ukasz, are there any open issues? Otherwise I'm ready to accept the PEP. -- --Guido van Rossum (python.org/~guido) From lukasz at langa.pl Thu May 23 16:58:27 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 23 May 2013 16:58:27 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: On 23 maj 2013, at 16:49, Guido van Rossum wrote: > ?ukasz, are there any open issues? Otherwise I'm ready to accept the PEP. There's one. Quoting the PEP: "The dispatch type is currently specified as a decorator argument. The implementation could allow a form using argument annotations. This usage pattern is out of scope for the standard library (per PEP 8). However, whether this registration form would be acceptable for general usage, is up to debate." I feel that the PEP should explicitly allow or disallow for the implementation to accept dispatch on annotations, e.g.: @func.register def _(arg: int): ... versus @func.register(int) def _(arg): ... -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From p.f.moore at gmail.com Thu May 23 17:11:08 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 May 2013 16:11:08 +0100 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: On 23 May 2013 15:58, ?ukasz Langa wrote: > On 23 maj 2013, at 16:49, Guido van Rossum wrote: > > > ?ukasz, are there any open issues? Otherwise I'm ready to accept the PEP. > > There's one. Quoting the PEP: > > "The dispatch type is currently specified as a decorator argument. The > implementation could allow a form using argument annotations. This usage > pattern is out of scope for the standard library (per PEP 8). However, > whether this registration form would be acceptable for general usage, is > up to debate." > > I feel that the PEP should explicitly allow or disallow for the > implementation to accept dispatch on annotations, e.g.: > > @func.register > def _(arg: int): > ... > > versus > > @func.register(int) > def _(arg): > ... Personally, I think the register(int) form seems more natural. But that may well be because there are no uses of annotations in the wild (at least not in code I'm familiar with) and having this as an example of how annotations can be used would help with adoption. I'm not 100% sure what the options are. 1. Only support the register(int) form 2. Only support the annotation form 3. Support both annotation and argument forms Is the debate between 1 and 2, or 1 and 3? Is it even possible to implement 3 without having 2 different names for "register"? If the debate is between 1 and 2, I'd prefer 1. But if it's between 1 and 3, I'm less sure - having the *option* to try annotations for this in my own code sounds useful. Paul. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu May 23 17:14:50 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 23 May 2013 08:14:50 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: Ok, happy bikeshedding. I'm outta here until that's settled. :-) On Thu, May 23, 2013 at 7:58 AM, ?ukasz Langa wrote: > On 23 maj 2013, at 16:49, Guido van Rossum wrote: > >> ?ukasz, are there any open issues? Otherwise I'm ready to accept the PEP. > > There's one. Quoting the PEP: > > "The dispatch type is currently specified as a decorator argument. The > implementation could allow a form using argument annotations. This usage > pattern is out of scope for the standard library (per PEP 8). However, > whether this registration form would be acceptable for general usage, is > up to debate." > > I feel that the PEP should explicitly allow or disallow for the > implementation to accept dispatch on annotations, e.g.: > > @func.register > def _(arg: int): > ... > > versus > > @func.register(int) > def _(arg): > ... > > -- > Best regards, > ?ukasz Langa > > WWW: http://lukasz.langa.pl/ > Twitter: @llanga > IRC: ambv on #python-dev > -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Thu May 23 17:04:19 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 23 May 2013 08:04:19 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: <519E2FF3.9000106@stoneleaf.us> On 05/23/2013 07:58 AM, ?ukasz Langa wrote: > On 23 maj 2013, at 16:49, Guido van Rossum wrote: > >> ?ukasz, are there any open issues? Otherwise I'm ready to accept the PEP. > > There's one. Quoting the PEP: > > "The dispatch type is currently specified as a decorator argument. The > implementation could allow a form using argument annotations. This usage > pattern is out of scope for the standard library (per PEP 8). However, > whether this registration form would be acceptable for general usage, is > up to debate." > > I feel that the PEP should explicitly allow or disallow for the > implementation to accept dispatch on annotations, e.g.: > > @func.register > def _(arg: int): > ... > > versus > > @func.register(int) > def _(arg): > ... If the stdlib is still staying out of the annotation business, then it should not be allowed. -- ~Ethan~ From walter at livinglogic.de Thu May 23 18:00:09 2013 From: walter at livinglogic.de (=?UTF-8?B?V2FsdGVyIETDtnJ3YWxk?=) Date: Thu, 23 May 2013 18:00:09 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: <519E3D09.4000707@livinglogic.de> On 23.05.13 00:33, ?ukasz Langa wrote: > Hello, > I would like to submit the following PEP for discussion and evaluation. > > > PEP: 443 > Title: Single-dispatch generic functions > [...] > >>> @fun.register(int) > ... def _(arg, verbose=False): > ... if verbose: > ... print("Strength in numbers, eh?", end=" ") > ... print(arg) > ... Should it be possible to register multiple types for the generic function with one register() call, i.e. should: @fun.register(int, float) def _(arg, verbose=False): ... be allowed as a synonym for @fun.register(int) @fun.register(float) def _(arg, verbose=False): ... Servus, Walter From p.f.moore at gmail.com Thu May 23 18:56:53 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 23 May 2013 17:56:53 +0100 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E3D09.4000707@livinglogic.de> References: <519E3D09.4000707@livinglogic.de> Message-ID: On 23 May 2013 17:00, Walter D?rwald wrote: > Should it be possible to register multiple types for the generic function > with one register() call, i.e. should: > > @fun.register(int, float) > def _(arg, verbose=False): > ... > > be allowed as a synonym for > > @fun.register(int) > @fun.register(float) > def _(arg, verbose=False): > No, because people will misread register(int, float) as meaning first argument int, second float. The double decorator is explicit as to what is going on, and isn't too hard to read or write. Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Thu May 23 20:13:22 2013 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 23 May 2013 14:13:22 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: <519E5C42.7080009@netwok.org> Hi, Thanks for writing this PEP. Blessing one implementation for the stdlib and one official backport will make programmers? lives a bit easier :) > >>> @fun.register(int) > ... def _(arg, verbose=False): > ... if verbose: > ... print("Strength in numbers, eh?", end=" ") > ... print(arg) > ... Does this work if the implementation function is called like the first decorated function? (I don?t know the proper terminology) e.g. >>> @fun.register(int) ... def fun(arg, verbose=False): ... if verbose: ... print("Strength in numbers, eh?", end=" ") ... print(arg) The precedent is 2.6+ properties, where prop.setter mutates and returns the property object, which then overwrites the previous name in the class dictionary. > * the current implementation relies on ``__mro__`` alone, making it > incompatible with Abstract Base Classes' > ``register()``/``unregister()`` functionality. A possible solution has > been proposed by PJE on the original issue for exposing > ``pkgutil.simplegeneric`` as part of the ``functools`` API > [#issue-5135]_. Making generic functions work with ABCs sounds like a requirement to me, as ABCs are baked into the language (isinstance). ABCs and interfaces (i.e. zope.interface) are really neat and powerful. > * the dispatch type is currently specified as a decorator argument. The > implementation could allow a form using argument annotations. This > usage pattern is out of scope for the standard library [#pep-0008]_. > However, whether this registration form would be acceptable for > general usage, is up to debate. +1 to passing the type as argument to the decorator and not supporting annotations. It?s simple and works. Question: what happens if two functions (say in two different modules) are registered for the same type? From ethan at stoneleaf.us Thu May 23 19:44:13 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 23 May 2013 10:44:13 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: Message-ID: <519E556D.1070500@stoneleaf.us> > User API > ======== > > To define a generic function, decorate it with the ``@singledispatch`` > decorator. Note that the dispatch happens on the type of the first > argument, create your function accordingly: > > .. code-block:: pycon > > >>> from functools import singledispatch > >>> @singledispatch > ... def fun(arg, verbose=False): > ... if verbose: > ... print("Let me just say,", end=" ") > ... print(arg) > > To add overloaded implementations to the function, use the > ``register()`` attribute of the generic function. It takes a type > parameter: > > .. code-block:: pycon > > >>> @fun.register(int) > ... def _(arg, verbose=False): > ... if verbose: > ... print("Strength in numbers, eh?", end=" ") > ... print(arg) > ... > >>> @fun.register(list) > ... def _(arg, verbose=False): > ... if verbose: > ... print("Enumerate this:") > ... for i, elem in enumerate(arg): > ... print(i, elem) > > To enable registering lambdas and pre-existing functions, the > ``register()`` attribute can be used in a functional form: > > .. code-block:: pycon > > >>> def nothing(arg, verbose=False): > ... print("Nothing.") > ... > >>> fun.register(type(None), nothing) > So to have a generic `mapping` function that worked on dicts, namedtuples, user-defined record types, etc., would look something like: --> from functools import singledispatch --> @singledispatch --> def mapping(d): ... new_d = {} ... new_d.update(d) ... return new_d ... --> @mapping.register(tuple) ... def _(t): ... names = getattr(t, '_fields', ['f%d' % n for n in range(len(t))]) ... values = list(t) ... return dict(zip(names, values)) ... --> @mapping.register(user_class): ... def _(uc): ... blah blah ... return dict(more blah) ... Very cool. I'm looking forward to it! Oh, the tuple example above is intended primarily for named tuples, but since there is no common base class besides tuple I had to also handle the case where a plain tuple is passed in, and personally I'd rather have generic field names than raise an exception. -- ~Ethan~ From pje at telecommunity.com Thu May 23 20:59:01 2013 From: pje at telecommunity.com (PJ Eby) Date: Thu, 23 May 2013 14:59:01 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: On Thu, May 23, 2013 at 11:11 AM, Paul Moore wrote: > Is the debate between 1 and 2, or 1 and 3? Is it even possible to implement > 3 without having 2 different names for "register"? Yes. You could do it as either: @func.register def doit(foo: int): ... by checking for the first argument to register() being a function, or: @func.register() def doit(foo: int): ... by using a default None first argument. In either case, you would then raise a TypeError if there wasn't an annotation. As to the ability to do multiple types registration, you could support it only in type annotations, e.g.: @func.register def doit(foo: [int, float]): ... without it being confused with being multiple dispatch. One other thing about the register API that's currently unspecified in the PEP: what does it return, exactly? I generally lean towards returning the undecorated function, so that if you say: @func.register def do_int(foo: int): ... You still have the option of calling it explicitly. OTOH, some may prefer to treat it like an overload and call it 'func' every time, in which case register should return the generic function. Some guidance as to what should be the official One Obvious Way would be helpful here. (Personally, I usually name my methods explicitly because in debugging it's a fast clue as to which piece of code I should be looking at.) From ethan at stoneleaf.us Thu May 23 20:38:05 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 23 May 2013 11:38:05 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E5C42.7080009@netwok.org> References: <519E5C42.7080009@netwok.org> Message-ID: <519E620D.9050805@stoneleaf.us> On 05/23/2013 11:13 AM, ?ric Araujo wrote: > > Thanks for writing this PEP. Blessing one implementation for the stdlib > and one official backport will make programmers? lives a bit easier :) > >> >>> @fun.register(int) >> ... def _(arg, verbose=False): >> ... if verbose: >> ... print("Strength in numbers, eh?", end=" ") >> ... print(arg) >> ... > > Does this work if the implementation function is called like the first > decorated function? (I don?t know the proper terminology) e.g. > > >>> @fun.register(int) > ... def fun(arg, verbose=False): > ... if verbose: > ... print("Strength in numbers, eh?", end=" ") > ... print(arg) > > The precedent is 2.6+ properties, where prop.setter mutates and returns > the property object, which then overwrites the previous name in the > class dictionary. Actually, properties return new instances: --> class Test(object): ... _temp = 'fleeting' ... @property ... def temp(self): ... return self._temp ... @temp.setter ... def new_temp(self, value): ... self._temp = value ... --> id(Test.temp) 30245384 --> id(Test.new_temp) 30246352 --> Test.temp is Test.new_temp False -- ~Ethan~ From pje at telecommunity.com Thu May 23 21:01:36 2013 From: pje at telecommunity.com (PJ Eby) Date: Thu, 23 May 2013 15:01:36 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: On Thu, May 23, 2013 at 2:59 PM, PJ Eby wrote: > I generally lean towards returning the undecorated function, so that if you say: > > @func.register > def do_int(foo: int): > ... Oops, forgot to mention: one other advantage to returning the undecorated function is that you can do this: @func.register(int) @func.register(float) def do_num(foo): ... Which neatly solves the multiple registration problem, even without argument annotations. From lukasz at langa.pl Thu May 23 22:10:03 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 23 May 2013 22:10:03 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> Message-ID: <38A712BA-0776-494A-8C9B-5A9E0D4D3B91@langa.pl> On 23 maj 2013, at 20:59, PJ Eby wrote: > As to the ability to do multiple types registration, you could support > it only in type annotations, e.g.: > > @func.register > def doit(foo: [int, float]): > ... Initially I thought so, too. But it seems other people might think this means "a sequence with the first element being an integer, and the second a float". The BDFL seems to have yet a different idea: http://mail.python.org/pipermail/python-ideas/2012-December/018129.html This is clearly material for a separate PEP, wink wink, nudge nudge. To the point though. Based on this, and the fact PEP 8 currently disallows annotations within the standard library, I came to the conclusion that currently we should not include the annotation-driven form. > I generally lean towards returning the undecorated function, so that if you say: > > @func.register > def do_int(foo: int): > ... Me too. The PEP has been updated to specify that explicitly. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From lukasz at langa.pl Thu May 23 22:10:21 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 23 May 2013 22:10:21 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E5C42.7080009@netwok.org> References: <519E5C42.7080009@netwok.org> Message-ID: <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> On 23 maj 2013, at 20:13, ?ric Araujo wrote: > Does this work if the implementation function is called like the first > decorated function? No, the ``register()`` attribute returns the undecorated function which enables decorator stacking, as well as creating unit tests for each variant independently. > Making generic functions work with ABCs sounds like a requirement to me Yes, I will implement that. > Question: what happens if two functions (say in two different modules) > are registered for the same type? Last one wins. Just like with assigning names in a scope, defining methods in a class or overriding them in a subclass. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From ethan at stoneleaf.us Thu May 23 22:19:20 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 23 May 2013 13:19:20 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <38A712BA-0776-494A-8C9B-5A9E0D4D3B91@langa.pl> References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> <38A712BA-0776-494A-8C9B-5A9E0D4D3B91@langa.pl> Message-ID: <519E79C8.4080805@stoneleaf.us> On 05/23/2013 01:10 PM, ?ukasz Langa wrote: > On 23 maj 2013, at 20:59, PJ Eby wrote: > >> As to the ability to do multiple types registration, you could support >> it only in type annotations, e.g.: >> >> @func.register >> def doit(foo: [int, float]): >> ... > > Initially I thought so, too. But it seems other people might think this > means "a sequence with the first element being an integer, and the second > a float". The BDFL seems to have yet a different idea: > > http://mail.python.org/pipermail/python-ideas/2012-December/018129.html > > This is clearly material for a separate PEP, wink wink, nudge nudge. > > To the point though. Based on this, and the fact PEP 8 currently disallows > annotations within the standard library, I came to the conclusion that > currently we should not include the annotation-driven form. > >> I generally lean towards returning the undecorated function, so that if you say: >> >> @func.register >> def do_int(foo: int): >> ... > > Me too. The PEP has been updated to specify that explicitly. So with this decision made, are there any open issues left? Or can we invite Guido back to the discussion? ;) -- ~Ethan~ From ronan.lamy at gmail.com Thu May 23 23:02:46 2013 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Thu, 23 May 2013 23:02:46 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> Message-ID: 2013/5/23 ?ukasz Langa > On 23 maj 2013, at 20:13, ?ric Araujo wrote: > > > Question: what happens if two functions (say in two different modules) > > are registered for the same type? > > Last one wins. Just like with assigning names in a scope, defining methods > in a class or overriding them in a subclass. > This is a serious annoyance, considering that there are several places where a large library can reasonably define the implementations (i.e. with the class, with the function, or in some utility module). Note that in contrast with the case of functions in a module or methods in a class, linting tools cannot be expected to detect a duplication between functions with different names defined in different modules. Another thing missing from the PEP is the ability to access the implementation function when you know the generic function and the class. A major use case for this is to define the implementation for a subclass by reusing its parent's implementation, e.g. : @some_generic.register(my_int) def _(arg): print("Hello from my_int!") return some_generic[int](arg) -- Ronan Lamy -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Thu May 23 23:17:47 2013 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 23 May 2013 17:17:47 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> Message-ID: <519E877B.8050008@netwok.org> Le 23/05/2013 16:10, ?ukasz Langa a ?crit : >> Does this work if the implementation function is called like the first >> decorated function? > No, the ``register()`` attribute returns the undecorated function which > enables decorator stacking, as well as creating unit tests for each > variant independently. Perfect. My web framework of choice uses decorators that register things and return the function as is and I love it. I guess the common pattern will be to use variants of the generic function name, e.g. func is implemented by func_int, func_str and co, which also helps debugging. >> Making generic functions work with ABCs sounds like a requirement to me > Yes, I will implement that. Great! >> Question: what happens if two functions (say in two different modules) >> are registered for the same type? > Last one wins. Just like with assigning names in a scope, defining methods > in a class or overriding them in a subclass. Works for me. Cheers From steve at pearwood.info Fri May 24 00:32:22 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 24 May 2013 08:32:22 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E2FF3.9000106@stoneleaf.us> References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> <519E2FF3.9000106@stoneleaf.us> Message-ID: <519E98F6.2060309@pearwood.info> On 24/05/13 01:04, Ethan Furman wrote: > On 05/23/2013 07:58 AM, ?ukasz Langa wrote: >> I feel that the PEP should explicitly allow or disallow for the >> implementation to accept dispatch on annotations, e.g.: >> >> @func.register >> def _(arg: int): >> ... >> >> versus >> >> @func.register(int) >> def _(arg): >> ... > > If the stdlib is still staying out of the annotation business, then it should not be allowed. Perhaps it is time to relax that ruling? The standard library acts as a guide to best practice in Python, and I think that uptake of annotations has been hurt due to the lack of good examples. Also, anyone with the conceit that their library or module may someday be in the standard library cannot afford to use annotations at all. So I'm tentatively +1 on allowing the annotation form in addition to the decorator argument form. -- Steven From steve at pearwood.info Fri May 24 00:33:08 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 24 May 2013 08:33:08 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <519E26B0.20403@stoneleaf.us> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520145004.1B1FA250BCF@webabinitio.net> <519A3DFC.5090705@stoneleaf.us> <519B1F89.4030003@avl.com> <519B9AC1.6050808@stoneleaf.us> <519DFF3B.8060003@v.loewis.de> <519E26B0.20403@stoneleaf.us> Message-ID: <519E9924.6060104@pearwood.info> On 24/05/13 00:24, Ethan Furman wrote: > Here's the code that existed at one point: > > for c in s: > val = _b32rev.get(c) > if val is None: > raise TypeError('Non-base32 digit found') > > Even though there is no KeyError to convert in this incarnation, providing the cause of failure is still appreciated by the user who's trying to figure out what, exactly, went wrong. For the record, that is the implementation used in Python 3.3.0rc3, so "at some point" is actually very recently. -- Steven From steve at pearwood.info Fri May 24 00:40:12 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 24 May 2013 08:40:12 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E3D09.4000707@livinglogic.de> Message-ID: <519E9ACC.6000400@pearwood.info> On 24/05/13 02:56, Paul Moore wrote: > On 23 May 2013 17:00, Walter D?rwald wrote: > >> Should it be possible to register multiple types for the generic function >> with one register() call, i.e. should: >> >> @fun.register(int, float) >> def _(arg, verbose=False): >> ... >> >> be allowed as a synonym for >> >> @fun.register(int) >> @fun.register(float) >> def _(arg, verbose=False): >> > > No, because people will misread register(int, float) as meaning first > argument int, second float. The double decorator is explicit as to what is > going on, and isn't too hard to read or write. I don't think that they will. Being able to register multiple types with a single call reads very naturally to me, while multiple decorators still looks weird. Even after many years of seeing them, I still get a momentary "What the hell...?" moment when I see two decorators on one function. That's only going to be increased when both decorators are the same (apart from the argument). The double decorator form above looks to me as weird as: x = func(a) x = func(b) would. I have to stop and think about what is going on, and whether or not it is a mistake. So I am a strong +1 on allowing multiple types to be registered in one call. -- Steven From benhoyt at gmail.com Fri May 24 00:58:01 2013 From: benhoyt at gmail.com (Ben Hoyt) Date: Fri, 24 May 2013 10:58:01 +1200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E9ACC.6000400@pearwood.info> References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: > So I am a strong +1 on allowing multiple types to be registered in one call. Yeah, agreed. It also fits the pattern set by isinstance(), which allows a tuple of types, like isinstance(x, (int, str)). That said, I'm +0 on this PEP itself. It seems no one has provided decent use-case examples (apart from contrived ones), from the stdlib for example. In the fairly large codebase I work on, it'd only be used in one place, and even there the PEP's approach is arguably too simple for what we're doing. It seems to me for the few times this would be used, direct and simple use of isinstance() would be clearer. But maybe that's just our particular codebase. -Ben From pje at telecommunity.com Fri May 24 02:31:48 2013 From: pje at telecommunity.com (PJ Eby) Date: Thu, 23 May 2013 20:31:48 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: On Thu, May 23, 2013 at 6:58 PM, Ben Hoyt wrote: > It seems no one has provided > decent use-case examples (apart from contrived ones) Um, copy.copy(), pprint.pprint(), a bunch of functions in pkgutil which are actually *based on this implementation already* and have been since Python 2.5... I don't see how any of those are contrived examples. If we'd had this in already, all the registration-based functions for copying, pickling, etc. would likely have been implemented this way, and the motivating example for the PEP is the coming refactoring of pprint.pprint. From ethan at stoneleaf.us Fri May 24 02:20:23 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 23 May 2013 17:20:23 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> Message-ID: <519EB247.7010908@stoneleaf.us> On 05/23/2013 02:02 PM, Ronan Lamy wrote: > 2013/5/23 ?ukasz Langa > > > On 23 maj 2013, at 20:13, ?ric Araujo > wrote: > > > Question: what happens if two functions (say in two different modules) > > are registered for the same type? > > Last one wins. Just like with assigning names in a scope, defining methods > in a class or overriding them in a subclass. > > > This is a serious annoyance, considering that there are several places where a large library can reasonably define the > implementations (i.e. with the class, with the function, or in some utility module). Note that in contrast with the case > of functions in a module or methods in a class, linting tools cannot be expected to detect a duplication between > functions with different names defined in different modules. What would you suggest happen in this case? -- ~Ethan~ From ericsnowcurrently at gmail.com Fri May 24 03:30:01 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 23 May 2013 19:30:01 -0600 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E98F6.2060309@pearwood.info> References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> <519E2FF3.9000106@stoneleaf.us> <519E98F6.2060309@pearwood.info> Message-ID: On May 23, 2013 4:37 PM, "Steven D'Aprano" wrote: > > On 24/05/13 01:04, Ethan Furman wrote: >> If the stdlib is still staying out of the annotation business, then it should not be allowed. > > > > Perhaps it is time to relax that ruling? The standard library acts as a guide to best practice in Python, and I think that uptake of annotations has been hurt due to the lack of good examples. Also, anyone with the conceit that their library or module may someday be in the standard library cannot afford to use annotations at all. The idea that decorators determine the meaning of annotations (i.e. they have no meaning without a decorator) really appeals to me. I don't see the imperative for this PEP though, but I'm not opposed. If there were more discussion and consensus on annotations + decorators I'd be more convinced. -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Fri May 24 03:45:53 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 23 May 2013 19:45:53 -0600 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> <519E2FF3.9000106@stoneleaf.us> <519E98F6.2060309@pearwood.info> Message-ID: On Thu, May 23, 2013 at 7:30 PM, Eric Snow wrote: > If there were more > discussion and consensus on annotations + decorators I'd be more convinced. However, this PEP should not be gated on any such discussion. -eric From fperez.net at gmail.com Fri May 24 05:04:42 2013 From: fperez.net at gmail.com (Fernando Perez) Date: Fri, 24 May 2013 03:04:42 +0000 (UTC) Subject: [Python-Dev] What if we didn't have repr? References: Message-ID: On Tue, 21 May 2013 06:36:54 -0700, Guido van Rossum wrote: > Actually changing __str__ or __repr__ is out of the question, best we > can do is discourage makingbthem different. But adding a protocol for > pprint (with extra parameters to convey options) is a fair idea. I note > that Nick sggested to use single-dispatch generic functions for this > though. Both have pros and cons. Post design ideas to python-ideas > please, not here! Just in case you guys find this useful, in IPython we've sort of created this kind of 'extended repr protocol', described and illustrated here with examples: http://nbviewer.ipython.org/url/github.com/ipython/ipython/raw/master/ examples/notebooks/Custom%20Display%20Logic.ipynb It has proven to be widely used in practice. Cheers, f From ncoghlan at gmail.com Fri May 24 05:53:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 13:53:45 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <0F3ACAC0-EE48-4574-B9E2-C00B09E5DCDC@langa.pl> <519E2FF3.9000106@stoneleaf.us> <519E98F6.2060309@pearwood.info> Message-ID: On Fri, May 24, 2013 at 11:45 AM, Eric Snow wrote: > On Thu, May 23, 2013 at 7:30 PM, Eric Snow wrote: >> If there were more >> discussion and consensus on annotations + decorators I'd be more convinced. > > However, this PEP should not be gated on any such discussion. Right, I think the latest update makes the right call by saying "maybe someday, but not for now". Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 24 05:57:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 13:57:38 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: On Fri, May 24, 2013 at 10:31 AM, PJ Eby wrote: > On Thu, May 23, 2013 at 6:58 PM, Ben Hoyt wrote: >> It seems no one has provided >> decent use-case examples (apart from contrived ones) > > Um, copy.copy(), pprint.pprint(), a bunch of functions in pkgutil > which are actually *based on this implementation already* and have > been since Python 2.5... I don't see how any of those are contrived > examples. If we'd had this in already, all the registration-based > functions for copying, pickling, etc. would likely have been > implemented this way, and the motivating example for the PEP is the > coming refactoring of pprint.pprint. We should be able to use it to help deal with the "every growing importer API" problem, too. I know that's technically what pkgutil already uses it for, but elevating this from "pkgutil implementation detail" to "official stdlib functionality" should make it easier to document properly :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 24 07:09:30 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 15:09:30 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E9ACC.6000400@pearwood.info> References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: On Fri, May 24, 2013 at 8:40 AM, Steven D'Aprano wrote: > I don't think that they will. Being able to register multiple types with a > single call reads very naturally to me, while multiple decorators still > looks weird. Even after many years of seeing them, I still get a momentary > "What the hell...?" moment when I see two decorators on one function. That's > only going to be increased when both decorators are the same (apart from the > argument). The double decorator form above looks to me as weird as: > > x = func(a) > x = func(b) > > > would. I have to stop and think about what is going on, and whether or not > it is a mistake. The difference is that this idiom quickly becomes familiar and unexceptional: @fun.register(float) @fun.register(Decimal) def fun_floating_point(arg1, arg2): ... "Oh, OK, 'fun' is a generic function, and we're registering this as the implementation for floats and Decimals" By contrast, the following are *always* ambiguous at the point of definition, as it depends on how fun is defined: @fun.register(float, Decimal) def fun_floating_point(arg1, arg2): ... @fun.register([float, Decimal]) def fun_floating_point(arg1, arg2): ... Is that multiple dispatch? Or is it registering for single dispatch on multiple different types? Sure, we could pick the latter meaning for the standard library, but existing generic function implementations (cited in the PEP) use the tuple-of-types notation for multiple dispatch. By opting for stacking decorators in the PEP and hence the stdlib, we leave the way clear for 3rd party multi-dispatch libraries to use multiple type arguments without introducing any ambiguity. > So I am a strong +1 on allowing multiple types to be registered in one call. Whereas I'm a strong -1, as the ambiguity problem it would create is persistent and irreversible, while stacking registration decorators is just a new idiom to become accustomed to. Cheers, Nick. From pje at telecommunity.com Fri May 24 08:28:56 2013 From: pje at telecommunity.com (PJ Eby) Date: Fri, 24 May 2013 02:28:56 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: On Thu, May 23, 2013 at 11:57 PM, Nick Coghlan wrote: > We should be able to use it to help deal with the "every growing > importer API" problem, too. I know that's technically what pkgutil > already uses it for, but elevating this from "pkgutil implementation > detail" to "official stdlib functionality" should make it easier to > document properly :) Oh, that reminds me. pprint() is actually an instance of a general pattern that single dispatch GF's are good for: "visitor pattern" algorithms. There's a pretty good write-up on the general issues with doing visitor pattern stuff in Python, and how single-dispatch GF's can solve that problem, here: http://peak.telecommunity.com/DevCenter/VisitorRevisited The code samples use a somewhat different API from the PEP, but it's pretty close. The main issues solved are eliminating monkeypatching and fixing the inheritance problems that occur when you use 'visit_foo' methods. One of the samples actually comes from the old 'compiler' package in the stdlib... which tells you how long ago I did the write-up. ;-) From sam.partington at gmail.com Fri May 24 11:54:27 2013 From: sam.partington at gmail.com (Sam Partington) Date: Fri, 24 May 2013 10:54:27 +0100 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> Message-ID: On 23 May 2013 22:02, Ronan Lamy wrote: > 2013/5/23 ?ukasz Langa >> Last one wins. Just like with assigning names in a scope, defining methods >> in a class or overriding them in a subclass. > > This is a serious annoyance, considering that there are several places where > a large library can reasonably define the implementations (i.e. with the > class, with the function, or in some utility module). Note that in contrast > with the case of functions in a module or methods in a class, linting tools > cannot be expected to detect a duplication between functions with different > names defined in different modules. But isn't it much much worse than names in scope, as with assigning names in a scope it is only your scope that is affected : from os.path import join def join(wibble): 'overloads join in this module only' any other module is unaffected, os.path.join still calls os.path.join however with this all scopes globally are affected by the last one wins rule. -----default.py------- from pkgutil import simplegeneric @simplegeneric def fun(x): print 'default impl' -------a.py-------- from default import fun @fun.register(int) def impl_a(x): print 'impl_a' def f(): fun(0) # expect this to call impl_a -------b.py------ from default import fun @fun.register(int) def impl_b(x): print 'impl_b' def f(): fun(0) # expect this to call impl_b -------- >>> import a, b >>> a.f() impl_b >>> b.f() impl_b >>> import b, a >>> a.f() impl_a >>> b.f() impl_a >>> exit() That is rather worrying. It is more analagous in the above example to sys.modules['os.path'].join = myjoin I don't have a solution mind though. Sam From jeanpierreda at gmail.com Fri May 24 12:20:55 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 24 May 2013 06:20:55 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519E9ACC.6000400@pearwood.info> References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: On Thu, May 23, 2013 at 6:40 PM, Steven D'Aprano wrote: > I don't think that they will. Being able to register multiple types with a > single call reads very naturally to me, while multiple decorators still > looks weird. Even after many years of seeing them, I still get a momentary > "What the hell...?" moment when I see two decorators on one function. That's > only going to be increased when both decorators are the same (apart from the > argument). The double decorator form above looks to me as weird as: > > x = func(a) > x = func(b) > > > would. I have to stop and think about what is going on, and whether or not > it is a mistake. That's absurd. The above is not comparable to double decorators, the following is: x = func(a) x = func(x) And this is clearly not something anyone has to stop and think about. (more literally, obviously it's actually def x(...): ... ; x = func(a)(x); x = func(b)(x)) There is nothing remotely wrong or distasteful about using multiple decorators. It's a natural thing to want to compose multiple functions together; for example, @functools.lru_cache with @fun.register or @staticmethod or [...]. And it's even natural to want to apply the same decorator with different arguments multiple times to the same thing, if it happens to do something different when given different arguments. -- Devin From steve at pearwood.info Fri May 24 12:53:54 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 24 May 2013 20:53:54 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> Message-ID: <519F46C2.9020503@pearwood.info> On 24/05/13 15:09, Nick Coghlan wrote: > On Fri, May 24, 2013 at 8:40 AM, Steven D'Aprano wrote: >> I don't think that they will. Being able to register multiple types with a >> single call reads very naturally to me, while multiple decorators still >> looks weird. Even after many years of seeing them, I still get a momentary >> "What the hell...?" moment when I see two decorators on one function. That's >> only going to be increased when both decorators are the same (apart from the >> argument). The double decorator form above looks to me as weird as: >> >> x = func(a) >> x = func(b) >> >> >> would. I have to stop and think about what is going on, and whether or not >> it is a mistake. > > The difference is that this idiom quickly becomes familiar and unexceptional: > > @fun.register(float) > @fun.register(Decimal) > def fun_floating_point(arg1, arg2): > ... I initially wrote a reply about the nature of ambiguity, why register(float, Decimal) should not be considered ambiguous, why stacked decorators that are near-duplicates are a code smell, blah blah blah. But for the sake of brevity I'm going to skip it. The important point that you make is here: > Is that multiple dispatch? Or is it registering for single dispatch on > multiple different types? > > Sure, we could pick the latter meaning for the standard library, but > existing generic function implementations (cited in the PEP) use the > tuple-of-types notation for multiple dispatch. This is an excellent point I had not considered. By the way, it seems to me that Guido's multimethod implementation referenced in the PEP actually uses a single decorator argument per function argument, not a tuple-of-types: @multimethod(int, int) def foo(a, b): ...code for two ints... http://www.artima.com/weblogs/viewpost.jsp?thread=101605 You have convinced me: ambiguous or not, for the sake of future expansion I agree that multiple positional arguments to the register method should be left for some hypothetical multiple-dispatch generics: @fun.register(float, Decimal) # not yet supported, but maybe someday would mean "first argument is a float, second argument is a Decimal". But that still leaves open how to specify single dispatch on more than one type: > stacking registration decorators is > just a new idiom to become accustomed to. Python built-ins and the standard library already have a standard idiom for specifying multiple values at once. A tuple of types is the One Obvious Way to do this: @fun.register((float, Decimal)) which matches the same standard idiom that should be familiar to most people: isinstance(obj, (float, Decimal)) issubclass(cls, (float, Decimal)) And of course it is forward-compatible with our hypothetical future multiple-generics. -- Steven From ncoghlan at gmail.com Fri May 24 13:41:27 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 21:41:27 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519F46C2.9020503@pearwood.info> References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> <519F46C2.9020503@pearwood.info> Message-ID: On Fri, May 24, 2013 at 8:53 PM, Steven D'Aprano wrote: > Python built-ins and the standard library already have a standard idiom for > specifying multiple values at once. A tuple of types is the One Obvious Way > to do this: > > @fun.register((float, Decimal)) It's not obvious, it's ambiguous - some third party libraries use that notation for multi-method dispatch, and they always will, no matter what notation we choose for the standard library. We have three available notations to register the same function for multiple types: stacked decorators, tuple-of-types and multiple arguments. Of those, the first we *cannot avoid* supporting, since we want to return the undecorated function regardless for pickle support and ease of testing. The second two are both used as notations by existing third party multiple dispatch libraries. Thus, your request is that we add a second way to do it that is *known* to conflict with existing third party practices. There is no practical gain on offer, it merely aligns with your current sense of aesthetics slightly better than stacked decorators do. While you're entitled to that aesthetic preference, it isn't a valid justification for adding an unneeded alternate spelling. Furthermore, the proposed registration syntax in the PEP is identical to the syntax which already exists for ABC registration as a class decorator (http://docs.python.org/3/library/abc#abc.ABCMeta.register). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 24 13:55:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 21:55:09 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E3D09.4000707@livinglogic.de> <519E9ACC.6000400@pearwood.info> <519F46C2.9020503@pearwood.info> Message-ID: On Fri, May 24, 2013 at 9:41 PM, Nick Coghlan wrote: > Furthermore, the proposed registration syntax in the PEP is identical > to the syntax which already exists for ABC registration as a class > decorator (http://docs.python.org/3/library/abc#abc.ABCMeta.register). Sorry, I withdraw that observation - it's wrong. ABC registration obviously doesn't need to provide arguments to the decorator at all, since the only necessary info is the ABC itself, and that's providing through the method binding. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Fri May 24 14:08:16 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 22:08:16 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> Message-ID: On Fri, May 24, 2013 at 7:54 PM, Sam Partington wrote: > But isn't it much much worse than names in scope, as with assigning > names in a scope it is only your scope that is affected : > > from os.path import join > def join(wibble): > 'overloads join in this module only' > > any other module is unaffected, os.path.join still calls os.path.join > > however with this all scopes globally are affected by the last one wins rule. Indeed, as with any modification of process global state, generic implementation registration is something to be approached with care. ABC registration is similar. There's actually three kinds of registration that can happen, and only two of them are appropriate for libraries to do implicitly, while the last should only be explicitly triggered from main: * registering a class your library defines with a stdlib or third party generic function * registering a stdlib or third party class with a generic function your library defines * registering a stdlib or third party class with a stdlib or third party generic function The first two cases? Those are just part of defining class behaviour or function behaviour. That's entirely up to the library developer and an entirely sensible thing for them to be doing. That third case? It's the moral equivalent of monkey patching, and it's the strict prerogative of application integrators. The core assumption is that on import, you're just providing one component of an application, and you don't know what that application is or what it's needs are. By contrast, when you're explicitly called from main, then you can assume that this is an explicit request from the *integrated* application that wants you to modify the global state. One of the best examples of a project that gets this distinction right is gevent - you can general import gevent without any side effects on the process global state. However, the gevent.monkey module exposes monkeypatching that the *application* developer can request. You know you have a well written library if someone else could import every single one of your modules into their application and it would have *zero* effect on them until they call a function. This is often the tipping point that pushes things over from being libraries to being frameworks: the frameworks have side effects on import that mean they don't play well with others (implicitly configuring the logging system is a common example - the changes to the logging module's default behaviour in 3.2 were designed to make it easier for library developers to *stop doing that*, because it causes spurious incompatibilities. Messing too much with the import system as a side effect of import is also framework-like behaviour). Making *any* changes outside your module scope as a side effect of import can be problematic, since even if it doesn't conflict with another library, it has a tendency to break module reloading. One of the minor reasons that ABC registration, and the proposed single dispatch registration, permit silent overwriting is that being too aggressive about enforcing "once and only once" can make module reloading even more fragile than it is already. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ronan.lamy at gmail.com Fri May 24 14:22:19 2013 From: ronan.lamy at gmail.com (Ronan Lamy) Date: Fri, 24 May 2013 14:22:19 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: <519EB247.7010908@stoneleaf.us> References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> <519EB247.7010908@stoneleaf.us> Message-ID: 2013/5/24 Ethan Furman > On 05/23/2013 02:02 PM, Ronan Lamy wrote: > >> 2013/5/23 ?ukasz Langa > >> >> >> On 23 maj 2013, at 20:13, ?ric Araujo > merwok at netwok.org>> wrote: >> >> > Question: what happens if two functions (say in two different >> modules) >> > are registered for the same type? >> >> Last one wins. Just like with assigning names in a scope, defining >> methods >> in a class or overriding them in a subclass. >> >> >> This is a serious annoyance, considering that there are several places >> where a large library can reasonably define the >> implementations (i.e. with the class, with the function, or in some >> utility module). Note that in contrast with the case >> of functions in a module or methods in a class, linting tools cannot be >> expected to detect a duplication between >> functions with different names defined in different modules. >> > > What would you suggest happen in this case? > Raise a ValueError, maybe? In that case, there needs to be a way to force the overriding when it is explicitly desired. One way would be to allow unregistering implementations: overriding is then done by unregistering the old implementation before defining the new one. This is a bit cumbersome, which IMHO is a good thing for an operation that is just as disruptive as monkey-patching a class or a module. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Fri May 24 14:44:24 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 24 May 2013 14:44:24 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> <519EB247.7010908@stoneleaf.us> Message-ID: <97C3BEB3-6BF3-455B-AB15-1054E48C6658@langa.pl> On 24 maj 2013, at 14:22, Ronan Lamy wrote: >> 2013/5/24 Ethan Furman >> What would you suggest happen in this case? > Raise a ValueError, maybe? In that case, there needs to be a way to force the overriding when it is explicitly desired. One way would be to allow unregistering implementations: overriding is then done by unregistering the old implementation before defining the new one. Unfortunately this isn't going to work because the order of imports might change during the life cycle of a program. Especially if you wish to expose registering to users, the order of imports cannot be guaranteed. I recognize the need for such behaviour to be discoverable. This is important for debugging purposes. This is why I'm going to let users inspect registered overloads, as well as provide their own mapping for the registry. I'm working on the reference implementation now, stay tuned. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From ncoghlan at gmail.com Fri May 24 14:37:26 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 24 May 2013 22:37:26 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> <519EB247.7010908@stoneleaf.us> Message-ID: On Fri, May 24, 2013 at 10:22 PM, Ronan Lamy wrote: > Raise a ValueError, maybe? In that case, there needs to be a way to force > the overriding when it is explicitly desired. One way would be to allow > unregistering implementations: overriding is then done by unregistering the > old implementation before defining the new one. This is a bit cumbersome, > which IMHO is a good thing for an operation that is just as disruptive as > monkey-patching a class or a module. If you're registering an implementation for a type you didn't define on a generic function you didn't define, it's *exactly* as disruptive as monkey-patching. Note that the PEP proposes giving exactly as much of a runtime warning about overwriting a registration as we do about monkeypatching: none. The two cases are exactly analagous: you can do it, you don't get a warning if you do it, but it you do it implicitly as a side effect of import then you will have developers cursing your name. So don't do that, put it in a function that people can call if they want to register your implementations (along the lines of gevent.monkey). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From lukasz at langa.pl Fri May 24 16:13:37 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 24 May 2013 16:13:37 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions In-Reply-To: References: <519E5C42.7080009@netwok.org> <12B446AB-1176-4210-B3E3-22884CB394BE@langa.pl> <519EB247.7010908@stoneleaf.us> <97C3BEB3-6BF3-455B-AB15-1054E48C6658@langa.pl> Message-ID: On 24 maj 2013, at 14:53, Ronan Lamy wrote: >> 2013/5/24 ?ukasz Langa >> >> I recognize the need for such behaviour to be discoverable. This is >> important for debugging purposes. This is why I'm going to let users >> inspect registered overloads, as well as provide their own mapping >> for the registry. I'm working on the reference implementation now, >> stay tuned. > > OK, I agree that preventing silent overwriting is actually not desirable. Specifying an interface for the implementation registry sounds like the best possible solution to my concerns. Users can now inspect the dispatcher and specify their own mapping for a registry: http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l359 Documentation: http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189 ABC support is still not there, I'm working on it. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From status at bugs.python.org Fri May 24 18:07:32 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 24 May 2013 18:07:32 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130524160732.2C3FE560D4@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-05-17 - 2013-05-24) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3972 ( +6) closed 25850 (+45) total 29822 (+51) Open issues with patches: 1773 Issues opened (41) ================== #13146: Writing a pyc file is not atomic http://bugs.python.org/issue13146 reopened by barry #15535: Fix pickling efficiency of named tuples in 2.7.3 http://bugs.python.org/issue15535 reopened by rhettinger #18000: _md5 should be built if _ssl cannot be built http://bugs.python.org/issue18000 opened by jdemeyer #18003: New lzma crazy slow with line-oriented reading. http://bugs.python.org/issue18003 opened by Michael.Fox #18004: test_list.test_overflow crashes Win64 http://bugs.python.org/issue18004 opened by anselm.kruis #18009: os.write.__doc__ is misleading http://bugs.python.org/issue18009 opened by shai #18010: pydoc search chokes on import errors http://bugs.python.org/issue18010 opened by pitrou #18011: Inconsistency between b32decode() documentation, docstring and http://bugs.python.org/issue18011 opened by serhiy.storchaka #18013: cgi.FieldStorage does not parse W3C sample http://bugs.python.org/issue18013 opened by flox #18014: Problem compiling on Windows, VC++Express 2010 http://bugs.python.org/issue18014 opened by terry.reedy #18015: python 2.7.5 fails to unpickle namedtuple pickled by 2.7.3 or http://bugs.python.org/issue18015 opened by anselm.kruis #18016: subprocess should open stdin in mode w+b on windows http://bugs.python.org/issue18016 opened by Jason.Gross #18017: ctypes.PyDLL documentation http://bugs.python.org/issue18017 opened by Marc.Br??nink #18018: SystemError: Parent module '' not loaded, cannot perform relat http://bugs.python.org/issue18018 opened by flox #18020: html.escape 10x slower than cgi.escape http://bugs.python.org/issue18020 opened by flox #18021: Update broken link to Apple Publication Style Guide http://bugs.python.org/issue18021 opened by madison.may #18022: Inconsistency between quopri.decodestring() and email.quoprimi http://bugs.python.org/issue18022 opened by serhiy.storchaka #18023: msi product code for 2.7.5150 not in Tools/msi/uuids.py http://bugs.python.org/issue18023 opened by anselm.kruis #18024: dbm module fails to build on SLES11SP1 using 2.7.5 source http://bugs.python.org/issue18024 opened by wempa #18025: Buffer overflow in BufferedIOBase.readinto() http://bugs.python.org/issue18025 opened by serhiy.storchaka #18027: distutils should access stat_result timestamps via .st_*time a http://bugs.python.org/issue18027 opened by jwilk #18028: Warnings with -fstrict-aliasing http://bugs.python.org/issue18028 opened by bkabrda #18029: Python SSL support is missing from SPARC build http://bugs.python.org/issue18029 opened by eeiddne #18032: set methods should specify whether they consume iterators "laz http://bugs.python.org/issue18032 opened by hhm #18033: Example for Profile Module shows incorrect method http://bugs.python.org/issue18033 opened by jough #18034: Last two entries in the programming FAQ are out of date (impor http://bugs.python.org/issue18034 opened by r.david.murray #18035: telnetlib incorrectly assumes that select.error has an errno a http://bugs.python.org/issue18035 opened by gregory.p.smith #18036: "How do I create a .pyc file?" FAQ entry is out of date http://bugs.python.org/issue18036 opened by r.david.murray #18037: 2to3 passes through string literal which causes SyntaxError in http://bugs.python.org/issue18037 opened by vinay.sajip #18038: Unhelpful error message on invalid encoding specification http://bugs.python.org/issue18038 opened by Max.Cantor #18039: dbm.open(..., flag="n") does not work and does not give a warn http://bugs.python.org/issue18039 opened by sonyachiko #18040: SIGINT catching regression on windows in 2.7 http://bugs.python.org/issue18040 opened by David.Gilman #18041: mention issues with code churn in the devguide http://bugs.python.org/issue18041 opened by tshepang #18042: Provide enum.unique class decorator http://bugs.python.org/issue18042 opened by ncoghlan #18043: No mention of `match.regs` in `re` documentation http://bugs.python.org/issue18043 opened by cool-RR #18044: Email headers do not properly decode to unicode. http://bugs.python.org/issue18044 opened by Tim.Rawlinson #18045: get_python_version is not import in bdist_rpm.py http://bugs.python.org/issue18045 opened by maoliping455 #18046: Simplify and clarify logging internals http://bugs.python.org/issue18046 opened by alex #18047: Descriptors get invoked in old-style objects and classes http://bugs.python.org/issue18047 opened by icecrime #18048: Merging test_pep263.py and test_coding.py http://bugs.python.org/issue18048 opened by serhiy.storchaka #18049: Re-enable threading test on OSX http://bugs.python.org/issue18049 opened by ronaldoussoren Most recent 15 issues with no replies (15) ========================================== #18049: Re-enable threading test on OSX http://bugs.python.org/issue18049 #18047: Descriptors get invoked in old-style objects and classes http://bugs.python.org/issue18047 #18046: Simplify and clarify logging internals http://bugs.python.org/issue18046 #18045: get_python_version is not import in bdist_rpm.py http://bugs.python.org/issue18045 #18044: Email headers do not properly decode to unicode. http://bugs.python.org/issue18044 #18043: No mention of `match.regs` in `re` documentation http://bugs.python.org/issue18043 #18040: SIGINT catching regression on windows in 2.7 http://bugs.python.org/issue18040 #18039: dbm.open(..., flag="n") does not work and does not give a warn http://bugs.python.org/issue18039 #18036: "How do I create a .pyc file?" FAQ entry is out of date http://bugs.python.org/issue18036 #18034: Last two entries in the programming FAQ are out of date (impor http://bugs.python.org/issue18034 #18033: Example for Profile Module shows incorrect method http://bugs.python.org/issue18033 #18032: set methods should specify whether they consume iterators "laz http://bugs.python.org/issue18032 #18027: distutils should access stat_result timestamps via .st_*time a http://bugs.python.org/issue18027 #18025: Buffer overflow in BufferedIOBase.readinto() http://bugs.python.org/issue18025 #18023: msi product code for 2.7.5150 not in Tools/msi/uuids.py http://bugs.python.org/issue18023 Most recent 15 issues waiting for review (15) ============================================= #18049: Re-enable threading test on OSX http://bugs.python.org/issue18049 #18046: Simplify and clarify logging internals http://bugs.python.org/issue18046 #18038: Unhelpful error message on invalid encoding specification http://bugs.python.org/issue18038 #18025: Buffer overflow in BufferedIOBase.readinto() http://bugs.python.org/issue18025 #18020: html.escape 10x slower than cgi.escape http://bugs.python.org/issue18020 #18015: python 2.7.5 fails to unpickle namedtuple pickled by 2.7.3 or http://bugs.python.org/issue18015 #18013: cgi.FieldStorage does not parse W3C sample http://bugs.python.org/issue18013 #18011: Inconsistency between b32decode() documentation, docstring and http://bugs.python.org/issue18011 #18010: pydoc search chokes on import errors http://bugs.python.org/issue18010 #18000: _md5 should be built if _ssl cannot be built http://bugs.python.org/issue18000 #17998: internal error in regular expression engine http://bugs.python.org/issue17998 #17978: Python crashes if Py_Initialize/Py_Finalize are called multipl http://bugs.python.org/issue17978 #17976: file.write doesn't raise IOError when it should http://bugs.python.org/issue17976 #17974: Migrate unittest to argparse http://bugs.python.org/issue17974 #17956: add ScheduledExecutor http://bugs.python.org/issue17956 Top 10 most discussed issues (10) ================================= #18003: New lzma crazy slow with line-oriented reading. http://bugs.python.org/issue18003 17 msgs #13612: xml.etree.ElementTree says unknown encoding of a regular encod http://bugs.python.org/issue13612 14 msgs #12641: Remove -mno-cygwin from distutils http://bugs.python.org/issue12641 13 msgs #15392: Create a unittest framework for IDLE http://bugs.python.org/issue15392 11 msgs #13146: Writing a pyc file is not atomic http://bugs.python.org/issue13146 9 msgs #17140: Provide a more obvious public ThreadPool API http://bugs.python.org/issue17140 9 msgs #7727: xmlrpc library returns string which contain null ( \x00 ) http://bugs.python.org/issue7727 7 msgs #14097: Improve the "introduction" page of the tutorial http://bugs.python.org/issue14097 7 msgs #17683: socket.getsockname() inconsistent return type with AF_UNIX http://bugs.python.org/issue17683 7 msgs #17839: base64 module should use memoryview http://bugs.python.org/issue17839 7 msgs Issues closed (42) ================== #3006: subprocess.Popen causes socket to remain open after close http://bugs.python.org/issue3006 closed by neologix #3489: add rotate{left,right} methods to bytearray http://bugs.python.org/issue3489 closed by terry.reedy #7146: platform.uname()[4] returns 'amd64' on Windows and 'x86-64' on http://bugs.python.org/issue7146 closed by r.david.murray #7214: TreeBuilder.end(tag) differs between cElementTree and ElementT http://bugs.python.org/issue7214 closed by eli.bendersky #11011: More functools functions http://bugs.python.org/issue11011 closed by r.david.murray #11995: test_pydoc loads all Python modules http://bugs.python.org/issue11995 closed by pitrou #14009: Clearer documentation for cElementTree http://bugs.python.org/issue14009 closed by eli.bendersky #15758: FileIO.readall() has worst case O(n^2) complexity http://bugs.python.org/issue15758 closed by sbt #16603: Sporadic test_socket failures: testFDPassCMSG_SPACE on Mac OS http://bugs.python.org/issue16603 closed by neologix #16986: ElementTree incorrectly parses strings with declared encoding http://bugs.python.org/issue16986 closed by serhiy.storchaka #17269: getaddrinfo segfaults on OS X when provided with invalid argum http://bugs.python.org/issue17269 closed by ronaldoussoren #17453: logging.config.fileConfig error http://bugs.python.org/issue17453 closed by lukasz.langa #17532: IDLE: Always include "Options" menu on MacOSX http://bugs.python.org/issue17532 closed by ned.deily #17644: str.format() crashes http://bugs.python.org/issue17644 closed by python-dev #17684: Skip tests in test_socket like testFDPassSeparate on OS X http://bugs.python.org/issue17684 closed by neologix #17743: Use extended syntax of `set` command in activate.bat/deactivat http://bugs.python.org/issue17743 closed by python-dev #17744: Unset VIRTUAL_ENV environment variable in deactivate.bat http://bugs.python.org/issue17744 closed by python-dev #17812: Quadratic complexity in b32encode http://bugs.python.org/issue17812 closed by serhiy.storchaka #17844: Add link to alternatives for bytes-to-bytes codecs http://bugs.python.org/issue17844 closed by ncoghlan #17900: Recursive OrderedDict pickling http://bugs.python.org/issue17900 closed by serhiy.storchaka #17901: _elementtree.TreeBuilder raises IndexError on end if construct http://bugs.python.org/issue17901 closed by eli.bendersky #17917: use PyModule_AddIntMacro() instead of PyModule_AddIntConstant( http://bugs.python.org/issue17917 closed by neologix #17937: Collect garbage harder at shutdown http://bugs.python.org/issue17937 closed by pitrou #17955: Minor updates to Functional HOWTO http://bugs.python.org/issue17955 closed by akuchling #17957: remove outdated (and unexcellent) paragraph in whatsnew http://bugs.python.org/issue17957 closed by ezio.melotti #17979: Cannot build 2.7 with --enable-unicode=no http://bugs.python.org/issue17979 closed by serhiy.storchaka #17980: CVE-2013-2099 ssl.match_hostname() trips over crafted wildcard http://bugs.python.org/issue17980 closed by pitrou #17988: ElementTree.Element != ElementTree._ElementInterface http://bugs.python.org/issue17988 closed by eli.bendersky #17989: ElementTree.Element broken attribute setting http://bugs.python.org/issue17989 closed by eli.bendersky #17996: socket module should expose AF_LINK http://bugs.python.org/issue17996 closed by giampaolo.rodola #17999: test_super fails in refleak runs http://bugs.python.org/issue17999 closed by python-dev #18001: TypeError: dict is not callable in ConfigParser.py http://bugs.python.org/issue18001 closed by ethan.furman #18002: AMD64 Windows7 SP1 3.x buildbot: compilation of _ssl module fa http://bugs.python.org/issue18002 closed by pitrou #18005: faster modular exponentiation in some cases http://bugs.python.org/issue18005 closed by mark.dickinson #18006: Set thread name in linux kernel http://bugs.python.org/issue18006 closed by neologix #18007: CookieJar expects request objects with origin_req_host attribu http://bugs.python.org/issue18007 closed by orsenthil #18008: python33-3.3.2 Parser/pgen: Permission denied http://bugs.python.org/issue18008 closed by ned.deily #18012: cannot assign unicode keys to SimpleCookie http://bugs.python.org/issue18012 closed by flox #18019: dictionary views lead to segmentation fault http://bugs.python.org/issue18019 closed by python-dev #18030: IDLE shell crashes when reporting errors in Windows 7 http://bugs.python.org/issue18030 closed by serhiy.storchaka #18031: The Python Tutorial says % string formatting will be removed http://bugs.python.org/issue18031 closed by rhettinger #18026: Typo in ctypes documentation http://bugs.python.org/issue18026 closed by ned.deily From barry at python.org Fri May 24 21:56:29 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 24 May 2013 15:56:29 -0400 Subject: [Python-Dev] Bilingual scripts Message-ID: <20130524155629.7597bdb0@anarchist> Here's something that seems to come up from time to time in Debian. Take a Python application like tox, nose, or pyflakes. Their executables work with both Python 2 and 3, but require a #! line to choose which interpreter to invoke. When we add Python 3 support in Debian for such a script, all the library installations are handled just fine, but we have conflicts about what to name the thing that lands in /usr/bin. Do we have /usr/bin/pyflakes and /usr/bin/pyflakes3? Do we call the latter py3flakes (as has been convention with some other scripts, but which breaks locate(1))? Do we simply remove the /usr/bin scripts and encourage people to use something like `$python -m nose`? One interesting recent suggestion is to create a shell driver script and make that thing accept --py3 and --py2 flags, which would then select the appropriate interpreter and invoke the actual Python driver. There are some technical details to iron out with that, but there's some appeal to it. Have any other *nix distros addressed this, and if so, how do you solve it? It would be nice if we could have some cross-platform recommendations so things work the same wherever you go. To that end, if we can reach some consensus, I'd be willing to put together an informational PEP and some scripts that might be of general use. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From rdmurray at bitdance.com Fri May 24 22:23:58 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 24 May 2013 16:23:58 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130524155629.7597bdb0@anarchist> References: <20130524155629.7597bdb0@anarchist> Message-ID: <20130524202358.C57F9250BDB@webabinitio.net> On Fri, 24 May 2013 15:56:29 -0400, Barry Warsaw wrote: > Have any other *nix distros addressed this, and if so, how do you solve it? > It would be nice if we could have some cross-platform recommendations so > things work the same wherever you go. To that end, if we can reach some > consensus, I'd be willing to put together an informational PEP and some > scripts that might be of general use. Gentoo has a (fairly complex) driver script that is symlinked to all of these bin scripts. The system then has the concept of the "current python", which can be set to python2 or python3. The default bin then calls the current default interpreter. There are also xxx2 and xxx3 versions of each bin script, which call the 'current' version of python2 or python3, respectively. I'm sure one of the gentoo devs on this list can speak to this more completely...I'm just a user :) But I must say that the system works well from my point of view. --David From dirkjan at ochtman.nl Fri May 24 22:52:43 2013 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Fri, 24 May 2013 22:52:43 +0200 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130524202358.C57F9250BDB@webabinitio.net> References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> Message-ID: On Fri, May 24, 2013 at 10:23 PM, R. David Murray wrote: > Gentoo has a (fairly complex) driver script that is symlinked to all > of these bin scripts. The system then has the concept of the > "current python", which can be set to python2 or python3. The default > bin then calls the current default interpreter. There are also > xxx2 and xxx3 versions of each bin script, which call the 'current' > version of python2 or python3, respectively. I'm one of the Gentoo devs, on the python team. I haven't actually written any code for this, but I can show a little of what's going on. I think most of the code is actually in https://bitbucket.org/mgorny/python-exec. We then install three scripts: lrwxrwxrwx 1 root root 11 May 20 14:06 /usr/bin/sphinx-build -> python-exec -rwxr-xr-x 1 root root 311 May 20 14:06 /usr/bin/sphinx-build-python2.7 -rwxr-xr-x 1 root root 311 May 20 14:06 /usr/bin/sphinx-build-python3.2 sphinx-build-python2.7 looks like this: #!/usr/bin/python2.7 # EASY-INSTALL-ENTRY-SCRIPT: 'Sphinx==1.1.3','console_scripts','sphinx-build' __requires__ = 'Sphinx==1.1.3' import sys from pkg_resources import load_entry_point if __name__ == '__main__': sys.exit( load_entry_point('Sphinx==1.1.3', 'console_scripts', 'sphinx-build')() ) We now use a python2.7 suffix rather than just a 2.7 suffix because we will install separate wrappers for e.g. pypy1.9 (and we are also prepared to support jython or other implementations at some point). If you have any more questions, I'll try to answer them; or, join #gentoo-python on Freenode, there are generally people hanging out there who know much more about our setup than I do. Cheers, Dirkjan From chrism at plope.com Sat May 25 09:12:32 2013 From: chrism at plope.com (Chris McDonough) Date: Sat, 25 May 2013 03:12:32 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130524155629.7597bdb0@anarchist> References: <20130524155629.7597bdb0@anarchist> Message-ID: <1369465952.2673.171.camel@thinko> On Fri, 2013-05-24 at 15:56 -0400, Barry Warsaw wrote: > Here's something that seems to come up from time to time in Debian. > > Take a Python application like tox, nose, or pyflakes. Their executables work > with both Python 2 and 3, but require a #! line to choose which interpreter to > invoke. You probably already know this, but I'll mention it anyway. This probably matters a lot for nose and pyflakes, but I'd say that for tox it should not, it basically just scripts execution of shell commands. I'd think maybe in cases like tox (and others that are compatible with both Python 2 and 3) the hashbang should just be set to "#!/usr/bin/python" unconditionally. Maybe we could also think about modifying pyflakes so that it can validate both 2 and 3 code (choosing one or the other based on a header line in the validated files and defaulting to the version of Python being run). This is kind of the right thing anyway. Nose is a bit of a special case. I personally never run nosetests directly, I always use setup.py nosetests, which makes it not matter. In general, I'd like to think that scripts that get installed to global bindirs will execute utilities that are useful independent of the version of Python being used to execute them. - C From solipsis at pitrou.net Sat May 25 09:53:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 May 2013 09:53:16 +0200 Subject: [Python-Dev] Bilingual scripts References: <20130524155629.7597bdb0@anarchist> Message-ID: <20130525095316.010f2b34@fsol> On Fri, 24 May 2013 15:56:29 -0400 Barry Warsaw wrote: > Here's something that seems to come up from time to time in Debian. > > Take a Python application like tox, nose, or pyflakes. Their executables work > with both Python 2 and 3, but require a #! line to choose which interpreter to > invoke. > > When we add Python 3 support in Debian for such a script, all the library > installations are handled just fine, but we have conflicts about what to name > the thing that lands in /usr/bin. Do we have /usr/bin/pyflakes and > /usr/bin/pyflakes3? Do we call the latter py3flakes (as has been convention > with some other scripts, but which breaks locate(1))? Do we simply remove the > /usr/bin scripts and encourage people to use something like `$python -m nose`? How about always running the version specific targets, e.g. nosetests-2.7? Regards Antoine. From ncoghlan at gmail.com Sat May 25 09:57:28 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 May 2013 17:57:28 +1000 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130524155629.7597bdb0@anarchist> References: <20130524155629.7597bdb0@anarchist> Message-ID: On Sat, May 25, 2013 at 5:56 AM, Barry Warsaw wrote: > Have any other *nix distros addressed this, and if so, how do you solve it? I believe Fedora follows the lead set by our own makefile and just appends a "3" to the script name when there is also a Python 2 equivalent (thus ``pydoc3`` and ``pyvenv``). (I don't have any other system provided Python 3 scripts on this machine, though) > It would be nice if we could have some cross-platform recommendations so > things work the same wherever you go. To that end, if we can reach some > consensus, I'd be willing to put together an informational PEP and some > scripts that might be of general use. It seems to me the existing recommendation to use ``#!/usr/bin/env python`` instead of referencing a particular binary already covers the general case. The challenge for the distros is that we want a solution that *ignores* user level virtual environments. I think the simplest thing to do is just append the "3" to the binary name (as we do ourselves for pydoc) and then abide by the recommendations in PEP 394 to reference the correct system executable. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From chrism at plope.com Sat May 25 12:17:06 2013 From: chrism at plope.com (Chris McDonough) Date: Sat, 25 May 2013 06:17:06 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: References: <20130524155629.7597bdb0@anarchist> Message-ID: <1369477026.2673.178.camel@thinko> On Sat, 2013-05-25 at 17:57 +1000, Nick Coghlan wrote: > I think the simplest thing to do is just append the "3" to the binary > name (as we do ourselves for pydoc) and then abide by the > recommendations in PEP 394 to reference the correct system executable. I'm curious if folks have other concrete examples of global bindir executables other than nosetests and pydoc that need to be disambiguated by Python version. I'd hate to see it become standard practice to append "3" to scripts generated by packages which happen to use Python 3, as it will just sort of perpetuate its otherness. - C From ncoghlan at gmail.com Sat May 25 13:01:17 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 25 May 2013 21:01:17 +1000 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <1369477026.2673.178.camel@thinko> References: <20130524155629.7597bdb0@anarchist> <1369477026.2673.178.camel@thinko> Message-ID: On Sat, May 25, 2013 at 8:17 PM, Chris McDonough wrote: > On Sat, 2013-05-25 at 17:57 +1000, Nick Coghlan wrote: >> I think the simplest thing to do is just append the "3" to the binary >> name (as we do ourselves for pydoc) and then abide by the >> recommendations in PEP 394 to reference the correct system executable. > > I'm curious if folks have other concrete examples of global bindir > executables other than nosetests and pydoc that need to be disambiguated > by Python version. Single source Python 2/3 packages don't have the problem. They can either run with the system Python explicitly, or use "/usr/bin/env python" in order to respect virtual environments. The issue only exists for projects where Python 2 and Python 3 are separate code bases with distinct scripts to be installed on the target system. In my opinion, is just one more reason why single source is a vastly superior alternative to treating the Python 2 and Python 3 versions as separate applications. > I'd hate to see it become standard practice to > append "3" to scripts generated by packages which happen to use Python > 3, as it will just sort of perpetuate its otherness. Fedora only does it for stuff that has to straddle the two with a parallel install as a system binary. If something is exclusive to Py3 (like "pyvenv") it doesn't get the suffix. It's certainly not an elegant solution, but Python is far from the only runtime platform afflicted by similar issues when it comes to balancing the interests of distro developers wanting to build a single integrated system against those of application developers wanting to use more recent versions of their dependencies. Longer term, we (Red Hat) want to better support application stacks that are more independent of the system versions, while still being fully under the control of the system package manager (to ease security updates). The mechanism underlying that is Software Collections, which is better described here: http://developerblog.redhat.com/2013/01/28/software-collections-on-red-hat-enterprise-linux/ (The software collection system also exists in Fedora, as that's where it was developed, but the main benefit there is to allow application developers to use something *older* than what Fedora provides in the system packages if there are backwards compatibility issues with an update) I have no idea if Debian are contemplating anything similar, but the current situation where application developers have to choose between letting distro vendors control the upgrade cycle of key dependencies and divorcing themselves entirely from the benefits offered by the system package manager is unsustainable in the long term. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From lukasz at langa.pl Sat May 25 14:08:24 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Sat, 25 May 2013 14:08:24 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) Message-ID: Hello, Since the initial version, several minor changes have been made to the PEP. The history is visible on hg.python.org. The most important change in this version is that I introduced ABC support and completed a reference implementation. No open issues remain from my point of view. PEP: 443 Title: Single-dispatch generic functions Version: $Revision$ Last-Modified: $Date$ Author: ?ukasz Langa Discussions-To: Python-Dev Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 22-May-2013 Post-History: 22-May-2013, 25-May-2013 Replaces: 245, 246, 3124 Abstract ======== This PEP proposes a new mechanism in the ``functools`` standard library module that provides a simple form of generic programming known as single-dispatch generic functions. A **generic function** is composed of multiple functions sharing the same name. Which form should be used during a call is determined by the dispatch algorithm. When the implementation is chosen based on the type of a single argument, this is known as **single dispatch**. Rationale and Goals =================== Python has always provided a variety of built-in and standard-library generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``, ``copy.copy()``, and most of the functions in the ``operator`` module. However, it currently: 1. does not have a simple or straightforward way for developers to create new generic functions, 2. does not have a standard way for methods to be added to existing generic functions (i.e., some are added using registration functions, others require defining ``__special__`` methods, possibly by monkeypatching). In addition, it is currently a common anti-pattern for Python code to inspect the types of received arguments, in order to decide what to do with the objects. For example, code may wish to accept either an object of some type, or a sequence of objects of that type. Currently, the "obvious way" to do this is by type inspection, but this is brittle and closed to extension. Abstract Base Classes make it easier to discover present behaviour, but don't help adding new behaviour. A developer using an already-written library may be unable to change how their objects are treated by such code, especially if the objects they are using were created by a third party. Therefore, this PEP proposes a uniform API to address dynamic overloading using decorators. User API ======== To define a generic function, decorate it with the ``@singledispatch`` decorator. Note that the dispatch happens on the type of the first argument, create your function accordingly:: >>> from functools import singledispatch >>> @singledispatch ... def fun(arg, verbose=False): ... if verbose: ... print("Let me just say,", end=" ") ... print(arg) To add overloaded implementations to the function, use the ``register()`` attribute of the generic function. It takes a type parameter:: >>> @fun.register(int) ... def _(arg, verbose=False): ... if verbose: ... print("Strength in numbers, eh?", end=" ") ... print(arg) ... >>> @fun.register(list) ... def _(arg, verbose=False): ... if verbose: ... print("Enumerate this:") ... for i, elem in enumerate(arg): ... print(i, elem) To enable registering lambdas and pre-existing functions, the ``register()`` attribute can be used in a functional form:: >>> def nothing(arg, verbose=False): ... print("Nothing.") ... >>> fun.register(type(None), nothing) The ``register()`` attribute returns the undecorated function which enables decorator stacking, pickling, as well as creating unit tests for each variant independently:: >>> @fun.register(float) ... @fun.register(Decimal) ... def fun_num(arg, verbose=False): ... if verbose: ... print("Half of your number:", end=" ") ... print(arg / 2) ... >>> fun_num is fun False When called, the generic function dispatches on the first argument:: >>> fun("Hello, world.") Hello, world. >>> fun("test.", verbose=True) Let me just say, test. >>> fun(42, verbose=True) Strength in numbers, eh? 42 >>> fun(['spam', 'spam', 'eggs', 'spam'], verbose=True) Enumerate this: 0 spam 1 spam 2 eggs 3 spam >>> fun(None) Nothing. >>> fun(1.23) 0.615 To get the implementation for a specific type, use the ``dispatch()`` attribute:: >>> fun.dispatch(float) >>> fun.dispatch(dict) The proposed API is intentionally limited and opinionated, as to ensure it is easy to explain and use, as well as to maintain consistency with existing members in the ``functools`` module. Implementation Notes ==================== The functionality described in this PEP is already implemented in the ``pkgutil`` standard library module as ``simplegeneric``. Because this implementation is mature, the goal is to move it largely as-is. The reference implementation is available on hg.python.org [#ref-impl]_. The dispatch type is specified as a decorator argument. An alternative form using function annotations has been considered but its inclusion has been deferred. As of May 2013, this usage pattern is out of scope for the standard library [#pep-0008]_ and the best practices for annotation usage are still debated. Based on the current ``pkgutil.simplegeneric`` implementation and following the convention on registering virtual subclasses on Abstract Base Classes, the dispatch registry will not be thread-safe. Abstract Base Classes --------------------- The ``pkgutil.simplegeneric`` implementation relied on several forms of method resultion order (MRO). ``@singledispatch`` removes special handling of old-style classes and Zope's ExtensionClasses. More importantly, it introduces support for Abstract Base Classes (ABC). When a generic function overload is registered for an ABC, the dispatch algorithm switches to a mode of MRO calculation for the provided argument which includes the relevant ABCs. The algorithm is as follows:: def _compose_mro(cls, haystack): """Calculates the MRO for a given class `cls`, including relevant abstract base classes from `haystack`.""" bases = set(cls.__mro__) mro = list(cls.__mro__) for regcls in haystack: if regcls in bases or not issubclass(cls, regcls): continue # either present in the __mro__ or unrelated for index, base in enumerate(mro): if not issubclass(base, regcls): break if base in bases and not issubclass(regcls, base): # Conflict resolution: put classes present in __mro__ # and their subclasses first. index += 1 mro.insert(index, regcls) return mro While this mode of operation is significantly slower, no caching is involved because user code may ``register()`` a new class on an ABC at any time. In such case, it is possible to create a situation with ambiguous dispatch, for instance:: >>> from collections import Iterable, Container >>> class P: ... pass >>> Iterable.register(P) >>> Container.register(P) Faced with ambiguity, ``@singledispatch`` refuses the temptation to guess:: >>> @singledispatch ... def g(arg): ... return "base" ... >>> g.register(Iterable, lambda arg: "iterable") at 0x108b49110> >>> g.register(Container, lambda arg: "container") at 0x108b491c8> >>> g(P()) Traceback (most recent call last): ... RuntimeError: Ambiguous dispatch: or Note that this exception would not be raised if ``Iterable`` and ``Container`` had been provided as base classes during class definition. In this case dispatch happens in the MRO order:: >>> class Ten(Iterable, Container): ... def __iter__(self): ... for i in range(10): ... yield i ... def __contains__(self, value): ... return value in range(10) ... >>> g(Ten()) 'iterable' Usage Patterns ============== This PEP proposes extending behaviour only of functions specifically marked as generic. Just as a base class method may be overridden by a subclass, so too may a function be overloaded to provide custom functionality for a given type. Universal overloading does not equal *arbitrary* overloading, in the sense that we need not expect people to randomly redefine the behavior of existing functions in unpredictable ways. To the contrary, generic function usage in actual programs tends to follow very predictable patterns and overloads are highly-discoverable in the common case. If a module is defining a new generic operation, it will usually also define any required overloads for existing types in the same place. Likewise, if a module is defining a new type, then it will usually define overloads there for any generic functions that it knows or cares about. As a result, the vast majority of overloads can be found adjacent to either the function being overloaded, or to a newly-defined type for which the overload is adding support. It is only in rather infrequent cases that one will have overloads in a module that contains neither the function nor the type(s) for which the overload is added. In the absence of incompetence or deliberate intention to be obscure, the few overloads that are not adjacent to the relevant type(s) or function(s), will generally not need to be understood or known about outside the scope where those overloads are defined. (Except in the "support modules" case, where best practice suggests naming them accordingly.) As mentioned earlier, single-dispatch generics are already prolific throughout the standard library. A clean, standard way of doing them provides a way forward to refactor those custom implementations to use a common one, opening them up for user extensibility at the same time. Alternative approaches ====================== In PEP 3124 [#pep-3124]_ Phillip J. Eby proposes a full-grown solution with overloading based on arbitrary rule sets (with the default implementation dispatching on argument types), as well as interfaces, adaptation and method combining. PEAK-Rules [#peak-rules]_ is a reference implementation of the concepts described in PJE's PEP. Such a broad approach is inherently complex, which makes reaching a consensus hard. In contrast, this PEP focuses on a single piece of functionality that is simple to reason about. It's important to note this does not preclude the use of other approaches now or in the future. In a 2005 article on Artima [#artima2005]_ Guido van Rossum presents a generic function implementation that dispatches on types of all arguments on a function. The same approach was chosen in Andrey Popp's ``generic`` package available on PyPI [#pypi-generic]_, as well as David Mertz's ``gnosis.magic.multimethods`` [#gnosis-multimethods]_. While this seems desirable at first, I agree with Fredrik Lundh's comment that "if you design APIs with pages of logic just to sort out what code a function should execute, you should probably hand over the API design to someone else". In other words, the single argument approach proposed in this PEP is not only easier to implement but also clearly communicates that dispatching on a more complex state is an anti-pattern. It also has the virtue of corresponding directly with the familiar method dispatch mechanism in object oriented programming. The only difference is whether the custom implementation is associated more closely with the data (object-oriented methods) or the algorithm (single-dispatch overloading). PyPy's RPython offers ``extendabletype`` [#pairtype]_, a metaclass which enables classes to be externally extended. In combination with ``pairtype()`` and ``pair()`` factories, this offers a form of single-dispatch generics. Acknowledgements ================ Apart from Phillip J. Eby's work on PEP 3124 [#pep-3124]_ and PEAK-Rules, influences include Paul Moore's original issue [#issue-5135]_ that proposed exposing ``pkgutil.simplegeneric`` as part of the ``functools`` API, Guido van Rossum's article on multimethods [#artima2005]_, and discussions with Raymond Hettinger on a general pprint rewrite. Huge thanks to Nick Coghlan for encouraging me to create this PEP and providing initial feedback. References ========== .. [#ref-impl] http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l359 .. [#pep-0008] PEP 8 states in the "Programming Recommendations" section that "the Python standard library will not use function annotations as that would result in a premature commitment to a particular annotation style". (http://www.python.org/dev/peps/pep-0008) .. [#pep-3124] http://www.python.org/dev/peps/pep-3124/ .. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules .. [#artima2005] http://www.artima.com/weblogs/viewpost.jsp?thread=101605 .. [#pypi-generic] http://pypi.python.org/pypi/generic .. [#gnosis-multimethods] http://gnosis.cx/publish/programming/charming_python_b12.html .. [#pairtype] https://bitbucket.org/pypy/pypy/raw/default/rpython/tool/pairtype.py .. [#issue-5135] http://bugs.python.org/issue5135 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From solipsis at pitrou.net Sat May 25 15:18:23 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 May 2013 15:18:23 +0200 Subject: [Python-Dev] __subclasses__() return order Message-ID: <20130525151823.38477393@fsol> Hello, In http://bugs.python.org/issue17936, I proposed making tp_subclasses (the internal container implementing object.__subclasses__) a dict. This would make the return order of __subclasses__ completely undefined, while it is right now slightly predictable. I have never seen __subclasses__ actually used in production code, so I'm wondering whether someone might be affected by such a change. Regards Antoine. From eliben at gmail.com Sat May 25 15:23:56 2013 From: eliben at gmail.com (Eli Bendersky) Date: Sat, 25 May 2013 06:23:56 -0700 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: <20130525151823.38477393@fsol> References: <20130525151823.38477393@fsol> Message-ID: On Sat, May 25, 2013 at 6:18 AM, Antoine Pitrou wrote: > > Hello, > > In http://bugs.python.org/issue17936, I proposed making tp_subclasses > (the internal container implementing object.__subclasses__) a dict. > This would make the return order of __subclasses__ completely > undefined, while it is right now slightly predictable. I have never seen > __subclasses__ actually used in production code, so I'm wondering > whether someone might be affected by such a change. > > Regards > Personally I never used it, but it's now explicitly documented as returning a list. Not sure what's the right thing to do here, but perhaps returning an OrderedDict can eliminate the order problem? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat May 25 15:26:58 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 May 2013 15:26:58 +0200 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: References: <20130525151823.38477393@fsol> Message-ID: <20130525152658.577eab65@fsol> On Sat, 25 May 2013 06:23:56 -0700 Eli Bendersky wrote: > On Sat, May 25, 2013 at 6:18 AM, Antoine Pitrou wrote: > > > > > Hello, > > > > In http://bugs.python.org/issue17936, I proposed making tp_subclasses > > (the internal container implementing object.__subclasses__) a dict. > > This would make the return order of __subclasses__ completely > > undefined, while it is right now slightly predictable. I have never seen > > __subclasses__ actually used in production code, so I'm wondering > > whether someone might be affected by such a change. > > > > Regards > > > > Personally I never used it, but it's now explicitly documented as returning > a list. Not sure what's the right thing to do here, but perhaps returning > an OrderedDict can eliminate the order problem? It would still return a list. Regards Antoine. From solipsis at pitrou.net Sat May 25 15:45:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 May 2013 15:45:25 +0200 Subject: [Python-Dev] __subclasses__() return order References: <20130525151823.38477393@fsol> <20130525152658.577eab65@fsol> Message-ID: <20130525154525.5917817d@fsol> On Sat, 25 May 2013 15:26:58 +0200 Antoine Pitrou wrote: > On Sat, 25 May 2013 06:23:56 -0700 > Eli Bendersky wrote: > > On Sat, May 25, 2013 at 6:18 AM, Antoine Pitrou wrote: > > > > > > > > Hello, > > > > > > In http://bugs.python.org/issue17936, I proposed making tp_subclasses > > > (the internal container implementing object.__subclasses__) a dict. > > > This would make the return order of __subclasses__ completely > > > undefined, while it is right now slightly predictable. I have never seen > > > __subclasses__ actually used in production code, so I'm wondering > > > whether someone might be affected by such a change. > > > > > > Regards > > > > > > > Personally I never used it, but it's now explicitly documented as returning > > a list. Not sure what's the right thing to do here, but perhaps returning > > an OrderedDict can eliminate the order problem? > > It would still return a list. I guess I should explain myself more clearly: __subclasses__() already computes its result on-the-fly (it must weed out dead weakrefs) (*). So the visible behaviour of __subclasses__ wouldn't change, except for ordering. (*) >>> object.__subclasses__() is object.__subclasses__() False Regards Antoine. From lukasz at langa.pl Sat May 25 15:49:04 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Sat, 25 May 2013 15:49:04 +0200 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: <20130525154525.5917817d@fsol> References: <20130525151823.38477393@fsol> <20130525152658.577eab65@fsol> <20130525154525.5917817d@fsol> Message-ID: <190C76A9-C252-470C-B93B-4FBFCF22FD50@langa.pl> On 25 maj 2013, at 15:45, Antoine Pitrou wrote: > On Sat, 25 May 2013 15:26:58 +0200 > Antoine Pitrou wrote: > >> On Sat, 25 May 2013 06:23:56 -0700 >> Eli Bendersky wrote: >>> On Sat, May 25, 2013 at 6:18 AM, Antoine Pitrou wrote: >>> >>>> >>>> Hello, >>>> >>>> In http://bugs.python.org/issue17936, I proposed making tp_subclasses >>>> (the internal container implementing object.__subclasses__) a dict. >>>> This would make the return order of __subclasses__ completely >>>> undefined, while it is right now slightly predictable. I have never seen >>>> __subclasses__ actually used in production code, so I'm wondering >>>> whether someone might be affected by such a change. >>>> >>>> Regards >>>> >>> >>> Personally I never used it, but it's now explicitly documented as returning >>> a list. Not sure what's the right thing to do here, but perhaps returning >>> an OrderedDict can eliminate the order problem? >> >> It would still return a list. > > I guess I should explain myself more clearly: __subclasses__() already > computes its result on-the-fly (it must weed out dead weakrefs) (*). So > the visible behaviour of __subclasses__ wouldn't change, except for > ordering. +1 Makes sense to me. As currently defined, you cannot rely on the item order anyway. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From pje at telecommunity.com Sat May 25 16:08:20 2013 From: pje at telecommunity.com (PJ Eby) Date: Sat, 25 May 2013 10:08:20 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: On Sat, May 25, 2013 at 8:08 AM, ?ukasz Langa wrote: > The most important > change in this version is that I introduced ABC support and completed > a reference implementation. Excellent! A couple of thoughts on the implementation... While the dispatch() method allows you to look up what implementation would be *selected* for a target type, it does not let you figure out whether a particular method has been *registered* for a type. That is, if I have a class MyInt that subclasses int, I can't use dispatch() to check whether a MyInt implementation has been registered, because I might get back an implementation registered for int or object. ISTM there should be some way to get at the raw registration info, perhaps by exposing a dictproxy for the registry. Second, it should be possible to memoize dispatch() using a weak key dictionary that is cleared if new ABC implementations have been registered or when a call to register() is made. The way to detect ABC registrations is via the ABCMeta._abc_invalidation_counter attribute: if its value is different than the previous value saved with the cache, the cache must be cleared, and the new value stored. (Unfortunately, this is a private attribute at the moment; it might be a good idea to make it public, however, because it's needed for any sort of type dispatching mechanism, not just this one particular generic function implementation.) Anyway, doing the memoizing in the wrapper function should bring the overall performance very close to a hand-written type dispatch. Code might look something like: # imported inside closure so that functools module # doesn't force import of these other modules: # from weakref import ref, WeakKeyDictionary from abc import ABCMeta cache = WeakKeyDictionary() valid_as_of = ABCMeta._abc_invalidation_counter def wrapper(*args, **kw): nonlocal valid_as_of if valid_as_of != ABCMeta._abc_invalidation_counter: cache.clear() valid_as_of = ABCMeta._abc_invalidation_counter cls = args[0].__class__ try: impl = cache.data[ref(cls)] except KeyError: impl = cache[cls] = dispatch(cls) return impl(*args, **kw) def register(typ, func=None): ... cache.clear() ... This would basically eliminate doing any extra (Python) function calls in the common case, and might actually be faster than my current simplegeneric implementation on PyPI (which doesn't even do ABCs at the moment). From lukasz at langa.pl Sat May 25 16:53:59 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Sat, 25 May 2013 16:53:59 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> On 25 maj 2013, at 16:08, PJ Eby wrote: > ISTM there should be some way to get at the raw > registration info, perhaps by exposing a dictproxy for the registry. Is that really useful? Just today Antoine asked about changing behaviour of __subclasses__(), suspecting it isn't used in real world code anyway. What you're proposing is the functional equivalent of __subclasses__(). If you need direct access to the registry, do you think the ability to specify your own registry container isn't enough? > The way to detect ABC registrations is via the > ABCMeta._abc_invalidation_counter attribute: if its value is different > than the previous value saved with the cache, the cache must be > cleared, and the new value stored. Wow, I was looking at it just today morning and somehow missed how it can be used. Great idea, I'll play around with it later today. > (Unfortunately, this is a private attribute at the moment; it might be > a good idea to make it public, however, because it's needed for any > sort of type dispatching mechanism, not just this one particular > generic function implementation.) I think we can safely use it within the standard library, anyway. As for making it public, it's an idea for a separate discussion. > This would basically eliminate doing any extra (Python) function calls > in the common case, and might actually be faster than my current > simplegeneric implementation on PyPI (which doesn't even do ABCs at > the moment). Yes, that sounds neat. Thanks for feedback! -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From ncoghlan at gmail.com Sat May 25 16:59:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 00:59:09 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: On Sun, May 26, 2013 at 12:08 AM, PJ Eby wrote: > On Sat, May 25, 2013 at 8:08 AM, ?ukasz Langa wrote: >> The most important >> change in this version is that I introduced ABC support and completed >> a reference implementation. > > Excellent! A couple of thoughts on the implementation... > > While the dispatch() method allows you to look up what implementation > would be *selected* for a target type, it does not let you figure out > whether a particular method has been *registered* for a type. > > That is, if I have a class MyInt that subclasses int, I can't use > dispatch() to check whether a MyInt implementation has been > registered, because I might get back an implementation registered for > int or object. ISTM there should be some way to get at the raw > registration info, perhaps by exposing a dictproxy for the registry. I like that idea - types.MappingProxyType makes it straightforward to expose a read-only view of the dispatch registry. > Second, it should be possible to memoize dispatch() using a weak key > dictionary that is cleared if new ABC implementations have been > registered or when a call to register() is made. The way to detect > ABC registrations is via the ABCMeta._abc_invalidation_counter > attribute: if its value is different than the previous value saved > with the cache, the cache must be cleared, and the new value stored. > > (Unfortunately, this is a private attribute at the moment; it might be > a good idea to make it public, however, because it's needed for any > sort of type dispatching mechanism, not just this one particular > generic function implementation.) I think I added an issue on the tracker for that somewhere... yup: http://bugs.python.org/issue16832 Given the global nature of the cache invalidation, it may be better as a module level abc.get_cache_token() function. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From lukasz at langa.pl Sat May 25 17:09:39 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Sat, 25 May 2013 17:09:39 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: <172A3269-16D3-498D-9214-BDF2E394B9BD@langa.pl> On 25 maj 2013, at 16:59, Nick Coghlan wrote: > I think I added an issue on the tracker for that somewhere... yup: > http://bugs.python.org/issue16832 > > Given the global nature of the cache invalidation, it may be better as > a module level abc.get_cache_token() function. I assigned myself to the issue. I hope it doesn't require a PEP? ;-) -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sat May 25 17:13:00 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 01:13:00 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: On Sun, May 26, 2013 at 12:53 AM, ?ukasz Langa wrote: > On 25 maj 2013, at 16:08, PJ Eby wrote: > >> ISTM there should be some way to get at the raw >> registration info, perhaps by exposing a dictproxy for the registry. > > Is that really useful? Just today Antoine asked about changing > behaviour of __subclasses__(), suspecting it isn't used in real world > code anyway. What you're proposing is the functional equivalent of > __subclasses__(). > > If you need direct access to the registry, do you think the ability to > specify your own registry container isn't enough? I'm actually wary about allowing that - letting people pass in the registry namespace the can alter it directly and thus bypass any extra code in register (such as setting the flag to enable the ABC support in the reference impl, or clearing the cache in PJE's suggested update). We don't allow custom namespaces for classes either. Sure, we allow them for the class *body*, but the contents of that get copied to a new dict when creating the class instance, precisely so we can ensure user code never gets a direct reference to the underlying mapping that defines the class behaviour. I actually patched Python once to to remove that copy operation while tinkering with the PEP 422 implementation. I wish I had kept a recording of the subsequent terminal session, as the terrible interactions with the method lookup caching were thoroughly confusing, but quite entertaining if you're into tinkering with programming languages :) So I think I'd prefer flipping this around - you can't provide a custom registry mapping, but you *can* get access to a read only view of it through a "registry" attribute on the generic function. >> The way to detect ABC registrations is via the >> ABCMeta._abc_invalidation_counter attribute: if its value is different >> than the previous value saved with the cache, the cache must be >> cleared, and the new value stored. > > Wow, I was looking at it just today morning and somehow missed how it > can be used. Great idea, I'll play around with it later today. Yeah, this is exactly how the ABC code itself uses it - sorry I wasn't clearer about why I suggested you look at the caching code for ABC instance and subtype checks :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Sat May 25 17:15:49 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 01:15:49 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: <172A3269-16D3-498D-9214-BDF2E394B9BD@langa.pl> References: <172A3269-16D3-498D-9214-BDF2E394B9BD@langa.pl> Message-ID: On Sun, May 26, 2013 at 1:09 AM, ?ukasz Langa wrote: > On 25 maj 2013, at 16:59, Nick Coghlan wrote: > > I think I added an issue on the tracker for that somewhere... yup: > http://bugs.python.org/issue16832 > > Given the global nature of the cache invalidation, it may be better as > a module level abc.get_cache_token() function. > > > I assigned myself to the issue. I hope it doesn't require a PEP? ;-) Heh, I think you're safe on that one :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From pje at telecommunity.com Sat May 25 18:48:45 2013 From: pje at telecommunity.com (PJ Eby) Date: Sat, 25 May 2013 12:48:45 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: On Sat, May 25, 2013 at 10:59 AM, Nick Coghlan wrote: > Given the global nature of the cache invalidation, it may be better as > a module level abc.get_cache_token() function. Well, since the only reason to ever use it is to improve performance, it'd be better to expose it as an attribute than as a function. ;-) From pje at telecommunity.com Sat May 25 19:03:57 2013 From: pje at telecommunity.com (PJ Eby) Date: Sat, 25 May 2013 13:03:57 -0400 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: <20130525151823.38477393@fsol> References: <20130525151823.38477393@fsol> Message-ID: On Sat, May 25, 2013 at 9:18 AM, Antoine Pitrou wrote: > In http://bugs.python.org/issue17936, I proposed making tp_subclasses > (the internal container implementing object.__subclasses__) a dict. > This would make the return order of __subclasses__ completely > undefined, while it is right now slightly predictable. I have never seen > __subclasses__ actually used in production code, so I'm wondering > whether someone might be affected by such a change. FWIW, when I've used __subclasses__, I've never depended on it having a stable or predictable order. (I find it somewhat difficult to imagine *why* one would do that, but of course that doesn't mean nobody has done it.) From ncoghlan at gmail.com Sat May 25 19:15:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 03:15:44 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: On Sun, May 26, 2013 at 2:48 AM, PJ Eby wrote: > On Sat, May 25, 2013 at 10:59 AM, Nick Coghlan wrote: >> Given the global nature of the cache invalidation, it may be better as >> a module level abc.get_cache_token() function. > > Well, since the only reason to ever use it is to improve performance, > it'd be better to expose it as an attribute than as a function. ;-) A single function call is hardly in the same league as arbitrary traversal of the object graph. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From lukasz at langa.pl Sat May 25 22:16:04 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Sat, 25 May 2013 22:16:04 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: On 25 maj 2013, at 17:13, Nick Coghlan wrote: > So I think I'd prefer flipping this around - you can't provide a > custom registry mapping, but you *can* get access to a read only view > of it through a "registry" attribute on the generic function. You guys convinced me. Both the PEP and the implementation are updated accordingly. I also introduced the caching as suggested by PJ. So, the latest document is live: http://www.python.org/dev/peps/pep-0443/ The code is here: http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 The documentation here: http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189 The tests here: http://hg.python.org/features/pep-443/file/tip/Lib/test/test_functools.py#l855 I say we're ready. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From solipsis at pitrou.net Sat May 25 22:25:27 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 25 May 2013 22:25:27 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: <20130525222527.5ed19fa1@fsol> On Sat, 25 May 2013 22:16:04 +0200 ?ukasz Langa wrote: > On 25 maj 2013, at 17:13, Nick Coghlan wrote: > > > So I think I'd prefer flipping this around - you can't provide a > > custom registry mapping, but you *can* get access to a read only view > > of it through a "registry" attribute on the generic function. > > You guys convinced me. Both the PEP and the implementation are updated > accordingly. I also introduced the caching as suggested by PJ. > > So, the latest document is live: > http://www.python.org/dev/peps/pep-0443/ > > The code is here: > http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 > > The documentation here: > http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189 > > The tests here: > http://hg.python.org/features/pep-443/file/tip/Lib/test/test_functools.py#l855 > > I say we're ready. Let it cook for a week or two. Regards Antoine. From pje at telecommunity.com Sun May 26 01:07:30 2013 From: pje at telecommunity.com (PJ Eby) Date: Sat, 25 May 2013 19:07:30 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: On Sat, May 25, 2013 at 4:16 PM, ?ukasz Langa wrote: > So, the latest document is live: > http://www.python.org/dev/peps/pep-0443/ > > The code is here: > http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 > > The documentation here: > http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189 Code and tests look great! Nitpick on the docs and PEP, though: generic functions are not composed of functions sharing the same name; it would probably be more correct to say they're composed of functions that perform the same operations on different types. (I think the "names" language might be left over from discussion of *overloaded* functions in PEP 3124 et al; in any case we're actually recommending people *not* use the same names now, so it's confusing.) We should probably also standardize on the term used for the registered functions. The standard terminology is "method", but that would be confusing in Python, where methods usually have a self argument. The PEP uses the term "implementation", and I think that actually makes a lot of sense: a generic function is composed of functions that implement the same operation for different types. So I suggest changing this: """ Transforms a function into a single-dispatch generic function. A **generic function** is composed of multiple functions sharing the same name. Which form should be used during a call is determined by the dispatch algorithm. When the implementation is chosen based on the type of a single argument, this is known as **single dispatch**. Adding an overload to a generic function is achieved by using the :func:`register` attribute of the generic function. The :func:`register` attribute is a decorator, taking a type paramater and decorating a function implementing the overload for that type.""" to: """ Transforms a function into a single-dispatch generic function. A **generic function** is composed of multiple functions implementing the same operation for different types. Which implementation should be used during a call is determined by the dispatch algorithm. When the implementation is chosen based on the type of a single argument, this is known as **single dispatch**. Adding an implementation to a generic function is achieved by using the :func:`register` attribute of the generic function. The :func:`register` attribute is a decorator, taking a type paramater and decorating a function implementing the operation for that type.""" And replacing "overload" with "implementation" in the remainder of the docs and code. Last, but not least, there should be a stacking example somewhere in the doc, as in the PEP, and perhaps the suggestion to name individual implementations differently from each other and the main function -- perhaps as an adjunct to documenting that register() always returns its argument unchanged. (Currently, it doesn't mention what register()'s return value is.) (It may also be useful to note somewhere that, due to caching, changing the base classes of an existing class may not change what implementation is selected the next time the generic function is invoked with an argument of that type or a subclass thereof.) From ncoghlan at gmail.com Sun May 26 03:37:03 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 11:37:03 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: On Sun, May 26, 2013 at 9:07 AM, PJ Eby wrote: > On Sat, May 25, 2013 at 4:16 PM, ?ukasz Langa wrote: >> So, the latest document is live: >> http://www.python.org/dev/peps/pep-0443/ >> >> The code is here: >> http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 Hmm, I find the use of the variable name "dispatch_cache" for a cache that dispatch() doesn't actually use to be confusing. It also doesn't make sense to me that dispatch() itself bypasses the cache - I would expect all the cache manipulation to be in dispatch(), and there to be a separate "_find_impl()" function that is invoked to handle cache misses. If there's a good reason for dispatch() to bypass the cache without refreshing it, then I suggest renaming the cache variable to "impl_cache". > We should probably also standardize on the term used for the > registered functions. The standard terminology is "method", but that > would be confusing in Python, where methods usually have a self > argument. The PEP uses the term "implementation", and I think that > actually makes a lot of sense: a generic function is composed of > functions that implement the same operation for different types. +1 on consistently using "implementation" as the name for individual functions that are passed to register() and that dispatch() may return. It's also consistent with the terminology we (or at least I) tend to use about finding the implementation of methods based on the type hierarchy. Also +1 to your other docs comments - that info is in the PEP, and is relevant to actually using the new generics in practice. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From steve at pearwood.info Sun May 26 04:22:16 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 26 May 2013 12:22:16 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: <51A171D8.2080209@pearwood.info> On 26/05/13 09:07, PJ Eby wrote: > """ > Transforms a function into a single-dispatch generic function. A **generic > function** is composed of multiple functions implementing the same > operation for different types. Which > implementation should be used during a call is determined by the > dispatch algorithm. > When the implementation is chosen based on the type of a single argument, > this is known as **single dispatch**. > > Adding an implementation to a generic function is achieved by using the > :func:`register` attribute of the generic function. The > :func:`register` attribute is a decorator, taking a type paramater Typo: /s/paramater/parameter/ > and decorating a function implementing the operation for that type.""" Otherwise, +1 on the doc changes suggested. Thanks PJ and ?ukasz for seeing this one through. -- Steven From robert.kern at gmail.com Sun May 26 06:19:08 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 26 May 2013 00:19:08 -0400 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: <20130525151823.38477393@fsol> References: <20130525151823.38477393@fsol> Message-ID: On 2013-05-25 09:18, Antoine Pitrou wrote: > > Hello, > > In http://bugs.python.org/issue17936, I proposed making tp_subclasses > (the internal container implementing object.__subclasses__) a dict. > This would make the return order of __subclasses__ completely > undefined, while it is right now slightly predictable. I have never seen > __subclasses__ actually used in production code, so I'm wondering > whether someone might be affected by such a change. I do use a package that does use __subclasses__ in production code, but the order is unimportant. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ncoghlan at gmail.com Sun May 26 06:36:28 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 14:36:28 +1000 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: <190C76A9-C252-470C-B93B-4FBFCF22FD50@langa.pl> References: <20130525151823.38477393@fsol> <20130525152658.577eab65@fsol> <20130525154525.5917817d@fsol> <190C76A9-C252-470C-B93B-4FBFCF22FD50@langa.pl> Message-ID: On Sat, May 25, 2013 at 11:49 PM, ?ukasz Langa wrote: > I guess I should explain myself more clearly: __subclasses__() already > computes its result on-the-fly (it must weed out dead weakrefs) (*). So > the visible behaviour of __subclasses__ wouldn't change, except for > ordering. > > > +1 > > Makes sense to me. As currently defined, you cannot rely on the item order > anyway. Another concurrence here - if any code in the world depends on __subclasses__ always returning entries in the exact order they happen to be returned in right now, I'm quite happy to declare that code implementation dependent and thus exempt from the normal backwards compatibility guarantees :) Cheers, Nick. From se8.and at gmail.com Sun May 26 11:48:50 2013 From: se8.and at gmail.com (=?ISO-8859-1?Q?S=E9bastien_Durand?=) Date: Sun, 26 May 2013 11:48:50 +0200 Subject: [Python-Dev] PEP 8 and function names Message-ID: Hi all, "There should be one-- and preferably only one --obvious way to do it." We all love this mantra. But one thing that often confuses people : function naming. The standard library is kind of inconsistent. Some functions are separated by underscores and others aren't. It's not intuitive and new pythonistas end up constantly reading the doc. (Time saving one char typing vs time guessing function names.) Would it be a good idea to clarify PEP 8 on this ? I mean for future libraries. Regards, S?bastien -------------- next part -------------- An HTML attachment was scrubbed... URL: From andriy.kornatskyy at live.com Sun May 26 12:05:31 2013 From: andriy.kornatskyy at live.com (Andriy Kornatskyy) Date: Sun, 26 May 2013 13:05:31 +0300 Subject: [Python-Dev] PEP 8 and function names In-Reply-To: References: Message-ID: PEP8 consistency is a question to the development team commitment. Nothing prevents you add pep8 checks to build process, contribute fixes. This inconsistency has been analyzed for various web frameworks recently: http://mindref.blogspot.com/2012/10/python-web-pep8-consistency.html No much in the list are paying attention to this... Andriy ________________________________ > Date: Sun, 26 May 2013 11:48:50 +0200 > From: se8.and at gmail.com > To: python-dev at python.org > Subject: [Python-Dev] PEP 8 and function names > > Hi all, > > "There should be one-- and preferably only one --obvious way to do it." > > We all love this mantra. > > But one thing that often confuses people : function naming. The > standard library is kind of inconsistent. Some functions are separated > by underscores and others aren't. It's not intuitive and new > pythonistas end up constantly reading the doc. (Time saving one char > typing vs time guessing function names.) > > Would it be a good idea to clarify PEP 8 on this ? I mean for future > libraries. > > Regards, > > S?bastien > From ncoghlan at gmail.com Sun May 26 12:34:46 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 26 May 2013 20:34:46 +1000 Subject: [Python-Dev] PEP 8 and function names In-Reply-To: References: Message-ID: On Sun, May 26, 2013 at 7:48 PM, S?bastien Durand wrote: > Hi all, > > "There should be one-- and preferably only one --obvious way to do it." > > We all love this mantra. > > But one thing that often confuses people : function naming. The standard > library is kind of inconsistent. Some functions are separated by underscores > and others aren't. It's not intuitive and new pythonistas end up constantly > reading the doc. (Time saving one char typing vs time guessing function > names.) > > Would it be a good idea to clarify PEP 8 on this ? I mean for future > libraries. As far as I am aware, there's nothing to clarify: new code should use underscores as word separators, code added to an existing module or based on existing API should follow the conventions of that module or API. This is what PEP 8 already says. The standard library is inconsistent because it's a 20 year old code base with severe backwards compatibility constraints, and much of it was written before there was even a PEP process, let alone PEP 8. We did do one wholesale conversion to PEP 8 compliance (for the threading module) and decided the cost/benefit ratio was too low to justify ever doing that again. We do have a general guideline requiring PEP 8 compliance for *new* modules added to the standard library. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stephen at xemacs.org Sun May 26 15:02:18 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 26 May 2013 22:02:18 +0900 Subject: [Python-Dev] PEP 8 and function names In-Reply-To: References: Message-ID: <87a9nhuc6d.fsf@uwakimon.sk.tsukuba.ac.jp> Nick Coghlan writes: > threading module) and decided the cost/benefit ratio was too low to > justify ever doing that again. I think you just failed Econ 101, Nick. I-teach-that-s**t-for-a-living-ly y'rs, P.S. Of course we all understood what you meant. :-) From breamoreboy at yahoo.co.uk Sun May 26 15:18:55 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sun, 26 May 2013 14:18:55 +0100 Subject: [Python-Dev] PEP 8 and function names In-Reply-To: <87a9nhuc6d.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87a9nhuc6d.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 26/05/2013 14:02, Stephen J. Turnbull wrote: > Nick Coghlan writes: > > > threading module) and decided the cost/benefit ratio was too low to > > justify ever doing that again. > > I think you just failed Econ 101, Nick. > > I-teach-that-s**t-for-a-living-ly y'rs, > > P.S. Of course we all understood what you meant. :-) > Yet another reference to Orwell's worst room in the world, what does this imply about Python? :) -- If you're using GoogleCrap? please read this http://wiki.python.org/moin/GoogleGroupsPython. Mark Lawrence From hodgestar+pythondev at gmail.com Mon May 27 00:45:03 2013 From: hodgestar+pythondev at gmail.com (Simon Cross) Date: Mon, 27 May 2013 00:45:03 +0200 Subject: [Python-Dev] __subclasses__() return order In-Reply-To: <20130525151823.38477393@fsol> References: <20130525151823.38477393@fsol> Message-ID: I've used __subclasses__ as an easy way to register components by sub-classing a base component. I didn't rely on the ordering. I guess the current order depends on the order in which modules are imported and so is pretty fragile anyway? From lukasz at langa.pl Mon May 27 14:57:55 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Mon, 27 May 2013 14:57:55 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: On 26 maj 2013, at 01:07, PJ Eby wrote: > The PEP uses the term "implementation", and I think that > actually makes a lot of sense: a generic function is composed of > functions that implement the same operation for different types. All suggested changes applied. There are still a couple of mentions of "overloads" and "overloading" in the PEP but they are unambiguous now and always refer to the general mechanism. > Last, but not least, there should be a stacking example somewhere in > the doc, as in the PEP I swapped the old examples from the docs and reused the PEP API docs in their entirety. This way it's easier to keep things consistent. > (It may also be useful to note somewhere that, due to caching, > changing the base classes of an existing class may not change what > implementation is selected the next time the generic function is > invoked with an argument of that type or a subclass thereof.) I don't think it's necessary. Abstract base classes present the same behaviour and this isn't documented anywhere: >>> from abc import ABC >>> class FirstABC(ABC): pass >>> class SecondABC(ABC): pass >>> class ImplementsFirst(FirstABC): pass >>> assert FirstABC in ImplementsFirst.__mro__ >>> assert issubclass(ImplementsFirst, FirstABC) If we change bases of the class, it no longer reports the first in the MRO: >>> ImplementsFirst.__bases__ = (SecondABC,) >>> assert FirstABC not in ImplementsFirst.__mro__ >>> assert SecondABC in ImplementsFirst.__mro__ >>> assert issubclass(ImplementsFirst, SecondABC) But it still reports being a subclass: >>> assert issubclass(ImplementsFirst, FirstABC), "sic!" -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From lukasz at langa.pl Mon May 27 15:31:26 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Mon, 27 May 2013 15:31:26 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> Message-ID: <8152F6F7-B20D-4555-8BBD-073BD88D6E53@langa.pl> On 26 maj 2013, at 03:37, Nick Coghlan wrote: > On Sun, May 26, 2013 at 9:07 AM, PJ Eby wrote: >> On Sat, May 25, 2013 at 4:16 PM, ?ukasz Langa wrote: >>> So, the latest document is live: >>> http://www.python.org/dev/peps/pep-0443/ >>> >>> The code is here: >>> http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 > > Hmm, I find the use of the variable name "dispatch_cache" for a cache > that dispatch() doesn't actually use to be confusing. Why? It's a cache for dispatches, hence "dispatch_cache". It might not be obvious at first, unless you're Polish ;) > It also doesn't make sense to me that dispatch() itself bypasses the > cache - I would expect all the cache manipulation to be in dispatch(), > and there to be a separate "_find_impl()" function that is invoked to > handle cache misses. This is exactly what I did now. I also exposed ._clear_cache() and the uncached ._find_impl() if somebody finds it necessary to use it. Both are left undocumented. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From skip at pobox.com Mon May 27 18:47:15 2013 From: skip at pobox.com (Skip Montanaro) Date: Mon, 27 May 2013 11:47:15 -0500 Subject: [Python-Dev] PEP 8 and function names In-Reply-To: References: Message-ID: > But one thing that often confuses people : function naming. The standard > library is kind of inconsistent. Some functions are separated by underscores > and others aren't. I think there are a number of reasons for this: * Despite PEP 8's age, significant chunks of the standard library predate it * Modules which are thin wrappers around various C libraries tend to mimic those libraries' names * Modules which were heavily influenced by similar libraries from other languages often carry over spelling * PEP 8 hasn't always been a checklist item for inclusion (not sure it even is today) * Sometimes Cerberus was sleeping, and they snuck past him In any case, once a module makes it into the standard library, the cost of changing spelling outweighs the benefits of slavish adherence to PEP 8. Skip From a.badger at gmail.com Mon May 27 20:38:36 2013 From: a.badger at gmail.com (Toshio Kuratomi) Date: Mon, 27 May 2013 11:38:36 -0700 Subject: [Python-Dev] Bilingual scripts In-Reply-To: References: <20130524155629.7597bdb0@anarchist> Message-ID: <20130527183836.GG2038@unaka.lan> On Sat, May 25, 2013 at 05:57:28PM +1000, Nick Coghlan wrote: > On Sat, May 25, 2013 at 5:56 AM, Barry Warsaw wrote: > > Have any other *nix distros addressed this, and if so, how do you solve it? > > I believe Fedora follows the lead set by our own makefile and just > appends a "3" to the script name when there is also a Python 2 > equivalent (thus ``pydoc3`` and ``pyvenv``). (I don't have any other > system provided Python 3 scripts on this machine, though) > Fedora is a bit of a mess... we try to work with upstream's intent when upstream has realized this problem exists and have a single standard when upstream does not. The full guidelines are here: http://fedoraproject.org/wiki/Packaging:Python#Naming Here's the summary: * If the scripts don't care whether they're running on py2 or py3, just use the base name and choose python2 as the interpreter for now (since we can't currently get rid of python2 on an end user system, that is the choice that brings in less dependencies). ex: /usr/bin/pygmentize * If the script does two different things depending on python2 or python3 being the interpreter (note: this includes both bilingual scripts and scripts which have been modified by 2to3/exist in two separate versions) then we have to look at what upstream is doing: - If upstream already deals with it (ex: pydoc3, easy_install-3.1) then we use upstream's name. We don't love this from an inter-package consistently standpoint as there are other packages which append a version for their own usage (is /usr/bin/foo-3.4 for python-3.4 or the 3.4 version of the foo package?) (And we sometimes have to do this locally if we need to have multiple versions of a package with the multiple versions having scripts... ) We decided to use upstream's name if they account for this issue because it will match with upstream's documentation and nothing else seemed as important in this instance. - If upstream doesn't deal with it, then we use a "python3-" prefix. This matches with our package naming so it seemed to make sense. (But Barry's point about locate and tab completion and such would be a reason to revisit this... Perhaps standardizing on /usr/bin/foo2-python3 [pathological case of having both package version and interpreter version in the name.] - (tangent from a different portion of this thread: we've found that this is a larger problem than we would hope. There are some obvious ones like - ipython (implements a python interpreter so python2 vs python3 is understandably important ad different). - nosetests (the python source being operated on is run through the python interpreter so the version has to match). - easy_install (needs to install python modules to the correct interpreter's site-packages. It decides the correct interpreter according to which interpreter invoked it.) But recently we found a new class of problems: frameworks which are bilinugual. For instance, if you have a web framework which has a /usr/bin/django-admin script that can be used to quickstart a project, run a python shell and automatically load your code, load your ORM db schema and operate on it to make modifications to the db then that script has to know whether your code is compatible with python2 or python3. > > It would be nice if we could have some cross-platform recommendations so > > things work the same wherever you go. To that end, if we can reach some > > consensus, I'd be willing to put together an informational PEP and some > > scripts that might be of general use. > > It seems to me the existing recommendation to use ``#!/usr/bin/env > python`` instead of referencing a particular binary already covers the > general case. The challenge for the distros is that we want a solution > that *ignores* user level virtual environments. > > I think the simplest thing to do is just append the "3" to the binary > name (as we do ourselves for pydoc) and then abide by the > recommendations in PEP 394 to reference the correct system executable. > I'd rather not have a bare 3 for the issues notes above. Something like py3 would be better. There's still room for confusion when distributions have to push multiple versions of a package with scripts that fall into this category. Should the format be: /usr/bin/foo2-py3 (My preference as it places the version next to the thing that it's a version of.) or /usr/bin/foo-py3-2 (Confusing as the 2 is bare. Something like /usr/bin/foo-py3-v2 is slightly better but still not as nice as the previous IMHO) -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From lukasz at langa.pl Mon May 27 23:28:43 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Mon, 27 May 2013 23:28:43 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: <8152F6F7-B20D-4555-8BBD-073BD88D6E53@langa.pl> References: <16DDD0AB-8274-48F0-AB48-05BC7117204C@langa.pl> <8152F6F7-B20D-4555-8BBD-073BD88D6E53@langa.pl> Message-ID: <36BAE202-D296-4118-AD2B-48889540983D@langa.pl> On 27 maj 2013, at 15:31, ?ukasz Langa wrote: > This is exactly what I did now. I also exposed ._clear_cache() and the > uncached ._find_impl() if somebody finds it necessary to use it. Both > are left undocumented. For the record, I moved _find_impl out of the closure for easier testability. I also simplified it a bit as the results are cached anyway. For the most common case where the function argument is of a type that's directly registered, _find_impl isn't even called now. Anyhow, no remaining issues. Somebody call the BDFL. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From benjamin at python.org Tue May 28 08:14:03 2013 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 27 May 2013 23:14:03 -0700 Subject: [Python-Dev] [Python-checkins] cpython (merge 3.3 -> default): Merge with 3.3 In-Reply-To: <3bKHfN4rr2z7LmQ@mail.python.org> References: <3bKHfN4rr2z7LmQ@mail.python.org> Message-ID: 2013/5/27 terry.reedy : > http://hg.python.org/cpython/rev/c5d4c041ab47 > changeset: 83942:c5d4c041ab47 > parent: 83940:2ea849fde22b > parent: 83941:24c3e7e08168 > user: Terry Jan Reedy > date: Mon May 27 21:33:40 2013 -0400 > summary: > Merge with 3.3 > > files: > Lib/idlelib/CallTips.py | 4 +- > Lib/idlelib/PathBrowser.py | 3 +- > Lib/idlelib/idle_test/@README.txt | 63 +++++++++++ Is @README really the intended name of this file? Would README-TEST or something similar be better? -- Regards, Benjamin From ncoghlan at gmail.com Tue May 28 14:15:25 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 May 2013 22:15:25 +1000 Subject: [Python-Dev] Structural cleanups to the main CPython repo Message-ID: I have a feature branch where I'm intermittently working on the bootstrapping changes described in PEP 432. As part of those changes, I've cleaned up a few aspects of the repo layout: * moved the main executable source file from Modules to a separate Apps directory * moved the _freezeimportlib and _testembed source files from Modules to a separate Tools directory * split the monster pythonrun.h/c pair into 3 separate header/impl pairs: * bootstrap.h/bootstrap.c * shutdown.h/shutdown.c * pythonrun.h/pythonrun.c These structural changes generally mean automatic merges touching the build machinery or the startup or shutdown code fail fairly spectacularly and need a lot of TLC to complete them without losing any changes from the main repo. Would anyone object if I went ahead and posted patches for making these changes to the main repo? I found they made the code *much* easier to follow when I started to turn the ideas in PEP 432 into working software, and implementing these shifts should make future merges to my feature branch simpler, as well as producing significantly cleaner diffs when PEP 432 gets closer to completion. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Tue May 28 14:31:17 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 14:31:17 +0200 Subject: [Python-Dev] Structural cleanups to the main CPython repo References: Message-ID: <20130528143117.516cc583@pitrou.net> Le Tue, 28 May 2013 22:15:25 +1000, Nick Coghlan a ?crit : > I have a feature branch where I'm intermittently working on the > bootstrapping changes described in PEP 432. > > As part of those changes, I've cleaned up a few aspects of the repo > layout: > > * moved the main executable source file from Modules to a separate > Apps directory Sounds fine (I don't like "Apps" much, but hey :-)). > * moved the _freezeimportlib and _testembed source files from Modules > to a separate Tools directory Well, they should probably go to Apps too, no? > * split the monster pythonrun.h/c pair into 3 separate header/impl > pairs: > * bootstrap.h/bootstrap.c > * shutdown.h/shutdown.c > * pythonrun.h/pythonrun.c I don't think separating bootstrap from shutdown is a good idea. They are quite closely related since one undoes what the other did (and they may also use shared private functions or data). I don't know what goes in the remaining "pythonrun.c", could you detail a bit? Regards Antoine. From barry at python.org Tue May 28 14:33:41 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 08:33:41 -0400 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: Message-ID: <20130528083341.76293567@anarchist> On May 28, 2013, at 10:15 PM, Nick Coghlan wrote: >Would anyone object if I went ahead and posted patches for making >these changes to the main repo? When you say "post[ed] patches", do you mean you want to put them some place for us to review? If so, sure, go ahead of course. -Barry From storchaka at gmail.com Tue May 28 15:02:00 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 28 May 2013 16:02:00 +0300 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <20130520174638.12fae7ee@fsol> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520174638.12fae7ee@fsol> Message-ID: 20.05.13 18:46, Antoine Pitrou ???????(??): > I think it is a legitimate case where to silence the original > exception. However, the binascii.Error would be more informative if it > said *which* non-base32 digit was encountered. Please open a new issue for this request (note that no other binascii or base64 functions provide such information). From ncoghlan at gmail.com Tue May 28 15:07:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 28 May 2013 23:07:37 +1000 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: <20130528143117.516cc583@pitrou.net> References: <20130528143117.516cc583@pitrou.net> Message-ID: On Tue, May 28, 2013 at 10:31 PM, Antoine Pitrou wrote: > Le Tue, 28 May 2013 22:15:25 +1000, > Nick Coghlan a ?crit : >> I have a feature branch where I'm intermittently working on the >> bootstrapping changes described in PEP 432. >> >> As part of those changes, I've cleaned up a few aspects of the repo >> layout: >> >> * moved the main executable source file from Modules to a separate >> Apps directory > > Sounds fine (I don't like "Apps" much, but hey :-)). Unfortunately, I don't know any other short word for "things with main functions that we ship to end users" :) >> * moved the _freezeimportlib and _testembed source files from Modules >> to a separate Tools directory > > Well, they should probably go to Apps too, no? I wanted to split out "part of the build/test infrastructure" from "shipped to end users", but I could also live with a simple "Bin" directory that contained both kinds of executable. >> * split the monster pythonrun.h/c pair into 3 separate header/impl >> pairs: >> * bootstrap.h/bootstrap.c >> * shutdown.h/shutdown.c >> * pythonrun.h/pythonrun.c > > I don't think separating bootstrap from shutdown is a good idea. They > are quite closely related since one undoes what the other did (and they > may also use shared private functions or data). It was deliberate - a big part of PEP 432 is making sure that all the interpreter state lives *in* the interpreter state (as part of the config struct). Splitting the two into separate compilation modules makes it possible to ensure that all communication goes via the interpreter configuration (statics in other modules are still a problem, but also mostly out of scope for PEP 432). I *really* want to get us to clean phase separation of "the interpreter is starting up", "the interpreter is running normally" and "the interpreter is shutting down". I found that to be incredibly difficult to do when they were all intermixed in one file, which is why I decided to enlist the compiler's help by separating them. > I don't know what goes > in the remaining "pythonrun.c", could you detail a bit? While they have some of the PEP 432 changes in them, the header files in the branch give the general flavour of the separation: Bootstrap is mostly get/init type functions: https://bitbucket.org/ncoghlan/cpython_sandbox/src/ae7fef62b462fb6b559172bd4dbefc185ec28c40/Include/bootstrap.h?at=pep432_modular_bootstrap Pythonrun is mostly PyRun_*, PyParser_*, Py_Compile* and a few other odds and ends: https://bitbucket.org/ncoghlan/cpython_sandbox/src/ae7fef62b462fb6b559172bd4dbefc185ec28c40/Include/pythonrun.h?at=pep432_modular_bootstrap Shutdown covers the various finalisers, atexit handling, etc: https://bitbucket.org/ncoghlan/cpython_sandbox/src/ae7fef62b462fb6b559172bd4dbefc185ec28c40/Include/shutdown.h?at=pep432_modular_bootstrap Cheers, Nick. From a.cavallo at cavallinux.eu Tue May 28 15:13:10 2013 From: a.cavallo at cavallinux.eu (a.cavallo at cavallinux.eu) Date: Tue, 28 May 2013 15:13:10 +0200 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> Message-ID: <91c2d05eb05358e48f0997570368d252@cavallinux.eu> >>> As part of those changes, I've cleaned up a few aspects of the repo >>> layout: >>> >>> * moved the main executable source file from Modules to a separate >>> Apps directory >> Do you mean things that go into the shared library (libpythonXX/pythonXX.dll) vs executables? From fred at fdrake.net Tue May 28 15:25:54 2013 From: fred at fdrake.net (Fred Drake) Date: Tue, 28 May 2013 09:25:54 -0400 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> Message-ID: On Tue, May 28, 2013 at 9:07 AM, Nick Coghlan wrote: > Unfortunately, I don't know any other short word for "things with main > functions that we ship to end users" :) We used to call such things "programs", but that term may no longer be in popular parlance. :-) Or is it just too long? -Fred -- Fred L. Drake, Jr. "A storm broke loose in my mind." --Albert Einstein From benjamin at python.org Tue May 28 15:31:08 2013 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 28 May 2013 06:31:08 -0700 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> Message-ID: 2013/5/28 Nick Coghlan : > On Tue, May 28, 2013 at 10:31 PM, Antoine Pitrou wrote: >> Le Tue, 28 May 2013 22:15:25 +1000, >> Nick Coghlan a ?crit : >>> I have a feature branch where I'm intermittently working on the >>> bootstrapping changes described in PEP 432. >>> >>> As part of those changes, I've cleaned up a few aspects of the repo >>> layout: >>> >>> * moved the main executable source file from Modules to a separate >>> Apps directory >> >> Sounds fine (I don't like "Apps" much, but hey :-)). > > Unfortunately, I don't know any other short word for "things with main > functions that we ship to end users" :) "Bin" is quite common (if ironic). I think it would be fine two if that stuff was in Python/; anywhere is better than modules. (Care to move the GC, too?) -- Regards, Benjamin From storchaka at gmail.com Tue May 28 16:03:45 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 28 May 2013 17:03:45 +0300 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> Message-ID: 28.05.13 16:07, Nick Coghlan ???????(??): > On Tue, May 28, 2013 at 10:31 PM, Antoine Pitrou wrote: >> Le Tue, 28 May 2013 22:15:25 +1000, >> Nick Coghlan a ?crit : >>> * moved the main executable source file from Modules to a separate >>> Apps directory >> Sounds fine (I don't like "Apps" much, but hey :-)). > Unfortunately, I don't know any other short word for "things with main > functions that we ship to end users" :) main From ncoghlan at gmail.com Tue May 28 16:26:11 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 May 2013 00:26:11 +1000 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> Message-ID: On Wed, May 29, 2013 at 12:03 AM, Serhiy Storchaka wrote: > 28.05.13 16:07, Nick Coghlan ???????(??): >> >> On Tue, May 28, 2013 at 10:31 PM, Antoine Pitrou >> wrote: >>> >>> Le Tue, 28 May 2013 22:15:25 +1000, >>> Nick Coghlan a ?crit : >>>> >>>> * moved the main executable source file from Modules to a separate >>>> Apps directory >>> >>> Sounds fine (I don't like "Apps" much, but hey :-)). >> >> Unfortunately, I don't know any other short word for "things with main >> functions that we ship to end users" :) > > main IIRC, the reason I avoided that originally was due to the potential confusion between C's main and Python's main. I don't know why I didn't think of Fred's suggestion of "Programs" - I think that contrasts nicely with Modules, so I'd like to run with that. Cleanly separating out the main functions affected the PEP 432 feature branch directly because the whole point of that PEP is to make all of them simpler by moving more of the relevant code into the shared library. However, I really *don't* want to dive into the seemingly random allocation of some things between the Python/ subdir and the Modules/ subdir . If there's a consistent pattern there, I think it may be lost somewhere back in the 20th century, as I've never been able to figure one out... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From barry at python.org Tue May 28 16:36:19 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 10:36:19 -0400 Subject: [Python-Dev] PEP 8 and function names In-Reply-To: References: Message-ID: <20130528103619.442c059a@anarchist> On May 26, 2013, at 08:34 PM, Nick Coghlan wrote: >As far as I am aware, there's nothing to clarify: new code should use >underscores as word separators, code added to an existing module or >based on existing API should follow the conventions of that module or >API. This is what PEP 8 already says. Exactly so. -Barry From barry at python.org Tue May 28 17:35:00 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 11:35:00 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130524202358.C57F9250BDB@webabinitio.net> References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> Message-ID: <20130528113500.4d406948@anarchist> On May 24, 2013, at 04:23 PM, R. David Murray wrote: >Gentoo has a (fairly complex) driver script that is symlinked to all >of these bin scripts. The system then has the concept of the >"current python", which can be set to python2 or python3. The default >bin then calls the current default interpreter. There are also >xxx2 and xxx3 versions of each bin script, which call the 'current' >version of python2 or python3, respectively. > >I'm sure one of the gentoo devs on this list can speak to this more >completely...I'm just a user :) But I must say that the system works >well from my point of view. Interesting approach, but it doesn't seem to me to be fundamentally different than the BPOS (big pile o' symlinks). Over in Debian-land one of the interesting points against a driver script was that folks like to be able to explicitly override the shebang line interpreter, e.g. $ head /usr/bin/foo #! /usr/bin/python3 -Es $ python3.4 /usr/bin/foo ... One other person mentioned they like to be able to execfile() - or the Python 3 moral equivalent - the /usr/bin script, which obvious would be harder with a sh or binary driver script. -Barry From rdmurray at bitdance.com Tue May 28 17:41:23 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 28 May 2013 11:41:23 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528113500.4d406948@anarchist> References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> <20130528113500.4d406948@anarchist> Message-ID: <20130528154124.3F820250498@webabinitio.net> On Tue, 28 May 2013 11:35:00 -0400, Barry Warsaw wrote: > On May 24, 2013, at 04:23 PM, R. David Murray wrote: > > >Gentoo has a (fairly complex) driver script that is symlinked to all > >of these bin scripts. The system then has the concept of the > >"current python", which can be set to python2 or python3. The default > >bin then calls the current default interpreter. There are also > >xxx2 and xxx3 versions of each bin script, which call the 'current' > >version of python2 or python3, respectively. > > > >I'm sure one of the gentoo devs on this list can speak to this more > >completely...I'm just a user :) But I must say that the system works > >well from my point of view. > > Interesting approach, but it doesn't seem to me to be fundamentally different > than the BPOS (big pile o' symlinks). > > Over in Debian-land one of the interesting points against a driver script was > that folks like to be able to explicitly override the shebang line > interpreter, e.g. > > $ head /usr/bin/foo > #! /usr/bin/python3 -Es > $ python3.4 /usr/bin/foo > ... > > One other person mentioned they like to be able to execfile() - or the Python > 3 moral equivalent - the /usr/bin script, which obvious would be harder with a > sh or binary driver script. True. Another big disadvantage is that you can't just look in the file to find out what it is doing, which I *do* find be a significant drawback. I have the same complaint about setuptools entry-point scripts, where I still haven't figured out how to go from what is in the file to the code that actually gets called. --David From barry at python.org Tue May 28 17:45:33 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 11:45:33 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: References: <20130524155629.7597bdb0@anarchist> Message-ID: <20130528114533.5cda74c9@anarchist> On May 25, 2013, at 05:57 PM, Nick Coghlan wrote: >It seems to me the existing recommendation to use ``#!/usr/bin/env >python`` instead of referencing a particular binary already covers the >general case. The challenge for the distros is that we want a solution >that *ignores* user level virtual environments. Right. My general recommendation is that upstream's (development version) scripts use #! /usr/bin/env, but that distros and possibly even virtualenv/buildout installs, hardcode the #! to a specific interpreter. We've just had way too many problems when a /usr/bin script uses /usr/bin/env and breaks the world. We also recommend using -Es to isolate the environment as much as possible. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Tue May 28 17:47:26 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 11:47:26 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <1369477026.2673.178.camel@thinko> References: <20130524155629.7597bdb0@anarchist> <1369477026.2673.178.camel@thinko> Message-ID: <20130528114726.6975558d@anarchist> On May 25, 2013, at 06:17 AM, Chris McDonough wrote: >I'm curious if folks have other concrete examples of global bindir >executables other than nosetests and pydoc that need to be disambiguated >by Python version. I'd hate to see it become standard practice to >append "3" to scripts generated by packages which happen to use Python >3, as it will just sort of perpetuate its otherness. tox https://bitbucket.org/hpk42/tox/issue/96/cant-have-a-python-3-setuppy -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From v+python at g.nevcal.com Tue May 28 17:49:09 2013 From: v+python at g.nevcal.com (Glenn Linderman) Date: Tue, 28 May 2013 08:49:09 -0700 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520174638.12fae7ee@fsol> Message-ID: <51A4D1F5.70302@g.nevcal.com> On 5/28/2013 6:02 AM, Serhiy Storchaka wrote: > 20.05.13 18:46, Antoine Pitrou ???????(??): >> I think it is a legitimate case where to silence the original >> exception. However, the binascii.Error would be more informative if it >> said *which* non-base32 digit was encountered. > > Please open a new issue for this request (note that no other binascii > or base64 functions provide such information). Sounds like perhaps multiple issues would be useful to suggest enhancements for the error messages provided by other binascii and base64 functions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tseaver at palladion.com Tue May 28 18:17:49 2013 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 28 May 2013 12:17:49 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528154124.3F820250498@webabinitio.net> References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> <20130528113500.4d406948@anarchist> <20130528154124.3F820250498@webabinitio.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/28/2013 11:41 AM, R. David Murray wrote: > I have the same complaint about setuptools entry-point scripts, where > I still haven't figured out how to go from what is in the file to the > code that actually gets called. Hmm, just dump the 'entry_points.txt' file in the named distribution's EGG-INFO directory? E.g.: $ cat bin/pip #!/path/to/virtualenv/bin/pythonX.Y # EASY-INSTALL-ENTRY-SCRIPT: 'pip==1.3.1','console_scripts','pip' __requires__ = 'pip==1.3.1' import sys from pkg_resources import load_entry_point if __name__ == '__main__': sys.exit( load_entry_point('pip==1.3.1', 'console_scripts', 'pip')() ) $ cat lib/pythonX.Y/site-packages/pip-1.3.1-pyX.Y.egg/EGG-INFO/entry_points.txt [console_scripts] pip = pip:main pip-X.Y = pip:main Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEYEARECAAYFAlGk2K0ACgkQ+gerLs4ltQ6WaACZAbdz7k3sdM21DNx0mzcecY93 hvYAoJTwA2l3OvSoYStzGmsJ+N16JDwM =YHcy -----END PGP SIGNATURE----- From solipsis at pitrou.net Tue May 28 18:20:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 18:20:18 +0200 Subject: [Python-Dev] Structural cleanups to the main CPython repo References: <20130528143117.516cc583@pitrou.net> Message-ID: <20130528182018.41d53a45@pitrou.net> Le Tue, 28 May 2013 23:07:37 +1000, Nick Coghlan a ?crit : > > It was deliberate - a big part of PEP 432 is making sure that all the > interpreter state lives *in* the interpreter state (as part of the > config struct). Splitting the two into separate compilation modules > makes it possible to ensure that all communication goes via the > interpreter configuration (statics in other modules are still a > problem, but also mostly out of scope for PEP 432). > > I *really* want to get us to clean phase separation of "the > interpreter is starting up", "the interpreter is running normally" and > "the interpreter is shutting down". I found that to be incredibly > difficult to do when they were all intermixed in one file, which is > why I decided to enlist the compiler's help by separating them. It sounds a bit exagerated. We have encoders and decoders in the same (C) modules, compressors and decompressors ditto. Why not keep initialization and finalization in the same source file too? (how long are the resulting C files?) > > I don't know what goes > > in the remaining "pythonrun.c", could you detail a bit? > > While they have some of the PEP 432 changes in them, the header files > in the branch give the general flavour of the separation: > > Bootstrap is mostly get/init type functions: > https://bitbucket.org/ncoghlan/cpython_sandbox/src/ae7fef62b462fb6b559172bd4dbefc185ec28c40/Include/bootstrap.h?at=pep432_modular_bootstrap > > Pythonrun is mostly PyRun_*, PyParser_*, Py_Compile* and a few other > odds and ends: > https://bitbucket.org/ncoghlan/cpython_sandbox/src/ae7fef62b462fb6b559172bd4dbefc185ec28c40/Include/pythonrun.h?at=pep432_modular_bootstrap > > Shutdown covers the various finalisers, atexit handling, etc: > https://bitbucket.org/ncoghlan/cpython_sandbox/src/ae7fef62b462fb6b559172bd4dbefc185ec28c40/Include/shutdown.h?at=pep432_modular_bootstrap The fact that PyXXX_Init() and PyXXX_Fini() end up in different header files look like a red flag to me, modularization-wise. I agree to separate PyRun_* stuff from initialization/finalization routines, though. Regards Antoine. From martin at v.loewis.de Tue May 28 18:42:48 2013 From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=) Date: Tue, 28 May 2013 18:42:48 +0200 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> Message-ID: <51A4DE88.7010609@v.loewis.de> Am 28.05.13 15:07, schrieb Nick Coghlan: >> Sounds fine (I don't like "Apps" much, but hey :-)). > > Unfortunately, I don't know any other short word for "things with main > functions that we ship to end users" :) Bike-sheddingly: POSIX calls them "commands and utilities": https://www2.opengroup.org/ogsys/catalog/c436 Regards, Martin From martin at v.loewis.de Tue May 28 18:47:47 2013 From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=) Date: Tue, 28 May 2013 18:47:47 +0200 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: <20130528182018.41d53a45@pitrou.net> References: <20130528143117.516cc583@pitrou.net> <20130528182018.41d53a45@pitrou.net> Message-ID: <51A4DFB3.6080901@v.loewis.de> Am 28.05.13 18:20, schrieb Antoine Pitrou: > Le Tue, 28 May 2013 23:07:37 +1000, > Nick Coghlan a ?crit : >> It was deliberate - a big part of PEP 432 is making sure that all the >> interpreter state lives *in* the interpreter state (as part of the >> config struct). > > It sounds a bit exagerated. We have encoders and decoders in the same > (C) modules, compressors and decompressors ditto. Why not keep > initialization and finalization in the same source file too? I can sympathize with the motivation. Unlike encoders and decoders, it is *very* tempting to put interpreter state into global variables. With encoders and decoders, it's clear that globals won't work if you have multiple of them. With interpreter state, it's either singletons in the first place, or the globals can be swapped out when switching interpreters. By splitting initialization and finalization into distinct translation units, you make it much more difficult to introduce new "hidden" variables. Regards, Martin From barry at python.org Tue May 28 19:22:01 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 13:22:01 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130527183836.GG2038@unaka.lan> References: <20130524155629.7597bdb0@anarchist> <20130527183836.GG2038@unaka.lan> Message-ID: <20130528132201.68ed6f98@anarchist> On May 27, 2013, at 11:38 AM, Toshio Kuratomi wrote: >Fedora is a bit of a mess... we try to work with upstream's intent when >upstream has realized this problem exists and have a single standard when >upstream does not. The full guidelines are here: > >http://fedoraproject.org/wiki/Packaging:Python#Naming Thanks. One of the reasons I've brought this up here is so that hopefully we can come up with recommendations for upstreams where this matters. One thing is for sure (IMO, anyway). Utilities that provide version-specific scripts should also provide -m invocation. E.g. there are various places where a package's tests (provided unittest, or other as-built tests) can be invoked. Where those might use nose, we recommend invoking them with `$python -m nose` instead using nosetests-X.Y. This also makes it easier to loop over all the versions of Python available on the system (which might not be known statically). >- If upstream doesn't deal with it, then we use a "python3-" prefix. This > matches with our package naming so it seemed to make sense. (But > Barry's point about locate and tab completion and such would be a reason > to revisit this... Perhaps standardizing on /usr/bin/foo2-python3 > [pathological case of having both package version and interpreter > version in the name.] Note that the Gentoo example also takes into account versions that might act differently based on the interpreter's implementation. So a -python3 suffix may not be enough. Maybe now we're getting into PEP 425 compatibility tag territory. > - (tangent from a different portion of this thread: we've found that this > is a larger problem than we would hope. There are some obvious ones > like > - ipython (implements a python interpreter so python2 vs python3 is > understandably important ad different). > - nosetests (the python source being operated on is run through the > python interpreter so the version has to match). > - easy_install (needs to install python modules to the correct > interpreter's site-packages. It decides the correct interpreter > according to which interpreter invoked it.) > > But recently we found a new class of problems: frameworks which are > bilinugual. For instance, if you have a web framework which has a > /usr/bin/django-admin script that can be used to quickstart a > project, run a python shell and automatically load your code, load your > ORM db schema and operate on it to make modifications to the db then > that script has to know whether your code is compatible with python2 or > python3. Yay. >> I think the simplest thing to do is just append the "3" to the binary >> name (as we do ourselves for pydoc) and then abide by the >> recommendations in PEP 394 to reference the correct system executable. >> >I'd rather not have a bare 3 for the issues notes above. Something like py3 >would be better. Same here. I definitely don't like the current Debian semi-convention (not standardized or consistent) of injecting a '3' in the middle of the name, e.g. py3compile or py3doc. Note that adopting PEP 425 conventions allows for -py3 suffix to mean any Python 3 version, compatible across minor version numbers or implementations. This probably translates into a shebang line of #! /usr/bin/python3 whereas -py33 would mean #! /usr/bin/python3.3 This might be overkill in some cases, but at least it builds on existing standards. >There's still room for confusion when distributions have to push multiple >versions of a package with scripts that fall into this category. Should the >format be: > >/usr/bin/foo2-py3 (My preference as it places the version next to the > thing that it's a version of.) > >or > >/usr/bin/foo-py3-2 (Confusing as the 2 is bare. Something like > /usr/bin/foo-py3-v2 is slightly better but still not as nice as the > previous IMHO) Definitely the former, especially if PEP 425 serves at the basis for standardization. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Tue May 28 19:27:18 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 13:27:18 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130525095316.010f2b34@fsol> References: <20130524155629.7597bdb0@anarchist> <20130525095316.010f2b34@fsol> Message-ID: <20130528132718.2e0410be@anarchist> On May 25, 2013, at 09:53 AM, Antoine Pitrou wrote: >How about always running the version specific targets, e.g. >nosetests-2.7? We have nosetests-2.7 and nosetests3 in /usr/bin, but we generally recommend folks not use these, especially for things like (build time) package tests. It's harder to iterate over when the installed versions are unknown statically, e.g. if you wanted to run all the tests over all available versions of Python. For those, we recommend people use `$python -m nose` since the available versions of Python can be queried from the system. This is why I would really like to see all scripts provide a -m equivalent for command line invocation. This might be a little awkward for < Python 2.7 (where IIRC -m doesn't work with packages). -Barry From barry at python.org Tue May 28 19:30:58 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 13:30:58 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <1369465952.2673.171.camel@thinko> References: <20130524155629.7597bdb0@anarchist> <1369465952.2673.171.camel@thinko> Message-ID: <20130528133058.76300390@anarchist> On May 25, 2013, at 03:12 AM, Chris McDonough wrote: >You probably already know this, but I'll mention it anyway. This >probably matters a lot for nose and pyflakes, but I'd say that for tox >it should not, it basically just scripts execution of shell commands. >I'd think maybe in cases like tox (and others that are compatible with >both Python 2 and 3) the hashbang should just be set to >"#!/usr/bin/python" unconditionally. Unfortunately, not entirely so: https://bitbucket.org/hpk42/tox/issue/96/cant-have-a-python-3-setuppy >Maybe we could also think about modifying pyflakes so that it can >validate both 2 and 3 code (choosing one or the other based on a header >line in the validated files and defaulting to the version of Python >being run). This is kind of the right thing anyway. Agreed. Auto-detection may need to be accompanied by a command line option to override in some cases. But I agree, that in general, it would be very nice if the script itself were actually bilingual. (But then, see my previous comment about cross-interpreter dependencies.) >Nose is a bit of a special case. I personally never run nosetests >directly, I always use setup.py nosetests, which makes it not matter. Which is morally equivalent to `$python -m nose`. >In general, I'd like to think that scripts that get installed to global >bindirs will execute utilities that are useful independent of the >version of Python being used to execute them. Agreed. I'm trying to tease out some conventions we can recommend for when this can't be the case for whatever reason. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From dholth at gmail.com Tue May 28 19:57:07 2013 From: dholth at gmail.com (Daniel Holth) Date: Tue, 28 May 2013 13:57:07 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528133058.76300390@anarchist> References: <20130524155629.7597bdb0@anarchist> <1369465952.2673.171.camel@thinko> <20130528133058.76300390@anarchist> Message-ID: On Tue, May 28, 2013 at 1:30 PM, Barry Warsaw wrote: > On May 25, 2013, at 03:12 AM, Chris McDonough wrote: > >>You probably already know this, but I'll mention it anyway. This >>probably matters a lot for nose and pyflakes, but I'd say that for tox >>it should not, it basically just scripts execution of shell commands. >>I'd think maybe in cases like tox (and others that are compatible with >>both Python 2 and 3) the hashbang should just be set to >>"#!/usr/bin/python" unconditionally. > > Unfortunately, not entirely so: > > https://bitbucket.org/hpk42/tox/issue/96/cant-have-a-python-3-setuppy > >>Maybe we could also think about modifying pyflakes so that it can >>validate both 2 and 3 code (choosing one or the other based on a header >>line in the validated files and defaulting to the version of Python >>being run). This is kind of the right thing anyway. > > Agreed. Auto-detection may need to be accompanied by a command line option to > override in some cases. But I agree, that in general, it would be very nice > if the script itself were actually bilingual. (But then, see my previous > comment about cross-interpreter dependencies.) > >>Nose is a bit of a special case. I personally never run nosetests >>directly, I always use setup.py nosetests, which makes it not matter. > > Which is morally equivalent to `$python -m nose`. > >>In general, I'd like to think that scripts that get installed to global >>bindirs will execute utilities that are useful independent of the >>version of Python being used to execute them. > > Agreed. I'm trying to tease out some conventions we can recommend for when > this can't be the case for whatever reason. > > -Barry Wheel has no mechanism for renaming scripts (or any file) based on the Python version used to install. Instead you would have to build python-version-specific packages for each desired script name. From solipsis at pitrou.net Tue May 28 20:00:02 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 20:00:02 +0200 Subject: [Python-Dev] PEP 409 and the stdlib References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520174638.12fae7ee@fsol> Message-ID: <20130528200002.1c555431@fsol> On Tue, 28 May 2013 16:02:00 +0300 Serhiy Storchaka wrote: > 20.05.13 18:46, Antoine Pitrou ???????(??): > > I think it is a legitimate case where to silence the original > > exception. However, the binascii.Error would be more informative if it > > said *which* non-base32 digit was encountered. > > Please open a new issue for this request (note that no other binascii or > base64 functions provide such information). No, my point was that the KeyError gives you this information (when displayed as a context), silencing it removes the information. Regards Antoine. From barry at python.org Tue May 28 20:04:33 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 28 May 2013 14:04:33 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: References: <20130524155629.7597bdb0@anarchist> <1369465952.2673.171.camel@thinko> <20130528133058.76300390@anarchist> Message-ID: <20130528140433.165facef@anarchist> On May 28, 2013, at 01:57 PM, Daniel Holth wrote: >Wheel has no mechanism for renaming scripts (or any file) based on the >Python version used to install. Instead you would have to build >python-version-specific packages for each desired script name. Note that I'm not trying to borrow any implementation details from wheels, just the file naming conventions (compatibility tags) described in PEP 425. It would still be up to upstream package or distro tools to fiddle the installed file names. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From solipsis at pitrou.net Tue May 28 20:02:45 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 20:02:45 +0200 Subject: [Python-Dev] Bilingual scripts References: <20130524155629.7597bdb0@anarchist> <20130525095316.010f2b34@fsol> <20130528132718.2e0410be@anarchist> Message-ID: <20130528200245.4be5532c@fsol> On Tue, 28 May 2013 13:27:18 -0400 Barry Warsaw wrote: > On May 25, 2013, at 09:53 AM, Antoine Pitrou wrote: > > >How about always running the version specific targets, e.g. > >nosetests-2.7? > > We have nosetests-2.7 and nosetests3 in /usr/bin, but we generally recommend > folks not use these, especially for things like (build time) package tests. > It's harder to iterate over when the installed versions are unknown > statically, e.g. if you wanted to run all the tests over all available > versions of Python. It sounds like you want a dedicated script or utility for this ("run all the tests over all available versions of Python") rather than hack it every time you package a Python library. Your use case also doesn't seem to impact end-users. > This is why I would really like to see all scripts provide a -m equivalent for > command line invocation. This might be a little awkward for < Python 2.7 > (where IIRC -m doesn't work with packages). Do you still support Python < 2.7? Regards Antoine. From dholth at gmail.com Tue May 28 20:19:27 2013 From: dholth at gmail.com (Daniel Holth) Date: Tue, 28 May 2013 14:19:27 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528140433.165facef@anarchist> References: <20130524155629.7597bdb0@anarchist> <1369465952.2673.171.camel@thinko> <20130528133058.76300390@anarchist> <20130528140433.165facef@anarchist> Message-ID: On Tue, May 28, 2013 at 2:04 PM, Barry Warsaw wrote: > On May 28, 2013, at 01:57 PM, Daniel Holth wrote: > >>Wheel has no mechanism for renaming scripts (or any file) based on the >>Python version used to install. Instead you would have to build >>python-version-specific packages for each desired script name. > > Note that I'm not trying to borrow any implementation details from wheels, > just the file naming conventions (compatibility tags) described in PEP 425. > It would still be up to upstream package or distro tools to fiddle the > installed file names. I'm just saying that I prefer a setup.py without too many Python-version-specific differences, since it would look pretty silly to install a wheel of nose generated on Python 3.2 on Python 3.3 and have the wrong version suffix on the scripts. I like the plainly named scripts without version suffixes. From steve at pearwood.info Tue May 28 20:34:23 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 29 May 2013 04:34:23 +1000 Subject: [Python-Dev] PEP 409 and the stdlib In-Reply-To: <20130528200002.1c555431@fsol> References: <519A2149.3040903@stoneleaf.us> <519A2F37.6020406@stoneleaf.us> <20130520174638.12fae7ee@fsol> <20130528200002.1c555431@fsol> Message-ID: <51A4F8AF.7050706@pearwood.info> On 29/05/13 04:00, Antoine Pitrou wrote: > On Tue, 28 May 2013 16:02:00 +0300 > Serhiy Storchaka wrote: >> 20.05.13 18:46, Antoine Pitrou ???????(??): >>> I think it is a legitimate case where to silence the original >>> exception. However, the binascii.Error would be more informative if it >>> said *which* non-base32 digit was encountered. >> >> Please open a new issue for this request (note that no other binascii or >> base64 functions provide such information). > > No, my point was that the KeyError gives you this information (when > displayed as a context), silencing it removes the information. That is an accidental side-effect of the specific implementation, and does not occur in any of the versions of Python I have access to (production versions of 2.4 through 2.7, plus 3.2 and 3.3.0rc3). If the implementation changes again in the future, that information will be lost again. Relying on the context in this case to display this information is harmful for at least three reasons: - it's an accident of implementation; - it suggests that the binascii.Error is a bug in the error handling, when that is not the case; - and it is semantically irrelevant to the error being reported. The semantics of the error are "an invalid character has been found", not "an expected key is not found". I try not to throw references to the Zen around too lightly, but I think "Explicit is better than implicit" is appropriate here. If it is helpful for the error to show the invalid character, and I hope that we all agree that it is, then the binascii.Error message should explicitly show that character, rather than rely on the implementation implicitly showing it as a side-effect. -- Steven From rowen at uw.edu Tue May 28 21:21:25 2013 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 28 May 2013 12:21:25 -0700 Subject: [Python-Dev] pep 422 "Safe object finalization" question: why break weakrefs first? Message-ID: Pep 422 proposes the following order for dealing with cyclic isolates: 1. Weakrefs to CI objects are cleared, and their callbacks called. At this point, the objects are still safe to use. 2. The finalizers of all CI objects are called. 3. The CI is traversed again to determine if it is still isolated. If it is determined that at least one object in CI is now reachable from outside the CI, this collection is aborted and the whole CI is resurrected. Otherwise, proceed. 4. The CI becomes a CT as the GC systematically breaks all known references inside it (using the tp_clear function). Why are weakrefs are broken first, before determining if any of the objects should become resurrected? Naively I would expect weakrefs to be broken after step 3, once the system is sure no objects have been resurrected. I request that this information be added to PEP 422. -- Russell From a.badger at gmail.com Tue May 28 21:23:26 2013 From: a.badger at gmail.com (Toshio Kuratomi) Date: Tue, 28 May 2013 12:23:26 -0700 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528132201.68ed6f98@anarchist> References: <20130524155629.7597bdb0@anarchist> <20130527183836.GG2038@unaka.lan> <20130528132201.68ed6f98@anarchist> Message-ID: <20130528192325.GJ2038@unaka.lan> On Tue, May 28, 2013 at 01:22:01PM -0400, Barry Warsaw wrote: > On May 27, 2013, at 11:38 AM, Toshio Kuratomi wrote: > > >- If upstream doesn't deal with it, then we use a "python3-" prefix. This > > matches with our package naming so it seemed to make sense. (But > > Barry's point about locate and tab completion and such would be a reason > > to revisit this... Perhaps standardizing on /usr/bin/foo2-python3 > > [pathological case of having both package version and interpreter > > version in the name.] > > Note that the Gentoo example also takes into account versions that might act > differently based on the interpreter's implementation. So a -python3 suffix > may not be enough. Maybe now we're getting into PEP 425 compatibility tag > territory. > This is an interesting, unmapped area in Fedora at the moment... I was hoping to talk to Nick and the Fedora python maintainer at our next Fedora conference. I've been looking at how Fedora's ruby guidelines are implemented wrt alternate interpreters and wondering if we could do something similar for python: https://fedoraproject.org/wiki/Packaging:Ruby#Different_Interpreters_Compatibility I'm not sure yet how much of that I'd (or Nick and the python maintainer [bkabrda, the current python maintainer is the one who wrote the rubypick script]) would want to use in python -- replacing /usr/bin/python with a script that chooses between CPython and pypy based on user preference gave me an instinctual feeling of dread the first time I looked at it but it seems to be working well for the ruby folks. My current feeling is that I wouldn't use this same system for interpreters which are not mostly compatible (for instance, python2 vs python3). but I also haven't devoted much actual time to thinking about whether that might have some advantages. -Toshio -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From solipsis at pitrou.net Tue May 28 21:37:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 21:37:15 +0200 Subject: [Python-Dev] pep 422 "Safe object finalization" question: why break weakrefs first? References: Message-ID: <20130528213715.2fd63922@fsol> On Tue, 28 May 2013 12:21:25 -0700 "Russell E. Owen" wrote: > Pep 422 proposes the following order for dealing with cyclic isolates: > > 1. Weakrefs to CI objects are cleared, and their callbacks called. At > this point, the objects are still safe to use. > 2. The finalizers of all CI objects are called. > 3. The CI is traversed again to determine if it is still isolated. If > it is determined that at least one object in CI is now reachable from > outside the CI, this collection is aborted and the whole CI is > resurrected. Otherwise, proceed. > 4. The CI becomes a CT as the GC systematically breaks all known > references inside it (using the tp_clear function). > > Why are weakrefs are broken first, before determining if any of the > objects should become resurrected? Naively I would expect weakrefs to be > broken after step 3, once the system is sure no objects have been > resurrected. The answer is that this is how weakrefs currently work: they are cleared (and their callbacks executed) before __del__ is executed, therefore if __del__ revives the object, the weakrefs stay dead. The rationale is simply to minimize disruption for existing code. However, the PEP would indeed make it possible to change that behaviour, if desired. You can read http://hg.python.org/cpython/file/4e687d53b645/Modules/gc_weakref.txt for a detailed (and lengthy) explanation of why weakrefs work that way right now. Regards Antoine. From solipsis at pitrou.net Tue May 28 21:39:33 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 21:39:33 +0200 Subject: [Python-Dev] cpython (merge 3.3 -> default): Merge with 3.3 References: <3bKHfN4rr2z7LmQ@mail.python.org> <51A5003F.3010603@udel.edu> Message-ID: <20130528213933.720e9167@fsol> On Tue, 28 May 2013 15:06:39 -0400 Terry Reedy wrote: > On 5/28/2013 2:14 AM, Benjamin Peterson wrote: > > 2013/5/27 terry.reedy : > >> http://hg.python.org/cpython/rev/c5d4c041ab47 > >> changeset: 83942:c5d4c041ab47 > >> parent: 83940:2ea849fde22b > >> parent: 83941:24c3e7e08168 > >> user: Terry Jan Reedy > >> date: Mon May 27 21:33:40 2013 -0400 > >> summary: > >> Merge with 3.3 > >> > >> files: > >> Lib/idlelib/CallTips.py | 4 +- > >> Lib/idlelib/PathBrowser.py | 3 +- > >> Lib/idlelib/idle_test/@README.txt | 63 +++++++++++ > > Is @README really the intended name of this file? > Yes, Nick suggested README instead of what I had. I want a prefix to > keep it near the top of a directory listing even when other non > 'test_xxx' files are added. I thing '_' wold be better though. I don't think "prefixing with a weird character so that the filename show up top" is a very elegant trick, and we don't use it for other directories. "README.txt" *will* be easily visible because of the all-caps filename. > If I can > find how to rename in hg "Rename in hg" -> "hg rename" Antoine. From rowen at uw.edu Tue May 28 21:41:23 2013 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 28 May 2013 12:41:23 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) References: Message-ID: In article , ??ukasz Langa wrote: > Hello, > Since the initial version, several minor changes have been made to the > PEP. The history is visible on hg.python.org. The most important > change in this version is that I introduced ABC support and completed > a reference implementation. > > No open issues remain from my point of view. Is it true that this cannot be used for instance and class methods? It dispatches based on the first argument, which is "self" for instance methods, whereas the second argument would almost certainly be the argument one would want to use for conditional dispatch. -- Russell From rowen at uw.edu Tue May 28 21:49:10 2013 From: rowen at uw.edu (Russell E. Owen) Date: Tue, 28 May 2013 12:49:10 -0700 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) References: Message-ID: A question about the example: how hard would it be to modify the example @fun.register(list) ... to work with other collections? If it is easy, I think it would make a for a much more useful example. -- Russell From pje at telecommunity.com Tue May 28 23:27:17 2013 From: pje at telecommunity.com (PJ Eby) Date: Tue, 28 May 2013 17:27:17 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: On Tue, May 28, 2013 at 3:41 PM, Russell E. Owen wrote: > Is it true that this cannot be used for instance and class methods? It > dispatches based on the first argument, which is "self" for instance > methods, whereas the second argument would almost certainly be the > argument one would want to use for conditional dispatch. You can use a staticmethod and then delegate to it, of course. But it probably wouldn't be too difficult to allow specifying which argument to dispatch on, e.g.: @singledispatch.on('someArg') def my_method(self, someArg, ...): ... The code would look something like this: def singledispatch(func, argPosn=0): ... # existing code here... ... def wrapper(*args, **kw): return dispatch(args[argPosn].__class__)(*args, **kw) # instead of args[0] def _dispatch_on(argname): def decorate(func): argPosn = # code to find argument position of argname for func return dispatch(func, argPosn) return decorate singledispatch.on = _dispatch_on So, it's just a few lines added, but of course additional doc, tests, etc. would have to be added as well. (It also might be a good idea for there to be some error checking in wrapper() to raise an approriate TypeError if len(args)<=arg.) From solipsis at pitrou.net Tue May 28 23:40:38 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 28 May 2013 23:40:38 +0200 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to References: <3bKp9Q5tqHz7LkF@mail.python.org> Message-ID: <20130528234038.323899c7@fsol> On Tue, 28 May 2013 23:29:46 +0200 (CEST) brett.cannon wrote: > > +.. class:: ModuleManager(name) > + > + A :term:`context manager` which provides the module to load. The module will > + either come from :attr:`sys.modules` in the case of reloading or a fresh > + module if loading a new module. Proper cleanup of :attr:`sys.modules` occurs > + if the module was new and an exception was raised. What use case does this API solve? (FWIW, I think "ModuleManager" is a rather bad name :-)) Regards Antoine. From rdmurray at bitdance.com Tue May 28 23:52:43 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 28 May 2013 17:52:43 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> <20130528113500.4d406948@anarchist> <20130528154124.3F820250498@webabinitio.net> Message-ID: <20130528215244.31F25250498@webabinitio.net> On Tue, 28 May 2013 12:17:49 -0400, Tres Seaver wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 05/28/2013 11:41 AM, R. David Murray wrote: > > I have the same complaint about setuptools entry-point scripts, where > > I still haven't figured out how to go from what is in the file to the > > code that actually gets called. > > Hmm, just dump the 'entry_points.txt' file in the named distribution's > EGG-INFO directory? E.g.: > > $ cat bin/pip > #!/path/to/virtualenv/bin/pythonX.Y > # EASY-INSTALL-ENTRY-SCRIPT: 'pip==1.3.1','console_scripts','pip' > __requires__ = 'pip==1.3.1' > import sys > from pkg_resources import load_entry_point > > if __name__ == '__main__': > sys.exit( > load_entry_point('pip==1.3.1', 'console_scripts', 'pip')() > ) > > $ cat > lib/pythonX.Y/site-packages/pip-1.3.1-pyX.Y.egg/EGG-INFO/entry_points.txt > [console_scripts] > pip = pip:main > pip-X.Y = pip:main I'm afraid I'm still not enlightened. I'm sure I would understand this if I had ever set up an entry point, since I would have had to read the docs on how to do it. But I never have. So, my point is that the information on what python code is actually being called ought to be in the stub script file, as a comment if nothing else, for discoverability reasons. I'm not bothered enough to work up a patch, though :) --David From tjreedy at udel.edu Wed May 29 01:29:33 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Tue, 28 May 2013 19:29:33 -0400 Subject: [Python-Dev] cpython (merge 3.3 -> default): Merge with 3.3 In-Reply-To: <20130528213933.720e9167@fsol> References: <3bKHfN4rr2z7LmQ@mail.python.org> <51A5003F.3010603@udel.edu> <20130528213933.720e9167@fsol> Message-ID: On 5/28/2013 3:39 PM, Antoine Pitrou wrote: > On Tue, 28 May 2013 15:06:39 -0400 > Terry Reedy wrote: >> Yes, Nick suggested README instead of what I had. I want a prefix to >> keep it near the top of a directory listing even when other non >> 'test_xxx' files are added. I thing '_' wold be better though. > > I don't think "prefixing with a weird character '_' is not weird for Python names. > so that the filename show up top" is a very elegant trick, I disagree. Books have Table of Contents, Preface, and Foreword sections at the front for a reason: if they are present, they are easy and obvious to find. READMEs are like a preface*, sometimes with an annotated Contents. They logically belong at the top for the same reason. A long title for this how-to file, which I would prefer, would be something like "_Writing-Testing-Running_Idle_Tests", or "_Idle_Test-Writing_Guidelines", or "A_Guide_to_Idle_Tests", or "An_Idle_Test_HOWTO". *At least this one is. Some are addenda that have little to do with the other files in the directory. They might better have a different name, like 'Manual Corrections' (which would sort after 'Manual', where it belongs), "Starting the game', 'Windows differences', etc. I don't know if 'readme.txt' was common before DOS 8.3 filename limitations, but that limit is mostly gone. I have used this 'trick' for decades. Another file that does not belong in the main alpha list is a project-specific template for the .py files in a directory. > and we don't use it for other directories. I think we should. Seriously. Maybe with more descriptive names. > "README.txt" *will* be easily visible because of the all-caps filename. Somewhat easy, but only if one thinks to look for it. I only found 4 used outside of Lib/test/*. Which of the following big directories have a README? /Lib /Lib/idlelib /Lib/test /Tools/scripts Would it not be easier to discover if the 'preface' file were always at or near the top? > "Rename in hg" -> "hg rename" Thanks. Found it with Google and read that it works well in hg. I will also check if TortoiseHG has an easy gui equivalent (rt click on file to rename). Terry From python at mrabarnett.plus.com Wed May 29 01:42:18 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 29 May 2013 00:42:18 +0100 Subject: [Python-Dev] cpython (merge 3.3 -> default): Merge with 3.3 In-Reply-To: References: <3bKHfN4rr2z7LmQ@mail.python.org> <51A5003F.3010603@udel.edu> <20130528213933.720e9167@fsol> Message-ID: <51A540DA.4060007@mrabarnett.plus.com> On 29/05/2013 00:29, Terry Jan Reedy wrote: > On 5/28/2013 3:39 PM, Antoine Pitrou wrote: >> On Tue, 28 May 2013 15:06:39 -0400 >> Terry Reedy wrote: > >>> Yes, Nick suggested README instead of what I had. I want a prefix to >>> keep it near the top of a directory listing even when other non >>> 'test_xxx' files are added. I thing '_' wold be better though. >> >> I don't think "prefixing with a weird character > > '_' is not weird for Python names. > >> so that the filename show up top" is a very elegant trick, > > I disagree. Books have Table of Contents, Preface, and Foreword sections > at the front for a reason: if they are present, they are easy and > obvious to find. READMEs are like a preface*, sometimes with an > annotated Contents. They logically belong at the top for the same reason. > > A long title for this how-to file, which I would prefer, would be > something like > "_Writing-Testing-Running_Idle_Tests", or > "_Idle_Test-Writing_Guidelines", or > "A_Guide_to_Idle_Tests", or > "An_Idle_Test_HOWTO". > [snip] I'm somehow not happy about "_README", what with a single underscore indicating "internal" in Python code. Perhaps it would be a bit more Pythonic to have "_README_" instead (dunder would be overdoing it, perhaps). :-) From brett at python.org Wed May 29 02:14:41 2013 From: brett at python.org (Brett Cannon) Date: Tue, 28 May 2013 20:14:41 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130528234038.323899c7@fsol> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: On Tue, May 28, 2013 at 5:40 PM, Antoine Pitrou wrote: > On Tue, 28 May 2013 23:29:46 +0200 (CEST) > brett.cannon wrote: >> >> +.. class:: ModuleManager(name) >> + >> + A :term:`context manager` which provides the module to load. The module will >> + either come from :attr:`sys.modules` in the case of reloading or a fresh >> + module if loading a new module. Proper cleanup of :attr:`sys.modules` occurs >> + if the module was new and an exception was raised. > > What use case does this API solve? See http://bugs.python.org/issue18088 for the other part of this story. I'm basically replacing what importlib.util.module_for_loader does after I realized there is no way in a subclass to override what/how attributes are set on a module before the code object is executed. Instead of using the decorator people will be able to use this context manager with a new method to get the same effect with the ability to better control attribute initialization. > (FWIW, I think "ModuleManager" is a rather bad name :-) I'm open to suggestions, but the thing does manage the module so it at least makes sense. From tjreedy at udel.edu Wed May 29 02:16:01 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Tue, 28 May 2013 20:16:01 -0400 Subject: [Python-Dev] cpython (merge 3.3 -> default): Merge with 3.3 In-Reply-To: <51A540DA.4060007@mrabarnett.plus.com> References: <3bKHfN4rr2z7LmQ@mail.python.org> <51A5003F.3010603@udel.edu> <20130528213933.720e9167@fsol> <51A540DA.4060007@mrabarnett.plus.com> Message-ID: On 5/28/2013 7:42 PM, MRAB wrote: >> "A_Guide_to_Idle_Tests", or >> "An_Idle_Test_HOWTO". > [snip] > I'm somehow not happy about "_README", what with a single underscore > indicating "internal" in Python code. The file is internal to the subset of IDLE developers writing tests, but... > Perhaps it would be a bit more Pythonic to have "_README_" instead > (dunder would be overdoing it, perhaps). :-) Guido said README.txt on the committers list. From steve at pearwood.info Wed May 29 04:04:57 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 29 May 2013 12:04:57 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: <51A56249.1080907@pearwood.info> On 29/05/13 07:27, PJ Eby wrote: > On Tue, May 28, 2013 at 3:41 PM, Russell E. Owen wrote: >> Is it true that this cannot be used for instance and class methods? It >> dispatches based on the first argument, which is "self" for instance >> methods, whereas the second argument would almost certainly be the >> argument one would want to use for conditional dispatch. > > You can use a staticmethod and then delegate to it, of course. But it > probably wouldn't be too difficult to allow specifying which argument > to dispatch on, e.g.: > > @singledispatch.on('someArg') > def my_method(self, someArg, ...): > ... [...] > So, it's just a few lines added, but of course additional doc, tests, > etc. would have to be added as well. (It also might be a good idea > for there to be some error checking in wrapper() to raise an > approriate TypeError if len(args)<=arg.) I feel that specifying the dispatch argument in full generality is overkill, and that supporting two use-cases should be sufficient: - dispatch on the first argument of functions; - dispatch on the second argument of methods, skipping self/cls. After all, is this not supposed to be *simple* generics? :-) I'm vaguely leaning towards @singledispatch and @singledispatch.method for the colour of this bike shed. -- Steven From ncoghlan at gmail.com Wed May 29 04:19:55 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 May 2013 12:19:55 +1000 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: <51A4DFB3.6080901@v.loewis.de> References: <20130528143117.516cc583@pitrou.net> <20130528182018.41d53a45@pitrou.net> <51A4DFB3.6080901@v.loewis.de> Message-ID: On Wed, May 29, 2013 at 2:47 AM, "Martin v. L?wis" wrote: > Am 28.05.13 18:20, schrieb Antoine Pitrou: >> Le Tue, 28 May 2013 23:07:37 +1000, >> Nick Coghlan a ?crit : >>> It was deliberate - a big part of PEP 432 is making sure that all the >>> interpreter state lives *in* the interpreter state (as part of the >>> config struct). >> >> It sounds a bit exagerated. We have encoders and decoders in the same >> (C) modules, compressors and decompressors ditto. Why not keep >> initialization and finalization in the same source file too? > > I can sympathize with the motivation. Unlike encoders and decoders, > it is *very* tempting to put interpreter state into global variables. > With encoders and decoders, it's clear that globals won't work if you > have multiple of them. With interpreter state, it's either singletons > in the first place, or the globals can be swapped out when switching > interpreters. > > By splitting initialization and finalization into distinct translation > units, you make it much more difficult to introduce new "hidden" > variables. Yep, that was a key part of my motivation (the other part was also to find out what global state we *already had* by making the build blow up for anything that was static and referenced by more than just the bootstrapping code). The part I didn't think through when I did it in a long-lived branch was just how much of nightmare it was going to make any merges that touched pythonrun.h or pythonrun.c :) I'd also be open to a setup with a single "lifecycle.h" header file, which was split into the bootstrap and shutdown implementation units, since that makes it easier to check that the appropriate setup/finalize pairs exist (by looking at the combined header file), while still enlisting the build chain's assistance in avoiding hidden global state. Anway, I'll come up with some specific patches and put them on the tracker, starting with moving the source files for the binary executables and making the simpler pythonrun/lifecycle split. I can look into splitting lifecycle.c into separate bootstrap and shutdown translation units after those less controversial changes have been reviewed (the split may not even be all that practical outside the PEP 432 branch, since it would involve exposing quite a few currently static variables to the linker). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From tseaver at palladion.com Wed May 29 04:20:33 2013 From: tseaver at palladion.com (Tres Seaver) Date: Tue, 28 May 2013 22:20:33 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528215244.31F25250498@webabinitio.net> References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> <20130528113500.4d406948@anarchist> <20130528154124.3F820250498@webabinitio.net> <20130528215244.31F25250498@webabinitio.net> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 05/28/2013 05:52 PM, R. David Murray wrote: > On Tue, 28 May 2013 12:17:49 -0400, Tres Seaver > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> On 05/28/2013 11:41 AM, R. David Murray wrote: >>> I have the same complaint about setuptools entry-point scripts, >>> where I still haven't figured out how to go from what is in the >>> file to the code that actually gets called. >> >> Hmm, just dump the 'entry_points.txt' file in the named >> distribution's EGG-INFO directory? E.g.: >> >> $ cat bin/pip #!/path/to/virtualenv/bin/pythonX.Y # >> EASY-INSTALL-ENTRY-SCRIPT: 'pip==1.3.1','console_scripts','pip' >> __requires__ = 'pip==1.3.1' import sys from pkg_resources import >> load_entry_point >> >> if __name__ == '__main__': sys.exit( load_entry_point('pip==1.3.1', >> 'console_scripts', 'pip')() ) >> >> $ cat >> lib/pythonX.Y/site-packages/pip-1.3.1-pyX.Y.egg/EGG-INFO/entry_points.txt >> >> [console_scripts] >> pip = pip:main pip-X.Y = pip:main > > I'm afraid I'm still not enlightened. > > I'm sure I would understand this if I had ever set up an entry point, > since I would have had to read the docs on how to do it. But I never > have. > > So, my point is that the information on what python code is actually > being called ought to be in the stub script file, as a comment if > nothing else, for discoverability reasons. > > I'm not bothered enough to work up a patch, though :) It is there already: # EASY-INSTALL-ENTRY-SCRIPT: 'pip==1.3.1','console_scripts','pip' Which says, load the entry point named 'pip' from the 'console_scripts' entry point group in the 'pip 1.3.1' distribution. The 'entry_points.txt' metadata file specifies that that entry point is a function named 'main' inside the 'pip' package itself. Ters. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver at palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iEUEARECAAYFAlGlZesACgkQ+gerLs4ltQ50xACeJUBMjAvMBaOm63Viigz2bvkP S5gAl2w4WAxgasXie10DMtHJOyRRFvA= =34KH -----END PGP SIGNATURE----- From ncoghlan at gmail.com Wed May 29 04:40:32 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 May 2013 12:40:32 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: On Wed, May 29, 2013 at 5:41 AM, Russell E. Owen wrote: > In article , > ?ukasz Langa wrote: > >> Hello, >> Since the initial version, several minor changes have been made to the >> PEP. The history is visible on hg.python.org. The most important >> change in this version is that I introduced ABC support and completed >> a reference implementation. >> >> No open issues remain from my point of view. > > Is it true that this cannot be used for instance and class methods? It > dispatches based on the first argument, which is "self" for instance > methods, whereas the second argument would almost certainly be the > argument one would want to use for conditional dispatch. Correct. OO and generic functions are different development paradigms, and there are limitations on mixing them. Generic functions are for stateless algorithms, which expect to receive all required input through their arguments. By contrast, class and instance methods expect to receive some state implicitly - in many respects, they *already are* generic functions. Thus, this is really a request for dual dispatch in disguise: you want to first dispatch on the class or instance (through method dispatch) and *then* dispatch on the second argument (through generic function dispatch). Dual dispatch is much harder than single dispatch and "functools.singledispatch" does not and should not support it (it's in the name). As PJE noted, you *can* use singledispatch with staticmethods, as that eliminates the dual dispatch behaviour by removing the class and instance based dispatch step. You can also register already bound class and instance methods as implementations for a generic function, as that also resolves the dual dispatch in a way that means the single dispatch implementation doesn't even need to be aware it is happening. I expect we will see improved tools for integrating class based dispatch and generic function dispatch in the future, but we should *not* try to engineer a solution up front. Doing so would involve too much guessing about possible use cases, rather than letting the design be informed by the *actual* use cases that emerge in practice. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed May 29 05:01:57 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 May 2013 13:01:57 +1000 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528192325.GJ2038@unaka.lan> References: <20130524155629.7597bdb0@anarchist> <20130527183836.GG2038@unaka.lan> <20130528132201.68ed6f98@anarchist> <20130528192325.GJ2038@unaka.lan> Message-ID: On Wed, May 29, 2013 at 5:23 AM, Toshio Kuratomi wrote: > On Tue, May 28, 2013 at 01:22:01PM -0400, Barry Warsaw wrote: >> On May 27, 2013, at 11:38 AM, Toshio Kuratomi wrote: >> >> >- If upstream doesn't deal with it, then we use a "python3-" prefix. This >> > matches with our package naming so it seemed to make sense. (But >> > Barry's point about locate and tab completion and such would be a reason >> > to revisit this... Perhaps standardizing on /usr/bin/foo2-python3 >> > [pathological case of having both package version and interpreter >> > version in the name.] >> >> Note that the Gentoo example also takes into account versions that might act >> differently based on the interpreter's implementation. So a -python3 suffix >> may not be enough. Maybe now we're getting into PEP 425 compatibility tag >> territory. >> > This is an interesting, unmapped area in Fedora at the moment... I > was hoping to talk to Nick and the Fedora python maintainer at our next > Fedora conference. > > I've been looking at how Fedora's ruby guidelines are implemented wrt > alternate interpreters and wondering if we could do something similar for > python: > > https://fedoraproject.org/wiki/Packaging:Ruby#Different_Interpreters_Compatibility > > I'm not sure yet how much of that I'd (or Nick and the python maintainer > [bkabrda, the current python maintainer is the one who wrote the rubypick > script]) would want to use in python -- replacing /usr/bin/python with a > script that chooses between CPython and pypy based on user preference gave > me an instinctual feeling of dread the first time I looked at it but it > seems to be working well for the ruby folks. > > My current feeling is that I wouldn't use this same system for interpreters > which are not mostly compatible (for instance, python2 vs python3). but I > also haven't devoted much actual time to thinking about whether that might > have some advantages. PEP 432 is also related, as it includes the "pysystem" proposal [1] (an alternate Python CLI that will default to -Es behaviour, but is otherwise similar to the standard "python" interpreter). The rest of the discussion though makes me think we may actually need a *nix equivalent of PEP 397 (which describes the "py" launcher we created to get around the limitations of Windows file associations). Between that and the interpreter identification mechanism defined for the PEP 425 compatibility tags it should be possible to come up with an upstream solution for 3.4 that the distros can backport to work with earlier versions (similar to the way users can download the Windows launcher directly from https://bitbucket.org/pypa/pylauncher/downloads even though we only started shipping it upstream as part of the Python 3.3 installer) Cheers, Nick. [1] http://www.python.org/dev/peps/pep-0432/#a-system-python-executable -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ncoghlan at gmail.com Wed May 29 07:09:15 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 29 May 2013 15:09:15 +1000 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: On Wed, May 29, 2013 at 10:14 AM, Brett Cannon wrote: >> (FWIW, I think "ModuleManager" is a rather bad name :-) > > I'm open to suggestions, but the thing does manage the module so it at > least makes sense. I suggest ModuleInitialiser as the CM name, with a helper function to make usage read better: with initialise_module(name) as m: # Module initialisation code goes here # Module is rolled back if initialisation fails Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From solipsis at pitrou.net Wed May 29 08:08:14 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 29 May 2013 08:08:14 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) References: Message-ID: <20130529080814.205fa8ce@fsol> On Wed, 29 May 2013 12:40:32 +1000 Nick Coghlan wrote: > On Wed, May 29, 2013 at 5:41 AM, Russell E. Owen wrote: > > In article , > > ?ukasz Langa wrote: > > > >> Hello, > >> Since the initial version, several minor changes have been made to the > >> PEP. The history is visible on hg.python.org. The most important > >> change in this version is that I introduced ABC support and completed > >> a reference implementation. > >> > >> No open issues remain from my point of view. > > > > Is it true that this cannot be used for instance and class methods? It > > dispatches based on the first argument, which is "self" for instance > > methods, whereas the second argument would almost certainly be the > > argument one would want to use for conditional dispatch. > > Correct. OO and generic functions are different development paradigms, > and there are limitations on mixing them. Generic functions are for > stateless algorithms, which expect to receive all required input > through their arguments. By contrast, class and instance methods > expect to receive some state implicitly - in many respects, they > *already are* generic functions. There are actual use cases for generic methods, think pickle.py. (also, often a "stateless" function will eventually become stateful, if used as part of a sufficiently complex application / library; e.g. some logging will be added, or some kind of configuration object) Regards Antoine. From solipsis at pitrou.net Wed May 29 08:16:02 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 29 May 2013 08:16:02 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) References: <20130529080814.205fa8ce@fsol> Message-ID: <20130529081602.49fb5946@fsol> On Wed, 29 May 2013 08:08:14 +0200 Antoine Pitrou wrote: > On Wed, 29 May 2013 12:40:32 +1000 > Nick Coghlan wrote: > > On Wed, May 29, 2013 at 5:41 AM, Russell E. Owen wrote: > > > In article , > > > ?ukasz Langa wrote: > > > > > >> Hello, > > >> Since the initial version, several minor changes have been made to the > > >> PEP. The history is visible on hg.python.org. The most important > > >> change in this version is that I introduced ABC support and completed > > >> a reference implementation. > > >> > > >> No open issues remain from my point of view. > > > > > > Is it true that this cannot be used for instance and class methods? It > > > dispatches based on the first argument, which is "self" for instance > > > methods, whereas the second argument would almost certainly be the > > > argument one would want to use for conditional dispatch. > > > > Correct. OO and generic functions are different development paradigms, > > and there are limitations on mixing them. Generic functions are for > > stateless algorithms, which expect to receive all required input > > through their arguments. By contrast, class and instance methods > > expect to receive some state implicitly - in many respects, they > > *already are* generic functions. > > There are actual use cases for generic methods, think pickle.py. That said, I admit this is a case where the generic method use is private, i.e. is not exposed for other code to extend. (the public extension protocol being in the form of plain methods: __getstate__, etc.) Regards Antoine. From brett at python.org Wed May 29 15:04:01 2013 From: brett at python.org (Brett Cannon) Date: Wed, 29 May 2013 09:04:01 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: On May 29, 2013 1:09 AM, "Nick Coghlan" wrote: > > On Wed, May 29, 2013 at 10:14 AM, Brett Cannon wrote: > >> (FWIW, I think "ModuleManager" is a rather bad name :-) > > > > I'm open to suggestions, but the thing does manage the module so it at > > least makes sense. > > I suggest ModuleInitialiser as the CM name, with a helper function to > make usage read better: > > with initialise_module(name) as m: > # Module initialisation code goes here > # Module is rolled back if initialisation fails But you're not initializing the module; more like getting the module, either new or from sys.modules. But I thought ModuleGetter seemed too Java-like. Could hide the class behind a get_module function though. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed May 29 16:28:43 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 May 2013 00:28:43 +1000 Subject: [Python-Dev] Structural cleanups to the main CPython repo In-Reply-To: References: <20130528143117.516cc583@pitrou.net> <20130528182018.41d53a45@pitrou.net> <51A4DFB3.6080901@v.loewis.de> Message-ID: On Wed, May 29, 2013 at 12:19 PM, Nick Coghlan wrote: > Anway, I'll come up with some specific patches and put them on the > tracker, starting with moving the source files for the binary > executables and making the simpler pythonrun/lifecycle split. I can > look into splitting lifecycle.c into separate bootstrap and shutdown > translation units after those less controversial changes have been > reviewed (the split may not even be all that practical outside the PEP > 432 branch, since it would involve exposing quite a few currently > static variables to the linker). I started with the simplest part, adding a new Programs directory: http://bugs.python.org/issue18093 Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rdmurray at bitdance.com Wed May 29 16:33:54 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 29 May 2013 10:33:54 -0400 Subject: [Python-Dev] decoding setuptools entry point scripts (was: Bilingual scripts) In-Reply-To: References: <20130524155629.7597bdb0@anarchist> <20130524202358.C57F9250BDB@webabinitio.net> <20130528113500.4d406948@anarchist> <20130528154124.3F820250498@webabinitio.net> <20130528215244.31F25250498@webabinitio.net> Message-ID: <20130529143403.A211F2504B4@webabinitio.net> On Tue, 28 May 2013 22:20:33 -0400, Tres Seaver wrote: > > So, my point is that the information on what python code is actually > > being called ought to be in the stub script file, as a comment if > > nothing else, for discoverability reasons. > > > > I'm not bothered enough to work up a patch, though :) > > It is there already: > > # EASY-INSTALL-ENTRY-SCRIPT: 'pip==1.3.1','console_scripts','pip' > > Which says, load the entry point named 'pip' from the 'console_scripts' > entry point group in the 'pip 1.3.1' distribution. > > The 'entry_points.txt' metadata file specifies that that entry point is a > function named 'main' inside the 'pip' package itself. Ah, but you had to *decode* that for me, using your non-local expert's knowledge. I assume 'main' is defined in or imported into pip's __init__? Now, if the comment had said: # Call pip.main (per the specification in the pip entry of the # console_scripts section of pip-1.3.1-egg-info/entrypoints.txt). then I would have known everything I needed to know without either consulting the *implementor's* documentation for setuptools or an expert such as yourself. Of that, as a *user*, the first two words are the only thing I'm interested in, but the other information could be handy in debugging certain specialized and unlikely issues, such as when someone has manually changed the entrypoints.txt file. Note that the comment still requires you to know python import semantics...but if you don't know that much you wouldn't get far looking at the source code anyway. The dir/filename lets you 'find' the entrypoints.txt file even if you don't know where on your system that file is installed. --David From ncoghlan at gmail.com Wed May 29 16:34:40 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 May 2013 00:34:40 +1000 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: On Wed, May 29, 2013 at 11:04 PM, Brett Cannon wrote: >> with initialise_module(name) as m: >> # Module initialisation code goes here >> # Module is rolled back if initialisation fails > > But you're not initializing the module; more like getting the module, either > new or from sys.modules. But I thought ModuleGetter seemed too Java-like. > Could hide the class behind a get_module function though. The point is to provide a useful mnemonic for *why* you would use this context manager, and the reason is because the body of the with statement is going to initialize the contents, and you want to unwind things appropriately if that fails. initializing_module is probably a better name than initialized_module, though (since it isn't initialized yet on entry - instead, that's what should be the case by the end of the statement) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From brett at python.org Wed May 29 16:47:16 2013 From: brett at python.org (Brett Cannon) Date: Wed, 29 May 2013 10:47:16 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: On Wed, May 29, 2013 at 10:34 AM, Nick Coghlan wrote: > On Wed, May 29, 2013 at 11:04 PM, Brett Cannon wrote: >>> with initialise_module(name) as m: >>> # Module initialisation code goes here >>> # Module is rolled back if initialisation fails >> >> But you're not initializing the module; more like getting the module, either >> new or from sys.modules. But I thought ModuleGetter seemed too Java-like. >> Could hide the class behind a get_module function though. > > The point is to provide a useful mnemonic for *why* you would use this > context manager, and the reason is because the body of the with > statement is going to initialize the contents, and you want to unwind > things appropriately if that fails. You should use this context manager to get the correct module to initialize/execute/whatever, e.g. contextlib.closing is about what the context manager is going to do for you, not what you are doing to the object it returned. > > initializing_module is probably a better name than initialized_module, > though (since it isn't initialized yet on entry - instead, that's what > should be the case by the end of the statement) I am willing to compromise to module_to_initialize, module_to_init, or module_to_load. Pick one. =) From ncoghlan at gmail.com Wed May 29 16:59:02 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 May 2013 00:59:02 +1000 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: On Thu, May 30, 2013 at 12:47 AM, Brett Cannon wrote: > I am willing to compromise to module_to_initialize, module_to_init, or > module_to_load. Pick one. =) I see this as *really* similar to a database transaction, and those start with "session.begin()". Could you tolerate "with begin_module_init(name) as m:"? We could even document the ability to check m.__initializing__ to see whether this is a reload() or not. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rdmurray at bitdance.com Wed May 29 17:58:21 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 29 May 2013 11:58:21 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: <20130529155822.54EE6250BCF@webabinitio.net> On Thu, 30 May 2013 00:59:02 +1000, Nick Coghlan wrote: > On Thu, May 30, 2013 at 12:47 AM, Brett Cannon wrote: > > I am willing to compromise to module_to_initialize, module_to_init, or > > module_to_load. Pick one. =) > > I see this as *really* similar to a database transaction, and those > start with "session.begin()". > > Could you tolerate "with begin_module_init(name) as m:"? But for a transaction, it is 'with session', not 'with begin_session'. With 'begin_module_init' I would have no idea what 'm' was. With Brett's 'module_to_init' I have an intuitive idea about what 'm' is. And if 'm' isn't a module, then module_manager would be better. (Note that I haven't grokked what Brett's context manager is actually doing/returning, I'm speaking here as an ignorant reader of someone else's code :) > We could even document the ability to check m.__initializing__ to see > whether this is a reload() or not. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/rdmurray%40bitdance.com From solipsis at pitrou.net Wed May 29 18:04:55 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 29 May 2013 18:04:55 +0200 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> Message-ID: <20130529180455.2151c27b@pitrou.net> Le Wed, 29 May 2013 11:58:21 -0400, "R. David Murray" a ?crit : > On Thu, 30 May 2013 00:59:02 +1000, Nick Coghlan > wrote: > > On Thu, May 30, 2013 at 12:47 AM, Brett Cannon > > wrote: > > > I am willing to compromise to module_to_initialize, > > > module_to_init, or module_to_load. Pick one. =) > > > > I see this as *really* similar to a database transaction, and those > > start with "session.begin()". > > > > Could you tolerate "with begin_module_init(name) as m:"? > > But for a transaction, it is 'with session', not 'with begin_session'. or "with transaction.begin()", or "with transaction.commit_on_success()", depending on the API :-) > With 'begin_module_init' I would have no idea what 'm' was. With > Brett's 'module_to_init' I have an intuitive idea about what 'm' is. Agreed. Regards Antoine. From brett at python.org Wed May 29 18:25:45 2013 From: brett at python.org (Brett Cannon) Date: Wed, 29 May 2013 12:25:45 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130529155822.54EE6250BCF@webabinitio.net> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> Message-ID: On Wed, May 29, 2013 at 11:58 AM, R. David Murray wrote: > On Thu, 30 May 2013 00:59:02 +1000, Nick Coghlan wrote: >> On Thu, May 30, 2013 at 12:47 AM, Brett Cannon wrote: >> > I am willing to compromise to module_to_initialize, module_to_init, or >> > module_to_load. Pick one. =) >> >> I see this as *really* similar to a database transaction, and those >> start with "session.begin()". >> >> Could you tolerate "with begin_module_init(name) as m:"? > > But for a transaction, it is 'with session', not 'with begin_session'. > > With 'begin_module_init' I would have no idea what 'm' was. With > Brett's 'module_to_init' I have an intuitive idea about what 'm' is. > And if 'm' isn't a module, then module_manager would be better. > > (Note that I haven't grokked what Brett's context manager is actually > doing/returning, I'm speaking here as an ignorant reader of someone > else's code :) In case you want to suggest a name, the context manager returns the module that should be initialized/loaded. So typical usage will be:: class Loader: def load_module(self, fullname): with importlib.util.module_to_init(fullname) as module: # Load/initialize the module return module Basically the manager either gets the module from sys.modules if it is already there (a reload) or creates a new module and sticks it into sys.modules so other imports will grab the right module object. If there is an exception and the module was new, it deletes it from sys.modules to prevent stale modules from sticking around. From rdmurray at bitdance.com Wed May 29 18:49:43 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 29 May 2013 12:49:43 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> Message-ID: <20130529164944.03EF3250BD1@webabinitio.net> On Wed, 29 May 2013 12:25:45 -0400, Brett Cannon wrote: > In case you want to suggest a name, the context manager returns the > module that should be initialized/loaded. So typical usage will be:: > > class Loader: > def load_module(self, fullname): > with importlib.util.module_to_init(fullname) as module: > # Load/initialize the module > return module > > Basically the manager either gets the module from sys.modules if it is > already there (a reload) or creates a new module and sticks it into > sys.modules so other imports will grab the right module object. If > there is an exception and the module was new, it deletes it from > sys.modules to prevent stale modules from sticking around. So it is a context manager to handle the exception? Now I think I see where Nick is coming from. How about 'managed_initializiation'? That seems closer to the 'closing' model, to me. It isn't as clear about what it is returning, though, since you are passing it a name and getting back a module, whereas in the closing case you get back the same object you send in. Perhaps 'managed_module'? --David From brett at python.org Wed May 29 18:55:01 2013 From: brett at python.org (Brett Cannon) Date: Wed, 29 May 2013 12:55:01 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130529164944.03EF3250BD1@webabinitio.net> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> Message-ID: On Wed, May 29, 2013 at 12:49 PM, R. David Murray wrote: > On Wed, 29 May 2013 12:25:45 -0400, Brett Cannon wrote: >> In case you want to suggest a name, the context manager returns the >> module that should be initialized/loaded. So typical usage will be:: >> >> class Loader: >> def load_module(self, fullname): >> with importlib.util.module_to_init(fullname) as module: >> # Load/initialize the module >> return module >> >> Basically the manager either gets the module from sys.modules if it is >> already there (a reload) or creates a new module and sticks it into >> sys.modules so other imports will grab the right module object. If >> there is an exception and the module was new, it deletes it from >> sys.modules to prevent stale modules from sticking around. > > So it is a context manager to handle the exception? It's to choose the right module (sys.modules or new) and if the module is new and there is an exception to delete it from sys.modules (other small details like setting __initializing__ but that's not important). So both __enter__ and __exit__ have logic. > Now I think I see > where Nick is coming from. > > How about 'managed_initializiation'? That seems closer to the 'closing' > model, to me. It isn't as clear about what it is returning, though, > since you are passing it a name and getting back a module, whereas in > the closing case you get back the same object you send in. True. > > Perhaps 'managed_module'? managed_module is better than managed_initialization. From ericsnowcurrently at gmail.com Wed May 29 20:00:44 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 29 May 2013 12:00:44 -0600 Subject: [Python-Dev] performance testing recommendations in devguide Message-ID: The devguide doesn't have anything on performance testing that I could find. We do have a number of relatively useful resources in this space though, like pybench and (eventually) speed.python.org. I'd like to add a page to the devguide on performance testing, including an explanation of our performance goals, how to test for them, and what tools are available. Tools I'm aware of: * pybench (relatively limited in real-world usefulness) * timeit module (for quick comparisions) * benchmarks repo (real-world performance test suite) * speed.python.org (would omit for now) Things to test: * speed * memory (tools? tests?) Critically sensitive performance subjects * interpreter start-up time * module import overhead * attribute lookup overhead (including MRO traversal) * function call overhead * instance creation overhead * dict performance (the underlying namespace type) * tuple performance (packing/unpacking, integral container type) * string performance What would be important to say in the devguide regarding Python performance and testing it? What would you add/subtract from the above? How important is testing memory performance? How do we avoid performance regressions? Thanks! -eric From solipsis at pitrou.net Wed May 29 20:10:44 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 29 May 2013 20:10:44 +0200 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> Message-ID: <20130529201044.0c27a19e@fsol> On Wed, 29 May 2013 12:55:01 -0400 Brett Cannon wrote: > > Perhaps 'managed_module'? > > managed_module is better than managed_initialization. I don't understand how it's "managed". "manage", "manager", etc. is the kind of dumb words everybody uses when they don't manage (!) to explain what they're talking about. My vote is for "module_to_init", "uninitialized_module", "pristine_module", etc. Regards Antoine. From barry at python.org Wed May 29 20:20:27 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 29 May 2013 14:20:27 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528200245.4be5532c@fsol> References: <20130524155629.7597bdb0@anarchist> <20130525095316.010f2b34@fsol> <20130528132718.2e0410be@anarchist> <20130528200245.4be5532c@fsol> Message-ID: <20130529142027.1d372f02@anarchist> On May 28, 2013, at 08:02 PM, Antoine Pitrou wrote: >On Tue, 28 May 2013 13:27:18 -0400 >Barry Warsaw wrote: >> On May 25, 2013, at 09:53 AM, Antoine Pitrou wrote: >> >> >How about always running the version specific targets, e.g. >> >nosetests-2.7? >> >> We have nosetests-2.7 and nosetests3 in /usr/bin, but we generally recommend >> folks not use these, especially for things like (build time) package tests. >> It's harder to iterate over when the installed versions are unknown >> statically, e.g. if you wanted to run all the tests over all available >> versions of Python. > >It sounds like you want a dedicated script or utility for this ("run >all the tests over all available versions of Python") rather than hack >it every time you package a Python library. There is some support for this in some of the Debian helpers, e.g. pybuild, but that's not in widespread use. One problem is that there's no definitive way to know how to run a package's test suite (and won't be even in after PEP 426). I tried to generate some momentum around trying to standardize this, but it didn't get anywhere. Still, that's just one small aspect of the problem. >Your use case also doesn't seem to impact end-users. Depends on who the end-users are. It definitely impacts developers. >> This is why I would really like to see all scripts provide a -m equivalent >> for command line invocation. This might be a little awkward for < Python >> 2.7 (where IIRC -m doesn't work with packages). > >Do you still support Python < 2.7? Debian still does support 2.6, but hopefully not for long! -Barry From barry at python.org Wed May 29 20:34:18 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 29 May 2013 14:34:18 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130528192325.GJ2038@unaka.lan> References: <20130524155629.7597bdb0@anarchist> <20130527183836.GG2038@unaka.lan> <20130528132201.68ed6f98@anarchist> <20130528192325.GJ2038@unaka.lan> Message-ID: <20130529143418.43cf715b@anarchist> On May 28, 2013, at 12:23 PM, Toshio Kuratomi wrote: >> Note that the Gentoo example also takes into account versions that might act >> differently based on the interpreter's implementation. So a -python3 suffix >> may not be enough. Maybe now we're getting into PEP 425 compatibility tag >> territory. >> > This is an interesting, unmapped area in Fedora at the moment... I >was hoping to talk to Nick and the Fedora python maintainer at our next >Fedora conference. > >I've been looking at how Fedora's ruby guidelines are implemented wrt >alternate interpreters and wondering if we could do something similar for >python: > >https://fedoraproject.org/wiki/Packaging:Ruby#Different_Interpreters_Compatibility Very interesting. It was something like this, albeit replacing _jruby_ or _mri_ with something like --py2 or --py3 that I had in mind. However... >I'm not sure yet how much of that I'd (or Nick and the python maintainer >[bkabrda, the current python maintainer is the one who wrote the rubypick >script]) would want to use in python -- replacing /usr/bin/python with a >script that chooses between CPython and pypy based on user preference gave >me an instinctual feeling of dread the first time I looked at it but it >seems to be working well for the ruby folks. ... it kind of gives me the heebie-jeebies too. I think *most* scripts wouldn't need this kind of variability though. For example, lsb_release only needs to run with one version of Python, so: % head -1 /usr/bin/lsb_release #! /usr/bin/python3 -Es is just fine. I wouldn't want to replace /usr/bin/python with a selectable interpreter (see also PEP 394), but if we had something like /usr/bin/multipy which acted like rubypick for the few, very limited examples where it's needed, then it might be useful to do so. I would very definitely want to get consensus on the mechanism and api between the various Linux distros here so it works the same on F/RH, D/U, Gentoo and any others. >My current feeling is that I wouldn't use this same system for interpreters >which are not mostly compatible (for instance, python2 vs python3). but I >also haven't devoted much actual time to thinking about whether that might >have some advantages. Seems like for Python, that would be the most important use case, but maybe I have blinders on for the issue at hand. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed May 29 20:38:16 2013 From: barry at python.org (Barry Warsaw) Date: Wed, 29 May 2013 14:38:16 -0400 Subject: [Python-Dev] Bilingual scripts In-Reply-To: References: <20130524155629.7597bdb0@anarchist> <20130527183836.GG2038@unaka.lan> <20130528132201.68ed6f98@anarchist> <20130528192325.GJ2038@unaka.lan> Message-ID: <20130529143816.448c0d1a@anarchist> On May 29, 2013, at 01:01 PM, Nick Coghlan wrote: >PEP 432 is also related, as it includes the "pysystem" proposal [1] >(an alternate Python CLI that will default to -Es behaviour, but is >otherwise similar to the standard "python" interpreter). I *knew* this was being specified somewhere, but I couldn't find it in either the tracker or PEP summary. As an aside Nick, what do you think about splitting the pysystem proposal out of PEP 432? I think they could certainly live as independent PEPs albeit perhaps the pysystem one dependent on 432. >The rest of the discussion though makes me think we may actually need >a *nix equivalent of PEP 397 (which describes the "py" launcher we >created to get around the limitations of Windows file associations). Perhaps! >Between that and the interpreter identification mechanism defined for >the PEP 425 compatibility tags it should be possible to come up with >an upstream solution for 3.4 that the distros can backport to work >with earlier versions (similar to the way users can download the >Windows launcher directly from >https://bitbucket.org/pypa/pylauncher/downloads even though we only >started shipping it upstream as part of the Python 3.3 installer) We're getting pretty close to a real idea here. :) -Barry From rdmurray at bitdance.com Wed May 29 20:43:12 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 29 May 2013 14:43:12 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130529201044.0c27a19e@fsol> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> <20130529201044.0c27a19e@fsol> Message-ID: <20130529184312.9776E250BD5@webabinitio.net> On Wed, 29 May 2013 20:10:44 +0200, Antoine Pitrou wrote: > On Wed, 29 May 2013 12:55:01 -0400 > Brett Cannon wrote: > > > Perhaps 'managed_module'? > > > > managed_module is better than managed_initialization. > > I don't understand how it's "managed". "manage", "manager", etc. is the > kind of dumb words everybody uses when they don't manage (!) to explain > what they're talking about. > > My vote is for "module_to_init", "uninitialized_module", > "pristine_module", etc. I don't really have a horse in this race (that is, whatever is chosen, my vote will be 0 on it unless someone comes up with something brilliant :), but I'll just point out that those names do not give any clue as to why the thing is a context manager instead of a function that just returns the uninitialized module. --David From rdmurray at bitdance.com Wed May 29 20:56:46 2013 From: rdmurray at bitdance.com (R. David Murray) Date: Wed, 29 May 2013 14:56:46 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130529201044.0c27a19e@fsol> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> <20130529201044.0c27a19e@fsol> Message-ID: <20130529185647.37A7C250BD3@webabinitio.net> On Wed, 29 May 2013 20:10:44 +0200, Antoine Pitrou wrote: > On Wed, 29 May 2013 12:55:01 -0400 > Brett Cannon wrote: > > > Perhaps 'managed_module'? > > > > managed_module is better than managed_initialization. > > I don't understand how it's "managed". "manage", "manager", etc. is the > kind of dumb words everybody uses when they don't manage (!) to explain > what they're talking about. > > My vote is for "module_to_init", "uninitialized_module", > "pristine_module", etc. Actually, you are right, 'managed_module' isn't much if any better than those. Our problem is that there are two concepts we are trying to cram into one name: what the context manager is managing, and the object that the context manager gives you on entry to the with block. There probably isn't a good answer. I suppose that one approach would be to have a module_initializer context manager return self and then separately call a method on it it to actually load the module inside the with body. But adding more typing to solve a naming issue seems...odd. --David From carlosnepomuceno at outlook.com Wed May 29 20:59:21 2013 From: carlosnepomuceno at outlook.com (Carlos Nepomuceno) Date: Wed, 29 May 2013 21:59:21 +0300 Subject: [Python-Dev] performance testing recommendations in devguide In-Reply-To: References: Message-ID: ---------------------------------------- > Date: Wed, 29 May 2013 12:00:44 -0600 > From: ericsnowcurrently at gmail.com > To: python-dev at python.org > Subject: [Python-Dev] performance testing recommendations in devguide > > The devguide doesn't have anything on performance testing that I could > find. We do have a number of relatively useful resources in this > space though, like pybench and (eventually) speed.python.org. I'd > like to add a page to the devguide on performance testing, including > an explanation of our performance goals, how to test for them, and > what tools are available. Thanks Eric! I was looking for that kind of place! ;) > Tools I'm aware of: > * pybench (relatively limited in real-world usefulness) > * timeit module (for quick comparisions) > * benchmarks repo (real-world performance test suite) > * speed.python.org (would omit for now) Why PyBench isn't considered reliable[1]? What do you mean by "benchmarks repo"? http://hg.python.org/benchmarks ? > Things to test: > * speed > * memory (tools? tests?) > > Critically sensitive performance subjects > * interpreter start-up time > * module import overhead > * attribute lookup overhead (including MRO traversal) > * function call overhead > * instance creation overhead > * dict performance (the underlying namespace type) > * tuple performance (packing/unpacking, integral container type) > * string performance > > What would be important to say in the devguide regarding Python > performance and testing it? I've just discovered insertion at the end is faster than at the start of a list. I'd like to see things like that not only in the devguide but also in the docs (http://docs.python.org/). I found it on Dan's presentation[2] but I'm not sure it isn't in the docs somewhere. > What would you add/subtract from the > above? Threading performance! > How important is testing memory performance? How do we avoid > performance regressions? Thanks! Testing and making it faster! ;) Offcourse we need a baseline (benchmarks database) to compare and check improvements. > -eric [1] "pybench - run the standard Python PyBench benchmark suite. This is considered an unreliable, unrepresentative benchmark; do not base decisions off it. It is included only for completeness." Source: http://hg.python.org/benchmarks/file/dccd52b95a71/README.txt [2] http://stromberg.dnsalias.org/~dstromberg/Intro-to-Python/Intro%20to%20Python%202010.pdf From solipsis at pitrou.net Wed May 29 21:18:01 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 29 May 2013 21:18:01 +0200 Subject: [Python-Dev] performance testing recommendations in devguide References: Message-ID: <20130529211801.4339de36@fsol> Hi, On Wed, 29 May 2013 12:00:44 -0600 Eric Snow wrote: > The devguide doesn't have anything on performance testing that I could > find. See http://bugs.python.org/issue17449 > Tools I'm aware of: > * pybench (relatively limited in real-world usefulness) > * timeit module (for quick comparisions) > * benchmarks repo (real-world performance test suite) > * speed.python.org (would omit for now) > > Things to test: > * speed > * memory (tools? tests?) You can use the "-m" option to perf.py. > Critically sensitive performance subjects > * interpreter start-up time There are startup tests in the benchmark suite. > * module import overhead > * attribute lookup overhead (including MRO traversal) > * function call overhead > * instance creation overhead > * dict performance (the underlying namespace type) > * tuple performance (packing/unpacking, integral container type) > * string performance These are all micro-benchmark fodder rather than high-level concerns (e.g. "startup time" is a high-level concern potentially impacted by "module import overhead", but only if the latter is a significant contributor to startup time). > How do we avoid performance regressions? Right now we don't have any automated way to detect them. Regards Antoine. From solipsis at pitrou.net Wed May 29 21:19:43 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 29 May 2013 21:19:43 +0200 Subject: [Python-Dev] performance testing recommendations in devguide References: Message-ID: <20130529211943.0d2390fc@fsol> Hi, On Wed, 29 May 2013 21:59:21 +0300 Carlos Nepomuceno wrote: > > [1] "pybench - run the standard Python PyBench benchmark suite. This is considered > an unreliable, unrepresentative benchmark; do not base decisions > off it. It is included only for completeness." "unrepresentative" is the main criticism against pybench. PyBench is a suite of micro-benchmarks (almost nano-benchmarks, actually :-)) that don't try to simulate any real-world situation. PyBench may also be unreliable, because its tests are so static that they could be optimized away by a clever enough (JIT) compiler. Regards Antoine. From fijall at gmail.com Wed May 29 22:04:34 2013 From: fijall at gmail.com (Maciej Fijalkowski) Date: Wed, 29 May 2013 22:04:34 +0200 Subject: [Python-Dev] performance testing recommendations in devguide In-Reply-To: <20130529211943.0d2390fc@fsol> References: <20130529211943.0d2390fc@fsol> Message-ID: On Wed, May 29, 2013 at 9:19 PM, Antoine Pitrou wrote: > > Hi, > > On Wed, 29 May 2013 21:59:21 +0300 > Carlos Nepomuceno wrote: >> >> [1] "pybench - run the standard Python PyBench benchmark suite. This is considered >> an unreliable, unrepresentative benchmark; do not base decisions >> off it. It is included only for completeness." > > "unrepresentative" is the main criticism against pybench. PyBench is a > suite of micro-benchmarks (almost nano-benchmarks, actually :-)) that > don't try to simulate any real-world situation. > > PyBench may also be unreliable, because its tests are so static that > they could be optimized away by a clever enough (JIT) compiler. > > Regards > > Antoine. For what is worth PyBench is bad because it's micro-only. A lot of stuff only shows up in larger examples, especially on an optimizing compiler. The proposed list contains also only micro-benchmarks, which will have the exact same problem as pybench. From brett at python.org Wed May 29 22:22:39 2013 From: brett at python.org (Brett Cannon) Date: Wed, 29 May 2013 16:22:39 -0400 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130529185647.37A7C250BD3@webabinitio.net> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> <20130529201044.0c27a19e@fsol> <20130529185647.37A7C250BD3@webabinitio.net> Message-ID: On Wed, May 29, 2013 at 2:56 PM, R. David Murray wrote: > On Wed, 29 May 2013 20:10:44 +0200, Antoine Pitrou wrote: >> On Wed, 29 May 2013 12:55:01 -0400 >> Brett Cannon wrote: >> > > Perhaps 'managed_module'? >> > >> > managed_module is better than managed_initialization. >> >> I don't understand how it's "managed". "manage", "manager", etc. is the >> kind of dumb words everybody uses when they don't manage (!) to explain >> what they're talking about. >> >> My vote is for "module_to_init", "uninitialized_module", >> "pristine_module", etc. I don't like unititionalized_module or pristine_module as that isn't guaranteed thanks to reloading; seems misleading. > > Actually, you are right, 'managed_module' isn't much if any better > than those. > > Our problem is that there are two concepts we are trying to cram into > one name: what the context manager is managing, and the object that the > context manager gives you on entry to the with block. There probably > isn't a good answer. > > I suppose that one approach would be to have a module_initializer context > manager return self and then separately call a method on it it to actually > load the module inside the with body. But adding more typing to solve > a naming issue seems...odd. That would make me feel icky, so I won't do it. So module_to_init it is unless someone can convince me the bikeshed is a different colour. From mal at egenix.com Wed May 29 22:36:52 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 29 May 2013 22:36:52 +0200 Subject: [Python-Dev] performance testing recommendations in devguide In-Reply-To: <20130529211943.0d2390fc@fsol> References: <20130529211943.0d2390fc@fsol> Message-ID: <51A666E4.1000505@egenix.com> On 29.05.2013 21:19, Antoine Pitrou wrote: > > Hi, > > On Wed, 29 May 2013 21:59:21 +0300 > Carlos Nepomuceno wrote: >> >> [1] "pybench - run the standard Python PyBench benchmark suite. This is considered >> an unreliable, unrepresentative benchmark; do not base decisions >> off it. It is included only for completeness." > > "unrepresentative" is the main criticism against pybench. PyBench is a > suite of micro-benchmarks (almost nano-benchmarks, actually :-)) that > don't try to simulate any real-world situation. > > PyBench may also be unreliable, because its tests are so static that > they could be optimized away by a clever enough (JIT) compiler. Correct. pybench was written to test and verify CPython interpreter optimizations and also to detect changes which resulted in performance degradation of very basic operations such as attribute lookups, method calls, simple integer math, etc. It was never meant to be representative of anything :-) At the time, we only had pystone as "benchmark" and things like high precision timers were not yet readily available as they are now. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 29 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-07-01: EuroPython 2013, Florence, Italy ... 33 days to go ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From storchaka at gmail.com Wed May 29 22:50:27 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 29 May 2013 23:50:27 +0300 Subject: [Python-Dev] performance testing recommendations in devguide In-Reply-To: References: Message-ID: 29.05.13 21:00, Eric Snow ???????(??): > Critically sensitive performance subjects > * interpreter start-up time > * module import overhead > * attribute lookup overhead (including MRO traversal) > * function call overhead > * instance creation overhead > * dict performance (the underlying namespace type) > * tuple performance (packing/unpacking, integral container type) > * string performance * regular expressions performance * IO performance From ericsnowcurrently at gmail.com Thu May 30 07:33:11 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 29 May 2013 23:33:11 -0600 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <20130529164944.03EF3250BD1@webabinitio.net> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> Message-ID: On Wed, May 29, 2013 at 10:49 AM, R. David Murray wrote: > Perhaps 'managed_module'? I was just thinking the same thing. -eric From ericsnowcurrently at gmail.com Thu May 30 08:10:07 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 30 May 2013 00:10:07 -0600 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> <20130529201044.0c27a19e@fsol> <20130529185647.37A7C250BD3@webabinitio.net> Message-ID: On Wed, May 29, 2013 at 2:22 PM, Brett Cannon wrote: > So module_to_init it is unless someone can convince me the bikeshed is > a different colour. Whatever the name is, it should reflect what is happening during the with statement, and more particularly that the thing will end at the end of the with statement. managed_module() seems fine to me though it could still imply the lifetime of the module rather than the management. During the with statement the module is managed, and I expect it's clear that the management is relative to the import system. However, it could also make sense to split the function into two pieces: getting the module and handling it properly in the face of exceptions in a with statement. So, importlib.util.get_module() and ModuleType.managed(): class Loader: def load_module(self, fullname): module = importlib.util.get_module(fullname) with module.managed(): # Load/initialize the module return module If ModuleType.managed() returned the module, you could do it on one line: class Loader: def load_module(self, fullname): with importlib.util.get_module(fullname).managed() as module: # Load/initialize the module return module On second thought, that "one-liner" is a little too busy. And if it's a problem as a method on ModuleType, make it importlib.util.managed_module(): class Loader: def load_module(self, fullname): module = importlib.util.get_module(fullname) with importlib.util.managed_module(module): # Load/initialize the module return module It would be nice to have both parts in one function. It would be less boilerplate for the "common" case that way, is easier to read, and eliminates the risk of someone not realizing they need both parts. However, I'm not sure it buys us that much, the separate-part approach helps keep the two concepts distinct (for better or for worse), and each piece could be separately useful. Maybe have a third function that wraps the other two or have managed_module() accept strings (and then call get_module() internally). -eric From mark at hotpy.org Thu May 30 10:34:09 2013 From: mark at hotpy.org (Mark Shannon) Date: Thu, 30 May 2013 09:34:09 +0100 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> Message-ID: <51A70F01.8040602@hotpy.org> On 29/05/13 01:14, Brett Cannon wrote: > On Tue, May 28, 2013 at 5:40 PM, Antoine Pitrou wrote: >> On Tue, 28 May 2013 23:29:46 +0200 (CEST) >> brett.cannon wrote: >>> >>> +.. class:: ModuleManager(name) >>> + >>> + A :term:`context manager` which provides the module to load. The module will >>> + either come from :attr:`sys.modules` in the case of reloading or a fresh >>> + module if loading a new module. Proper cleanup of :attr:`sys.modules` occurs >>> + if the module was new and an exception was raised. >> >> What use case does this API solve? > > See http://bugs.python.org/issue18088 for the other part of this > story. I'm basically replacing what importlib.util.module_for_loader > does after I realized there is no way in a subclass to override > what/how attributes are set on a module before the code object is > executed. Instead of using the decorator people will be able to use > this context manager with a new method to get the same effect with the > ability to better control attribute initialization. > >> (FWIW, I think "ModuleManager" is a rather bad name :-) +1. XxxManager is what Java programmers call their classes when they are forced to have an unnecessary class because they don't have 1st class functions or modules. (I don't like 'Context Manager' either, but it's too late to change it :( ) > > I'm open to suggestions, but the thing does manage the module so it at > least makes sense. But what do you mean by managing? 'Manage' has many meanings. Once you've answered that question you should have your name. Cheers, Mark. From asolano at icai.es Thu May 30 10:42:25 2013 From: asolano at icai.es (Alfredo Solano) Date: Thu, 30 May 2013 10:42:25 +0200 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <51A70F01.8040602@hotpy.org> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <51A70F01.8040602@hotpy.org> Message-ID: <51A710F1.5090207@icai.es> Hi, What about ModuleProxy? From the dictionary: prox?y /?pr?ks?/ Noun The authority to represent someone else, esp. in voting. A person authorized to act on behalf of another. Synonyms deputy - representative - agent - substitute Alfredo On 05/30/2013 10:34 AM, Mark Shannon wrote: > > > On 29/05/13 01:14, Brett Cannon wrote: >> On Tue, May 28, 2013 at 5:40 PM, Antoine Pitrou >> wrote: >>> On Tue, 28 May 2013 23:29:46 +0200 (CEST) >>> brett.cannon wrote: >>>> >>>> +.. class:: ModuleManager(name) >>>> + >>>> + A :term:`context manager` which provides the module to load. The >>>> module will >>>> + either come from :attr:`sys.modules` in the case of reloading or >>>> a fresh >>>> + module if loading a new module. Proper cleanup of >>>> :attr:`sys.modules` occurs >>>> + if the module was new and an exception was raised. >>> >>> What use case does this API solve? >> >> See http://bugs.python.org/issue18088 for the other part of this >> story. I'm basically replacing what importlib.util.module_for_loader >> does after I realized there is no way in a subclass to override >> what/how attributes are set on a module before the code object is >> executed. Instead of using the decorator people will be able to use >> this context manager with a new method to get the same effect with the >> ability to better control attribute initialization. >> >>> (FWIW, I think "ModuleManager" is a rather bad name :-) > > +1. XxxManager is what Java programmers call their classes when they > are forced to have an > unnecessary class because they don't have 1st class functions or modules. > > (I don't like 'Context Manager' either, but it's too late to change it > :( ) > >> >> I'm open to suggestions, but the thing does manage the module so it at >> least makes sense. > > But what do you mean by managing? 'Manage' has many meanings. > Once you've answered that question you should have your name. > > Cheers, > Mark. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/asolano%40icai.es From ncoghlan at gmail.com Thu May 30 10:53:01 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 May 2013 18:53:01 +1000 Subject: [Python-Dev] Bilingual scripts In-Reply-To: <20130529143816.448c0d1a@anarchist> References: <20130524155629.7597bdb0@anarchist> <20130527183836.GG2038@unaka.lan> <20130528132201.68ed6f98@anarchist> <20130528192325.GJ2038@unaka.lan> <20130529143816.448c0d1a@anarchist> Message-ID: On 30 May 2013 04:40, "Barry Warsaw" wrote: > > On May 29, 2013, at 01:01 PM, Nick Coghlan wrote: > > >PEP 432 is also related, as it includes the "pysystem" proposal [1] > >(an alternate Python CLI that will default to -Es behaviour, but is > >otherwise similar to the standard "python" interpreter). > > I *knew* this was being specified somewhere, but I couldn't find it in either > the tracker or PEP summary. As an aside Nick, what do you think about > splitting the pysystem proposal out of PEP 432? I think they could certainly > live as independent PEPs albeit perhaps the pysystem one dependent on 432. Sure. You could probably even implement it without PEP 432, it would just be somewhat painful to replicate the current CLI behaviour. Cheers, Nick. > > > >The rest of the discussion though makes me think we may actually need > >a *nix equivalent of PEP 397 (which describes the "py" launcher we > >created to get around the limitations of Windows file associations). > > Perhaps! > > >Between that and the interpreter identification mechanism defined for > >the PEP 425 compatibility tags it should be possible to come up with > >an upstream solution for 3.4 that the distros can backport to work > >with earlier versions (similar to the way users can download the > >Windows launcher directly from > >https://bitbucket.org/pypa/pylauncher/downloads even though we only > >started shipping it upstream as part of the Python 3.3 installer) > > We're getting pretty close to a real idea here. :) > > -Barry > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu May 30 11:01:10 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 30 May 2013 19:01:10 +1000 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <20130529155822.54EE6250BCF@webabinitio.net> <20130529164944.03EF3250BD1@webabinitio.net> <20130529201044.0c27a19e@fsol> <20130529185647.37A7C250BD3@webabinitio.net> Message-ID: On 30 May 2013 06:25, "Brett Cannon" wrote: > > On Wed, May 29, 2013 at 2:56 PM, R. David Murray wrote: > > On Wed, 29 May 2013 20:10:44 +0200, Antoine Pitrou wrote: > >> On Wed, 29 May 2013 12:55:01 -0400 > >> Brett Cannon wrote: > >> > > Perhaps 'managed_module'? > >> > > >> > managed_module is better than managed_initialization. > >> > >> I don't understand how it's "managed". "manage", "manager", etc. is the > >> kind of dumb words everybody uses when they don't manage (!) to explain > >> what they're talking about. > >> > >> My vote is for "module_to_init", "uninitialized_module", > >> "pristine_module", etc. > > I don't like unititionalized_module or pristine_module as that isn't > guaranteed thanks to reloading; seems misleading. > > > > > Actually, you are right, 'managed_module' isn't much if any better > > than those. > > > > Our problem is that there are two concepts we are trying to cram into > > one name: what the context manager is managing, and the object that the > > context manager gives you on entry to the with block. There probably > > isn't a good answer. > > > > I suppose that one approach would be to have a module_initializer context > > manager return self and then separately call a method on it it to actually > > load the module inside the with body. But adding more typing to solve > > a naming issue seems...odd. > > That would make me feel icky, so I won't do it. > > So module_to_init it is unless someone can convince me the bikeshed is > a different colour. +1 to that bikeshed colour. It covers what we're returning (a module) and what we plan to do with it that needs a with statement (initialising it). Cheers, Nick. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Thu May 30 13:08:23 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 30 May 2013 13:08:23 +0200 Subject: [Python-Dev] Segmentation fault on 3.4 with --pydebug Message-ID: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> This happens after Benjamin's changes in 83937. Anybody else seeing this? Intel i5 2.4 GHz, Mac OS X 10.8.3, clang $ hg up default $ make distclean $ MACOSX_DEPLOYMENT_TARGET=10.8 ./configure --with-pydebug $ make $ ./python.exe -Wd -m test.regrtest test_exceptions [1/1] test_exceptions Fatal Python error: Segmentation fault Current thread 0x00007fff74254180: File "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", line 453 in f File "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", line 453 in f File "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", line 453 in f ... (repeated a 100 times) Command terminated abnormally. Everything runs fine without --with-pydebug (or before 83937 with --with-pydebug). -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmi.baranov at gmail.com Thu May 30 13:25:35 2013 From: dmi.baranov at gmail.com (Dmitriy Baranov) Date: Thu, 30 May 2013 14:25:35 +0300 Subject: [Python-Dev] Segmentation fault on 3.4 with --pydebug In-Reply-To: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> References: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> Message-ID: No for me: $ ./python -Wd -m test.regrtest test_exceptions [1/1] test_exceptions 1 test OK. $ uname -a Linux 3.2.0-32-generic #51-Ubuntu SMP Wed Sep 26 21:32:50 UTC 2012 i686 i686 i386 GNU/Linux Please look at issue18075 2013/5/30 ?ukasz Langa : > This happens after Benjamin's changes in 83937. Anybody else seeing this? > > Intel i5 2.4 GHz, Mac OS X 10.8.3, clang > > $ hg up default > $ make distclean > $ MACOSX_DEPLOYMENT_TARGET=10.8 ./configure --with-pydebug > $ make > $ ./python.exe -Wd -m test.regrtest test_exceptions > [1/1] test_exceptions > Fatal Python error: Segmentation fault > > Current thread 0x00007fff74254180: > File > "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", > line 453 in f > File > "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", > line 453 in f > File > "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", > line 453 in f > ... (repeated a 100 times) > Command terminated abnormally. > > > > Everything runs fine without --with-pydebug (or before 83937 with > --with-pydebug). > > -- > Best regards, > ?ukasz Langa > > WWW: http://lukasz.langa.pl/ > Twitter: @llanga > IRC: ambv on #python-dev > > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/dmi.baranov%40gmail.com > From a.cavallo at cavallinux.eu Thu May 30 13:53:19 2013 From: a.cavallo at cavallinux.eu (a.cavallo at cavallinux.eu) Date: Thu, 30 May 2013 13:53:19 +0200 Subject: [Python-Dev] Segmentation fault on 3.4 with --pydebug In-Reply-To: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> References: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> Message-ID: <2d94e32be4ab62abd5c2e9fd90be6ef6@cavallinux.eu> What's the stack trace? $> gdb --args ./python.exe -Wd -m test.regrtest test_exceptions and once in gdb: gdb> bt That should point on where it happened. I hope this help On 2013-05-30 13:08, ?ukasz Langa wrote: > This happens after Benjamin's changes in 83937. Anybody else seeing > this? > > Intel i5 2.4 GHz, Mac OS X 10.8.3, clang > > $ hg up default > $ make distclean > $ MACOSX_DEPLOYMENT_TARGET=10.8 ./configure --with-pydebug > > $ make > $ ./python.exe -Wd -m test.regrtest test_exceptions > > [1/1] test_exceptions > Fatal Python error: Segmentation fault > > Current thread 0x00007fff74254180: > File > > "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", > line 453 in f > File > > "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", > line 453 in f > File > > "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", > line 453 in f > ... (repeated a 100 times) > Command terminated abnormally. > > Everything runs fine without --with-pydebug (or before 83937 with > --with-pydebug). From ronaldoussoren at mac.com Thu May 30 14:45:38 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 30 May 2013 14:45:38 +0200 Subject: [Python-Dev] Segmentation fault on 3.4 with --pydebug In-Reply-To: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> References: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> Message-ID: <23E046BA-E625-4D49-BA1A-41F3DDB6AE84@mac.com> On 30 May, 2013, at 13:08, ?ukasz Langa wrote: > This happens after Benjamin's changes in 83937. Anybody else seeing this? > > Intel i5 2.4 GHz, Mac OS X 10.8.3, clang > > $ hg up default > $ make distclean > $ MACOSX_DEPLOYMENT_TARGET=10.8 ./configure --with-pydebug > $ make > $ ./python.exe -Wd -m test.regrtest test_exceptions > [1/1] test_exceptions > Fatal Python error: Segmentation fault > > Current thread 0x00007fff74254180: > File "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", line 453 in f > File "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", line 453 in f > File "/Users/ambv/Documents/Projekty/Python/cpython/py34/Lib/test/test_exceptions.py", line 453 in f > ... (repeated a 100 times) > Command terminated abnormally. > > > > Everything runs fine without --with-pydebug (or before 83937 with --with-pydebug). Issue #18075 contains a patch. I probably won't have time to commit until sunday, but feel free to apply the patch yourself :-) Ronald From ron3200 at gmail.com Thu May 30 13:50:51 2013 From: ron3200 at gmail.com (Ron Adam) Date: Thu, 30 May 2013 06:50:51 -0500 Subject: [Python-Dev] cpython: Introduce importlib.util.ModuleManager which is a context manager to In-Reply-To: <51A70F01.8040602@hotpy.org> References: <3bKp9Q5tqHz7LkF@mail.python.org> <20130528234038.323899c7@fsol> <51A70F01.8040602@hotpy.org> Message-ID: On 05/30/2013 03:34 AM, Mark Shannon wrote: > > > On 29/05/13 01:14, Brett Cannon wrote: >> On Tue, May 28, 2013 at 5:40 PM, Antoine Pitrou wrote: >>> On Tue, 28 May 2013 23:29:46 +0200 (CEST) >>> brett.cannon wrote: >>>> >>>> +.. class:: ModuleManager(name) >>>> + >>>> + A :term:`context manager` which provides the module to load. The >>>> module will >>>> + either come from :attr:`sys.modules` in the case of reloading or a >>>> fresh >>>> + module if loading a new module. Proper cleanup of >>>> :attr:`sys.modules` occurs >>>> + if the module was new and an exception was raised. >>> (FWIW, I think "ModuleManager" is a rather bad name :-) > +1. XxxManager is what Java programmers call their classes when they are > forced to have an > unnecessary class because they don't have 1st class functions or modules. > > (I don't like 'Context Manager' either, but it's too late to change it :( ) >> I'm open to suggestions, but the thing does manage the module so it at >> least makes sense. > > But what do you mean by managing? 'Manage' has many meanings. > Once you've answered that question you should have your name. It manages the context, as in the above reference to context manager. In this case the context is the loading and unloading of a module. Having context managers names end with manager helps indicate how it's used. But other verb+er combinations also work. Taking a hint from the first few words of the __doc__ string gives us an obvious alternative. ModuleProvider Cheers, Ron From benjamin at python.org Thu May 30 18:19:45 2013 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 30 May 2013 09:19:45 -0700 Subject: [Python-Dev] Segmentation fault on 3.4 with --pydebug In-Reply-To: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> References: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> Message-ID: 2013/5/30 ?ukasz Langa : > This happens after Benjamin's changes in 83937. Anybody else seeing this? Remember you need the hash to fully identify hg changesets. :) > > Intel i5 2.4 GHz, Mac OS X 10.8.3, clang > > $ hg up default > $ make distclean > $ MACOSX_DEPLOYMENT_TARGET=10.8 ./configure --with-pydebug > $ make > $ ./python.exe -Wd -m test.regrtest test_exceptions > [1/1] test_exceptions > Fatal Python error: Segmentation fault As noted, it's infinite recursion. Without optimization I've noticed clang is very inefficient with respect to stack space, so for example, each PyEval_FrameEx frame is 1/2 KB. -- Regards, Benjamin From lukasz at langa.pl Thu May 30 21:31:51 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Thu, 30 May 2013 21:31:51 +0200 Subject: [Python-Dev] Segmentation fault on 3.4 with --pydebug In-Reply-To: <23E046BA-E625-4D49-BA1A-41F3DDB6AE84@mac.com> References: <0E175BCC-6CD5-479D-A7A9-B07F98F3A78F@langa.pl> <23E046BA-E625-4D49-BA1A-41F3DDB6AE84@mac.com> Message-ID: On 30 maj 2013, at 14:45, Ronald Oussoren wrote: > Issue #18075 contains a patch. I probably won't have time to commit until sunday, but feel free to apply the patch yourself :-) I did just that. Fixed, thanks! -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From ezio.melotti at gmail.com Thu May 30 22:35:07 2013 From: ezio.melotti at gmail.com (Ezio Melotti) Date: Thu, 30 May 2013 23:35:07 +0300 Subject: [Python-Dev] performance testing recommendations in devguide In-Reply-To: References: Message-ID: Hi, On Wed, May 29, 2013 at 9:00 PM, Eric Snow wrote: > ... > > What would be important to say in the devguide regarding Python > performance and testing it? In the devguide I would only add information that are specific to benchmarking the interpreter. A separate "Benchmarking HOWTO" that covers generic topics could/should be added to docs.python.org. Best Regards, Ezio Melotti > What would you add/subtract from the > above? How important is testing memory performance? How do we avoid > performance regressions? Thanks! > > -eric From lukasz at langa.pl Fri May 31 01:47:59 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 31 May 2013 01:47:59 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: Message-ID: <59C93B26-D554-4EEC-BF18-FBF61C40FA7D@langa.pl> On 29 maj 2013, at 04:40, Nick Coghlan wrote: > I expect we will see improved tools for integrating class based > dispatch and generic function dispatch in the future, but we should > *not* try to engineer a solution up front. Doing so would involve too > much guessing about possible use cases, rather than letting the design > be informed by the *actual* use cases that emerge in practice. I thought about this over the last two days and I've concluded Nick is right here. I really don't want functools.singledispatch to be undercooked when introduced. However, it seems we fleshed out the PEP and the reference implementation to do one thing only, and do it well. The rest is guesswork. It's better to build on a simple foundation than to provide a solution waiting for the problem (see: annotations). So, unless anyone strongly objects, I think we shouldn't bother to special-case instance methods and class methods. "Code wins arguments": class State: def __init__(self): self.add.register(int, self.add_int) self.add.register(float, self.add_float) self.add.register(complex, self.add_complex) self.sum = 0 @staticmethod @singledispatch def add(arg): raise TypeError("This type is not supported.") def add_int(self, arg): self.sum += arg def add_float(self, arg): self.sum += int(round(arg)) def add_complex(self, arg): self.sum += int(round(arg.real)) if __name__ == '__main__': state = State() state.add(1) state.add(2.51) state.add(3.7+4j) assert state.sum == 8 state.add(2.50) assert state.sum == 10 try: state.add("string") assert False, "TypeError not raised." except TypeError: pass # properly refused to add a string -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From lukasz at langa.pl Fri May 31 01:51:50 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 31 May 2013 01:51:50 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: <59C93B26-D554-4EEC-BF18-FBF61C40FA7D@langa.pl> References: <59C93B26-D554-4EEC-BF18-FBF61C40FA7D@langa.pl> Message-ID: On 31 maj 2013, at 01:47, ?ukasz Langa wrote: > class State: > def __init__(self): > self.add.register(int, self.add_int) Ouch, I realized this is wrong just after I hit "Send". self.add is a staticmethod so this registration will overload on every instance. Which is obviously bad. Lesson learned: don't post code at 2 AM. -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From lukasz at langa.pl Fri May 31 03:05:46 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 31 May 2013 03:05:46 +0200 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: References: <59C93B26-D554-4EEC-BF18-FBF61C40FA7D@langa.pl> Message-ID: <2ECE07A2-3E72-4DFD-928F-D1C110793C25@langa.pl> On 31 maj 2013, at 01:51, ?ukasz Langa wrote: > On 31 maj 2013, at 01:47, ?ukasz Langa wrote: > >> class State: >> def __init__(self): >> self.add.register(int, self.add_int) > > Ouch, I realized this is wrong just after I hit "Send". > self.add is a staticmethod so this registration will overload > on every instance. Which is obviously bad. So, after some embarrassing head banging, here's the correct solution: https://gist.github.com/ambv/5682351 So, it *is* possible to make instance-level and class-level registration work with the existing @singledispatch code and a bit of plumbing. Obviously, all that is not necessary for actual static methods. Back to the point, though. I don't feel we should complicate the code, tests and documentation by introducing special handling for methods. In terms of pure type-driven single dispatch, we have a solution that was intentionally simple from the get-go. The next step will be predicate dispatch anyway ;)) What do you think? -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From chrism at plope.com Fri May 31 07:13:02 2013 From: chrism at plope.com (Chris McDonough) Date: Fri, 31 May 2013 01:13:02 -0400 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: <2ECE07A2-3E72-4DFD-928F-D1C110793C25@langa.pl> References: <59C93B26-D554-4EEC-BF18-FBF61C40FA7D@langa.pl> <2ECE07A2-3E72-4DFD-928F-D1C110793C25@langa.pl> Message-ID: <1369977182.3082.6.camel@thinko> On Fri, 2013-05-31 at 03:05 +0200, ?ukasz Langa wrote: > On 31 maj 2013, at 01:51, ?ukasz Langa wrote: > > Back to the point, though. I don't feel we should complicate the > code, tests and documentation by introducing special handling > for methods. In terms of pure type-driven single dispatch, we > have a solution that was intentionally simple from the get-go. > > The next step will be predicate dispatch anyway ;)) > > What do you think? +1. It's incredibly useful and easy to document as-is. - C From ncoghlan at gmail.com Fri May 31 09:00:45 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 31 May 2013 17:00:45 +1000 Subject: [Python-Dev] PEP 443 - Single-dispatch generic functions (including ABC support) In-Reply-To: <1369977182.3082.6.camel@thinko> References: <59C93B26-D554-4EEC-BF18-FBF61C40FA7D@langa.pl> <2ECE07A2-3E72-4DFD-928F-D1C110793C25@langa.pl> <1369977182.3082.6.camel@thinko> Message-ID: On Fri, May 31, 2013 at 3:13 PM, Chris McDonough wrote: > On Fri, 2013-05-31 at 03:05 +0200, ?ukasz Langa wrote: >> On 31 maj 2013, at 01:51, ?ukasz Langa wrote: >> > >> Back to the point, though. I don't feel we should complicate the >> code, tests and documentation by introducing special handling >> for methods. In terms of pure type-driven single dispatch, we >> have a solution that was intentionally simple from the get-go. >> >> The next step will be predicate dispatch anyway ;)) >> >> What do you think? > > +1. It's incredibly useful and easy to document as-is. Yep, I suggest asking Benjamin for pronouncement - this looks great to me. I should finally be able to clean up the pkgutils.walk_packages docs by explaining how to register new handlers for custom importers :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From paul at mad-scientist.net Fri May 31 09:52:50 2013 From: paul at mad-scientist.net (Paul Smith) Date: Fri, 31 May 2013 03:52:50 -0400 Subject: [Python-Dev] Problem building Python 2.7.5 with separate sysroot Message-ID: <1369986770.4119.43.camel@homebase> Hi all. I'm trying to build Python 2.7.5 on a GNU/Linux (Linux Mint 14) system, but using a different sysroot (that is, a separate /usr/include, /usr/lib, etc., not the real one for my system). I have shell script wrappers around GCC and its various tools that invoke it with the right paths to force this to happen, and when I call Python's configure I send along "CC=sysroot-gcc", etc. for all the various tools. Note that it's not really a cross-compilation because the target is also a GNU/Linux system on the same hardware architecture. The majority of Python builds just fine like this. However, I'm having serious problems building modules such as fcntl, etc. Looking at the output from the makefile, I can see that somehow, someone is forcibly adding "-I/usr/include/x86_64-linux-gnu" to the link line: building 'termios' extension sysroot-gcc -pthread -fPIC -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -IInclude -I/usr/include/x86_64-linux-gnu -I/common/sysroot/tools/usr/include -I/common/sysroot/tools/usr/include-fixed -I/usr/local/include -I/home/workspaces/psmith/python/obj/src/Python-2.7.5/Include -I/home/workspaces/psmith/python/obj/bld/python -c /home/workspaces/psmith/python/obj/src/Python-2.7.5/Modules/termios.c -o build/temp.linux-x86_64-2.7/home/workspaces/psmith/python/obj/src/Python-2.7.5/Modules/termios.o This fails miserably because the headers in /usr/include/x86_64-linux-gnu do not play at all nicely with my other sysroot headers. Ditto for other extensions like fcntl, etc. I've searched high and low in the Python source, generated makefiles, config.log, etc. and I cannot find where this -I flag is coming from anywhere. I found the --oldincludedir flag to configure and set it to point into my sysroot as well, but that didn't help: the /usr/include still appears when building these extensions. Can anyone tell me where Python is getting these -I flags and what I need to do to tell it to NOT use those flags when building extensions? I'd also like to remove the -I/usr/local/include, although this is not actually causing me problems right now. From nad at acm.org Fri May 31 10:21:40 2013 From: nad at acm.org (Ned Deily) Date: Fri, 31 May 2013 01:21:40 -0700 Subject: [Python-Dev] Problem building Python 2.7.5 with separate sysroot References: <1369986770.4119.43.camel@homebase> Message-ID: In article <1369986770.4119.43.camel at homebase>, Paul Smith wrote: > Hi all. I'm trying to build Python 2.7.5 on a GNU/Linux (Linux Mint 14) > system, but using a different sysroot (that is, a separate > /usr/include, /usr/lib, etc., not the real one for my system). This list is for the development of Python itself, not about using or installing it. Python-list (AKA comp.lang.python) is the right list to ask such questions. That said ... > However, I'm having serious problems building modules such as fcntl, > etc. Looking at the output from the makefile, I can see that somehow, > someone is forcibly adding "-I/usr/include/x86_64-linux-gnu" to the link > line: [...] ... include file and library file selections for building standard library modules are handled by the top-level setup.py file in the source tree. That's where /usr/local/... is added and chances are that the above header is being added by add_gcc_paths() in setup.py. -- Ned Deily, nad at acm.org From lukasz at langa.pl Fri May 31 11:46:58 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 31 May 2013 11:46:58 +0200 Subject: [Python-Dev] PEP 443 - request for pronouncement Message-ID: <0748525E-61D5-46F7-8ECD-E8C7C3A43B5C@langa.pl> Hello python-dev, PEP 443 is ready for final review. I'm attaching the latest version below for convenience. The full history of changes is available here: http://hg.python.org/peps/log/tip/pep-0443.txt A reference implementation for PEP 443 is available at: http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 with relevant tests here: http://hg.python.org/features/pep-443/file/tip/Lib/test/test_functools.py#l855 and documentation here: http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189 There's also an official backport for 2.6 - 3.3 already up: https://pypi.python.org/pypi/singledispatch PEP: 443 Title: Single-dispatch generic functions Version: $Revision$ Last-Modified: $Date$ Author: ?ukasz Langa Discussions-To: Python-Dev Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 22-May-2013 Post-History: 22-May-2013, 25-May-2013, 31-May-2013 Replaces: 245, 246, 3124 Abstract ======== This PEP proposes a new mechanism in the ``functools`` standard library module that provides a simple form of generic programming known as single-dispatch generic functions. A **generic function** is composed of multiple functions implementing the same operation for different types. Which implementation should be used during a call is determined by the dispatch algorithm. When the implementation is chosen based on the type of a single argument, this is known as **single dispatch**. Rationale and Goals =================== Python has always provided a variety of built-in and standard-library generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``, ``copy.copy()``, and most of the functions in the ``operator`` module. However, it currently: 1. does not have a simple or straightforward way for developers to create new generic functions, 2. does not have a standard way for methods to be added to existing generic functions (i.e., some are added using registration functions, others require defining ``__special__`` methods, possibly by monkeypatching). In addition, it is currently a common anti-pattern for Python code to inspect the types of received arguments, in order to decide what to do with the objects. For example, code may wish to accept either an object of some type, or a sequence of objects of that type. Currently, the "obvious way" to do this is by type inspection, but this is brittle and closed to extension. Abstract Base Classes make it easier to discover present behaviour, but don't help adding new behaviour. A developer using an already-written library may be unable to change how their objects are treated by such code, especially if the objects they are using were created by a third party. Therefore, this PEP proposes a uniform API to address dynamic overloading using decorators. User API ======== To define a generic function, decorate it with the ``@singledispatch`` decorator. Note that the dispatch happens on the type of the first argument, create your function accordingly:: >>> from functools import singledispatch >>> @singledispatch ... def fun(arg, verbose=False): ... if verbose: ... print("Let me just say,", end=" ") ... print(arg) To add overloaded implementations to the function, use the ``register()`` attribute of the generic function. It is a decorator, taking a type parameter and decorating a function implementing the operation for that type:: >>> @fun.register(int) ... def _(arg, verbose=False): ... if verbose: ... print("Strength in numbers, eh?", end=" ") ... print(arg) ... >>> @fun.register(list) ... def _(arg, verbose=False): ... if verbose: ... print("Enumerate this:") ... for i, elem in enumerate(arg): ... print(i, elem) To enable registering lambdas and pre-existing functions, the ``register()`` attribute can be used in a functional form:: >>> def nothing(arg, verbose=False): ... print("Nothing.") ... >>> fun.register(type(None), nothing) The ``register()`` attribute returns the undecorated function which enables decorator stacking, pickling, as well as creating unit tests for each variant independently:: >>> @fun.register(float) ... @fun.register(Decimal) ... def fun_num(arg, verbose=False): ... if verbose: ... print("Half of your number:", end=" ") ... print(arg / 2) ... >>> fun_num is fun False When called, the generic function dispatches on the type of the first argument:: >>> fun("Hello, world.") Hello, world. >>> fun("test.", verbose=True) Let me just say, test. >>> fun(42, verbose=True) Strength in numbers, eh? 42 >>> fun(['spam', 'spam', 'eggs', 'spam'], verbose=True) Enumerate this: 0 spam 1 spam 2 eggs 3 spam >>> fun(None) Nothing. >>> fun(1.23) 0.615 Where there is no registered implementation for a specific type, its method resolution order is used to find a more generic implementation. To check which implementation will the generic function choose for a given type, use the ``dispatch()`` attribute:: >>> fun.dispatch(float) >>> fun.dispatch(dict) To access all registered implementations, use the read-only ``registry`` attribute:: >>> fun.registry.keys() dict_keys([, , , , , ]) >>> fun.registry[float] >>> fun.registry[object] The proposed API is intentionally limited and opinionated, as to ensure it is easy to explain and use, as well as to maintain consistency with existing members in the ``functools`` module. Implementation Notes ==================== The functionality described in this PEP is already implemented in the ``pkgutil`` standard library module as ``simplegeneric``. Because this implementation is mature, the goal is to move it largely as-is. The reference implementation is available on hg.python.org [#ref-impl]_. The dispatch type is specified as a decorator argument. An alternative form using function annotations has been considered but its inclusion has been deferred. As of May 2013, this usage pattern is out of scope for the standard library [#pep-0008]_ and the best practices for annotation usage are still debated. Based on the current ``pkgutil.simplegeneric`` implementation and following the convention on registering virtual subclasses on Abstract Base Classes, the dispatch registry will not be thread-safe. Abstract Base Classes --------------------- The ``pkgutil.simplegeneric`` implementation relied on several forms of method resultion order (MRO). ``@singledispatch`` removes special handling of old-style classes and Zope's ExtensionClasses. More importantly, it introduces support for Abstract Base Classes (ABC). When a generic function implementation is registered for an ABC, the dispatch algorithm switches to a mode of MRO calculation for the provided argument which includes the relevant ABCs. The algorithm is as follows:: def _compose_mro(cls, haystack): """Calculates the MRO for a given class `cls`, including relevant abstract base classes from `haystack`.""" bases = set(cls.__mro__) mro = list(cls.__mro__) for regcls in haystack: if regcls in bases or not issubclass(cls, regcls): continue # either present in the __mro__ or unrelated for index, base in enumerate(mro): if not issubclass(base, regcls): break if base in bases and not issubclass(regcls, base): # Conflict resolution: put classes present in __mro__ # and their subclasses first. index += 1 mro.insert(index, regcls) return mro In its most basic form, it returns the MRO for the given type:: >>> _compose_mro(dict, []) [, ] When the haystack consists of ABCs that the specified type is a subclass of, they are inserted in a predictable order:: >>> _compose_mro(dict, [Sized, MutableMapping, str, ... Sequence, Iterable]) [, , , , ] While this mode of operation is significantly slower, all dispatch decisions are cached. The cache is invalidated on registering new implementations on the generic function or when user code calls ``register()`` on an ABC to register a new virtual subclass. In the latter case, it is possible to create a situation with ambiguous dispatch, for instance:: >>> from collections import Iterable, Container >>> class P: ... pass >>> Iterable.register(P) >>> Container.register(P) Faced with ambiguity, ``@singledispatch`` refuses the temptation to guess:: >>> @singledispatch ... def g(arg): ... return "base" ... >>> g.register(Iterable, lambda arg: "iterable") at 0x108b49110> >>> g.register(Container, lambda arg: "container") at 0x108b491c8> >>> g(P()) Traceback (most recent call last): ... RuntimeError: Ambiguous dispatch: or Note that this exception would not be raised if ``Iterable`` and ``Container`` had been provided as base classes during class definition. In this case dispatch happens in the MRO order:: >>> class Ten(Iterable, Container): ... def __iter__(self): ... for i in range(10): ... yield i ... def __contains__(self, value): ... return value in range(10) ... >>> g(Ten()) 'iterable' Usage Patterns ============== This PEP proposes extending behaviour only of functions specifically marked as generic. Just as a base class method may be overridden by a subclass, so too may a function be overloaded to provide custom functionality for a given type. Universal overloading does not equal *arbitrary* overloading, in the sense that we need not expect people to randomly redefine the behavior of existing functions in unpredictable ways. To the contrary, generic function usage in actual programs tends to follow very predictable patterns and registered implementations are highly-discoverable in the common case. If a module is defining a new generic operation, it will usually also define any required implementations for existing types in the same place. Likewise, if a module is defining a new type, then it will usually define implementations there for any generic functions that it knows or cares about. As a result, the vast majority of registered implementations can be found adjacent to either the function being overloaded, or to a newly-defined type for which the implementation is adding support. It is only in rather infrequent cases that one will have implementations registered in a module that contains neither the function nor the type(s) for which the implementation is added. In the absence of incompetence or deliberate intention to be obscure, the few implementations that are not registered adjacent to the relevant type(s) or function(s), will generally not need to be understood or known about outside the scope where those implementations are defined. (Except in the "support modules" case, where best practice suggests naming them accordingly.) As mentioned earlier, single-dispatch generics are already prolific throughout the standard library. A clean, standard way of doing them provides a way forward to refactor those custom implementations to use a common one, opening them up for user extensibility at the same time. Alternative approaches ====================== In PEP 3124 [#pep-3124]_ Phillip J. Eby proposes a full-grown solution with overloading based on arbitrary rule sets (with the default implementation dispatching on argument types), as well as interfaces, adaptation and method combining. PEAK-Rules [#peak-rules]_ is a reference implementation of the concepts described in PJE's PEP. Such a broad approach is inherently complex, which makes reaching a consensus hard. In contrast, this PEP focuses on a single piece of functionality that is simple to reason about. It's important to note this does not preclude the use of other approaches now or in the future. In a 2005 article on Artima [#artima2005]_ Guido van Rossum presents a generic function implementation that dispatches on types of all arguments on a function. The same approach was chosen in Andrey Popp's ``generic`` package available on PyPI [#pypi-generic]_, as well as David Mertz's ``gnosis.magic.multimethods`` [#gnosis-multimethods]_. While this seems desirable at first, I agree with Fredrik Lundh's comment that "if you design APIs with pages of logic just to sort out what code a function should execute, you should probably hand over the API design to someone else". In other words, the single argument approach proposed in this PEP is not only easier to implement but also clearly communicates that dispatching on a more complex state is an anti-pattern. It also has the virtue of corresponding directly with the familiar method dispatch mechanism in object oriented programming. The only difference is whether the custom implementation is associated more closely with the data (object-oriented methods) or the algorithm (single-dispatch overloading). PyPy's RPython offers ``extendabletype`` [#pairtype]_, a metaclass which enables classes to be externally extended. In combination with ``pairtype()`` and ``pair()`` factories, this offers a form of single-dispatch generics. Acknowledgements ================ Apart from Phillip J. Eby's work on PEP 3124 [#pep-3124]_ and PEAK-Rules, influences include Paul Moore's original issue [#issue-5135]_ that proposed exposing ``pkgutil.simplegeneric`` as part of the ``functools`` API, Guido van Rossum's article on multimethods [#artima2005]_, and discussions with Raymond Hettinger on a general pprint rewrite. Huge thanks to Nick Coghlan for encouraging me to create this PEP and providing initial feedback. References ========== .. [#ref-impl] http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l359 .. [#pep-0008] PEP 8 states in the "Programming Recommendations" section that "the Python standard library will not use function annotations as that would result in a premature commitment to a particular annotation style". (http://www.python.org/dev/peps/pep-0008) .. [#pep-3124] http://www.python.org/dev/peps/pep-3124/ .. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules .. [#artima2005] http://www.artima.com/weblogs/viewpost.jsp?thread=101605 .. [#pypi-generic] http://pypi.python.org/pypi/generic .. [#gnosis-multimethods] http://gnosis.cx/publish/programming/charming_python_b12.html .. [#pairtype] https://bitbucket.org/pypy/pypy/raw/default/rpython/tool/pairtype.py .. [#issue-5135] http://bugs.python.org/issue5135 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From lukasz at langa.pl Fri May 31 12:34:56 2013 From: lukasz at langa.pl (=?utf-8?Q?=C5=81ukasz_Langa?=) Date: Fri, 31 May 2013 12:34:56 +0200 Subject: [Python-Dev] PEP 443 - request for pronouncement In-Reply-To: References: <0748525E-61D5-46F7-8ECD-E8C7C3A43B5C@langa.pl> Message-ID: On 31 maj 2013, at 12:18, Gustavo Carneiro wrote: > It is not clear from the PEP (up until the end of the User API section at least) when, if ever, is this implementation of fun ever called. I mean, what type of 'arg' triggers a dispatch to this function body? I added a sentence clarifying that. See the commit: http://hg.python.org/peps/rev/4d6c827944c4 Does that address your concern? > So my comment is just about clarity of the PEP text. I do not wish to interfere with pronouncement. Sure thing. Thanks for your feedback! -- Best regards, ?ukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev From gjcarneiro at gmail.com Fri May 31 12:37:30 2013 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Fri, 31 May 2013 11:37:30 +0100 Subject: [Python-Dev] PEP 443 - request for pronouncement In-Reply-To: References: <0748525E-61D5-46F7-8ECD-E8C7C3A43B5C@langa.pl> Message-ID: On Fri, May 31, 2013 at 11:34 AM, ?ukasz Langa wrote: > On 31 maj 2013, at 12:18, Gustavo Carneiro wrote: > > > It is not clear from the PEP (up until the end of the User API section > at least) when, if ever, is this implementation of fun ever called. I > mean, what type of 'arg' triggers a dispatch to this function body? > > I added a sentence clarifying that. See the commit: > http://hg.python.org/peps/rev/4d6c827944c4 > > Does that address your concern? > Yes, much better now. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at mad-scientist.net Fri May 31 13:35:29 2013 From: paul at mad-scientist.net (Paul Smith) Date: Fri, 31 May 2013 07:35:29 -0400 Subject: [Python-Dev] Problem building Python 2.7.5 with separate sysroot In-Reply-To: References: <1369986770.4119.43.camel@homebase> Message-ID: <1370000129.4119.53.camel@homebase> On Fri, 2013-05-31 at 01:21 -0700, Ned Deily wrote: > In article <1369986770.4119.43.camel at homebase>, > Paul Smith wrote: > > > Hi all. I'm trying to build Python 2.7.5 on a GNU/Linux (Linux Mint 14) > > system, but using a different sysroot (that is, a separate > > /usr/include, /usr/lib, etc., not the real one for my system). > > This list is for the development of Python itself, not about using or > installing it. Python-list (AKA comp.lang.python) is the right list to > ask such questions. That said ... > > > However, I'm having serious problems building modules such as fcntl, > > etc. Looking at the output from the makefile, I can see that somehow, > > someone is forcibly adding "-I/usr/include/x86_64-linux-gnu" to the link > > line: [...] > > ... include file and library file selections for building standard > library modules are handled by the top-level setup.py file in the source > tree. That's where /usr/local/... is added and chances are that the > above header is being added by add_gcc_paths() in setup.py. Yes, thank you. It seems to me (keeping with the theme of this mailing list) that the add_multiarch_paths() function in setup.py is not right. The first step, which asks the compiler about multi-arch, is OK because it's using my alternate compiler which reports no multiarch. But then it proceeds to run the local host version of dpkg-architecture. I see that it adds the -t flag if cross-compiling, which I'm not, but even that is not fixing the issue. If you're building on a system which is Debian derived with multi-arch support you will ALWAYS have your local "/usr/include" (plus some multiarch suffix) -- and '/usr/lib' + multiarch -- added to your include and lib paths; this is not good. My case may be unusual but even in a more formal cross-compilation environment it's not good to add /usr/include/..., or base such a decision on the behavior of the _build_ system. From gjcarneiro at gmail.com Fri May 31 12:18:07 2013 From: gjcarneiro at gmail.com (Gustavo Carneiro) Date: Fri, 31 May 2013 11:18:07 +0100 Subject: [Python-Dev] PEP 443 - request for pronouncement In-Reply-To: <0748525E-61D5-46F7-8ECD-E8C7C3A43B5C@langa.pl> References: <0748525E-61D5-46F7-8ECD-E8C7C3A43B5C@langa.pl> Message-ID: Sorry, maybe I am too late to comment on this, but, >>> @singledispatch ... def fun(arg, verbose=False): ... if verbose: ... print("Let me just say,", end=" ") ... print(arg) It is not clear from the PEP (up until the end of the User API section at least) when, if ever, is this implementation of fun ever called. I mean, what type of 'arg' triggers a dispatch to this function body? I am guessing that when the arg does not match the type of any of the other registered functions, this function body is used by default. But it is only a guess, the PEP doesn't state this clearly. If my guess is true, would it be reasonable to update the example "def fun" code to reflect this, e.g., to print("Warning: I do not know what to do with arg {} of type {}".format(arg, type(arg)). So my comment is just about clarity of the PEP text. I do not wish to interfere with pronouncement. Thanks. On Fri, May 31, 2013 at 10:46 AM, ?ukasz Langa wrote: > Hello python-dev, > > PEP 443 is ready for final review. I'm attaching the latest > version below for convenience. The full history of changes > is available here: http://hg.python.org/peps/log/tip/pep-0443.txt > > A reference implementation for PEP 443 is available at: > http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l363 > > with relevant tests here: > > http://hg.python.org/features/pep-443/file/tip/Lib/test/test_functools.py#l855 > > and documentation here: > > http://hg.python.org/features/pep-443/file/tip/Doc/library/functools.rst#l189 > > There's also an official backport for 2.6 - 3.3 already up: > https://pypi.python.org/pypi/singledispatch > > > > PEP: 443 > Title: Single-dispatch generic functions > Version: $Revision$ > Last-Modified: $Date$ > Author: ?ukasz Langa > Discussions-To: Python-Dev > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 22-May-2013 > Post-History: 22-May-2013, 25-May-2013, 31-May-2013 > Replaces: 245, 246, 3124 > > > Abstract > ======== > > This PEP proposes a new mechanism in the ``functools`` standard library > module that provides a simple form of generic programming known as > single-dispatch generic functions. > > A **generic function** is composed of multiple functions implementing > the same operation for different types. Which implementation should be > used during a call is determined by the dispatch algorithm. When the > implementation is chosen based on the type of a single argument, this is > known as **single dispatch**. > > > Rationale and Goals > =================== > > Python has always provided a variety of built-in and standard-library > generic functions, such as ``len()``, ``iter()``, ``pprint.pprint()``, > ``copy.copy()``, and most of the functions in the ``operator`` module. > However, it currently: > > 1. does not have a simple or straightforward way for developers to > create new generic functions, > > 2. does not have a standard way for methods to be added to existing > generic functions (i.e., some are added using registration > functions, others require defining ``__special__`` methods, possibly > by monkeypatching). > > In addition, it is currently a common anti-pattern for Python code to > inspect the types of received arguments, in order to decide what to do > with the objects. For example, code may wish to accept either an object > of some type, or a sequence of objects of that type. > > Currently, the "obvious way" to do this is by type inspection, but this > is brittle and closed to extension. Abstract Base Classes make it easier > to discover present behaviour, but don't help adding new behaviour. > A developer using an already-written library may be unable to change how > their objects are treated by such code, especially if the objects they > are using were created by a third party. > > Therefore, this PEP proposes a uniform API to address dynamic > overloading using decorators. > > > User API > ======== > > To define a generic function, decorate it with the ``@singledispatch`` > decorator. Note that the dispatch happens on the type of the first > argument, create your function accordingly:: > > >>> from functools import singledispatch > >>> @singledispatch > ... def fun(arg, verbose=False): > ... if verbose: > ... print("Let me just say,", end=" ") > ... print(arg) > > To add overloaded implementations to the function, use the > ``register()`` attribute of the generic function. It is a decorator, > taking a type parameter and decorating a function implementing the > operation for that type:: > > >>> @fun.register(int) > ... def _(arg, verbose=False): > ... if verbose: > ... print("Strength in numbers, eh?", end=" ") > ... print(arg) > ... > >>> @fun.register(list) > ... def _(arg, verbose=False): > ... if verbose: > ... print("Enumerate this:") > ... for i, elem in enumerate(arg): > ... print(i, elem) > > To enable registering lambdas and pre-existing functions, the > ``register()`` attribute can be used in a functional form:: > > >>> def nothing(arg, verbose=False): > ... print("Nothing.") > ... > >>> fun.register(type(None), nothing) > > The ``register()`` attribute returns the undecorated function which > enables decorator stacking, pickling, as well as creating unit tests for > each variant independently:: > > >>> @fun.register(float) > ... @fun.register(Decimal) > ... def fun_num(arg, verbose=False): > ... if verbose: > ... print("Half of your number:", end=" ") > ... print(arg / 2) > ... > >>> fun_num is fun > False > > When called, the generic function dispatches on the type of the first > argument:: > > >>> fun("Hello, world.") > Hello, world. > >>> fun("test.", verbose=True) > Let me just say, test. > >>> fun(42, verbose=True) > Strength in numbers, eh? 42 > >>> fun(['spam', 'spam', 'eggs', 'spam'], verbose=True) > Enumerate this: > 0 spam > 1 spam > 2 eggs > 3 spam > >>> fun(None) > Nothing. > >>> fun(1.23) > 0.615 > > Where there is no registered implementation for a specific type, its > method resolution order is used to find a more generic implementation. > To check which implementation will the generic function choose for > a given type, use the ``dispatch()`` attribute:: > > >>> fun.dispatch(float) > > >>> fun.dispatch(dict) > > > To access all registered implementations, use the read-only ``registry`` > attribute:: > > >>> fun.registry.keys() > dict_keys([, , , > , , > ]) > >>> fun.registry[float] > > >>> fun.registry[object] > > > The proposed API is intentionally limited and opinionated, as to ensure > it is easy to explain and use, as well as to maintain consistency with > existing members in the ``functools`` module. > > > Implementation Notes > ==================== > > The functionality described in this PEP is already implemented in the > ``pkgutil`` standard library module as ``simplegeneric``. Because this > implementation is mature, the goal is to move it largely as-is. The > reference implementation is available on hg.python.org [#ref-impl]_. > > The dispatch type is specified as a decorator argument. An alternative > form using function annotations has been considered but its inclusion > has been deferred. As of May 2013, this usage pattern is out of scope > for the standard library [#pep-0008]_ and the best practices for > annotation usage are still debated. > > Based on the current ``pkgutil.simplegeneric`` implementation and > following the convention on registering virtual subclasses on Abstract > Base Classes, the dispatch registry will not be thread-safe. > > Abstract Base Classes > --------------------- > > The ``pkgutil.simplegeneric`` implementation relied on several forms of > method resultion order (MRO). ``@singledispatch`` removes special > handling of old-style classes and Zope's ExtensionClasses. More > importantly, it introduces support for Abstract Base Classes (ABC). > > When a generic function implementation is registered for an ABC, the > dispatch algorithm switches to a mode of MRO calculation for the > provided argument which includes the relevant ABCs. The algorithm is as > follows:: > > def _compose_mro(cls, haystack): > """Calculates the MRO for a given class `cls`, including relevant > abstract base classes from `haystack`.""" > bases = set(cls.__mro__) > mro = list(cls.__mro__) > for regcls in haystack: > if regcls in bases or not issubclass(cls, regcls): > continue # either present in the __mro__ or unrelated > for index, base in enumerate(mro): > if not issubclass(base, regcls): > break > if base in bases and not issubclass(regcls, base): > # Conflict resolution: put classes present in __mro__ > # and their subclasses first. > index += 1 > mro.insert(index, regcls) > return mro > > In its most basic form, it returns the MRO for the given type:: > > >>> _compose_mro(dict, []) > [, ] > > When the haystack consists of ABCs that the specified type is a subclass > of, they are inserted in a predictable order:: > > >>> _compose_mro(dict, [Sized, MutableMapping, str, > ... Sequence, Iterable]) > [, , > , , > ] > > While this mode of operation is significantly slower, all dispatch > decisions are cached. The cache is invalidated on registering new > implementations on the generic function or when user code calls > ``register()`` on an ABC to register a new virtual subclass. In the > latter case, it is possible to create a situation with ambiguous > dispatch, for instance:: > > >>> from collections import Iterable, Container > >>> class P: > ... pass > >>> Iterable.register(P) > > >>> Container.register(P) > > > Faced with ambiguity, ``@singledispatch`` refuses the temptation to > guess:: > > >>> @singledispatch > ... def g(arg): > ... return "base" > ... > >>> g.register(Iterable, lambda arg: "iterable") > at 0x108b49110> > >>> g.register(Container, lambda arg: "container") > at 0x108b491c8> > >>> g(P()) > Traceback (most recent call last): > ... > RuntimeError: Ambiguous dispatch: > or > > Note that this exception would not be raised if ``Iterable`` and > ``Container`` had been provided as base classes during class definition. > In this case dispatch happens in the MRO order:: > > >>> class Ten(Iterable, Container): > ... def __iter__(self): > ... for i in range(10): > ... yield i > ... def __contains__(self, value): > ... return value in range(10) > ... > >>> g(Ten()) > 'iterable' > > > Usage Patterns > ============== > > This PEP proposes extending behaviour only of functions specifically > marked as generic. Just as a base class method may be overridden by > a subclass, so too may a function be overloaded to provide custom > functionality for a given type. > > Universal overloading does not equal *arbitrary* overloading, in the > sense that we need not expect people to randomly redefine the behavior > of existing functions in unpredictable ways. To the contrary, generic > function usage in actual programs tends to follow very predictable > patterns and registered implementations are highly-discoverable in the > common case. > > If a module is defining a new generic operation, it will usually also > define any required implementations for existing types in the same > place. Likewise, if a module is defining a new type, then it will > usually define implementations there for any generic functions that it > knows or cares about. As a result, the vast majority of registered > implementations can be found adjacent to either the function being > overloaded, or to a newly-defined type for which the implementation is > adding support. > > It is only in rather infrequent cases that one will have implementations > registered in a module that contains neither the function nor the > type(s) for which the implementation is added. In the absence of > incompetence or deliberate intention to be obscure, the few > implementations that are not registered adjacent to the relevant type(s) > or function(s), will generally not need to be understood or known about > outside the scope where those implementations are defined. (Except in > the "support modules" case, where best practice suggests naming them > accordingly.) > > As mentioned earlier, single-dispatch generics are already prolific > throughout the standard library. A clean, standard way of doing them > provides a way forward to refactor those custom implementations to use > a common one, opening them up for user extensibility at the same time. > > > Alternative approaches > ====================== > > In PEP 3124 [#pep-3124]_ Phillip J. Eby proposes a full-grown solution > with overloading based on arbitrary rule sets (with the default > implementation dispatching on argument types), as well as interfaces, > adaptation and method combining. PEAK-Rules [#peak-rules]_ is > a reference implementation of the concepts described in PJE's PEP. > > Such a broad approach is inherently complex, which makes reaching > a consensus hard. In contrast, this PEP focuses on a single piece of > functionality that is simple to reason about. It's important to note > this does not preclude the use of other approaches now or in the future. > > In a 2005 article on Artima [#artima2005]_ Guido van Rossum presents > a generic function implementation that dispatches on types of all > arguments on a function. The same approach was chosen in Andrey Popp's > ``generic`` package available on PyPI [#pypi-generic]_, as well as David > Mertz's ``gnosis.magic.multimethods`` [#gnosis-multimethods]_. > > While this seems desirable at first, I agree with Fredrik Lundh's > comment that "if you design APIs with pages of logic just to sort out > what code a function should execute, you should probably hand over the > API design to someone else". In other words, the single argument > approach proposed in this PEP is not only easier to implement but also > clearly communicates that dispatching on a more complex state is an > anti-pattern. It also has the virtue of corresponding directly with the > familiar method dispatch mechanism in object oriented programming. The > only difference is whether the custom implementation is associated more > closely with the data (object-oriented methods) or the algorithm > (single-dispatch overloading). > > PyPy's RPython offers ``extendabletype`` [#pairtype]_, a metaclass which > enables classes to be externally extended. In combination with > ``pairtype()`` and ``pair()`` factories, this offers a form of > single-dispatch generics. > > > Acknowledgements > ================ > > Apart from Phillip J. Eby's work on PEP 3124 [#pep-3124]_ and > PEAK-Rules, influences include Paul Moore's original issue > [#issue-5135]_ that proposed exposing ``pkgutil.simplegeneric`` as part > of the ``functools`` API, Guido van Rossum's article on multimethods > [#artima2005]_, and discussions with Raymond Hettinger on a general > pprint rewrite. Huge thanks to Nick Coghlan for encouraging me to create > this PEP and providing initial feedback. > > > References > ========== > > .. [#ref-impl] > http://hg.python.org/features/pep-443/file/tip/Lib/functools.py#l359 > > .. [#pep-0008] PEP 8 states in the "Programming Recommendations" > section that "the Python standard library will not use function > annotations as that would result in a premature commitment to > a particular annotation style". > (http://www.python.org/dev/peps/pep-0008) > > .. [#pep-3124] http://www.python.org/dev/peps/pep-3124/ > > .. [#peak-rules] http://peak.telecommunity.com/DevCenter/PEAK_2dRules > > .. [#artima2005] > http://www.artima.com/weblogs/viewpost.jsp?thread=101605 > > .. [#pypi-generic] http://pypi.python.org/pypi/generic > > .. [#gnosis-multimethods] > http://gnosis.cx/publish/programming/charming_python_b12.html > > .. [#pairtype] > https://bitbucket.org/pypy/pypy/raw/default/rpython/tool/pairtype.py > > .. [#issue-5135] http://bugs.python.org/issue5135 > > > Copyright > ========= > > This document has been placed in the public domain. > > > .. > Local Variables: > mode: indented-text > indent-tabs-mode: nil > sentence-end-double-space: t > fill-column: 70 > coding: utf-8 > End: > > > > -- > Best regards, > ?ukasz Langa > > WWW: http://lukasz.langa.pl/ > Twitter: @llanga > IRC: ambv on #python-dev > > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/gjcarneiro%40gmail.com > -- Gustavo J. A. M. Carneiro "The universe is always one step beyond logic." -- Frank Herbert -------------- next part -------------- An HTML attachment was scrubbed... URL: From status at bugs.python.org Fri May 31 18:07:31 2013 From: status at bugs.python.org (Python tracker) Date: Fri, 31 May 2013 18:07:31 +0200 (CEST) Subject: [Python-Dev] Summary of Python tracker Issues Message-ID: <20130531160731.0BDE7560C8@psf.upfronthosting.co.za> ACTIVITY SUMMARY (2013-05-24 - 2013-05-31) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open 3997 (+25) closed 25884 (+34) total 29881 (+59) Open issues with patches: 1780 Issues opened (45) ================== #6386: importing yields unexpected results when initial script is a s http://bugs.python.org/issue6386 reopened by ncoghlan #17269: getaddrinfo segfaults on OS X when provided with invalid argum http://bugs.python.org/issue17269 reopened by ned.deily #18050: embedded interpreter or virtualenv fails with "ImportError: ca http://bugs.python.org/issue18050 opened by samueljohn #18052: IDLE 3.3.2 Windows taskbar icon regression http://bugs.python.org/issue18052 opened by terry.reedy #18053: Add checks for Misc/NEWS in make patchcheck http://bugs.python.org/issue18053 opened by Yogesh.Chaudhari #18054: Add more exception related assertions to unittest http://bugs.python.org/issue18054 opened by ncoghlan #18055: Stop using imp in IDLE http://bugs.python.org/issue18055 opened by brett.cannon #18056: Document importlib._bootstrap.NamespaceLoader http://bugs.python.org/issue18056 opened by brett.cannon #18057: Register NamespaceLoader with importlib.abc.Loader http://bugs.python.org/issue18057 opened by brett.cannon #18058: Define is_package for NamespaceLoader http://bugs.python.org/issue18058 opened by brett.cannon #18059: Add multibyte encoding support to pyexpat http://bugs.python.org/issue18059 opened by serhiy.storchaka #18060: Updating _fields_ of a derived struct type yields a bad cif http://bugs.python.org/issue18060 opened by lauri.alanko #18061: m68k Python 3.3 test results http://bugs.python.org/issue18061 opened by mirabilos #18062: m68k FPU precision issue http://bugs.python.org/issue18062 opened by mirabilos #18064: IDLE: add current directory to open_module http://bugs.python.org/issue18064 opened by terry.reedy #18065: set __path__ = [] for frozen packages http://bugs.python.org/issue18065 opened by brett.cannon #18066: Remove SGI-specific code from pty.py http://bugs.python.org/issue18066 opened by akuchling #18068: pickle + weakref.proxy(self) http://bugs.python.org/issue18068 opened by phd #18069: Subprocess picks the wrong executable on Windows http://bugs.python.org/issue18069 opened by berdario #18071: _osx_support compiler_fixup http://bugs.python.org/issue18071 opened by samueljohn #18073: pickle.Unpickler may read too many bytes, causing hangs with b http://bugs.python.org/issue18073 opened by jm #18076: Implement importlib.util.decode_source_bytes() http://bugs.python.org/issue18076 opened by brett.cannon #18078: threading.Condition to allow notify on a specific waiter http://bugs.python.org/issue18078 opened by JBernardo #18081: test_logging failure in WarningsTest on buildbots http://bugs.python.org/issue18081 opened by r.david.murray #18082: Inconsistent behavior of IOBase methods on closed files http://bugs.python.org/issue18082 opened by dwight.guth #18083: _sysconfigdata.py is installed in an arch-independent director http://bugs.python.org/issue18083 opened by automatthias #18085: Verifying refcounts.dat http://bugs.python.org/issue18085 opened by serhiy.storchaka #18088: Create importlib.abc.Loader.init_module_attrs() http://bugs.python.org/issue18088 opened by brett.cannon #18089: Create importlib.abc.InspectLoader.load_module() http://bugs.python.org/issue18089 opened by brett.cannon #18090: dict_contains first argument declared register, and shouldn't http://bugs.python.org/issue18090 opened by larry #18091: Remove PyNoArgsFunction http://bugs.python.org/issue18091 opened by larry #18092: Python 2.7.5 installation broken on OpenSuse 12.2 http://bugs.python.org/issue18092 opened by Andreas.Jung #18093: Move main functions to a separate Programs directory http://bugs.python.org/issue18093 opened by ncoghlan #18094: Skip tests in test_uuid not silently http://bugs.python.org/issue18094 opened by serhiy.storchaka #18095: unable to invoke socket.connect with AF_UNSPEC http://bugs.python.org/issue18095 opened by Roman.Valov #18096: bad library order returned by python-config.in http://bugs.python.org/issue18096 opened by taylor #18099: wsgiref sets Content-Length: 0 on 304 Not Modified http://bugs.python.org/issue18099 opened by flox #18100: socket.sendall() cannot send buffers of 2GB or more http://bugs.python.org/issue18100 opened by Tom.van.Leeuwen #18101: Tk.split() doesn't work with nested Unicode strings http://bugs.python.org/issue18101 opened by serhiy.storchaka #18102: except-clause with own exception class inside generator can le http://bugs.python.org/issue18102 opened by hagen #18103: Create a GUI test framework for Idle http://bugs.python.org/issue18103 opened by terry.reedy #18104: Idle: make human-mediated GUI tests usable http://bugs.python.org/issue18104 opened by terry.reedy #18105: ElementTree writes invalid files when UTF-16 encoding is speci http://bugs.python.org/issue18105 opened by Adam.Urban #18106: There are unused variables in Lib/test/test_collections.py http://bugs.python.org/issue18106 opened by vajrasky #18108: shutil.chown should support dir_fd and follow_symlinks keyword http://bugs.python.org/issue18108 opened by cjwatson Most recent 15 issues with no replies (15) ========================================== #18108: shutil.chown should support dir_fd and follow_symlinks keyword http://bugs.python.org/issue18108 #18101: Tk.split() doesn't work with nested Unicode strings http://bugs.python.org/issue18101 #18099: wsgiref sets Content-Length: 0 on 304 Not Modified http://bugs.python.org/issue18099 #18096: bad library order returned by python-config.in http://bugs.python.org/issue18096 #18095: unable to invoke socket.connect with AF_UNSPEC http://bugs.python.org/issue18095 #18094: Skip tests in test_uuid not silently http://bugs.python.org/issue18094 #18089: Create importlib.abc.InspectLoader.load_module() http://bugs.python.org/issue18089 #18088: Create importlib.abc.Loader.init_module_attrs() http://bugs.python.org/issue18088 #18082: Inconsistent behavior of IOBase methods on closed files http://bugs.python.org/issue18082 #18081: test_logging failure in WarningsTest on buildbots http://bugs.python.org/issue18081 #18076: Implement importlib.util.decode_source_bytes() http://bugs.python.org/issue18076 #18073: pickle.Unpickler may read too many bytes, causing hangs with b http://bugs.python.org/issue18073 #18068: pickle + weakref.proxy(self) http://bugs.python.org/issue18068 #18066: Remove SGI-specific code from pty.py http://bugs.python.org/issue18066 #18065: set __path__ = [] for frozen packages http://bugs.python.org/issue18065 Most recent 15 issues waiting for review (15) ============================================= #18106: There are unused variables in Lib/test/test_collections.py http://bugs.python.org/issue18106 #18101: Tk.split() doesn't work with nested Unicode strings http://bugs.python.org/issue18101 #18094: Skip tests in test_uuid not silently http://bugs.python.org/issue18094 #18093: Move main functions to a separate Programs directory http://bugs.python.org/issue18093 #18078: threading.Condition to allow notify on a specific waiter http://bugs.python.org/issue18078 #18066: Remove SGI-specific code from pty.py http://bugs.python.org/issue18066 #18059: Add multibyte encoding support to pyexpat http://bugs.python.org/issue18059 #18055: Stop using imp in IDLE http://bugs.python.org/issue18055 #18049: Re-enable threading test on OSX http://bugs.python.org/issue18049 #18045: get_python_version is not import in bdist_rpm.py http://bugs.python.org/issue18045 #18038: Unhelpful error message on invalid encoding specification http://bugs.python.org/issue18038 #18033: Example for Profile Module shows incorrect method http://bugs.python.org/issue18033 #18020: html.escape 10x slower than cgi.escape http://bugs.python.org/issue18020 #18015: python 2.7.5 fails to unpickle namedtuple pickled by 2.7.3 or http://bugs.python.org/issue18015 #18013: cgi.FieldStorage does not parse W3C sample http://bugs.python.org/issue18013 Top 10 most discussed issues (10) ================================= #16832: Expose cache validity checking support in ABCMeta http://bugs.python.org/issue16832 19 msgs #17987: test.support.captured_stderr, captured_stdin not documented http://bugs.python.org/issue17987 16 msgs #17947: Code, test, and doc review for PEP-0435 Enum http://bugs.python.org/issue17947 12 msgs #18062: m68k FPU precision issue http://bugs.python.org/issue18062 12 msgs #18078: threading.Condition to allow notify on a specific waiter http://bugs.python.org/issue18078 9 msgs #1693050: \w not helpful for non-Roman scripts http://bugs.python.org/issue1693050 9 msgs #5124: IDLE - pasting text doesn't delete selection http://bugs.python.org/issue5124 8 msgs #12641: Remove -mno-cygwin from distutils http://bugs.python.org/issue12641 8 msgs #18085: Verifying refcounts.dat http://bugs.python.org/issue18085 8 msgs #18090: dict_contains first argument declared register, and shouldn't http://bugs.python.org/issue18090 8 msgs Issues closed (34) ================== #8240: ssl.SSLSocket.write may fail on non-blocking sockets http://bugs.python.org/issue8240 closed by pitrou #15392: Create a unittest framework for IDLE http://bugs.python.org/issue15392 closed by terry.reedy #17206: Py_XDECREF() expands its argument multiple times http://bugs.python.org/issue17206 closed by python-dev #17272: request.full_url: unexpected results on assignment http://bugs.python.org/issue17272 closed by orsenthil #17700: Update Curses HOWTO for 3.4 http://bugs.python.org/issue17700 closed by akuchling #17746: test_shutil.TestWhich.test_non_matching_mode fails when runnin http://bugs.python.org/issue17746 closed by serhiy.storchaka #17768: _decimal: allow NUL fill character http://bugs.python.org/issue17768 closed by skrah #17953: sys.modules cannot be reassigned http://bugs.python.org/issue17953 closed by brett.cannon #18009: os.write.__doc__ is misleading http://bugs.python.org/issue18009 closed by python-dev #18011: Inconsistency between b32decode() documentation, docstring and http://bugs.python.org/issue18011 closed by serhiy.storchaka #18025: Buffer overflow in BufferedIOBase.readinto() http://bugs.python.org/issue18025 closed by serhiy.storchaka #18040: SIGINT catching regression on windows in 2.7 http://bugs.python.org/issue18040 closed by tim.golden #18046: Simplify and clarify logging internals http://bugs.python.org/issue18046 closed by python-dev #18047: Descriptors get invoked in old-style objects and classes http://bugs.python.org/issue18047 closed by rhettinger #18051: get_python_version undefined in bdist_rpm.py http://bugs.python.org/issue18051 closed by Craig.McDaniel #18063: m68k struct alignment issue vs. PyException_HEAD http://bugs.python.org/issue18063 closed by pitrou #18067: In considering the current directory for importing modules, Py http://bugs.python.org/issue18067 closed by r.david.murray #18070: change importlib.util.module_for_loader to unconditionally set http://bugs.python.org/issue18070 closed by brett.cannon #18072: Add an implementation of importlib.abc.InspectLoader.get_code http://bugs.python.org/issue18072 closed by brett.cannon #18074: argparse: Namespace can contain critical characters http://bugs.python.org/issue18074 closed by Sworddragon #18075: Infinite recursion tests triggering a segfault http://bugs.python.org/issue18075 closed by lukasz.langa #18077: dis.dis raises IndexError http://bugs.python.org/issue18077 closed by serhiy.storchaka #18079: documentation tutorial 3.1.3 typo http://bugs.python.org/issue18079 closed by serhiy.storchaka #18080: setting CC no longer overrides default linker for extension mo http://bugs.python.org/issue18080 closed by ned.deily #18084: wave.py should use sys.byteorder to determine endianess http://bugs.python.org/issue18084 closed by serhiy.storchaka #18086: Create importlib.util.set_attrs_post_import() http://bugs.python.org/issue18086 closed by brett.cannon #18087: os.listdir don't show hidden option http://bugs.python.org/issue18087 closed by serhiy.storchaka #18097: spam http://bugs.python.org/issue18097 closed by amaury.forgeotdarc #18098: "Build Applet.app" build fails on OS X 10.8 http://bugs.python.org/issue18098 closed by ned.deily #18107: 'str(long)' can be made faster http://bugs.python.org/issue18107 closed by arigo #967161: pty.spawn() enhancements http://bugs.python.org/issue967161 closed by akuchling #706406: fix bug #685846: raw_input defers signals http://bugs.python.org/issue706406 closed by amaury.forgeotdarc #592703: HTTPS does not handle pipelined requests http://bugs.python.org/issue592703 closed by amaury.forgeotdarc #1554133: PyOS_InputHook() and related API funcs. not documented http://bugs.python.org/issue1554133 closed by akuchling From nad at acm.org Fri May 31 23:14:34 2013 From: nad at acm.org (Ned Deily) Date: Fri, 31 May 2013 14:14:34 -0700 Subject: [Python-Dev] Problem building Python 2.7.5 with separate sysroot References: <1369986770.4119.43.camel@homebase> <1370000129.4119.53.camel@homebase> Message-ID: In article <1370000129.4119.53.camel at homebase>, Paul Smith wrote: > It seems to me (keeping with the theme of this mailing list) that the > add_multiarch_paths() function in setup.py is not right. [...] If you think there is a problem, please open an issue for it on the Python bug tracker: http://bugs.python.org. Thanks! -- Ned Deily, nad at acm.org