From masklinn at masklinn.net Mon Oct 4 10:34:33 2010 From: masklinn at masklinn.net (Masklinn) Date: Mon, 4 Oct 2010 10:34:33 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: <7E929BBF-952E-4B86-BBDE-E7C8AD437337@masklinn.net> On 2010-10-04, at 05:04 , Eviatar Bach wrote: > Hello, > > I have a proposal of making the range() function inclusive; that is, > range(3) would generate 0, 1, 2, and 3, as opposed to 0, 1, and 2. Not only > is it more intuitive, it also seems to be used often, with coders often > writing range(0, example+1) to get the intended result. It would be easy to > implement, and though significant, is not any more drastic than changing > print to a function in Python 3. Of course, if this were done, slicing > behaviour would have to be adjusted accordingly. > > What are your thoughts? Same as the others: 0. This is a discussion for python-ideas, I'm CCing that list 1. This is a major backwards compatibility breakage, and one which is entirely silent (`print` from keyword to function wasn't) 2. It loses not only well-known behavior but interesting properties as well (`range(n)` has exactly `n` elements. With your proposal, it has ``n+1`` breaking ``for i in range(5)`` to iterate 5 times as well as ``for i in range(len(collection))`` for cases where e.g. ``enumerate`` is not good enough or too slow) 3. As well as the relation between range and slices 4. I fail to see how it is more intuitive (let alone more practical, see previous points) 5. If you want an inclusive range, I'd recommend proposing a flag on `range` (e.g. ``inclusive=True``) rather than such a drastic breakage of ``range``'s behavior. That, at least, might have a chance. Changing the existing default behavior of range most definitely doesn't. I'd be ?1 on your proposal, ?0 on adding a flag to ``range`` (I can't recall the half-open ``range`` having bothered me recently, if ever) From ncoghlan at gmail.com Mon Oct 4 14:59:51 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 4 Oct 2010 22:59:51 +1000 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: On Mon, Oct 4, 2010 at 5:27 PM, Xavier Morel wrote: > Same as the others: > 0. This is a discussion for python-ideas, I'm CCing that list > 1. This is a major backwards compatibility breakage, and one which is entirely silent (`print` from keyword to function wasn't) > 2. It loses not only well-known behavior but interesting properties as well (`range(n)` has exactly `n` elements. With your proposal, it has ``n+1`` breaking ``for i in range(5)`` to iterate 5 times as well as ``for i in range(len(collection))`` for cases where e.g. ``enumerate`` is not good enough or too slow) > 3. As well as the relation between range and slices > 4. I fail to see how it is more intuitive (let alone more practical, see previous points) > 5. If you want an inclusive range, I'd recommend proposing a flag on `range` (e.g. ``inclusive=True``) rather than such a drastic breakage of ``range``'s behavior. That, at least, might have a chance. Changing the existing default behavior of range most definitely doesn't. A flag doesn't have any chance either - you spell inclusive ranges by including a "+1" on the stop value. Closed ranges actually do superficially appear more intuitive (especially to new programmers) because we often use inclusive ranges in ordinary speech ("10-15 people" allows 15 people, "ages 8-12" includes 12 year olds, "from A-Z" includes items starting with "Z"). However, there are some cases where we naturally use half-open ranges as well (such as "between 10 and 12" excluding 12:01 to 12:59) or explicitly invoke exclusive ranges as being easier to deal with (such as the "under 13s", "under 19s", etc naming schemes used for age brackets in junior sports) However, as soon you move into the mathematical world (including programming), closed ranges turn out to require constant adjustments in the arithmetic, so it far more natural to use half-open ranges consistently. Xavier noted the two most important properties of half-closed ranges for Python: they match the definition of subtraction, such that len(range(start, stop)) = (stop - start), and they match the definition of slicing as being half-open. As to whether slicing itself being half-open is beneficial, the value of that becomes clear ones you start trying to manipulate ranges: With half-open slices, the following is true: s == s[:i] + s[i:] With inclusive slices (which would be needed to complement inclusive range), you would need either a -1 on the stop value of the first slice, or a +1 on the start value of the second slice. Similarly, if you know the length of the slice you want, then you can grab it via s[i:i+slice_len], while you'd need a -1 correction on the stop value if slices were inclusive. There are other benefits to half-open ranges when it comes to (approximately) continuous spectra like time values, floating point numbers and lexically ordered strings. Being able to say things like "10:00" <= x < '12:00", 10.0 <= x < 12.0, "a" <= x < "n" are much clearer than trying to specify their closed range equivalents. While that isn't specifically applicable to the range() builtin, it is another factor in why it is important to drink the "half-open ranges are your friend" Kool-aid as a serious programmer. Cheers, Nick. P.S. many of the points above are just rephrased from http://www.siliconbrain.com/ranges.htm, which is the first hit when Googling "half-open ranges" -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Tue Oct 5 10:54:04 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Mon, 4 Oct 2010 22:54:04 -1000 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: Changing range would only make sense if lists were also changed to start at 1 instead of 0, and that's never gonna happen. It's a massively backwards incompatible change with no real offsetting advantage. Still, if you were designing a brand new language today, would you have arrays/lists start at 0 or 1? (Or compromise and do .5?) I personally lean towards 1, since I recall being frequently tripped up by the first element in an array being a[0] way back when I first learn C++ in the 20th century. But maybe this was because I had been messed up by writing BASIC for loops from 1 to n before that? Is there anyone with teaching experience here? Is this much of a problem for young people learning Python (or any other zero-based indexing language) as their first language? What do you guys think? Now that simplifying pointer arithmetic isn't such an important consideration, is it still better to do zero-based indexing? -- Carl Johnson From cmjohnson.mailinglist at gmail.com Tue Oct 5 10:54:04 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Mon, 4 Oct 2010 22:54:04 -1000 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: Changing range would only make sense if lists were also changed to start at 1 instead of 0, and that's never gonna happen. It's a massively backwards incompatible change with no real offsetting advantage. Still, if you were designing a brand new language today, would you have arrays/lists start at 0 or 1? (Or compromise and do .5?) I personally lean towards 1, since I recall being frequently tripped up by the first element in an array being a[0] way back when I first learn C++ in the 20th century. But maybe this was because I had been messed up by writing BASIC for loops from 1 to n before that? Is there anyone with teaching experience here? Is this much of a problem for young people learning Python (or any other zero-based indexing language) as their first language? What do you guys think? Now that simplifying pointer arithmetic isn't such an important consideration, is it still better to do zero-based indexing? -- Carl Johnson From cmjohnson.mailinglist at gmail.com Tue Oct 5 10:54:04 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Mon, 4 Oct 2010 22:54:04 -1000 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: Changing range would only make sense if lists were also changed to start at 1 instead of 0, and that's never gonna happen. It's a massively backwards incompatible change with no real offsetting advantage. Still, if you were designing a brand new language today, would you have arrays/lists start at 0 or 1? (Or compromise and do .5?) I personally lean towards 1, since I recall being frequently tripped up by the first element in an array being a[0] way back when I first learn C++ in the 20th century. But maybe this was because I had been messed up by writing BASIC for loops from 1 to n before that? Is there anyone with teaching experience here? Is this much of a problem for young people learning Python (or any other zero-based indexing language) as their first language? What do you guys think? Now that simplifying pointer arithmetic isn't such an important consideration, is it still better to do zero-based indexing? -- Carl Johnson From masklinn at masklinn.net Tue Oct 5 11:08:35 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 5 Oct 2010 11:08:35 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> On 2010-10-05, at 10:54 , Carl M. Johnson wrote: > Changing range would only make sense if lists were also changed to > start at 1 instead of 0, and that's never gonna happen. It's a > massively backwards incompatible change with no real offsetting > advantage. > > Still, if you were designing a brand new language today, would you > have arrays/lists start at 0 or 1? (Or compromise and do .5?) I > personally lean towards 1, since I recall being frequently tripped up > by the first element in an array being a[0] way back when I first > learn C++ in the 20th century. But maybe this was because I had been > messed up by writing BASIC for loops from 1 to n before that? Is there > anyone with teaching experience here? Is this much of a problem for > young people learning Python (or any other zero-based indexing > language) as their first language? > > What do you guys think? Now that simplifying pointer arithmetic isn't > such an important consideration, is it still better to do zero-based > indexing? I will refer to EWD 831[0], which talks about ranges and starting indexes without *once* referring to pointers. Pointers are in fact entirely irrelevant to the discussion: FORTRAN and ALGOL 60, among many others, used 1-indexed collections. Some languages (ADA, I believe, though I am by no means certain) also allow for arbitrary starting indexes. [0] http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF From cmjohnson.mailinglist at gmail.com Tue Oct 5 12:52:39 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Tue, 5 Oct 2010 00:52:39 -1000 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> Message-ID: I did some research before posting and saw that they talked about that Dykstra paper on C2's page about zero indexing, and honestly, I count it as a point in favor of starting with 1. Dykstra was a great computer scientist but a terrible computer programmer (with the exception of "Goto Considered Harmful" [a headline he didn't actually give to his article]) in the sense that he understand how to do things mathematically but not how to take into account the human factors in such a way that one can get normal people to program well. His theory that we should all be proving the correctness of our program is, to my way of thinking, a crank's theory. If regular people can't be trusted to program, they certainly can't be trusted to write correctness proofs, which is a harder task, not a simpler one. Moreover this ignores all of the stuff that Paul Graham would eventually say about the joys of exploratory programming, or to give an earlier reference, the need to build one to throw away as Brooks said. Proving correctness presumes that you know what you want to program before you start programming it, which is only rarely the case, mostly in the computer science classroom. So, I don't consider Dykstra's expertise to be worth relying in matters of programming, as distinct from matters of computer science. In the particular case, the correct way to represent an integer between 2 and 12 wouldn't be a, b, c, or d. It would be i in range(2, 12) (if we were creating a new language that was 1 indexed and range was likewise adjusted), the list [1] would be range(1), and the empty list would be range(0), so the whole issue could be neatly sidestepped. :-) As for l == l[:x] + l[x:y] + l[y:] where y > x, I think a case can be made that it would be less confusing as l == l[:x] + l[x+1:y] + l[y+1:], since you don't want to start again with x or y. You just ended at x. When you pick up again, you want to start at x+1 and y+1 so that you don't get the x-th and y-th elements again. ;-) Of course this is speculation on my part. Maybe students of programming find 1-indexing just as confusing as 0-indexing. Any pedagogues want to chime in? -- Carl Johnson From masklinn at masklinn.net Tue Oct 5 13:05:43 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 5 Oct 2010 13:05:43 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> Message-ID: <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> On 2010-10-05, at 12:52 , Carl M. Johnson wrote: > In the particular case, the correct way to represent an integer > between 2 and 12 wouldn't be a, b, c, or d. It would be i in range(2, > 12) You don't seem to realize that a, b, c and d are behaviors of languages, and that `range` can map to any of these 4 behaviors. The current `range` implements behavior `a`, the proposed one implements behavior c. a, b, c and d are simply descriptions of these behaviors in mathematical terms so as not to rely on language-specific concepts. > (if we were creating a new language that was 1 indexed and range > was likewise adjusted), the list [1] would be range(1), and the empty > list would be range(0), so the whole issue could be neatly > sidestepped. :-) I fail to see what gets sidestepped there. Ignored at best. > As for l == l[:x] + l[x:y] + l[y:] where y > x, I think a case can be > made that it would be less confusing as l == l[:x] + l[x+1:y] + > l[y+1:], since you don't want to start again with x or y. Why not? > You just > ended at x. When you pick up again, you want to start at x+1 and y+1 > so that you don't get the x-th and y-th elements again. ;-) Yes indeed, as you demonstrate here closed ranges greatly complexify the code one has to write compared to half-closed ranges. From bborcic at gmail.com Tue Oct 5 13:45:56 2010 From: bborcic at gmail.com (Boris Borcic) Date: Tue, 05 Oct 2010 13:45:56 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: Nick Coghlan wrote: > [...] Being able to say things like > "10:00"<= x< '12:00", 10.0<= x< 12.0, "a"<= x< "n" are much > clearer than trying to specify their closed range equivalents. makes one wonder about syntax like : for 10 <= x < 20 : blah(x) Mh, I suppose with rich comparisons special methods, it's possible to turn chained comparisons into range factories without introducing new syntax. Something more like for x in (10 <= step(1) < 20) : blah(x) From cmjohnson.mailinglist at gmail.com Tue Oct 5 13:51:10 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Tue, 5 Oct 2010 01:51:10 -1000 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> Message-ID: On Tue, Oct 5, 2010 at 1:05 AM, Masklinn wrote: >> (if we were creating a new language that was 1 indexed and range >> was likewise adjusted), the list [1] would be range(1), and the empty >> list would be range(0), so the whole issue could be neatly >> sidestepped. :-) > I fail to see what gets sidestepped there. Ignored at best. He was trying to be language neutral by writing using < and <= but that's part of his problem. He's too much of a mathematician. Rewriting things so that they don't use < or <= at all is the best way to explain things to a non-math person. If you say "range(1, 5) gives a range from 1 to 5" your explanation doesn't have to use < or <= at all. This is unlike a C-like language where you would write int i=2; i<12; i++. So the question of what mathematics "really" underlies it can be sidestepped by using a language that many people know better than the language of mathematics: the English language. >> As for l == l[:x] + l[x:y] + l[y:] where y > x, I think a case can be >> made that it would be less confusing as l == l[:x] + l[x+1:y] + >> l[y+1:], since you don't want to start again with x or y. > Why not? Because (speaking naively) I already FizzBuzzed the x-th element before. I don't want to double FizzBuzz it. So that means I should start up again with the +1 element. >> You just >> ended at x. When you pick up again, you want to start at x+1 and y+1 >> so that you don't get the x-th and y-th elements again. ;-) > Yes indeed, as you demonstrate here closed ranges greatly complexify the code one has to write compared to half-closed ranges. Yup. TANSTAAFL. That's why we shouldn't actually bother to change things: you lose on the backend what you gain on the frontend. I'm just curious about whether starting programmers have a strong preference for one or the other convention or whether both are confusing. From masklinn at masklinn.net Tue Oct 5 14:10:46 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 5 Oct 2010 14:10:46 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> Message-ID: <107B0DE5-08EC-4933-8B02-32AE0CCE7BD2@masklinn.net> On 2010-10-05, at 13:51 , Carl M. Johnson wrote: > On Tue, Oct 5, 2010 at 1:05 AM, Masklinn wrote: > >>> (if we were creating a new language that was 1 indexed and range >>> was likewise adjusted), the list [1] would be range(1), and the empty >>> list would be range(0), so the whole issue could be neatly >>> sidestepped. :-) >> I fail to see what gets sidestepped there. Ignored at best. > > He was trying to be language neutral by writing using < and <= but > that's part of his problem. He's too much of a mathematician. > Rewriting things so that they don't use < or <= at all is the best way > to explain things to a non-math person. If you say "range(1, 5) gives > a range from 1 to 5" your explanation doesn't have to use < or <= at > all. But again, you don't sidestep anything. "a range from 1 to 5" is ambiguous and can be understood as any of the 4 relations Dijkstra provides. So it's only a good way to explain it in that 0. it doesn't expose a reader to semi-mathematical notation anybody over 12 should be able to understand and 1. it avoids any semblance of unambiguity and instead decides to leave all interpretation to the reader. > This is unlike a C-like language where you would write int i=2; > i<12; i++. Uh what? > So the question of what mathematics "really" underlies it > can be sidestepped by using a language that many people know better > than the language of mathematics: the English language. No. As I said, it doesn't sidestep the issue but ignores it by replacing perfectly unambiguous notation by utterly ambiguous description. From fuzzyman at voidspace.org.uk Tue Oct 5 15:07:41 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 5 Oct 2010 14:07:41 +0100 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> Message-ID: On 5 October 2010 12:51, Carl M. Johnson wrote: > [snip...] > > Yup. TANSTAAFL. That's why we shouldn't actually bother to change > things: you lose on the backend what you gain on the frontend. I'm > just curious about whether starting programmers have a strong > preference for one or the other convention or whether both are > confusing. > Both teaching new programmers and programmers coming from other languages I've found them confused by the range behaviour and usually end up having to apologise for it (a sure sign of a language wart). It is *good* that range(5) produces 5 values (0 to 4) but *weird* that range(3, 10) doesn't include the 10. Changing it now would be *very* backwards incompatible of course. Python 4 perhaps? All the best, Michael Foord > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk -------------- next part -------------- An HTML attachment was scrubbed... URL: From ctb at msu.edu Tue Oct 5 15:13:56 2010 From: ctb at msu.edu (C. Titus Brown) Date: Tue, 5 Oct 2010 06:13:56 -0700 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> Message-ID: <20101005131356.GA21646@idyll.org> On Tue, Oct 05, 2010 at 02:07:41PM +0100, Michael Foord wrote: > On 5 October 2010 12:51, Carl M. Johnson wrote: > > > [snip...] > > > > Yup. TANSTAAFL. That's why we shouldn't actually bother to change > > things: you lose on the backend what you gain on the frontend. I'm > > just curious about whether starting programmers have a strong > > preference for one or the other convention or whether both are > > confusing. > > Both teaching new programmers and programmers coming from other languages > I've found them confused by the range behaviour and usually end up having to > apologise for it (a sure sign of a language wart). > > It is *good* that range(5) produces 5 values (0 to 4) but *weird* that > range(3, 10) doesn't include the 10. > > Changing it now would be *very* backwards incompatible of course. Python 4 > perhaps? Doesn't it make sense that len(range(5)) == 5 and for i in range(5): ... mimics the C/C++ behavior of for (i = 0; i < 5; i++) ... ? --titus -- C. Titus Brown, ctb at msu.edu From fuzzyman at voidspace.org.uk Tue Oct 5 15:16:20 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 05 Oct 2010 14:16:20 +0100 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <20101005131356.GA21646@idyll.org> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> Message-ID: <4CAB2524.1010008@voidspace.org.uk> On 05/10/2010 14:13, C. Titus Brown wrote: > On Tue, Oct 05, 2010 at 02:07:41PM +0100, Michael Foord wrote: >> On 5 October 2010 12:51, Carl M. Johnsonwrote: >> >>> [snip...] >>> >>> Yup. TANSTAAFL. That's why we shouldn't actually bother to change >>> things: you lose on the backend what you gain on the frontend. I'm >>> just curious about whether starting programmers have a strong >>> preference for one or the other convention or whether both are >>> confusing. >> Both teaching new programmers and programmers coming from other languages >> I've found them confused by the range behaviour and usually end up having to >> apologise for it (a sure sign of a language wart). >> >> It is *good* that range(5) produces 5 values (0 to 4) but *weird* that >> range(3, 10) doesn't include the 10. >> >> Changing it now would be *very* backwards incompatible of course. Python 4 >> perhaps? > Doesn't it make sense that > > len(range(5)) == 5 > > and > > for i in range(5): > ... > > mimics the C/C++ behavior of > > for (i = 0; i< 5; i++) ... > Yes. That is why I said that the current behaviour of range for a single input is *good*. Perhaps I should have been clearer; it is only the behaviour of range(x, y) that I've found people-new-to-python confused by. All the best, Michael > ? > > --titus -- http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. From alexander.belopolsky at gmail.com Tue Oct 5 16:33:14 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 5 Oct 2010 10:33:14 -0400 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <4CAB2524.1010008@voidspace.org.uk> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CAB2524.1010008@voidspace.org.uk> Message-ID: On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord wrote: > ... Perhaps I should have been clearer; it is only the > behaviour of range(x, y) that I've found people-new-to-python confused by. Teach them about range(x, y, z) and once you cover negative z they will stop complaining about range(x, y). :-) At least you don't have to deal with range vs. xrange in 3.x anymore. IMO, range([start,] stop[, step]) is one of the worst interfaces in python. Is there any other function with an optional *first* argument? Why range(date(2010, 1, 1), date(2010, 2, 1), timedelta(1)) cannot be used to produce days in January? Why range(2**300) succeeds, but len(range(2**300)) raises OverflowError? No, I don't think much can be done about it. Py3k has already done everything that was practical about improving range(..). From masklinn at masklinn.net Tue Oct 5 16:47:33 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 5 Oct 2010 16:47:33 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CAB2524.1010008@voidspace.org.uk> Message-ID: On 2010-10-05, at 16:33 , Alexander Belopolsky wrote: > On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord wrote: >> ... Perhaps I should have been clearer; it is only the >> behaviour of range(x, y) that I've found people-new-to-python confused by. > Teach them about range(x, y, z) and once you cover negative z they > will stop complaining about range(x, y). :-) > > At least you don't have to deal with range vs. xrange in 3.x anymore. > IMO, range([start,] stop[, step]) is one of the worst interfaces in > python. Is there any other function with an optional *first* > argument? Dict, kinda, though the other arguments are keywords so it probably doesn't count. > Why range(date(2010, 1, 1), date(2010, 2, 1), timedelta(1)) > cannot be used to produce days in January? Likewise for range('a', 'e'). Range only working on integers is definitely annoying compared to the equivalent construct in Haskell for instance, or Ruby (though ruby has the issue of indistinguishable half-closed and fully-closed ranges when using the operator version). > Why range(2**300) > succeeds, but len(range(2**300)) raises OverflowError? The former overflows in Python 2. It doesn't in Python 3 due to `range` being an iterable not a list. From fuzzyman at voidspace.org.uk Tue Oct 5 16:51:20 2010 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 05 Oct 2010 15:51:20 +0100 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CAB2524.1010008@voidspace.org.uk> Message-ID: <4CAB3B68.4020304@voidspace.org.uk> On 05/10/2010 15:33, Alexander Belopolsky wrote: > On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord wrote: >> ... Perhaps I should have been clearer; it is only the >> behaviour of range(x, y) that I've found people-new-to-python confused by. > Teach them about range(x, y, z) and once you cover negative z they > will stop complaining about range(x, y). :-) Well, it probably doesn't help (for those coming to Python from languages other than C) that some languages do-the-right-thing with ranges. <0.5 wink> $ irb >> (1..3).to_a => [1, 2, 3] All the best, Michael Foord > At least you don't have to deal with range vs. xrange in 3.x anymore. > IMO, range([start,] stop[, step]) is one of the worst interfaces in > python. Is there any other function with an optional *first* > argument? Why range(date(2010, 1, 1), date(2010, 2, 1), timedelta(1)) > cannot be used to produce days in January? Why range(2**300) > succeeds, but len(range(2**300)) raises OverflowError? > > No, I don't think much can be done about it. Py3k has already done > everything that was practical about improving range(..). -- http://www.voidspace.org.uk/ From alexander.belopolsky at gmail.com Tue Oct 5 17:02:12 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 5 Oct 2010 11:02:12 -0400 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CAB2524.1010008@voidspace.org.uk> Message-ID: On Tue, Oct 5, 2010 at 10:47 AM, Masklinn wrote: .. >> ?Why range(2**300) >> succeeds, but len(range(2**300)) raises OverflowError? > The former overflows in Python 2. It doesn't in Python 3 due to `range` being an iterable not a list. This particular wart is the subject of issue 2690. http://bugs.python.org/issue2690 From masklinn at masklinn.net Tue Oct 5 17:03:11 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 5 Oct 2010 17:03:11 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <4CAB3B68.4020304@voidspace.org.uk> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CAB2524.1010008@voidspace.org.uk> <4CAB3B68.4020304@voidspace.org.uk> Message-ID: <9FC12AF7-688B-40D5-A77F-4C2ED20FAA61@masklinn.net> On 2010-10-05, at 16:51 , Michael Foord wrote: > On 05/10/2010 15:33, Alexander Belopolsky wrote: >> On Tue, Oct 5, 2010 at 9:16 AM, Michael Foord wrote: >>> ... Perhaps I should have been clearer; it is only the >>> behaviour of range(x, y) that I've found people-new-to-python confused by. >> Teach them about range(x, y, z) and once you cover negative z they >> will stop complaining about range(x, y). :-) > > Well, it probably doesn't help (for those coming to Python from languages other than C) that some languages do-the-right-thing with ranges. <0.5 wink> > > $ irb > >> (1..3).to_a > => [1, 2, 3] > > All the best, True, likewise for Haskell: Prelude> [0..5] [0,1,2,3,4,5] On the other hand (for Ruby), >> (1...3).to_a => [1, 2] Ruby is also a bit different in that ranges are generally used more for containment-testing (via when) and there is a separate Fixnum.upto for iteration. From denis.spir at gmail.com Tue Oct 5 21:23:27 2010 From: denis.spir at gmail.com (spir) Date: Tue, 5 Oct 2010 21:23:27 +0200 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: <20101005212327.7a8965ff@o> On Tue, 05 Oct 2010 13:45:56 +0200 Boris Borcic wrote: > Nick Coghlan wrote: > > [...] Being able to say things like > > "10:00"<= x< '12:00", 10.0<= x< 12.0, "a"<= x< "n" are much > > clearer than trying to specify their closed range equivalents. > > makes one wonder about syntax like : > > for 10 <= x < 20 : > blah(x) > > > Mh, I suppose with rich comparisons special methods, it's possible to turn > chained comparisons into range factories without introducing new syntax. > Something more like > > > for x in (10 <= step(1) < 20) : > blah(x) About notation, even if loved right-hand-half-open intervals, I would wonder about [a,b] noting it. I guess 99.9% of programmers and novices (even purely amateur) have learnt about intervals at school in math courses. Both notations I know of use [a,b] for closed intervals, while half-open ones are noted either [a,b[ or [a,b). Thus, for me, the present C/python/etc notation is at best misleading. So, what about a hypothetical language using directly math *unambiguous* notation, thus also letting programmers chose their preferred semantics (without fooling others)? End of war? Denis -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From python at mrabarnett.plus.com Tue Oct 5 21:43:29 2010 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 05 Oct 2010 20:43:29 +0100 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <20101005212327.7a8965ff@o> References: <20101005212327.7a8965ff@o> Message-ID: <4CAB7FE1.2040902@mrabarnett.plus.com> On 05/10/2010 20:23, spir wrote: > On Tue, 05 Oct 2010 13:45:56 +0200 > Boris Borcic wrote: > >> Nick Coghlan wrote: >>> [...] Being able to say things like >>> "10:00"<= x< '12:00", 10.0<= x< 12.0, "a"<= x< "n" are much >>> clearer than trying to specify their closed range equivalents. >> >> makes one wonder about syntax like : >> >> for 10<= x< 20 : >> blah(x) >> >> >> Mh, I suppose with rich comparisons special methods, it's possible to turn >> chained comparisons into range factories without introducing new syntax. >> Something more like >> >> >> for x in (10<= step(1)< 20) : >> blah(x) > > About notation, even if loved right-hand-half-open intervals, I would wonder about [a,b] noting it. I guess 99.9% of programmers and novices (even purely amateur) have learnt about intervals at school in math courses. Both notations I know of use [a,b] for closed intervals, while half-open ones are noted either [a,b[ or [a,b). Thus, for me, the present C/python/etc notation is at best misleading. > So, what about a hypothetical language using directly math *unambiguous* notation, thus also letting programmers chose their preferred semantics (without fooling others)? End of war? > [Oops! Post sent to wrong list!] Dijkstra came to his conclusion after seeing the results of students using the programming language Mesa, which does support all 4 forms of interval. From tjreedy at udel.edu Tue Oct 5 23:41:03 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 05 Oct 2010 17:41:03 -0400 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: Message-ID: On 10/5/2010 4:54 AM, Carl M. Johnson wrote: > Changing range would only make sense if lists were also changed to > start at 1 instead of 0, and that's never gonna happen. It's a > massively backwards incompatible change with no real offsetting > advantage. > > Still, if you were designing a brand new language today, would you > have arrays/lists start at 0 or 1? (Or compromise and do .5?) I > personally lean towards 1, since I recall being frequently tripped up > by the first element in an array being a[0] way back when I first > learn C++ in the 20th century. But maybe this was because I had been > messed up by writing BASIC for loops from 1 to n before that? Is there > anyone with teaching experience here? Is this much of a problem for > young people learning Python (or any other zero-based indexing > language) as their first language? > > What do you guys think? Now that simplifying pointer arithmetic isn't > such an important consideration, is it still better to do zero-based > indexing? Sequences are often used as and can be viewed as tabular representations of functions for equally spaced inputs a+0*b, a+1*b, ..., a+i*b, .... In the simplest case, a==0 and b==1, so that the sequence directly maps counts 0,1,2,... to values. Without the 0 index, one must subtract 1 from each index to have the same effect. Pointer arithmetic is an example of the utility of keeping the 0 term, but only one such example of many. When one uses iterators instead of sequences, as in more common in Python 3, there is no inherent index to worry about or argue over. def inner_product(p,q): # no equal, finite len() check! sum = 0 for a,b in zip(p,q): sum += a*b No index in sight. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Wed Oct 6 01:33:25 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 06 Oct 2010 12:33:25 +1300 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> Message-ID: <4CABB5C5.6000904@canterbury.ac.nz> Carl M. Johnson wrote: > I'm > just curious about whether starting programmers have a strong > preference for one or the other convention or whether both are > confusing. Starting programmers don't have enough experience to judge which will be less confusing in the long run, so their opinion shouldn't be given overriding weight when designing a language intended for real-life use. Speaking as an experienced programmer, I'm convinced that Python has made the right choice. Not because Dijkstra or any other authority says so, but because of my own personal experiences. -- Greg From bruce at leapyear.org Wed Oct 6 01:48:08 2010 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 5 Oct 2010 16:48:08 -0700 Subject: [Python-ideas] [Python-Dev] Inclusive Range In-Reply-To: <4CABB5C5.6000904@canterbury.ac.nz> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <4CABB5C5.6000904@canterbury.ac.nz> Message-ID: With 1-based indexes, sometimes you have to add 1 and sometimes subtract 1 and sometimes neither. 0-based indexes avoid that problem. Personally, I think changing any of this behavior has about the same probability of success as adding bleen . --- Bruce http://www.vroospeak.com http://j.mp/gruyere-security On Tue, Oct 5, 2010 at 4:33 PM, Greg Ewing wrote: > Carl M. Johnson wrote: > >> I'm >> just curious about whether starting programmers have a strong >> preference for one or the other convention or whether both are >> confusing. >> > > Starting programmers don't have enough experience to judge > which will be less confusing in the long run, so their > opinion shouldn't be given overriding weight when designing > a language intended for real-life use. > > Speaking as an experienced programmer, I'm convinced that > Python has made the right choice. Not because Dijkstra or > any other authority says so, but because of my own personal > experiences. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Wed Oct 6 15:18:08 2010 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 6 Oct 2010 09:18:08 -0400 Subject: [Python-ideas] Inclusive Range In-Reply-To: <20101005131356.GA21646@idyll.org> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> Message-ID: On 10/5/10, C. Titus Brown wrote: > On Tue, Oct 05, 2010 at 02:07:41PM +0100, Michael Foord wrote: >> It is *good* that range(5) produces 5 values (0 to 4) If not for compatibility, the 5 values (1,2,3,4,5) would be even better. But even in a new language, changing the rest of the language so that (1,2,3,4,5) was more useful might not be a win. > Doesn't it make sense that ... for i in range(5): > mimics the C/C++ behavior of for (i = 0; i < 5; i++) If not for assumed familiarity with C idioms, why shouldn't it instead match for (i=1; i<=5; i++) -jJ From ncoghlan at gmail.com Wed Oct 6 15:58:48 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 6 Oct 2010 23:58:48 +1000 Subject: [Python-ideas] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> Message-ID: On the more general topic of *teaching* 0-based indexing, the best explanation I've seen is the one where 1-based indexing is explained as referring directly to the items in the sequence, while 0-based indexing numbers the implicit gaps between items and then returns the item immediately after the identified gap. Slicing for 0-based indexing can then be explained without needing to talk about half-open ranges at all - you just grab everything between the two identified gaps*. I think the main point here is that these are not independent design decisions - the behaviour of range() (or its equivalent), indexing, slicing, enumeration and anything else related to sequences all comes back to a single fundamental design choice of 1-based vs 0-based indexing. Once you make that initial decision (regardless of the merits either way), other decisions are going to flow from it as consequences, and it isn't really something a language can ever practically tinker with. Cheers, Nick. *(unfortunately, it's a bit trickier to mesh that otherwise clear and concise explanation cleanly with Python's definition of ranges and slicing with negative step values, since those offset everything by one, such that "list(reversed(range(1, 5, 1])) == list(range(4, 0, -1])". If I was going to ask for a change to anything in Python's indexing semantics, it would be for negative step values to create ranges that were half-open at the beginning rather than the end, such that reversing a slice just involved swapping the start value with the stop value and negating the step value. As it is, you also have to subtract one from both the start and stop value to get the original range of values back. However, just like the idea of ranges starting from 1 rather than 0, the idea of negative slices giving ranges half-open at the start rather than the end is also doomed by significant problems with backwards compatibility. For a new language, you might be able to make the argument that the alternative behaviour is a better design choice. For an existing one like Python, any possible benefits are so nebulous as to not be worth the inevitable hassle involved in changing the semantics) -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From masklinn at masklinn.net Wed Oct 6 16:36:35 2010 From: masklinn at masklinn.net (Masklinn) Date: Wed, 6 Oct 2010 16:36:35 +0200 Subject: [Python-ideas] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> Message-ID: <583BFEEC-09C8-4B8F-827C-43B4D6403F45@masklinn.net> On 2010-10-06, at 15:58 , Nick Coghlan wrote: > *(unfortunately, it's a bit trickier to mesh that otherwise clear and > concise explanation cleanly with Python's definition of ranges and > slicing with negative step values, since those offset everything by > one, such that "list(reversed(range(1, 5, 1])) == list(range(4, 0, > -1])". I'm not sure about that at all: the index is still right before the item, which is why the last item is `-1` rather than `-0`. And everything flows from that again. From rrr at ronadam.com Wed Oct 6 21:21:03 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 06 Oct 2010 14:21:03 -0500 Subject: [Python-ideas] improvements to slicing In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> Message-ID: <4CACCC1F.4010209@ronadam.com> On 10/06/2010 08:58 AM, Nick Coghlan wrote: > If I was going to ask for a change to anything in Python's > indexing semantics, it would be for negative step values to create > ranges that were half-open at the beginning rather than the end, such > that reversing a slice just involved swapping the start value with the > stop value and negating the step value. Yes, negative slices are very tricky to get right. They could use some attention I think. > As it is, you also have to > subtract one from both the start and stop value to get the original > range of values back. However, just like the idea of ranges starting > from 1 rather than 0, the idea of negative slices giving ranges > half-open at the start rather than the end is also doomed by > significant problems with backwards compatibility. For a new language, > you might be able to make the argument that the alternative behaviour > is a better design choice. For an existing one like Python, any > possible benefits are so nebulous as to not be worth the inevitable > hassle involved in changing the semantics) We don't need to change the current range function/generator to add inclusive or closed ranges. Just add a closed_range() function to the itertools or math module. [n for n in closed_range(-5, 5, 2)] --> [-5, -3, -1, 1, 3, 5] I just noticed the __getslice__ method is no longer on sequences. (?) My preference is for slicing to be based more on practical terms for manipulating sequences rather than be defined in a purely mathematical way. 1. Have the direction determine by the start and stop values rather than than by the step value so that the following is true. "abcdefg"[start:stop:step] == "abcdefg"[start:stop][::step] Reversing the slice can be done by simply swapping the start and stop. Negating the slice too would give you ... "abcdefg"[start:stop:step] == "abcdefg"[stop:start:-step] Negating the step would not always give you the reverse sequence for steps larger than 1, because the result may not contain the same values. >>> 'abcd'[::2] 'ac' >>> 'abcd'[::-2] 'db' This is the current behavior and wouldn't change. A positive step value would step from the left, and a negative step value would step from the right of the slice determined by start and stop. This already works if you don't give stop and start values. >>> "abcdefg"[::2] 'aceg' >>> "abcdefg"[::-2] 'geca' And these can be used in for loops or list comps. >>> [c for c in "abcdefg"[::2]] ['a', 'c', 'e', 'g'] If we could add a width value to slices we would be able to do this. >>> "abcdefg"[::2:2] 'abcdefg' ;-) As unimpressive as that looked, when used in a for loop or list comp it would give us an easy and useful way to step through data. [cc for cc in "abcdefg"[::2:2]] --> ['ab', 'cd', 'ef', 'g'] You could also spell that as... list("abcdefg")[::2:2]) --> ['ab', 'cd', 'ef', 'g'] The problems start when you try to use actual index values to specify start and stop ranges. You can't index the last element with an explicit stop value. >>> "abcdefg"[0:-1] 'abcdef' >>> "abcdefg"[0:-0] '' But we can use "None" which is awkward and requires testing the stop value when the index is supplied by a variable. >>> 'abcdefg'[:None] 'abcdefg' I'm not sure how to fix this one. We've been living with this for a long time so it's not like we need to fix it all at once. Negative indexes can be confusing. >>> "abcdefg"[-5:5] 'cde' # OK >>> "abcdefg"[5:-5] '' # Expected "edc'" here, not ''. >>> "abcdefg"[5:-5:-1] 'fed' # Expected reverse of '' here, # or 'cde', not 'fed'. With the suggested change we get... >>> "abcdefg"[-5:5] 'cde' # Stays the same. >>> "abcdefg"[5:-5] 'edc' # Swapping start and stop reverses it. >>> "abcdefg"[5:-5:-1] 'cde' # Negating the step, reverses it again. I think these are easier to use than the current behavior. It doesn't change slices using positive indexes and steps so maybe it's not so backward incompatible to sneak in. ;-) Ron From rrr at ronadam.com Thu Oct 7 01:05:01 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 06 Oct 2010 18:05:01 -0500 Subject: [Python-ideas] A general purpose range iterator (Re: improvements to slicing) In-Reply-To: References: Message-ID: <4CAD009D.6000805@ronadam.com> On 10/06/2010 03:41 PM, Nick Coghlan wrote: > On Thu, Oct 7, 2010 at 5:21 AM, Ron Adam wrote: >> I think these are easier to use than the current behavior. It doesn't >> change slices using positive indexes and steps so maybe it's not so backward >> incompatible to sneak in. ;-) > > I think that sound you just heard was thousands of SciPy users crying > out in horror ;) LOL, and my apologies to the SciPi users. I did try to google to find routines where the stop and start indexes converge and result in an empty list as Spir suggested, but my google foo seems to be broken today. Maybe someone can point me in the right google direction. > Given a "do over", there a few things I would change about Python's > range generation and extended slicing. Others would clearly change a > few different things. Given the dual barriers of "rough consensus and > running code", I don't think there are any *specific* changes that > would make it through the gauntlet. Yes, any changes would probably need to be done in a way that can be imported and live along side the current range and slice. > The idea of a *generalised* range generator is in interesting one > though. One that was simply: > > _d = object() > def irange(start=_d, stop=_d, step=1, *, include_start=True, > include_stop=False): > # Match signature of range while still allowing stop=val as the > only keyword argument > if stop is _d: > start, stop = 0, start > elif start is _d: > start = 0 > if include_start: > yield start > current = start > while 1: > current += step > if current>= stop: > break > yield current > if include_stop and current == stop: > yield stop > > Slower than builtin range() for the integer case, but works with > arbitrary types (e.g. float, Decimal, datetime) It wouldn't be that much slower if it returns another range object with the index's adjusted. But then it wouldn't be able to use the arbitrary types. A wild idea that keeps nudging my neurons is that sequence iterators or index objectes like slice objects could maybe be added before applying them to a final sequence. Sort of an iter math. It isn't as simple as just putting the iterators in a list and iterating the iterators in order, although that works for some things. Ron From denis.spir at gmail.com Wed Oct 6 22:39:11 2010 From: denis.spir at gmail.com (spir) Date: Wed, 6 Oct 2010 22:39:11 +0200 Subject: [Python-ideas] improvements to slicing In-Reply-To: <4CACCC1F.4010209@ronadam.com> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CACCC1F.4010209@ronadam.com> Message-ID: <20101006223911.7fea02e1@o> On Wed, 06 Oct 2010 14:21:03 -0500 Ron Adam wrote: > 1. Have the direction determine by the start and stop values rather than > than by the step value so that the following is true. > > "abcdefg"[start:stop:step] == "abcdefg"[start:stop][::step] Please provide an example with current and proposed semantics. If I understand correctly, this does not work in practice. When range bounds are variable (result from computation), the upper one can happen to be smaller than the upper one and we just want the resulting sub-sequence to be empty. This is normal and common use case, and this is good. (upper <= lower) ==> [] Else many routine would have to special-case (upper < lower). Your proposal, again if I understand, would break this semantics, instead returning a sub-sequence in reverse order. Denis -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From raymond.hettinger at gmail.com Wed Oct 6 22:35:39 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 6 Oct 2010 13:35:39 -0700 Subject: [Python-ideas] improvements to slicing In-Reply-To: <4CACCC1F.4010209@ronadam.com> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CACCC1F.4010209@ronadam.com> Message-ID: <991280BF-31D6-4B2B-B591-1E7A1DAB467B@gmail.com> On Oct 6, 2010, at 12:21 PM, Ron Adam wrote: > We don't need to change the current range function/generator to add inclusive or closed ranges. Just add a closed_range() function to the itertools or math module. > > [n for n in closed_range(-5, 5, 2)] --> [-5, -3, -1, 1, 3, 5] If I were a betting man, I would venture that you could post a recipe for closed_range(), publicize it on various mailing lists, mention it in talks, and find that it would almost never get used. There's nothing wrong with the idea, but the YAGNI factor will be hard to overcome. IMO, this would become cruft on the same day it gets added to the library. OTOH for numerical applications, there is utility for a floating point variant, something like linspace() in MATLAB. Raymond From denis.spir at gmail.com Wed Oct 6 22:11:59 2010 From: denis.spir at gmail.com (spir) Date: Wed, 6 Oct 2010 22:11:59 +0200 Subject: [Python-ideas] Inclusive Range In-Reply-To: References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> Message-ID: <20101006221159.73659a9b@o> On Wed, 6 Oct 2010 23:58:48 +1000 Nick Coghlan wrote: > On the more general topic of *teaching* 0-based indexing, the best explanation I've seen is the one where 1-based indexing is explained as referring directly to the items in the sequence, while 0-based indexing numbers the implicit gaps between items and then returns the item immediately after the identified gap. Slicing for 0-based indexing can then be explained without needing to talk about half-open ranges at all - you just grab everything between the two identified gaps*. In my experience, the only explanation that makes sense for newcomers is that 1-based indexes are just ordinary ordinals like we use everyday, while 0-based ones are _offsets_ measured from the start. It does not really help in practice (people do errors anyway), but at least they understand the logic so can reason when needed, namely to correct their errors. > I think the main point here is that these are not independent design > decisions - the behaviour of range() (or its equivalent), indexing, > slicing, enumeration and anything else related to sequences all comes > back to a single fundamental design choice of 1-based vs 0-based > indexing. I think there are languages with base 0 & closed range, unless it is base 1 & half-open range. Any convention works, practically. Also, the logics every supporter of the C convention, namely the famous tewt by EWD, reverses your argumentation: he show the advantages of helf-open intervals (according to his opinion), then that 0-based indexes fit better with this kind of intervals (ditto). Denis -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From ncoghlan at gmail.com Wed Oct 6 22:41:21 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 7 Oct 2010 06:41:21 +1000 Subject: [Python-ideas] A general purpose range iterator (Re: improvements to slicing) Message-ID: On Thu, Oct 7, 2010 at 5:21 AM, Ron Adam wrote: > I think these are easier to use than the current behavior. ?It doesn't > change slices using positive indexes and steps so maybe it's not so backward > incompatible to sneak in. ?;-) I think that sound you just heard was thousands of SciPy users crying out in horror ;) Given a "do over", there a few things I would change about Python's range generation and extended slicing. Others would clearly change a few different things. Given the dual barriers of "rough consensus and running code", I don't think there are any *specific* changes that would make it through the gauntlet. The idea of a *generalised* range generator is in interesting one though. One that was simply: _d = object() def irange(start=_d, stop=_d, step=1, *, include_start=True, include_stop=False): # Match signature of range while still allowing stop=val as the only keyword argument if stop is _d: start, stop = 0, start elif start is _d: start = 0 if include_start: yield start current = start while 1: current += step if current >= stop: break yield current if include_stop and current == stop: yield stop Slower than builtin range() for the integer case, but works with arbitrary types (e.g. float, Decimal, datetime) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From andy at insectnation.org Thu Oct 7 12:35:15 2010 From: andy at insectnation.org (Andy Buckley) Date: Thu, 07 Oct 2010 12:35:15 +0200 Subject: [Python-ideas] improvements to slicing In-Reply-To: <991280BF-31D6-4B2B-B591-1E7A1DAB467B@gmail.com> References: <94F5C718-92AC-41FA-B67E-509D82ECA634@masklinn.net> <076105EE-0F93-4F83-B48C-4B559E132C7E@masklinn.net> <20101005131356.GA21646@idyll.org> <4CACCC1F.4010209@ronadam.com> <991280BF-31D6-4B2B-B591-1E7A1DAB467B@gmail.com> Message-ID: <4CADA263.1020309@insectnation.org> On 06/10/10 22:35, Raymond Hettinger wrote: > > On Oct 6, 2010, at 12:21 PM, Ron Adam wrote: >> We don't need to change the current range function/generator to add inclusive or closed ranges. Just add a closed_range() function to the itertools or math module. >> >> [n for n in closed_range(-5, 5, 2)] --> [-5, -3, -1, 1, 3, 5] > > If I were a betting man, I would venture that you could post > a recipe for closed_range(), publicize it on various mailing > lists, mention it in talks, and find that it would almost never > get used. > > There's nothing wrong with the idea, but the YAGNI factor > will be hard to overcome. IMO, this would become cruft on > the same day it gets added to the library. There are plenty of places in my code where I would find such a thing useful, though... usually where I'm working with pre-determined integer codes (one very specific use-case: elementary particle ID codes, which are integers constructed from quantum number values) and it's simply more elegant and intuitive to specify a range whose requested upper bound is a valid code rather than valid_code+1. IMHO, an extra keyword on range/xrange would allow to write nicer code where applicable, without crufting up the library with whole extra functions. Depends on what you consider more crufty, I suppose, but I agree that ~no-one is going to find and import a new range function. numpy.linspace uses "endpoint" as the name for such a keyword: http://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html#numpy.linspace but again no-one wants to depend on numpy *just* to get that functionality! So how about range(start, realend, endpoint=True) xrange(start, realend, endpoint=True) with endpoint=False as default? No backward compatibility or performance issues to my (admittedly inexpert) eye. Andy From steve at pearwood.info Mon Oct 11 01:17:54 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Oct 2010 10:17:54 +1100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> Message-ID: <201010111017.56101.steve@pearwood.info> On Mon, 11 Oct 2010 05:57:21 am Paul McGuire wrote: > Just as an exercise, I wanted to try my hand at adding a function to > the compiled Python C code. An interesting optimization that I read > about (where? don't recall) finds the minimum and maximum elements of > a sequence in a single pass, with a 25% reduction in number of > comparison operations: > - the sequence elements are read in pairs > - each pair is compared to find smaller/greater > - the smaller is compared to current min > - the greater is compared to current max > > So each pair is applied to the running min/max values using 3 > comparisons, vs. 4 that would be required if both were compared to > both min and max. > > This feels somewhat similar to how divmod returns both quotient and > remainder of a single division operation. > > This would be potentially interesting for those cases where min and > max are invoked on the same sequence one after the other, and > especially so if the sequence elements were objects with expensive > comparison operations. Perhaps more importantly, it is ideal for the use-case where you have an iterator. You can't call min() and then max(), as min() consumes the iterator leaving nothing for max(). It may be undesirable to convert the iterator to a list first -- it may be that the number of items in the data stream is too large to fit into memory all at once, but even if it is small, it means you're now walking the stream three times when one would do. To my mind, minmax() is as obvious and as useful a built-in as divmod(), but if there is resistance to making such a function a built-in, perhaps it could go into itertools. (I would prefer it to keep the same signature as min() and max(), namely that it will take either a single iterable argument or multiple arguments.) I've experimented with minmax() myself. Not surprisingly, the performance of a pure Python version doesn't even come close to the built-ins. I'm +1 on the idea. Presumably follow-ups should go to python-ideas. -- Steven D'Aprano From zac256 at gmail.com Mon Oct 11 02:55:51 2010 From: zac256 at gmail.com (Zac Burns) Date: Sun, 10 Oct 2010 17:55:51 -0700 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <201010111017.56101.steve@pearwood.info> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: This could be generalized and placed into itertools if we create a function (say, apply for lack of a better name at the moment) that takes in an iterable and creates new iterables that yield each from the original (avoiding the need for a list) holding only one in memory. Then you could pass the whatever function you wanted to run the iterables over an get the result back in a tuple. Eg: itertools.apply(iterable, min, max) ~= (min(iterable), max(iterable)) This class that creates 'associated iterables' from an original iterable where each new iterable has to be iterated over at the same time might also be useful in other contexts and could be added to itertools as well. Unfortunately this solution seems incompatable with the implementations with for loops in min and max (EG: How do you switch functions at the right time?) So it might take some tweaking. -- Zachary Burns (407)590-4814 Aim - Zac256FL On Sun, Oct 10, 2010 at 4:17 PM, Steven D'Aprano wrote: > On Mon, 11 Oct 2010 05:57:21 am Paul McGuire wrote: > > Just as an exercise, I wanted to try my hand at adding a function to > > the compiled Python C code. An interesting optimization that I read > > about (where? don't recall) finds the minimum and maximum elements of > > a sequence in a single pass, with a 25% reduction in number of > > comparison operations: > > - the sequence elements are read in pairs > > - each pair is compared to find smaller/greater > > - the smaller is compared to current min > > - the greater is compared to current max > > > > So each pair is applied to the running min/max values using 3 > > comparisons, vs. 4 that would be required if both were compared to > > both min and max. > > > > This feels somewhat similar to how divmod returns both quotient and > > remainder of a single division operation. > > > > This would be potentially interesting for those cases where min and > > max are invoked on the same sequence one after the other, and > > especially so if the sequence elements were objects with expensive > > comparison operations. > > Perhaps more importantly, it is ideal for the use-case where you have an > iterator. You can't call min() and then max(), as min() consumes the > iterator leaving nothing for max(). It may be undesirable to convert > the iterator to a list first -- it may be that the number of items in > the data stream is too large to fit into memory all at once, but even > if it is small, it means you're now walking the stream three times when > one would do. > > To my mind, minmax() is as obvious and as useful a built-in as divmod(), > but if there is resistance to making such a function a built-in, > perhaps it could go into itertools. (I would prefer it to keep the same > signature as min() and max(), namely that it will take either a single > iterable argument or multiple arguments.) > > I've experimented with minmax() myself. Not surprisingly, the > performance of a pure Python version doesn't even come close to the > built-ins. > > I'm +1 on the idea. > > Presumably follow-ups should go to python-ideas. > > > > -- > Steven D'Aprano > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Mon Oct 11 07:50:14 2010 From: masklinn at masklinn.net (Masklinn) Date: Mon, 11 Oct 2010 07:50:14 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On 2010-10-11, at 02:55 , Zac Burns wrote: > > Unfortunately this solution seems incompatable with the implementations with > for loops in min and max (EG: How do you switch functions at the right > time?) So it might take some tweaking. As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books. As a result, this function would devolve into something along the lines of def apply(iterable, *funcs): return map(lambda c: c[0](c[1]), zip(funcs, tee(iterable, len(funcs)))) which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator. From taleinat at gmail.com Mon Oct 11 22:18:41 2010 From: taleinat at gmail.com (Tal Einat) Date: Mon, 11 Oct 2010 22:18:41 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: Masklinn wrote: > On 2010-10-11, at 02:55 , Zac Burns wrote: >> >> Unfortunately this solution seems incompatable with the implementations with >> for loops in min and max (EG: How do you switch functions at the right >> time?) So it might take some tweaking. > As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books. > > As a result, this function would devolve into something along the lines of > > ? ?def apply(iterable, *funcs): > ? ? ? ?return map(lambda c: c[0](c[1]), zip(funcs, tee(iterable, len(funcs)))) > > which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator. We recently needed exactly this -- to do several running calculations in parallel on an iterable. We avoided using co-routines and just created a RunningCalc class with a simple interface, and implemented various running calculations as sub-classes, e.g. min, max, average, variance, n-largest. This isn't very fast, but since generating the iterated values is computationally heavy, this is fast enough for our uses. Having a standard method to do this in Python, with implementations for common calculations in the stdlib, would have been nice. I wouldn't mind trying to work up a PEP for this, if there is support for the idea. - Tal Einat From p.f.moore at gmail.com Tue Oct 12 17:51:03 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 12 Oct 2010 16:51:03 +0100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On 11 October 2010 21:18, Tal Einat wrote: > We recently needed exactly this -- to do several running calculations > in parallel on an iterable. We avoided using co-routines and just > created a RunningCalc class with a simple interface, and implemented > various running calculations as sub-classes, e.g. min, max, average, > variance, n-largest. This isn't very fast, but since generating the > iterated values is computationally heavy, this is fast enough for our > uses. > > Having a standard method to do this in Python, with implementations > for common calculations in the stdlib, would have been nice. > > I wouldn't mind trying to work up a PEP for this, if there is support > for the idea. The "consumer" interface as described in http://effbot.org/zone/consumer.htm sounds about right for this: class Rmin(object): def __init__(self): self.running_min = None def feed(self, val): if self.running_min is None: self.running_min = val else: self.running_min = min(self.running_min, val) def close(self): pass rmin = Rmin() for val in iter: rmin.feed(val) print rmin.running_min Paul. From taleinat at gmail.com Tue Oct 12 21:41:00 2010 From: taleinat at gmail.com (Tal Einat) Date: Tue, 12 Oct 2010 21:41:00 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: Paul Moore wrote: > On 11 October 2010 21:18, Tal Einat wrote: >> We recently needed exactly this -- to do several running calculations >> in parallel on an iterable. We avoided using co-routines and just >> created a RunningCalc class with a simple interface, and implemented >> various running calculations as sub-classes, e.g. min, max, average, >> variance, n-largest. This isn't very fast, but since generating the >> iterated values is computationally heavy, this is fast enough for our >> uses. >> >> Having a standard method to do this in Python, with implementations >> for common calculations in the stdlib, would have been nice. >> >> I wouldn't mind trying to work up a PEP for this, if there is support >> for the idea. > > The "consumer" interface as described in > http://effbot.org/zone/consumer.htm sounds about right for this: > > class Rmin(object): > ? ?def __init__(self): > ? ? ? ?self.running_min = None > ? ?def feed(self, val): > ? ? ? ?if self.running_min is None: > ? ? ? ? ? ?self.running_min = val > ? ? ? ?else: > ? ? ? ? ? ?self.running_min = min(self.running_min, val) > ? ?def close(self): > ? ? ? ?pass > > rmin = Rmin() > for val in iter: > ? ?rmin.feed(val) > print rmin.running_min That's what I was thinking about too. How about something along these lines? http://pastebin.com/DReBL56T I just worked that up now and would like some comments and suggestions. It could either turn into a PEP or an external library, depending on popularity here. - Tal Einat From p.f.moore at gmail.com Tue Oct 12 22:33:01 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 12 Oct 2010 21:33:01 +0100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On 12 October 2010 20:41, Tal Einat wrote: > That's what I was thinking about too. > > How about something along these lines? > http://pastebin.com/DReBL56T > > I just worked that up now and would like some comments and > suggestions. It could either turn into a PEP or an external library, > depending on popularity here. Looks reasonable. I'd suspect it would be more appropriate as an external library rather than going directly into the stdlib. Also, when I've needed something like this in the past (for simulation code, involving iterators with millions of entries) speed has been pretty critical, so something pure-python like this might not have been enough. Maybe it's something that would be appropriate for numpy? But I like the idea in general. I don't see the need for the RunningCalc base class (duck typing rules!) and I'd be tempted to add dummy close methods, to conform to the published consumer protocol (even though it's not a formal Python standard). I wouldn't necessarily use the given apply function, either, but that's a matter of taste (I would suggest you change the name, though, to avoid reusing the old apply builtin's name, which was something entirely different). Paul From dan at programmer-art.org Wed Oct 13 22:04:28 2010 From: dan at programmer-art.org (Daniel G. Taylor) Date: Wed, 13 Oct 2010 16:04:28 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas Message-ID: <4CB610CC.1070009@programmer-art.org> Hey, I've recently been doing a lot of work with dates related to payment and statistics processing at work and have run into several annoyances with the built-in datetime, date, time, timedelta, etc classes, even when adding in relativedelta. They are awkward, non-intuitive and not at all Pythonic to me. Over the past year I've written up a library for making my life a bit easier and figured I would post some information here to see what others think, and to gauge whether or not such a library might be PEP-worthy. My original post about it was here: http://programmer-art.org/articles/programming/pythonic-date The github project page is here: http://github.com/danielgtaylor/paodate This is code that is and has been running in production environments for months but may still contain bugs. I have tried to include unit tests and ample documentation. I'd love to get some feedback and people's thoughts. I would also love to hear what others find is difficult or missing from the built-in date and time handling. Take care, -- Daniel G. Taylor http://programmer-art.org/ From phd at phd.pp.ru Wed Oct 13 22:30:08 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Thu, 14 Oct 2010 00:30:08 +0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB610CC.1070009@programmer-art.org> References: <4CB610CC.1070009@programmer-art.org> Message-ID: <20101013203008.GA27423@phd.pp.ru> On Wed, Oct 13, 2010 at 04:04:28PM -0400, Daniel G. Taylor wrote: > http://programmer-art.org/articles/programming/pythonic-date Have you ever tried mxDateTime. Do you consider it unpythonic? Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From mal at egenix.com Wed Oct 13 22:42:27 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 13 Oct 2010 22:42:27 +0200 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB610CC.1070009@programmer-art.org> References: <4CB610CC.1070009@programmer-art.org> Message-ID: <4CB619B3.4050203@egenix.com> Daniel G. Taylor wrote: > Hey, > > I've recently been doing a lot of work with dates related to payment and > statistics processing at work and have run into several annoyances with > the built-in datetime, date, time, timedelta, etc classes, even when > adding in relativedelta. They are awkward, non-intuitive and not at all > Pythonic to me. Over the past year I've written up a library for making > my life a bit easier and figured I would post some information here to > see what others think, and to gauge whether or not such a library might > be PEP-worthy. > > My original post about it was here: > > http://programmer-art.org/articles/programming/pythonic-date > > The github project page is here: > > http://github.com/danielgtaylor/paodate > > This is code that is and has been running in production environments for > months but may still contain bugs. I have tried to include unit tests > and ample documentation. I'd love to get some feedback and people's > thoughts. > > I would also love to hear what others find is difficult or missing from > the built-in date and time handling. mxDateTime implements most of these ideas: http://www.egenix.com/products/python/mxBase/mxDateTime/ It's been in production use for more than 13 years now and has proven to be very versatile in practice; YMMV, of course. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 13 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From dan at programmer-art.org Wed Oct 13 22:59:38 2010 From: dan at programmer-art.org (Daniel G. Taylor) Date: Wed, 13 Oct 2010 16:59:38 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB619B3.4050203@egenix.com> References: <4CB610CC.1070009@programmer-art.org> <4CB619B3.4050203@egenix.com> Message-ID: <4CB61DBA.9060801@programmer-art.org> On 10/13/2010 04:42 PM, M.-A. Lemburg wrote: > mxDateTime implements most of these ideas: > > http://www.egenix.com/products/python/mxBase/mxDateTime/ > > It's been in production use for more than 13 years now and > has proven to be very versatile in practice; YMMV, of course. Hah, that is a very nice looking library. I wish I had looked into it before writing my own. Looks like it still doesn't allow write access to many properties in date or delta objects, but looks to have a lot of really useful stuff in it. I'll be taking a closer look shortly. Any idea why this hasn't made it into Python's standard library while being around for 13 years? Seems like it would be extremely useful in the standard distribution. Take care, -- Daniel G. Taylor http://programmer-art.org/ From alexander.belopolsky at gmail.com Wed Oct 13 23:17:36 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 13 Oct 2010 17:17:36 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB610CC.1070009@programmer-art.org> References: <4CB610CC.1070009@programmer-art.org> Message-ID: On Wed, Oct 13, 2010 at 4:04 PM, Daniel G. Taylor wrote: > ... and have run into several annoyances with the > built-in datetime, date, time, timedelta, etc classes, even when adding in > relativedelta. They are awkward, non-intuitive and not at all Pythonic to > me. There seems to be no shortage of blogosphere rants about how awkward python datetime module is, but once patches are posted on the tracker to improve it, nobody seems to be interested in reviewing them. I has been suggested that C implementation presented a high barrier to entry for people to get involved in datetime module development. This was one of the reasons I pushed for including a pure python equivalent in 3.2. Unfortunately, getting datetime.py into SVN tree was not enough to spark new interest in improving the module. Maybe this will change with datetime.py making it into a released version. .. > My original post about it was here: > > http://programmer-art.org/articles/programming/pythonic-date > This post is severely lacking in detail, so I cannot tell how your library solves your announced problems, but most of them seem to be easy with datetime: * Make it easy to make a Date from anything - a timestamp, date, datetime, tuple, etc. >>> from datetime import * >>> datetime.utcfromtimestamp(0) datetime.datetime(1970, 1, 1, 0, 0) >>> datetime.utcfromtimestamp(0).date() datetime.date(1970, 1, 1) * Make it easy to turn a Date into anything datetime.timetuple() will convert datetime to a tuple. There is an open ticket to simplify datetime to timestamp conversion http://bugs.python.org/issue2736 but it is already easy enough: >>> (datetime.now() - datetime(1970,1,1)).total_seconds() 1286989863.82536 * Make it easy and pythonic to add/subtract one or more days, weeks, months, or years monthdelta addition was discussed at http://bugs.python.org/issue5434, but did not get enough interest. The rest seems to be easy enough with timedetla. * Make it easy to get a tuple of the start and end of the month Why would you want this? Start of the month is easy: just date(year, month, 1). End of the month is often unnecessary because it is more pythonic to work with semi-open ranges and use first of the next month instead. From dan at programmer-art.org Wed Oct 13 23:45:33 2010 From: dan at programmer-art.org (Daniel G. Taylor) Date: Wed, 13 Oct 2010 17:45:33 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: References: <4CB610CC.1070009@programmer-art.org> Message-ID: <4CB6287D.8070101@programmer-art.org> On 10/13/2010 05:17 PM, Alexander Belopolsky wrote: > On Wed, Oct 13, 2010 at 4:04 PM, Daniel G. Taylor > wrote: >> ... and have run into several annoyances with the >> built-in datetime, date, time, timedelta, etc classes, even when adding in >> relativedelta. They are awkward, non-intuitive and not at all Pythonic to >> me. > > There seems to be no shortage of blogosphere rants about how awkward > python datetime module is, but once patches are posted on the tracker > to improve it, nobody seems to be interested in reviewing them. I has > been suggested that C implementation presented a high barrier to entry > for people to get involved in datetime module development. This was > one of the reasons I pushed for including a pure python equivalent in > 3.2. Unfortunately, getting datetime.py into SVN tree was not enough > to spark new interest in improving the module. Maybe this will change > with datetime.py making it into a released version. This at least sounds like some progress is being made, so that makes me happy. I'd be glad to work on stuff if I knew it has the potential to make a difference and be accepted upstream and if it doesn't require me rewriting every little thing in the module. I'm not really sure where to start as all I really want is a nice wrapper to make working with dates seem intuitive and friendly. > .. >> My original post about it was here: >> >> http://programmer-art.org/articles/programming/pythonic-date >> > > This post is severely lacking in detail, so I cannot tell how your > library solves your announced problems, but most of them seem to be > easy with datetime: Yeah sorry it was mostly just a frustrated rant and then the start of my wrapper implementation. > * Make it easy to make a Date from anything - a timestamp, date, > datetime, tuple, etc. > >>>> from datetime import * >>>> datetime.utcfromtimestamp(0) > datetime.datetime(1970, 1, 1, 0, 0) >>>> datetime.utcfromtimestamp(0).date() > datetime.date(1970, 1, 1) Why does it not have this in the constructor? Where else in the standard lib does anything behave like this? My solution was to just dump whatever you want into the constructor and you get a Date object which can be converted to anything else via simple properties. > * Make it easy to turn a Date into anything > > datetime.timetuple() will convert datetime to a tuple. There is an > open ticket to simplify datetime to timestamp conversion > > http://bugs.python.org/issue2736 I'll be happy when this is fixed :-) > but it is already easy enough: > >>>> (datetime.now() - datetime(1970,1,1)).total_seconds() > 1286989863.82536 This is new in Python 2.7 it seems, before you had to calculate it by hand which was annoying to me. Now this seems okay. > * Make it easy and pythonic to add/subtract one or more days, weeks, > months, or years > > monthdelta addition was discussed at http://bugs.python.org/issue5434, > but did not get enough interest. The rest seems to be easy enough > with timedetla. And that means yet another module I have to import with various functions I have to use to manipulate an object rather than methods of the object itself. This doesn't seem Pythonic to me... > * Make it easy to get a tuple of the start and end of the month > > Why would you want this? Start of the month is easy: just date(year, > month, 1). End of the month is often unnecessary because it is more > pythonic to work with semi-open ranges and use first of the next month > instead. It's just for convenience really. For an example, I used it for querying a database for invoices in certain date ranges and for managing e.g. monthly recurring charges. It's just way more convenient and makes my code very easy to read where it counts - within the complex logic controlling when we charge credit cards. The less complex code there the better, because typos and bugs cost real money. Even if the tuples returned contained e.g. the first day of this and next month instead of the last day of the month it's still useful to have these properties that return the tuples (at least to me), as it saves some manual work each time. Take care, -- Daniel G. Taylor http://programmer-art.org/ From taleinat at gmail.com Wed Oct 13 23:54:31 2010 From: taleinat at gmail.com (Tal Einat) Date: Wed, 13 Oct 2010 23:54:31 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On Mon, Oct 11, 2010 at 10:18 PM, Tal Einat wrote: > Masklinn wrote: >> On 2010-10-11, at 02:55 , Zac Burns wrote: >>> >>> Unfortunately this solution seems incompatable with the implementations with >>> for loops in min and max (EG: How do you switch functions at the right >>> time?) So it might take some tweaking. >> As far as I know, there is no way to force lockstep iteration of arbitrary functions in Python. Though an argument could be made for adding coroutine capabilities to builtins and library functions taking iterables, I don't think that's on the books. >> >> As a result, this function would devolve into something along the lines of >> >> ? ?def apply(iterable, *funcs): >> ? ? ? ?return map(lambda c: c[0](c[1]), zip(funcs, tee(iterable, len(funcs)))) >> >> which would run out of memory on very long or nigh-infinite iterables due to tee memoizing all the content of the iterator. > > We recently needed exactly this -- to do several running calculations > in parallel on an iterable. We avoided using co-routines and just > created a RunningCalc class with a simple interface, and implemented > various running calculations as sub-classes, e.g. min, max, average, > variance, n-largest. This isn't very fast, but since generating the > iterated values is computationally heavy, this is fast enough for our > uses. > > Having a standard method to do this in Python, with implementations > for common calculations in the stdlib, would have been nice. > > I wouldn't mind trying to work up a PEP for this, if there is support > for the idea. After some thought, I've found a way to make running several "running calculations" in parallel fast. Speed should be comparable to having used the non-running variants. The method is to give each running calculation "blocks" of values instead of just one at a time. The apply_in_parallel(iterable, block_size=1000, *running_calcs) function would get blocks of values from the iterable and pass them to each running calculation separately. So RunningMax would look something like this: class RunningMax(RunningCalc): def __init__(self): self.max_value = None def feed(self, value): if self.max_value is None or value > self.max_value: self.max_value = value def feedMultiple(self, values): self.feed(max(values)) feedMultiple() would have a naive default implementation in the base class. Now this is non-trivial and can certainly be useful. Thoughts? Comments? - Tal Einat From dag.odenhall at gmail.com Thu Oct 14 00:16:32 2010 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Thu, 14 Oct 2010 00:16:32 +0200 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB610CC.1070009@programmer-art.org> References: <4CB610CC.1070009@programmer-art.org> Message-ID: <1287008192.4178.9.camel@gumri> On Wed, 2010-10-13 at 16:04 -0400, Daniel G. Taylor wrote: > Hey, > > I've recently been doing a lot of work with dates related to payment and > statistics processing at work and have run into several annoyances with > the built-in datetime, date, time, timedelta, etc classes, even when > adding in relativedelta. They are awkward, non-intuitive and not at all > Pythonic to me. Over the past year I've written up a library for making > my life a bit easier and figured I would post some information here to > see what others think, and to gauge whether or not such a library might > be PEP-worthy. > > My original post about it was here: > > http://programmer-art.org/articles/programming/pythonic-date > > The github project page is here: > > http://github.com/danielgtaylor/paodate > > This is code that is and has been running in production environments for > months but may still contain bugs. I have tried to include unit tests > and ample documentation. I'd love to get some feedback and people's > thoughts. > > I would also love to hear what others find is difficult or missing from > the built-in date and time handling. > > Take care, > Not convinced your library is very Pythonic. Why a tuple attribute instead of having date objects be iterable so you can do tuple(Date())? How does the fancy formats deal with locales? Is there support for ISO 8601? Should probably be the __str__. +1 on the general idea, though. From alexander.belopolsky at gmail.com Thu Oct 14 00:52:52 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 13 Oct 2010 18:52:52 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB6287D.8070101@programmer-art.org> References: <4CB610CC.1070009@programmer-art.org> <4CB6287D.8070101@programmer-art.org> Message-ID: On Wed, Oct 13, 2010 at 5:45 PM, Daniel G. Taylor wrote: .. >> * Make it easy to make a Date from anything - a timestamp, date, >> datetime, tuple, etc. >> >>>>> from datetime import * >>>>> datetime.utcfromtimestamp(0) >> >> datetime.datetime(1970, 1, 1, 0, 0) >>>>> >>>>> datetime.utcfromtimestamp(0).date() >> >> datetime.date(1970, 1, 1) > > Why does it not have this in the constructor? Because "explicit is better than implicit." > Where else in the standard lib does anything behave like this? float.fromhex is one example. This said, if I was starting from scratch, I would make date/datetime constructors take a single positional argument that could be a string (interpreted as ISO timestamp), tuple (broken down components), or another date/datetime object. This would make date/datetime constructors more similar to those of numeric types. I would not add datetime(int) or datetime(float), however, because numeric timestamps are too ambiguous and not necessary for purely calendaric calculations. From ncoghlan at gmail.com Thu Oct 14 01:14:55 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 14 Oct 2010 09:14:55 +1000 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On Thu, Oct 14, 2010 at 7:54 AM, Tal Einat wrote: > class RunningMax(RunningCalc): > ? ?def __init__(self): > ? ? ? ?self.max_value = None > > ? ?def feed(self, value): > ? ? ? ?if self.max_value is None or value > self.max_value: > ? ? ? ? ? ?self.max_value = value > > ? ?def feedMultiple(self, values): > ? ? ? ?self.feed(max(values)) > > feedMultiple() would have a naive default implementation in the base class. > > Now this is non-trivial and can certainly be useful. Thoughts? Comments? Why use feed() rather than the existing generator send() API? def runningmax(default_max=None): max_value = default_max while 1: value = max(yield max_value) if max_value is None or value > max_value: max_value = value That said, I think this kind of thing requires too many additional assumptions about how things are driven to make a particularly good candidate for standard library inclusion without use in PyPI library first. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From taleinat at gmail.com Thu Oct 14 02:13:05 2010 From: taleinat at gmail.com (Tal Einat) Date: Thu, 14 Oct 2010 02:13:05 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote: > Why use feed() rather than the existing generator send() API? > > def runningmax(default_max=None): > ? ?max_value = default_max > ? ?while 1: > ? ? ? ?value = max(yield max_value) > ? ? ? ?if max_value is None or value > max_value: > ? ? ? ? ? ?max_value = value I tried using generators for this and it came out very clumsy. For one thing, using generators for this requires first calling next() once to run the generator up to the first yield, which makes the user-facing API very confusing. Generators also have to yield a value at every iteration, which is unnecessary here. Finally, the feedMultiple optimization is impossible with a generator-based implementation. > That said, I think this kind of thing requires too many additional > assumptions about how things are driven to make a particularly good > candidate for standard library inclusion without use in PyPI library > first. I'm not sure. "Rolling your own" for this isn't too difficult, so many developers will prefer to do so rather then add another dependency from PyPI. On the other hand, Python's standard library includes various simple utilities that make relatively simple things easier, standardized and well tested. Additionally, I think this fits in very nicely with Python's embracing of iterators, and complements the itertools library well. While I'm at it, I'd like to mention that I am aiming at a single very simple common usage pattern: from RunningCalc import apply_in_parallel, RunningCount, RunningNLargest, RunningNSmallest count, largest10, smallest10 = apply_in_parallel(data, RunningCount(), RunningNLargest(10), RunningNSmallest(10)) Implementing new running calculation classes would also be very simple. - Tal Einat From birbag at gmail.com Thu Oct 14 10:02:24 2010 From: birbag at gmail.com (Marco Mariani) Date: Thu, 14 Oct 2010 10:02:24 +0200 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: References: <4CB610CC.1070009@programmer-art.org> Message-ID: On 13 October 2010 23:17, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: * Make it easy to get a tuple of the start and end of the month > > Why would you want this? Start of the month is easy: just date(year, > month, 1). End of the month is often unnecessary because it is more > pythonic to work with semi-open ranges and use first of the next month > instead. > Except next month may well be in next year.. blah And I don't care about pythonic ranges if I have to push the values through a BETWEEN query in SQL. import calendar import datetime end = datetime.date(year, month, calendar.monthrange(year, month)[1]) -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Oct 14 13:23:31 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Oct 2010 22:23:31 +1100 Subject: [Python-ideas] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> Message-ID: <201010142223.32257.steve@pearwood.info> On Thu, 14 Oct 2010 08:54:31 am you wrote: > After some thought, I've found a way to make running several "running > calculations" in parallel fast. Speed should be comparable to having > used the non-running variants. Speed "should be" comparable? Are you guessing or have you actually timed it? And surely the point of all this extra coding is to make something run *faster*, not "comparable to", the sequential algorithm? > The method is to give each running calculation "blocks" of values > instead of just one at a time. The apply_in_parallel(iterable, > block_size=1000, *running_calcs) function would get blocks of values > from the iterable and pass them to each running calculation > separately. So RunningMax would look something like this: > > class RunningMax(RunningCalc): > def __init__(self): > self.max_value = None > > def feed(self, value): > if self.max_value is None or value > self.max_value: > self.max_value = value > > def feedMultiple(self, values): > self.feed(max(values)) As I understand it, you have a large list of data, and you want to calculate a number of statistics on it. The naive algorithm does each calculation sequentially: a = min(data) b = max(data) c = f(data) # some other statistics d = g(data) ... x = z(data) If the calculations take t1, t2, t3, ..., tx time, then the sequential calculation takes sum(t1, t2, ..., tx) plus a bit of overhead. If you do it in parallel, this should reduce the time to max(t1, t2, ..., tx) plus a bit of overhead, potentially a big saving. But unless I've missed something significant, all you are actually doing is slicing data up into small pieces, then calling each function min, max, f, g, ..., z on each piece sequentially: block = data[:size] a = min(block) b = max(block) c = f(block) ... block = data[size:2*size] a = min(a, min(block)) b = max(b, max(block)) c = f(c, f(block)) ... block = data[2*size:3*size] a = min(a, min(block)) b = max(b, max(block)) c = f(c, f(block)) ... Each function still runs sequentially, but you've increased the amount of overhead a lot: your slicing and dicing the data, plus calling each function multiple times. And since each "running calculation" class needs to be hand-written to suit the specifics of the calculation, that's a lot of extra work just to get something which I expect will run slower than the naive sequential algorithm. I'm also distracted by the names, RunningMax and RunningCalc. RunningFoo doesn't mean "do Foo in parallel", it means to return intermediate calculations. For example, if I ran a function called RunningMax on this list: [1, 2, 1, 5, 7, 3, 4, 6, 8, 6] I would expect it to yield or return: [1, 2, 5, 7, 8] -- Steven D'Aprano From taleinat at gmail.com Thu Oct 14 14:05:25 2010 From: taleinat at gmail.com (Tal Einat) Date: Thu, 14 Oct 2010 14:05:25 +0200 Subject: [Python-ideas] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <201010142223.32257.steve@pearwood.info> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010142223.32257.steve@pearwood.info> Message-ID: On Thu, Oct 14, 2010 at 1:23 PM, Steven D'Aprano wrote: > On Thu, 14 Oct 2010 08:54:31 am you wrote: > >> After some thought, I've found a way to make running several "running >> calculations" in parallel fast. Speed should be comparable to having >> used the non-running variants. > > Speed "should be" comparable? Are you guessing or have you actually > timed it? > > And surely the point of all this extra coding is to make something run > *faster*, not "comparable to", the sequential algorithm? The use-case I'm targeting is when you can't hold all of the data in memory, and it is relatively "expensive" to generate it, e.g. a large and complex database query. In this case just running the sequential functions one at a time requires generating the data several times, once per function. My goal is to facilitate running several computations on a single iterator without keeping all of the data in memory. In almost all cases this will be slower than having run each of the sequential functions one at a time, if it were possible to keep all of the data in memory. The grouping optimization aims to reduce the overhead. - Tal Einat From masklinn at masklinn.net Thu Oct 14 14:06:09 2010 From: masklinn at masklinn.net (Masklinn) Date: Thu, 14 Oct 2010 14:06:09 +0200 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: References: <4CB610CC.1070009@programmer-art.org> Message-ID: On 2010-10-14, at 10:02 , Marco Mariani wrote: > On 13 October 2010 23:17, Alexander Belopolsky < > alexander.belopolsky at gmail.com> wrote: > * Make it easy to get a tuple of the start and end of the month >> >> Why would you want this? Start of the month is easy: just date(year, >> month, 1). End of the month is often unnecessary because it is more >> pythonic to work with semi-open ranges and use first of the next month >> instead. > > Except next month may well be in next year.. blah > > And I don't care about pythonic ranges if I have to push the values through > a BETWEEN query in SQL. > > import calendar > import datetime > > end = datetime.date(year, month, calendar.monthrange(year, month)[1]) There's also dateutil, which exposes some ideas of mx.DateTime on top of the built-in datetime, including relativedelta. As a result, you can get the last day of the current month by going backwards one day from the first day of next month: >>> datetime.now().date() + relativedelta(months=+1, day=+1, days=-1) datetime.date(2010, 10, 31) Or (clearer order of operations): >>> datetime.now().date() + relativedelta(months=+1, day=+1) + relativedelta(days=-1) datetime.date(2010, 10, 31) (note that in both cases the "+" sign is of course optional). Parameters without an `s` postfix are absolute (day=1 sets the day of the current datetime to 1, similar to using .replace), parameters with an `s` are offsets (`days=+1` takes tomorrow). From danielgtaylor at gmail.com Thu Oct 14 20:51:05 2010 From: danielgtaylor at gmail.com (Daniel G. Taylor) Date: Thu, 14 Oct 2010 14:51:05 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: References: <4CB610CC.1070009@programmer-art.org> Message-ID: <4CB75119.7000701@gmail.com> On 10/14/2010 08:06 AM, Masklinn wrote: > On 2010-10-14, at 10:02 , Marco Mariani wrote: >> On 13 October 2010 23:17, Alexander Belopolsky< >> alexander.belopolsky at gmail.com> wrote: >> * Make it easy to get a tuple of the start and end of the month >>> >>> Why would you want this? Start of the month is easy: just date(year, >>> month, 1). End of the month is often unnecessary because it is more >>> pythonic to work with semi-open ranges and use first of the next month >>> instead. >> >> Except next month may well be in next year.. blah >> >> And I don't care about pythonic ranges if I have to push the values through >> a BETWEEN query in SQL. >> >> import calendar >> import datetime >> >> end = datetime.date(year, month, calendar.monthrange(year, month)[1]) > > There's also dateutil, which exposes some ideas of mx.DateTime on top of the built-in datetime, including relativedelta. > > As a result, you can get the last day of the current month by going backwards one day from the first day of next month: > >>>> datetime.now().date() + relativedelta(months=+1, day=+1, days=-1) > datetime.date(2010, 10, 31) > > Or (clearer order of operations): > >>>> datetime.now().date() + relativedelta(months=+1, day=+1) + relativedelta(days=-1) > datetime.date(2010, 10, 31) > > (note that in both cases the "+" sign is of course optional). > > Parameters without an `s` postfix are absolute (day=1 sets the day of the current datetime to 1, similar to using .replace), parameters with an `s` are offsets (`days=+1` takes tomorrow). FWIW my library does the same sort of stuff using relativedelta internally, just sugar coats it heavily ;-) Take care, -- Daniel G. Taylor http://programmer-art.org/ From dan at programmer-art.org Thu Oct 14 20:54:30 2010 From: dan at programmer-art.org (Daniel G. Taylor) Date: Thu, 14 Oct 2010 14:54:30 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <1287008192.4178.9.camel@gumri> References: <4CB610CC.1070009@programmer-art.org> <1287008192.4178.9.camel@gumri> Message-ID: <4CB751E6.20605@programmer-art.org> On 10/13/2010 06:16 PM, Dag Odenhall wrote: > Not convinced your library is very Pythonic. Why a tuple attribute > instead of having date objects be iterable so you can do tuple(Date())? How do you envision this working for days, weeks, months, years? E.g. getting the min/max Date objects for today, for next week, for this current month, etc. I'm very open to ideas here; I just implemented what made sense to me at the time. > How does the fancy formats deal with locales? It internally uses datetime.strftime, so will behave however that behaves with regard to locales. > Is there support for ISO 8601? Should probably be the __str__. Not built-in other than supporting a strftime method. This is a good idea and I will probably add it. > +1 on the general idea, though. Thanks :-) Take care, -- Daniel G. Taylor http://programmer-art.org/ From davejakeman at hotmail.com Thu Oct 14 22:58:52 2010 From: davejakeman at hotmail.com (Dave Jakeman) Date: Thu, 14 Oct 2010 20:58:52 +0000 Subject: [Python-ideas] String Subtraction Message-ID: I'm new to Python and this is my first suggestion, so please bear with me: I believe there is a simple but useful string operation missing from Python: subtraction. This is best described by way of example: >>> "Mr Nesbit has learnt the first lesson of not being seen." - "Nesbit " 'Mr has learnt the first lesson of not being seen.' >>> "Can't have egg, bacon, spam and sausage without the spam." - " spam" 'Can't have egg, bacon, and sausage without the spam.' >>> "I'll bite your legs off!" - "arms" 'I'll bite your legs off!' If b and c were strings, then: a = b - c would be equivalent to: if b.find(c) < 0: a = b else: a = b[:b.find(c)] + b[b.find(c)+len(c):] The operation would remove from the minuend the first occurrence (searching from left to right) of the subtrahend. In the case of no match, the minuend would be returned unmodified. To those unfamiliar with string subtraction, it might seem non-intuitive, but it's a useful programming construct. Many things can be done with it and it's a good way to keep code simple. I think it would be preferable to the current interpreter response: >>> record = line - newline Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for -: 'str' and 'str' As the interpreter currently checks for this attempted operation, it seems it would be straightforward to add the code needed to do something useful with it. I don't think there would be backward compatibility issues, as this would be a new feature in place of a fatal error. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lvh at laurensvh.be Thu Oct 14 23:15:28 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Thu, 14 Oct 2010 23:15:28 +0200 Subject: [Python-ideas] String Subtraction In-Reply-To: References: Message-ID: People already do this with the s1.replace(s2, "") idiom. I'm not sure what the added value is. Your equivalent implementation looks pretty strange and complex: how is it different from str.replace with the empty string as second argument? cheers lvh -------------- next part -------------- An HTML attachment was scrubbed... URL: From dougal85 at gmail.com Thu Oct 14 23:16:07 2010 From: dougal85 at gmail.com (Dougal Matthews) Date: Thu, 14 Oct 2010 22:16:07 +0100 Subject: [Python-ideas] String Subtraction In-Reply-To: References: Message-ID: On 14 October 2010 21:58, Dave Jakeman wrote: > If b and c were strings, then: > > a = b - c > > would be equivalent to: > > if b.find(c) < 0: > a = b > else: > a = b[:b.find(c)] + b[b.find(c)+len(c):] > Or more simply... a = b.replace(c, '') Dougal -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwm-keyword-python.b4bdba at mired.org Thu Oct 14 23:23:11 2010 From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer) Date: Thu, 14 Oct 2010 17:23:11 -0400 Subject: [Python-ideas] String Subtraction In-Reply-To: References: Message-ID: <20101014172311.13909a3d@bhuda.mired.org> On Thu, 14 Oct 2010 20:58:52 +0000 Dave Jakeman wrote: > I'm new to Python and this is my first suggestion, so please bear with me: > > I believe there is a simple but useful string operation missing from Python: subtraction. This is best described by way of example: > > >>> "Mr Nesbit has learnt the first lesson of not being seen." - "Nesbit " > 'Mr has learnt the first lesson of not being seen.' > >>> "Can't have egg, bacon, spam and sausage without the spam." - " spam" > 'Can't have egg, bacon, and sausage without the spam.' > >>> "I'll bite your legs off!" - "arms" > 'I'll bite your legs off!' > > If b and c were strings, then: > > a = b - c > > would be equivalent to: The existing construct a = b.replace(c, '', 1) The problem isn't that it's non-intuitive (there's only one intuitive interface, and it's got nothing to do with computers), it's that there are a wealth of "intuitive" meanings. A case can be made that it should mean the same as any of thise: a = b.replace(c, '') a = b.replace(c, ' ', 1) a = b.replace(c, ' ') For that matter, it might also mean the same thing as any of these: a = re.sub(r'\s*%s\s*' % c, '', b, 1) a = re.sub(r'\s*%s\s*' % c, '', b) a = re.sub(r'\s*%s\s*' % c, ' ', b, 1) a = re.sub(r'\s*%s\s*' % c, ' ', b) a = re.sub(r'%s\s*' % c, '', b, 1) a = re.sub(r'%s\s*' % c, '', b) a = re.sub(r'%s\s*' % c, ' ', b, 1) a = re.sub(r'%s\s*' % c, ' ', b) a = re.sub(r'\s*%s' % c, '', b, 1) a = re.sub(r'\s*%s' % c, '', b) a = re.sub(r'\s*%s' % c, ' ', b, 1) a = re.sub(r'\s*%s' % c, ' ', b) Unless you can make a clear case as to why exactly one of those cases is different enough from the others to warrant a syntax all it's own, It's probably best to be explicit about the desired behavior. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From masklinn at masklinn.net Thu Oct 14 23:49:25 2010 From: masklinn at masklinn.net (Masklinn) Date: Thu, 14 Oct 2010 23:49:25 +0200 Subject: [Python-ideas] String Subtraction In-Reply-To: <20101014172311.13909a3d@bhuda.mired.org> References: <20101014172311.13909a3d@bhuda.mired.org> Message-ID: <8418E1CE-73CB-400C-912A-7709CD0B8018@masklinn.net> On 2010-10-14, at 23:23 , Mike Meyer wrote: > > The problem isn't that it's non-intuitive (there's only one intuitive > interface, and it's got nothing to do with computers), it's that there > are a wealth of "intuitive" meanings. A case can be made that it > should mean the same as any of thise: Still, from my experience with numbers I would expect `a + b - b == a`, even if the order in which these operations are applied is important and not irrelevant. From neatnate at gmail.com Fri Oct 15 00:01:42 2010 From: neatnate at gmail.com (Nathan Schneider) Date: Thu, 14 Oct 2010 18:01:42 -0400 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas In-Reply-To: <4CB751E6.20605@programmer-art.org> References: <4CB610CC.1070009@programmer-art.org> <1287008192.4178.9.camel@gumri> <4CB751E6.20605@programmer-art.org> Message-ID: I'm glad to see there's interest in solving this (seems I'm not alone in seeing date/time support as the ugly stepchild of the Python standard library). For what it's worth, not too long ago I ended up writing a bunch of convenience functions to instantiate and convert between existing date/time representations (datetime objects, time tuples, timestamps, and string representations). The result is here, in case anyone's interested: http://www.cs.cmu.edu/~nschneid/docs/temporal.py Cheers, Nathan On Thu, Oct 14, 2010 at 2:54 PM, Daniel G. Taylor wrote: > On 10/13/2010 06:16 PM, Dag Odenhall wrote: >> >> Not convinced your library is very Pythonic. Why a tuple attribute >> instead of having date objects be iterable so you can do tuple(Date())? > > How do you envision this working for days, weeks, months, years? E.g. > getting the min/max Date objects for today, for next week, for this current > month, etc. > > I'm very open to ideas here; I just implemented what made sense to me at the > time. > >> How does the fancy formats deal with locales? > > It internally uses datetime.strftime, so will behave however that behaves > with regard to locales. > >> Is there support for ISO 8601? Should probably be the __str__. > > Not built-in other than supporting a strftime method. This is a good idea > and I will probably add it. > >> +1 on the general idea, though. > > Thanks :-) > > Take care, > -- > Daniel G. Taylor > http://programmer-art.org/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From mwm-keyword-python.b4bdba at mired.org Fri Oct 15 00:13:26 2010 From: mwm-keyword-python.b4bdba at mired.org (Mike Meyer) Date: Thu, 14 Oct 2010 18:13:26 -0400 Subject: [Python-ideas] String Subtraction In-Reply-To: <8418E1CE-73CB-400C-912A-7709CD0B8018@masklinn.net> References: <20101014172311.13909a3d@bhuda.mired.org> <8418E1CE-73CB-400C-912A-7709CD0B8018@masklinn.net> Message-ID: <20101014181326.38579cfc@bhuda.mired.org> On Thu, 14 Oct 2010 23:49:25 +0200 Masklinn wrote: > On 2010-10-14, at 23:23 , Mike Meyer wrote: > > > > The problem isn't that it's non-intuitive (there's only one intuitive > > interface, and it's got nothing to do with computers), it's that there > > are a wealth of "intuitive" meanings. A case can be made that it > > should mean the same as any of these: > > Still, from my experience with numbers I would expect `a + b - b == a`, even if the order in which these operations are applied is important and not irrelevant. Well, if you use the standard left-to right ordering, that equality doesn't hold for the proposed meaning for string subtraction: ("xyzzy and " + "xyzzy") - "xyzzy" = " and xyzzy" != "xyzzy and " It won't hold for any of the definition I proposed either - not if a contains a copy of b. Come to think of it, it doesn't hold for the computer representation of numbers, either. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From denis.spir at gmail.com Fri Oct 15 00:16:18 2010 From: denis.spir at gmail.com (spir) Date: Fri, 15 Oct 2010 00:16:18 +0200 Subject: [Python-ideas] String Subtraction In-Reply-To: <20101014172311.13909a3d@bhuda.mired.org> References: <20101014172311.13909a3d@bhuda.mired.org> Message-ID: <20101015001618.2c19634e@o> On Thu, 14 Oct 2010 17:23:11 -0400 Mike Meyer wrote: > The problem isn't that it's non-intuitive (there's only one intuitive > interface, and it's got nothing to do with computers), it's that there > are a wealth of "intuitive" meanings. Maybe have string.erase(sub,n) be a "more intuitive" shortcut for string.replace(sub,'',n)? Denis -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From bruce at leapyear.org Fri Oct 15 01:45:18 2010 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 14 Oct 2010 16:45:18 -0700 Subject: [Python-ideas] String Subtraction In-Reply-To: <20101015001618.2c19634e@o> References: <20101014172311.13909a3d@bhuda.mired.org> <20101015001618.2c19634e@o> Message-ID: Here's a useful function along these lines, which ideally would be string.remove(): def remove(s, sub, maxremove=None, sep=None): """Removes instances of sub from the string. Args: s: The string to be modified. sub: The substring to be removed. maxremove: If specified, the maximum number of instances to be removed (starting from the left). If omitted, removes all instances. sep: Optionally, the separators to be removed. If the separator appears on both sides of a removed substring, one of the separators is removed. >>> remove('test,blah,blah,blah,this', 'blah') 'test,,,,this' >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2) 'test,,,blah,this' >>> remove('test,blah,blah,blah,this', 'blah', sep=',') 'test,this' >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2, sep=',') 'test,blah,this' >>> remove('foo(1)blah(2)blah(3)bar', 'blah', 1) 'foo(1)(2)blah(3)bar' """ processed = '' remaining = s while maxremove is None or maxremove > 0: parts = remaining.split(sub, 1) if len(parts) == 1: return processed + remaining processed += parts[0] remaining = parts[1] if sep and processed.endswith(sep) and remaining.startswith(sep): remaining = remaining[len(sep):] if maxremove is not None: maxremove -= 1 return processed + remaining --- Bruce Latest blog post: http://www.vroospeak.com/2010/10/today-we-are-all-chileans.html Learn how hackers think: http://j.mp/gruyere-security On Thu, Oct 14, 2010 at 3:16 PM, spir wrote: > On Thu, 14 Oct 2010 17:23:11 -0400 > Mike Meyer wrote: > > > The problem isn't that it's non-intuitive (there's only one intuitive > > interface, and it's got nothing to do with computers), it's that there > > are a wealth of "intuitive" meanings. > > Maybe have string.erase(sub,n) be a "more intuitive" shortcut for > string.replace(sub,'',n)? > > Denis > -- -- -- -- -- -- -- > vit esse estrany ? > > spir.wikidot.com > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Fri Oct 15 02:51:43 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 15 Oct 2010 11:51:43 +1100 Subject: [Python-ideas] Pythonic Dates, Times, and Deltas References: <4CB610CC.1070009@programmer-art.org> <4CB619B3.4050203@egenix.com> <4CB61DBA.9060801@programmer-art.org> Message-ID: <874ocouvkw.fsf@benfinney.id.au> "Daniel G. Taylor" writes: > Any idea why this hasn't made it into Python's standard library while > being around for 13 years? Seems like it would be extremely useful in > the standard distribution. One barrier is that its license terms are incompatible with redistribution under the terms of the Python license. I'd love to see the mx code released under compatible license terms, but am not optimistic. -- \ ?He that would make his own liberty secure must guard even his | `\ enemy from oppression.? ?Thomas Paine | _o__) | Ben Finney From rrr at ronadam.com Fri Oct 15 04:09:06 2010 From: rrr at ronadam.com (Ron Adam) Date: Thu, 14 Oct 2010 21:09:06 -0500 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: <4CB7B7C2.8090401@ronadam.com> On 10/13/2010 07:13 PM, Tal Einat wrote: > On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote: >> Why use feed() rather than the existing generator send() API? >> >> def runningmax(default_max=None): >> max_value = default_max >> while 1: >> value = max(yield max_value) >> if max_value is None or value> max_value: >> max_value = value > > I tried using generators for this and it came out very clumsy. For one > thing, using generators for this requires first calling next() once to > run the generator up to the first yield, which makes the user-facing > API very confusing. Generators also have to yield a value at every > iteration, which is unnecessary here. Finally, the feedMultiple > optimization is impossible with a generator-based implementation. Something I noticed about the min and max functions is that they treat values and iterable slightly different. # This works >>> min(1, 2) 1 >>> min([1, 2]) 1 # The gotcha >>> min([1]) 1 >>> min(1) Traceback (most recent call last): File "", line 1, in TypeError: 'int' object is not iterable So you need a function like the following to make it handle single values and single iterables the same. def xmin(value): try: return min(value) except TypeError: return min([value]) Then you can do... @consumer def Running_Min(out_value=None): while 1: in_value = yield out_value if in_value is not None: if out_value is None: out_value = xmin(in_value) else: out_value = xmin(out_value, xmin(in_value)) Or for your class... def xmax(value): try: return max(value) except TypeError: return max([value]) class RunningMax(RunningCalc): def __init__(self): self.max_value = None def feed(self, value): if value is not None: if self.max_value is None: self.max_value = xmax(value) else: self.max_value = xmax(self.max_value, xmax(value)) Now if they could handle None a bit better we might be able to get rid of the None checks too. ;-) Cheers, Ron From guido at python.org Fri Oct 15 04:14:02 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 14 Oct 2010 19:14:02 -0700 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB7B7C2.8090401@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> Message-ID: Why would you ever want to write min(1)? (Or min(x) where x is not iterable.) --Guido On Thu, Oct 14, 2010 at 7:09 PM, Ron Adam wrote: > > > On 10/13/2010 07:13 PM, Tal Einat wrote: >> >> On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote: >>> >>> Why use feed() rather than the existing generator send() API? >>> >>> def runningmax(default_max=None): >>> ? ?max_value = default_max >>> ? ?while 1: >>> ? ? ? ?value = max(yield max_value) >>> ? ? ? ?if max_value is None or value> ?max_value: >>> ? ? ? ? ? ?max_value = value >> >> I tried using generators for this and it came out very clumsy. For one >> thing, using generators for this requires first calling next() once to >> run the generator up to the first yield, which makes the user-facing >> API very confusing. Generators also have to yield a value at every >> iteration, which is unnecessary here. Finally, the feedMultiple >> optimization is impossible with a generator-based implementation. > > Something I noticed about the min and max functions is that they treat > values and iterable slightly different. > > # This works >>>> min(1, 2) > 1 >>>> min([1, 2]) > 1 > > > # The gotcha >>>> min([1]) > 1 >>>> min(1) > Traceback (most recent call last): > ?File "", line 1, in > TypeError: 'int' object is not iterable > > > So you need a function like the following to make it handle single values > and single iterables the same. > > def xmin(value): > ? ?try: > ? ? ? ?return min(value) > ? ?except TypeError: > ? ? ? ?return min([value]) > > Then you can do... > > @consumer > def Running_Min(out_value=None): > ? ?while 1: > ? ? ? ?in_value = yield out_value > ? ? ? ?if in_value is not None: > ? ? ? ? ? ?if out_value is None: > ? ? ? ? ? ? ? ?out_value = xmin(in_value) > ? ? ? ? ? ?else: > ? ? ? ? ? ? ? ?out_value = xmin(out_value, xmin(in_value)) > > > Or for your class... > > def xmax(value): > ? ?try: > ? ? ? ?return max(value) > ? ?except TypeError: > ? ? ? ?return max([value]) > > class RunningMax(RunningCalc): > ? ?def __init__(self): > ? ? ? ?self.max_value = None > > ? ?def feed(self, value): > ? ? ? ?if value is not None: > ? ? ? ? ? ?if self.max_value is None: > ? ? ? ? ? ? ? ?self.max_value = xmax(value) > ? ? ? ? ? ?else: > ? ? ? ? ? ? ? ?self.max_value = xmax(self.max_value, xmax(value)) > > > Now if they could handle None a bit better we might be able to get rid of > the None checks too. ?;-) > > Cheers, > ? Ron > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From taleinat at gmail.com Fri Oct 15 05:05:43 2010 From: taleinat at gmail.com (Tal Einat) Date: Fri, 15 Oct 2010 05:05:43 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB7B7C2.8090401@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> Message-ID: On Fri, Oct 15, 2010 at 4:09 AM, Ron Adam wrote: > > > On 10/13/2010 07:13 PM, Tal Einat wrote: >> >> On Thu, Oct 14, 2010 at 1:14 AM, Nick Coghlan wrote: >>> >>> Why use feed() rather than the existing generator send() API? >>> >>> def runningmax(default_max=None): >>> ? ?max_value = default_max >>> ? ?while 1: >>> ? ? ? ?value = max(yield max_value) >>> ? ? ? ?if max_value is None or value> ?max_value: >>> ? ? ? ? ? ?max_value = value >> >> I tried using generators for this and it came out very clumsy. For one >> thing, using generators for this requires first calling next() once to >> run the generator up to the first yield, which makes the user-facing >> API very confusing. Generators also have to yield a value at every >> iteration, which is unnecessary here. Finally, the feedMultiple >> optimization is impossible with a generator-based implementation. > > Something I noticed about the min and max functions is that they treat > values and iterable slightly different. Sorry, my bad. The max in "value = max(yield max_value)" was an error, it should have been removed. As Guido mentioned, there is never a reason to do max(value) where value is not an iterable. - Tal From steve at pearwood.info Fri Oct 15 05:12:03 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Oct 2010 14:12:03 +1100 Subject: [Python-ideas] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010142223.32257.steve@pearwood.info> Message-ID: <201010151412.03572.steve@pearwood.info> On Thu, 14 Oct 2010 11:05:25 pm you wrote: > On Thu, Oct 14, 2010 at 1:23 PM, Steven D'Aprano wrote: > > On Thu, 14 Oct 2010 08:54:31 am you wrote: > >> After some thought, I've found a way to make running several > >> "running calculations" in parallel fast. Speed should be > >> comparable to having used the non-running variants. > > > > Speed "should be" comparable? Are you guessing or have you actually > > timed it? > > > > And surely the point of all this extra coding is to make something > > run *faster*, not "comparable to", the sequential algorithm? > > The use-case I'm targeting is when you can't hold all of the data in > memory, and it is relatively "expensive" to generate it, e.g. a large > and complex database query. In this case just running the sequential > functions one at a time requires generating the data several times, > once per function. My goal is to facilitate running several > computations on a single iterator without keeping all of the data in > memory. Okay, fair enough, but I think that's enough of a specialist need that it doesn't belong as a built-in or even in the standard library. I suspect that, even for your application, a more sensible approach would be to write a single function to walk over the data once, doing all the calculations you need. E.g. if your data is numeric, and you need (say) the min, max, mean (average), standard deviation and standard error, rather than doing a separate pass for each function, you can do them all in a single pass: sum = 0 sum_sq = 0 count = 0 smallest = sys.maxint biggest = -sys.maxint for x in data: count += 1 sum += x sum_sq += x**2 smallest = min(smallest, x) biggest = max(biggest, x) mean = sum/count std_dev = math.sqrt((sum_sq + sum**2)/(count-1)) std_err = std_dev/math.sqrt(count) That expression for the standard deviation is from memory, don't trust it, I've probably got it wrong! Naturally, if you don't know what functions you need to call until runtime it will require a bit more cleverness. A general approach might be a functional approach based on reduce: def multireduce(functions, initial_values, data): values = list(initial_values) for x in data: for i, func in enumerate(functions): values[i] = func(x, values[i]) return values The point is that if generating the data is costly, the best approach is to lazily generate the data once only, with the minimal overhead and maximum flexibility. -- Steven D'Aprano From digitalxero at gmail.com Fri Oct 15 05:22:59 2010 From: digitalxero at gmail.com (Dj Gilcrease) Date: Thu, 14 Oct 2010 23:22:59 -0400 Subject: [Python-ideas] String Subtraction In-Reply-To: References: <20101014172311.13909a3d@bhuda.mired.org> <20101015001618.2c19634e@o> Message-ID: On Thu, Oct 14, 2010 at 7:45 PM, Bruce Leban wrote: > Here's a useful function along these lines, which ideally would be > string.remove(): > def remove(s, sub, maxremove=None, sep=None): > ??"""Removes instances of sub from the string. > ??Args: > ?? ?s: The string to be modified. > ?? ?sub: The substring to be removed. > ?? ?maxremove: If specified, the maximum number of instances to be > ?? ? ? ?removed (starting from the left). If omitted, removes all instances. > ?? ?sep: Optionally, the separators to be removed. If the separator appears > ?? ? ? ?on both sides of a removed substring, one of the separators is > removed. > ??>>> remove('test,blah,blah,blah,this', 'blah') > ??'test,,,,this' > ??>>> remove('test,blah,blah,blah,this', 'blah', maxremove=2) > ??'test,,,blah,this' > ??>>> remove('test,blah,blah,blah,this', 'blah', sep=',') > ??'test,this' > ??>>> remove('test,blah,blah,blah,this', 'blah', maxremove=2, sep=',') > ??'test,blah,this' > ??>>> remove('foo(1)blah(2)blah(3)bar', 'blah', 1) > ??'foo(1)(2)blah(3)bar' > ??""" Could be written as def remove(string, sub, max_remove=-1, sep=None): if sep: sub = sub + sep return string.replace(sub, '', max_remove) t = 'test,blah,blah,blah,this' print(remove(t, 'blah')) print(remove(t, 'blah', 2)) print(remove(t, 'blah', sep=',')) print(remove(t, 'blah', 2, ',')) print(remove('foo(1)blah(2)blah(3)bar', 'blah', 1)) Dj Gilcrease ?____ ( | ? ? \ ?o ? ?() ? | ?o ?|`| ? | ? ? ?| ? ? ?/`\_/| ? ? ?| | ? ,__ ? ,_, ? ,_, ? __, ? ?, ? ,_, _| ? ? ?| | ? ?/ ? ? ?| ?| ? |/ ? / ? ? ?/ ? | ? |_/ ?/ ? ?| ? / \_|_/ (/\___/ ?|/ ?/(__,/ ?|_/|__/\___/ ? ?|_/|__/\__/|_/\,/ ?|__/ ? ? ? ? ?/| ? ? ? ? ?\| From bruce at leapyear.org Fri Oct 15 06:40:00 2010 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 14 Oct 2010 21:40:00 -0700 Subject: [Python-ideas] String Subtraction In-Reply-To: References: <20101014172311.13909a3d@bhuda.mired.org> <20101015001618.2c19634e@o> Message-ID: Your code operates differently for "test blah,this". My code produces "test ,this" while yours produces "test this". Eliding multiple separators is perhaps more useful when sep=' ' but I used commas because they're easier to see. An alternative design removes one separator either before or after a removed string (but not both). That would work better for an example like this: >>> remove('The Illuminati fnord are everywhere fnord.', 'fnord', sep=' ') 'The Illuminati are everywhere.' Neither version of this may have sufficient utility to be added to standard library. --- Bruce http://www.vroospeak.com http://j.mp/gruyere-security On Thu, Oct 14, 2010 at 8:22 PM, Dj Gilcrease wrote: > On Thu, Oct 14, 2010 at 7:45 PM, Bruce Leban wrote: > > Here's a useful function along these lines, which ideally would be > > string.remove(): > > def remove(s, sub, maxremove=None, sep=None): > > """Removes instances of sub from the string. > > Args: > > s: The string to be modified. > > sub: The substring to be removed. > > maxremove: If specified, the maximum number of instances to be > > removed (starting from the left). If omitted, removes all > instances. > > sep: Optionally, the separators to be removed. If the separator > appears > > on both sides of a removed substring, one of the separators is > > removed. > > >>> remove('test,blah,blah,blah,this', 'blah') > > 'test,,,,this' > > >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2) > > 'test,,,blah,this' > > >>> remove('test,blah,blah,blah,this', 'blah', sep=',') > > 'test,this' > > >>> remove('test,blah,blah,blah,this', 'blah', maxremove=2, sep=',') > > 'test,blah,this' > > >>> remove('foo(1)blah(2)blah(3)bar', 'blah', 1) > > 'foo(1)(2)blah(3)bar' > > """ > > Could be written as > > def remove(string, sub, max_remove=-1, sep=None): > if sep: > sub = sub + sep > return string.replace(sub, '', max_remove) > > t = 'test,blah,blah,blah,this' > print(remove(t, 'blah')) > print(remove(t, 'blah', 2)) > print(remove(t, 'blah', sep=',')) > print(remove(t, 'blah', 2, ',')) > print(remove('foo(1)blah(2)blah(3)bar', 'blah', 1)) > > > Dj Gilcrease > ____ > ( | \ o () | o |`| > | | /`\_/| | | ,__ ,_, ,_, __, , ,_, > _| | | / | | |/ / / | |_/ / | / \_|_/ > (/\___/ |/ /(__,/ |_/|__/\___/ |_/|__/\__/|_/\,/ |__/ > /| > \| > -------------- next part -------------- An HTML attachment was scrubbed... URL: From taleinat at gmail.com Fri Oct 15 17:36:19 2010 From: taleinat at gmail.com (Tal Einat) Date: Fri, 15 Oct 2010 17:36:19 +0200 Subject: [Python-ideas] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <201010151412.03572.steve@pearwood.info> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010142223.32257.steve@pearwood.info> <201010151412.03572.steve@pearwood.info> Message-ID: On Fri, Oct 15, 2010 at 5:12 AM, Steven D'Aprano wrote: > On Thu, 14 Oct 2010 11:05:25 pm you wrote: >> The use-case I'm targeting is when you can't hold all of the data in >> memory, and it is relatively "expensive" to generate it, e.g. a large >> and complex database query. In this case just running the sequential >> functions one at a time requires generating the data several times, >> once per function. My goal is to facilitate running several >> computations on a single iterator without keeping all of the data in >> memory. > > Okay, fair enough, but I think that's enough of a specialist need that > it doesn't belong as a built-in or even in the standard library. I don't see this as a specialist need. This is relevant to any piece of code which receives an iterator and doesn't know whether it is feasible to keep all of its items in memory. The way I see it, Python's embracing of iterators is what makes this commonly useful. > I suspect that, even for your application, a more sensible approach > would be to write a single function to walk over the data once, doing > all the calculations you need. E.g. if your data is numeric, and you > need (say) the min, max, mean (average), standard deviation and > standard error, rather than doing a separate pass for each function, > you can do them all in a single pass: > > sum = 0 > sum_sq = 0 > count = 0 > smallest = sys.maxint > biggest = -sys.maxint > for x in data: > count += 1 > sum += x > sum_sq += x**2 > smallest = min(smallest, x) > biggest = max(biggest, x) > mean = sum/count > std_dev = math.sqrt((sum_sq + sum**2)/(count-1)) > std_err = std_dev/math.sqrt(count) What you suggest is that each programmer rolls his own code, which is reasonable for tasks which are not very common and are easy enough to implement. The problem is that in this case, the straightforward solution you suggest has both efficiency and numerical stability problems. These are actually quite tricky to understand and sort out. In light of this, a standard implementation which avoids common stumbling blocks and errors could have its place in the standard library. IIRC these were the reasons for the inclusion of the bisect module, for example. Regarding the numerical stability issues, these don't arise just in extreme edge-cases. Even a simple running average calculation for some large numbers, or numbers whose average is near zero, can have significant errors. Variance and standard deviation are even more problematic in this respect. > Naturally, if you don't know what functions you need to call until > runtime it will require a bit more cleverness. A general approach might > be a functional approach based on reduce: > > def multireduce(functions, initial_values, data): > ? ?values = list(initial_values) > ? ?for x in data: > ? ? ? ?for i, func in enumerate(functions): > ? ? ? ? ? ?values[i] = func(x, values[i]) > ? ?return values > > The point is that if generating the data is costly, the best approach is > to lazily generate the data once only, with the minimal overhead and > maximum flexibility. This is precisely what I am suggesting! The only difference is that I suggest using objects with a simple API instead of functions, to allow more flexibility. Some things are hard to implement using just a function as you suggest, and various optimizations are impossible. - Tal Einat From rrr at ronadam.com Fri Oct 15 19:13:54 2010 From: rrr at ronadam.com (Ron Adam) Date: Fri, 15 Oct 2010 12:13:54 -0500 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> Message-ID: <4CB88BD2.4010901@ronadam.com> My apologies, I clicked "reply" instead of "reply list" last night. After thinking about this a bit more, it isn't a matter of never needing to do it. The min and max functions wouldn't be able to compare a series of lists as individual values without a keyword switch to choose the specific behavior for a single item. ie... list of items or an item that happens to be a list. The below examples would not be able to compare sequences correctly. Ron On 10/14/2010 09:14 PM, Guido van Rossum wrote: > Why would you ever want to write min(1)? (Or min(x) where x is not iterable.) Basically to allow easier duck typing without having to check weather x is an iterable. This isn't a big deal or a must have. It's just one solution to a problem presented here. My own thoughts is that little tweaks like this may be helpful when using functions in indirect ways where it's nice not to have to do additional value, type, or attribute checking. [Tal also says] > As Guido mentioned, there is never a reason to do max(value) where > value is not an iterable. Well, you can always avoid doing it, but that doesn't mean it wouldn't be nice to have sometimes. Take a look at the following three coroutines that do the same exact thing. Which is easier to read and which would be considered the more Pythonic. def xmin(*args, **kwds): # Allow min to work with a single non-iterable value. if len(args) == 1 and not hasattr(args[0], "__iter__"): return min(args, **kwds) else: return min(*args, **kwds) # Accept values or chunks of values and keep a running minimum. @consumer def Running_Min(out_value=None): while 1: in_value = yield out_value if in_value is not None: if out_value is None: out_value = xmin(in_value) else: out_value = xmin(out_value, xmin(in_value)) @consumer def Running_Min(out_value=None): while 1: in_value = yield out_value if in_value is not None: if not hasattr(in_value, "__iter__"): in_value = [in_value] if out_value is None: out_value = min(in_value) else: out_value = min(out_value, min(in_value)) @consumer def Running_Min(out_value=None): while 1: in_value = yield out_value if in_value is not None: if not hasattr(in_value, "__iter__"): if out_value is None: out_value = in_value else: out_value = min(out_value, in_value) else: if out_value is None: out_value = min(in_value) else: out_value = min(out_value, min(in_value)) From g.brandl at gmx.net Fri Oct 15 19:27:10 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 15 Oct 2010 19:27:10 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB88BD2.4010901@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> Message-ID: Am 15.10.2010 19:13, schrieb Ron Adam: > [Tal also says] >> As Guido mentioned, there is never a reason to do max(value) where >> value is not an iterable. > > Well, you can always avoid doing it, but that doesn't mean it wouldn't be > nice to have sometimes. Take a look at the following three coroutines that > do the same exact thing. Which is easier to read and which would be > considered the more Pythonic. > > > def xmin(*args, **kwds): > # Allow min to work with a single non-iterable value. > if len(args) == 1 and not hasattr(args[0], "__iter__"): > return min(args, **kwds) > else: > return min(*args, **kwds) I don't understand this function. Why wouldn't you simply always call return min(args, **kwds) ? Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From rrr at ronadam.com Fri Oct 15 20:09:17 2010 From: rrr at ronadam.com (Ron Adam) Date: Fri, 15 Oct 2010 13:09:17 -0500 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> Message-ID: <4CB898CD.6000207@ronadam.com> On 10/15/2010 12:27 PM, Georg Brandl wrote: > Am 15.10.2010 19:13, schrieb Ron Adam: > >> [Tal also says] >>> As Guido mentioned, there is never a reason to do max(value) where >>> value is not an iterable. >> >> Well, you can always avoid doing it, but that doesn't mean it wouldn't be >> nice to have sometimes. Take a look at the following three coroutines that >> do the same exact thing. Which is easier to read and which would be >> considered the more Pythonic. >> >> >> def xmin(*args, **kwds): >> # Allow min to work with a single non-iterable value. >> if len(args) == 1 and not hasattr(args[0], "__iter__"): >> return min(args, **kwds) >> else: >> return min(*args, **kwds) > > I don't understand this function. Why wouldn't you simply always call > > return min(args, **kwds) > > ? Because it would always interpret a list of values as a single item. This function looks at args and if its a single value without an "__iter__" method, it passes it to min as min([value], **kwds) instead of min(value, **kwds). Another way to do this would be to use a try-except... try: return min(*args, **kwds) except TypeError: return min(args, **kwds) Ron From arnodel at googlemail.com Fri Oct 15 21:25:49 2010 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 15 Oct 2010 20:25:49 +0100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB898CD.6000207@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> Message-ID: On 15 October 2010 19:09, Ron Adam wrote: > > > On 10/15/2010 12:27 PM, Georg Brandl wrote: >> >> Am 15.10.2010 19:13, schrieb Ron Adam: >> >>> [Tal also says] >>>> >>>> As Guido mentioned, there is never a reason to do max(value) where >>>> value is not an iterable. >>> >>> Well, you can always avoid doing it, but that doesn't mean it wouldn't be >>> nice to have sometimes. ?Take a look at the following three coroutines >>> that >>> do the same exact thing. ?Which is easier to read and which would be >>> considered the more Pythonic. >>> >>> >>> def xmin(*args, **kwds): >>> ? ? ?# Allow min to work with a single non-iterable value. >>> ? ? ?if len(args) == 1 and not hasattr(args[0], "__iter__"): >>> ? ? ? ? ?return min(args, **kwds) >>> ? ? ?else: >>> ? ? ? ? ?return min(*args, **kwds) >> >> I don't understand this function. ?Why wouldn't you simply always call >> >> ? ?return min(args, **kwds) >> >> ? > > Because it would always interpret a list of values as a single item. > > This function looks at args and if its a single value without an "__iter__" > method, it passes it to min as min([value], **kwds) instead of min(value, > **kwds). But there are many iterable objects which are also comparable (hence it makes sense to consider their min/max), for example strings. So we get: xmin("foo", "bar", "baz") == "bar" xmin("foo", "bar") == "bar" but: xmin("foo") == "f" This will create havoc in your running min routine. (Notice the same will hold for min() but at least you know that min(x) considers x as an iterable and complains if it isn't) -- Arnaud From g.brandl at gmx.net Fri Oct 15 21:25:27 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 15 Oct 2010 21:25:27 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB898CD.6000207@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> Message-ID: Am 15.10.2010 20:09, schrieb Ron Adam: >> I don't understand this function. Why wouldn't you simply always call >> >> return min(args, **kwds) >> >> ? > > Because it would always interpret a list of values as a single item. > > This function looks at args and if its a single value without an "__iter__" > method, it passes it to min as min([value], **kwds) instead of min(value, > **kwds). > > Another way to do this would be to use a try-except... > > try: > return min(*args, **kwds) > except TypeError: > return min(args, **kwds) And that's just gratuitous. If you have the sequence of items to compare already as an iterable, there is absolutely no need to unpack them using *args. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From raymond.hettinger at gmail.com Fri Oct 15 22:01:37 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 15 Oct 2010 13:01:37 -0700 Subject: [Python-ideas] Fwd: stats module Was: minmax() function ... References: <1F769A68-3C17-482B-A252-7DB6BFF7F37B@gmail.com> Message-ID: <379FDE7A-D612-4701-8E11-2E9A36EFF3CB@gmail.com> Drat. This should have gone to python-ideas. Re-sending. Begin forwarded message: > From: Raymond Hettinger > Date: October 15, 2010 1:00:16 PM PDT > To: Python-Dev Dev > Subject: Fwd: [Python-ideas] stats module Was: minmax() function ... > > Hello guys. If you don't mind, I would like to hijack your thread :-) > > ISTM, that the minmax() idea is really just an optimization request. > A single-pass minmax() is easily coded in simple, pure-python, > so really the discussion is about how to remove the loop overhead > (there isn't much you can do about the cost of the two compares > which is where most of the time would be spent anyway). > > My suggestion is to aim higher. There is no reason a single pass > couldn't also return min/max/len/sum and perhaps even other summary > statistics like sum(x**2) so that you can compute standard deviation > and variance. > > A few years ago, Guido and other python devvers supported a > proposal I made to create a stats module, but I didn't have time > to develop it. The basic idea was that python's batteries should > include most of the functionality available on advanced student > calculators. Another idea behind it was that we could invisibility > do-the-right-thing under the hood to help users avoid numerical > problems (i.e. math.fsum(s)/len(s) is a more accurate way to > compute an average because it doesn't lose precision when > building-up the intermediate sums). > > I think the creativity and energy of this group is much better directed > at building a quality stats module (perhaps with some R-like capabilities). > That would likely be a better use of energy than bike-shedding > about ways to speed-up a trivial piece of code that is ultimately > constrained by the cost of the compares per item. > > my-two-cents, > > > Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrr at ronadam.com Fri Oct 15 22:00:53 2010 From: rrr at ronadam.com (Ron Adam) Date: Fri, 15 Oct 2010 15:00:53 -0500 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> Message-ID: <4CB8B2F5.2020507@ronadam.com> On 10/15/2010 02:04 PM, Arnaud Delobelle wrote: >> Because it would always interpret a list of values as a single item. >> >> This function looks at args and if its a single value without an "__iter__" >> method, it passes it to min as min([value], **kwds) instead of min(value, >> **kwds). > > But there are many iterable objects which are also comparable (hence > it makes sense to consider their min/max), for example strings. > > So we get: > > xmin("foo", "bar", "baz") == "bar" > xmin("foo", "bar") == "bar" > > but: > > xmin("foo") == "f" > > This will create havoc in your running min routine. > > (Notice the same will hold for min() but at least you know that min(x) > considers x as an iterable and complains if it isn't) Yes There doesn't seem to be a way to generalize min/max in a way to handle all the cases without knowing the context. So in a coroutine version of Tals class, you would need to pass a hint along with the value. Ron From g.brandl at gmx.net Fri Oct 15 22:52:43 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 15 Oct 2010 22:52:43 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB8B2F5.2020507@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: Am 15.10.2010 22:00, schrieb Ron Adam: >> (Notice the same will hold for min() but at least you know that min(x) >> considers x as an iterable and complains if it isn't) > > Yes > > There doesn't seem to be a way to generalize min/max in a way to handle all > the cases without knowing the context. I give up. You see an issue where there is none. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From masklinn at masklinn.net Fri Oct 15 22:56:46 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 15 Oct 2010 22:56:46 +0200 Subject: [Python-ideas] Fwd: stats module Was: minmax() function ... In-Reply-To: <379FDE7A-D612-4701-8E11-2E9A36EFF3CB@gmail.com> References: <1F769A68-3C17-482B-A252-7DB6BFF7F37B@gmail.com> <379FDE7A-D612-4701-8E11-2E9A36EFF3CB@gmail.com> Message-ID: On 2010-10-15, at 22:01 , Raymond Hettinger wrote: > Drat. This should have gone to python-ideas. > Re-sending. > > Begin forwarded message: > >> From: Raymond Hettinger >> Date: October 15, 2010 1:00:16 PM PDT >> To: Python-Dev Dev >> Subject: Fwd: [Python-ideas] stats module Was: minmax() function ... >> >> Hello guys. If you don't mind, I would like to hijack your thread :-) >> >> ISTM, that the minmax() idea is really just an optimization request. >> A single-pass minmax() is easily coded in simple, pure-python, >> so really the discussion is about how to remove the loop overhead >> (there isn't much you can do about the cost of the two compares >> which is where most of the time would be spent anyway). >> >> My suggestion is to aim higher. There is no reason a single pass >> couldn't also return min/max/len/sum and perhaps even other summary >> statistics like sum(x**2) so that you can compute standard deviation >> and variance. >> >> A few years ago, Guido and other python devvers supported a >> proposal I made to create a stats module, but I didn't have time >> to develop it. The basic idea was that python's batteries should >> include most of the functionality available on advanced student >> calculators. Another idea behind it was that we could invisibility >> do-the-right-thing under the hood to help users avoid numerical >> problems (i.e. math.fsum(s)/len(s) is a more accurate way to >> compute an average because it doesn't lose precision when >> building-up the intermediate sums). >> >> I think the creativity and energy of this group is much better directed >> at building a quality stats module (perhaps with some R-like capabilities). >> That would likely be a better use of energy than bike-shedding >> about ways to speed-up a trivial piece of code that is ultimately >> constrained by the cost of the compares per item. >> >> my-two-cents, >> >> >> Raymond I think I'd still go with composable coroutines, the kind of stuff dabeaz shows/promotes in his training sessions and stuff. Maybe with a higher-level interface making their usage easier, but they seem a perfect fit for that kind of stuff where you create arbitrary data pipes including forks and joins. As others mentioned, generator-based coroutines in Python have to be primed (by calling next() once on them) which is kind-of a pain, but the decorator to "fix" that is easy enough to write. From steve at pearwood.info Sat Oct 16 02:11:21 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 16 Oct 2010 11:11:21 +1100 Subject: [Python-ideas] stats module Was: minmax() function ... Message-ID: <201010161111.21847.steve@pearwood.info> Seconding Raymond's 'drat'. Resending to python-ideas. On Sat, 16 Oct 2010 07:00:16 am Raymond Hettinger wrote: > Hello guys. If you don't mind, I would like to hijack your thread > :-) Please do :) > A few years ago, Guido and other python devvers supported a > proposal I made to create a stats module, but I didn't have time > to develop it. [...] > I think the creativity and energy of this group is much better > directed at building a quality stats module (perhaps with some R-like > capabilities). +1 Are you still interested in working on it, or is this a subtle hint that somebody else should do so? -- Steven D'Aprano From raymond.hettinger at gmail.com Sat Oct 16 02:33:02 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 15 Oct 2010 17:33:02 -0700 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010161111.21847.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> Message-ID: <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> On Oct 15, 2010, at 5:11 PM, Steven D'Aprano wrote: >> A few years ago, Guido and other python devvers supported a >> proposal I made to create a stats module, but I didn't have time >> to develop it. > [...] >> I think the creativity and energy of this group is much better >> directed at building a quality stats module (perhaps with some R-like >> capabilities). > > +1 > > Are you still interested in working on it, or is this a subtle hint that > somebody else should do so? Hmm, perhaps this would be less subtle: HEY, WHY DON'T YOU GUYS GO TO WORK ON A STATS MODULE! There, that should do it :-) Raymond From sunqiang at gmail.com Sat Oct 16 02:49:15 2010 From: sunqiang at gmail.com (sunqiang) Date: Sat, 16 Oct 2010 08:49:15 +0800 Subject: [Python-ideas] [Python-Dev] Fwd: stats module Was: minmax() function ... In-Reply-To: References: <1F769A68-3C17-482B-A252-7DB6BFF7F37B@gmail.com> Message-ID: On Sat, Oct 16, 2010 at 8:05 AM, geremy condra wrote: > On Fri, Oct 15, 2010 at 1:00 PM, Raymond Hettinger > wrote: >> Hello guys. ?If you don't mind, I would like to hijack your thread :-) >> >> ISTM, that the minmax() idea is really just an optimization request. >> A single-pass minmax() is easily coded in simple, pure-python, >> so really the discussion is about how to remove the loop overhead >> (there isn't much you can do about the cost of the two compares >> which is where most of the time would be spent anyway). >> >> My suggestion is to aim higher. ? There is no reason a single pass >> couldn't also return min/max/len/sum and perhaps even other summary >> statistics like sum(x**2) so that you can compute standard deviation >> and variance. > > +1 from me. Here's a normal cdf and chi squared cdf approximation I > use for randomness testing. They may need to refined for inclusion, > but you're welcome to use them if you'd like. > > from math import sqrt, erf > > def normal_cdf(x, mu=0, sigma=1): > ? ? ? ?"""Approximates the normal cumulative distribution""" > ? ? ? ?return (1/2) * (1 + erf((x+mu)/(sigma*sqrt(2)))) > > def chi_squared_cdf(x, k): > ? ? ? ?"""Approximates the cumulative chi-squared statistic with k degrees > of freedom.""" > ? ? ? ?numerator = 1 - (2/(9*k)) - ((x/k)**(1/3)) > ? ? ? ?denominator = (1/3) * sqrt(2/k) > ? ? ? ?return normal_cdf(numerator/denominator) > >> A few years ago, Guido and other python devvers supported a >> proposal I made to create a stats module, but I didn't have time >> to develop it. ?The basic idea was that python's batteries should >> include most of the functionality available on advanced student >> calculators. ?Another idea behind it was that we could invisibility >> do-the-right-thing under the hood to help users avoid numerical >> problems (i.e. math.fsum(s)/len(s) is a more accurate way to >> compute an average because it doesn't lose precision when >> building-up the intermediate sums). > > Can you give some other examples? Sage does some of this and I > frequently find it annoying, actually, but I'm not sure if you're > referring to the same things there. have seen a blog post[1] several months ago from reddit[2], maybe it worth a reading. [1]: http://www.johndcook.com/blog/2010/06/07/math-library-functions-that-seem-unnecessary/ [2]: http://www.reddit.com/r/programming/comments/ccbja/math_library_functions_that_seem_unnecessary/ > Geremy Condra > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: http://mail.python.org/mailman/options/python-dev/sunqiang%40gmail.com > From rrr at ronadam.com Sat Oct 16 07:31:36 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 16 Oct 2010 00:31:36 -0500 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: <4CB938B8.4050709@ronadam.com> On 10/15/2010 03:52 PM, Georg Brandl wrote: > Am 15.10.2010 22:00, schrieb Ron Adam: > >>> (Notice the same will hold for min() but at least you know that min(x) >>> considers x as an iterable and complains if it isn't) >> >> Yes >> >> There doesn't seem to be a way to generalize min/max in a way to handle all >> the cases without knowing the context. > > I give up. You see an issue where there is none. Sorry for the delay, I was away for the day... Thanks for trying George, it really wasn't an issue. I was thinking about it from the point of view of, would it be possible to make min and max easier to use in indirect ways. As I found out, those functions depend on both the number of arguments, and the context they are used in, to do the right thing. Change either and you may get unexpected results. In the example where *args was used... I had left out the function def of min(*args, **kwds) where you would have saw that args, was just unpacking the arguments, and not the list object being passed to min. My mistake. Cheers, Ron From taleinat at gmail.com Sat Oct 16 12:59:26 2010 From: taleinat at gmail.com (Tal Einat) Date: Sat, 16 Oct 2010 12:59:26 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <201010111017.56101.steve@pearwood.info> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> Message-ID: On Mon, Oct 11, 2010 at 1:17 AM, Steven D'Aprano wrote: > On Mon, 11 Oct 2010 05:57:21 am Paul McGuire wrote: >> Just as an exercise, I wanted to try my hand at adding a function to >> the compiled Python C code. ?An interesting optimization that I read >> about (where? don't recall) finds the minimum and maximum elements of >> a sequence in a single pass, with a 25% reduction in number of >> comparison operations: >> - the sequence elements are read in pairs >> - each pair is compared to find smaller/greater >> - the smaller is compared to current min >> - the greater is compared to current max >> >> So each pair is applied to the running min/max values using 3 >> comparisons, vs. 4 that would be required if both were compared to >> both min and max. >> >> This feels somewhat similar to how divmod returns both quotient and >> remainder of a single division operation. >> >> This would be potentially interesting for those cases where min and >> max are invoked on the same sequence one after the other, and >> especially so if the sequence elements were objects with expensive >> comparison operations. > > Perhaps more importantly, it is ideal for the use-case where you have an > iterator. You can't call min() and then max(), as min() consumes the > iterator leaving nothing for max(). It may be undesirable to convert > the iterator to a list first -- it may be that the number of items in > the data stream is too large to fit into memory all at once, but even > if it is small, it means you're now walking the stream three times when > one would do. > > To my mind, minmax() is as obvious and as useful a built-in as divmod(), > but if there is resistance to making such a function a built-in, > perhaps it could go into itertools. (I would prefer it to keep the same > signature as min() and max(), namely that it will take either a single > iterable argument or multiple arguments.) > > I've experimented with minmax() myself. Not surprisingly, the > performance of a pure Python version doesn't even come close to the > built-ins. > > I'm +1 on the idea. > > Presumably follow-ups should go to python-ideas. The discussion which followed this up has digressed quite a bit, but I'd like to mention that I'm +1 on having an efficient minmax() function available. - Tal From jan.koprowski at gmail.com Sun Oct 17 09:27:37 2010 From: jan.koprowski at gmail.com (Jan Koprowski) Date: Sun, 17 Oct 2010 09:27:37 +0200 Subject: [Python-ideas] dict.hash - optimized per module Message-ID: Hi, My name is Jan and this is my first post on this group. So hello :) I'm very sorry if my idea is so naive as to be ridiculous but I believe it is worth to ask. I'm just watched "The Mighty Dictionary" video conference from Atlanta deliver by Brandon Craig Rhodes. After watching I made graph, using presented at conference library dictinfo, for __builtin__.__dict__. When I saw few collisions I think "Why this module doesn't have their own hashing function implementation which allow to avoid collision in this set of names?". My second think was "Why each Python module doesn't have their own internal hashing function which doesn't produce collisions in scope of his names". Maybe my thoughts was silly but is this doesn't speed Python a little? I'm aware that this doesn't work for locals or globals dict but may be an improvement in places where set of names is constant or predictable like builtins Python modules. What do You think? Greetings from Poland, -- ><> Jan Koprowski From steve at pearwood.info Sun Oct 17 11:41:34 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 17 Oct 2010 20:41:34 +1100 Subject: [Python-ideas] dict.hash - optimized per module In-Reply-To: References: Message-ID: <201010172041.34847.steve@pearwood.info> On Sun, 17 Oct 2010 06:27:37 pm Jan Koprowski wrote: > After watching I made graph, using presented at conference library > dictinfo, for __builtin__.__dict__. > When I saw few collisions I think "Why this module doesn't have > their own hashing function implementation which allow to avoid > collision in this set of names?". Python 2.6 has 143 builtin names, and zero collisions: >>> hashes = {} >>> import __builtin__ >>> for name in __builtin__.__dict__: ... h = hash(name) ... if h in hashes: print "Collision for", name ... L = hashes.setdefault(h, []) ... L.append(name) ... >>> len(hashes) 143 >>> filter(lambda x: len(x) > 1, hashes.values()) [] >>> next(hashes.iteritems()) (29257728, ['bytearray']) > My second think was "Why each > Python module doesn't have their own internal hashing function which > doesn't produce collisions in scope of his names". Firstly, the occasional collision doesn't matter much. Secondly, your idea would mean that every module would need it's own custom-made hash function. Writing good hash functions is hard. The Python hash function is very, very good. Expecting developers to produce *dozens* of hash functions equally as good is totally impractical. -- Steven D'Aprano From pyideas at rebertia.com Sun Oct 17 11:52:27 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 17 Oct 2010 02:52:27 -0700 Subject: [Python-ideas] dict.hash - optimized per module In-Reply-To: <201010172041.34847.steve@pearwood.info> References: <201010172041.34847.steve@pearwood.info> Message-ID: On Sun, Oct 17, 2010 at 2:41 AM, Steven D'Aprano wrote: > On Sun, 17 Oct 2010 06:27:37 pm Jan Koprowski wrote: >> ? After watching I made graph, using presented at conference library >> dictinfo, for __builtin__.__dict__. >> ? When I saw few collisions I think "Why this module doesn't have >> their own hashing function implementation which allow to avoid >> collision in this set of names?". > Firstly, the occasional collision doesn't matter much. > > Secondly, your idea would mean that every module would need it's own > custom-made hash function. Writing good hash functions is hard. The > Python hash function is very, very good. Expecting developers to > produce *dozens* of hash functions equally as good is totally > impractical. Actually, there's already software to automatically generate such functions; e.g. http://www.gnu.org/software/gperf/ Not that this makes the suggestion any more tractable though. Cheers, Chris From solipsis at pitrou.net Sun Oct 17 12:52:40 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Oct 2010 12:52:40 +0200 Subject: [Python-ideas] dict.hash - optimized per module References: <201010172041.34847.steve@pearwood.info> Message-ID: <20101017125240.0ef893ee@pitrou.net> On Sun, 17 Oct 2010 20:41:34 +1100 Steven D'Aprano wrote: > On Sun, 17 Oct 2010 06:27:37 pm Jan Koprowski wrote: > > > After watching I made graph, using presented at conference library > > dictinfo, for __builtin__.__dict__. > > When I saw few collisions I think "Why this module doesn't have > > their own hashing function implementation which allow to avoid > > collision in this set of names?". > > Python 2.6 has 143 builtin names, and zero collisions: It depends what you call collisions. Collisions during bucket lookup, or during hash value comparison (that is, after you selected a bucket)? For the former, here is the calculation assuming an overallocation factor of 4 (which, IIRC, is the one used in the dict implementation): >>> import builtins >>> d = builtins.__dict__ >>> m = len(d) * 4 >>> for name in d: ... h = hash(name) % m ... if h in hashes: print("Collision for", name) ... hashes.setdefault(h, []).append(name) ... Collision for True Collision for FutureWarning Collision for license Collision for KeyboardInterrupt Collision for UserWarning Collision for RuntimeError Collision for MemoryError Collision for Ellipsis Collision for UnicodeError Collision for Exception Collision for tuple Collision for delattr Collision for setattr Collision for ArithmeticError Collision for property Collision for KeyError Collision for PendingDeprecationWarning Collision for map Collision for AssertionError >>> len(d) 130 >>> len(hashes) 110 > > My second think was "Why each > > Python module doesn't have their own internal hashing function which > > doesn't produce collisions in scope of his names". The real answer here is that Python needs hash values to be globally valid. Both for semantics (module dicts are regular dicts and should be usable as such), and for efficiency (having an unique hash function means the precalculated hash value can be stored for critical types such as str). Regards Antoine. From steve at pearwood.info Sun Oct 17 18:57:58 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 18 Oct 2010 03:57:58 +1100 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> References: <201010161111.21847.steve@pearwood.info> <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> Message-ID: <201010180357.59264.steve@pearwood.info> On Sat, 16 Oct 2010 11:33:02 am Raymond Hettinger wrote: > > Are you still interested in working on it, or is this a subtle hint > > that somebody else should do so? > > Hmm, perhaps this would be less subtle: > HEY, WHY DON'T YOU GUYS GO TO WORK ON A STATS MODULE! http://pypi.python.org/pypi/stats It is not even close to production ready. It needs unit tests. The API should be considered unstable. There's no 3.x version yet. Obviously it has no real-world usage. But if anyone would like to contribute, critique or criticize, I welcome feedback or assistance, or even just encouragement. -- Steven D'Aprano From daniel at stutzbachenterprises.com Sun Oct 17 19:05:33 2010 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Sun, 17 Oct 2010 12:05:33 -0500 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180357.59264.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> <201010180357.59264.steve@pearwood.info> Message-ID: On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano wrote: > It is not even close to production ready. It needs unit tests. The API > should be considered unstable. There's no 3.x version yet. Obviously it > has no real-world usage. But if anyone would like to contribute, > critique or criticize, I welcome feedback or assistance, or even just > encouragement. > Would you consider hosting it on BitBucket or GitHub? It would make collaboration easier. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Oct 17 19:16:42 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 18 Oct 2010 04:16:42 +1100 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: References: <201010161111.21847.steve@pearwood.info> <201010180357.59264.steve@pearwood.info> Message-ID: <201010180416.43278.steve@pearwood.info> On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote: > On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano wrote: > > It is not even close to production ready. It needs unit tests. The > > API should be considered unstable. There's no 3.x version yet. > > Obviously it has no real-world usage. But if anyone would like to > > contribute, critique or criticize, I welcome feedback or > > assistance, or even just encouragement. > > Would you consider hosting it on BitBucket or GitHub? It would make > collaboration easier. Yes I would. I suppose if I ask people for their preferred hosting provider, I'll get 30 different opinions and start a flame-war... -- Steven D'Aprano From masklinn at masklinn.net Sun Oct 17 19:33:21 2010 From: masklinn at masklinn.net (Masklinn) Date: Sun, 17 Oct 2010 19:33:21 +0200 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180416.43278.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <201010180357.59264.steve@pearwood.info> <201010180416.43278.steve@pearwood.info> Message-ID: <932D6642-B025-43D5-A9B0-BBA53B777FC8@masklinn.net> On 2010-10-17, at 19:16 , Steven D'Aprano wrote: > On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote: >> On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano > wrote: >>> It is not even close to production ready. It needs unit tests. The >>> API should be considered unstable. There's no 3.x version yet. >>> Obviously it has no real-world usage. But if anyone would like to >>> contribute, critique or criticize, I welcome feedback or >>> assistance, or even just encouragement. >> Would you consider hosting it on BitBucket or GitHub? It would make >> collaboration easier. > Yes I would. > > I suppose if I ask people for their preferred hosting provider, I'll get > 30 different opinions and start a flame-war? If you're a bit bored, you can always host on both at the same time via hg-git [0]. 99% of the population[1] should be happy enough if it's available through both git and mercurial. [0] http://hg-git.github.com/ [1] yep, really, I didn't make that up at all. From debatem1 at gmail.com Sun Oct 17 19:36:28 2010 From: debatem1 at gmail.com (geremy condra) Date: Sun, 17 Oct 2010 10:36:28 -0700 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180416.43278.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <201010180357.59264.steve@pearwood.info> <201010180416.43278.steve@pearwood.info> Message-ID: On Sun, Oct 17, 2010 at 10:16 AM, Steven D'Aprano wrote: > On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote: >> On Sun, Oct 17, 2010 at 11:57 AM, Steven D'Aprano > wrote: >> > It is not even close to production ready. It needs unit tests. The >> > API should be considered unstable. There's no 3.x version yet. >> > Obviously it has no real-world usage. But if anyone would like to >> > contribute, critique or criticize, I welcome feedback or >> > assistance, or even just encouragement. >> >> Would you consider hosting it on BitBucket or GitHub? ?It would make >> collaboration easier. > > Yes I would. > > I suppose if I ask people for their preferred hosting provider, I'll get > 30 different opinions and start a flame-war... Like that's ever stopped you ;) I've been working on this as well, and somehow wound up with a totally different module. I'll have mine up someplace later tonight, but we should consider merging them. Geremy Condra From tjreedy at udel.edu Mon Oct 18 00:10:13 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 17 Oct 2010 18:10:13 -0400 Subject: [Python-ideas] dict.hash - optimized per module In-Reply-To: References: Message-ID: On 10/17/2010 3:27 AM, Jan Koprowski wrote: > Hi, > > My name is Jan and this is my first post on this group. So hello :) > I'm very sorry if my idea is so naive as to be ridiculous but I > believe it is worth to ask. Worth asking but not worth doing (or, in a sense, already done for function local namespaces). As Antoine said, strings have their hash computed just once. Recomputing a namespace-depending hash for each lookup would take far longer than the occational collision. For function local names, names are assigned a index at compile time so that runtime lookup is a super-quick index operation. If you want, call it perfect hashing with hashes computed once at compile time ;-). -- Terry Jan Reedy From merwok at netwok.org Mon Oct 18 14:45:14 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Mon, 18 Oct 2010 14:45:14 +0200 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180357.59264.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> <201010180357.59264.steve@pearwood.info> Message-ID: <4CBC415A.6040701@netwok.org> [Sorry if this comes twice, connection errors here] > http://pypi.python.org/pypi/stats Isn?t it a potential source of errors that the module name is so close to that of stat? Regards From merwok at netwok.org Sun Oct 17 19:30:32 2010 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Sun, 17 Oct 2010 19:30:32 +0200 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180357.59264.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> <201010180357.59264.steve@pearwood.info> Message-ID: <4CBB32B8.6080605@netwok.org> > http://pypi.python.org/pypi/stats Isn?t it a potential source of errors that the module name is so close to that of stat? Regards From daniel at stutzbachenterprises.com Mon Oct 18 22:48:35 2010 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Mon, 18 Oct 2010 15:48:35 -0500 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180416.43278.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <201010180357.59264.steve@pearwood.info> <201010180416.43278.steve@pearwood.info> Message-ID: On Sun, Oct 17, 2010 at 12:16 PM, Steven D'Aprano wrote: > On Mon, 18 Oct 2010 04:05:33 am Daniel Stutzbach wrote: > > Would you consider hosting it on BitBucket or GitHub? It would make > > collaboration easier. > > Yes I would. > > I suppose if I ask people for their preferred hosting provider, I'll get > 30 different opinions and start a flame-war... That's likely, yes. Just make an executive decision. :-) -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Oct 18 23:57:09 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 19 Oct 2010 08:57:09 +1100 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <4CBC415A.6040701@netwok.org> References: <201010161111.21847.steve@pearwood.info> <201010180357.59264.steve@pearwood.info> <4CBC415A.6040701@netwok.org> Message-ID: <201010190857.09731.steve@pearwood.info> On Mon, 18 Oct 2010 11:45:14 pm ?ric Araujo wrote: > [Sorry if this comes twice, connection errors here] > > > http://pypi.python.org/pypi/stats > > Isn?t it a potential source of errors that the module name is so > close to that of stat? The name is not set in stone. Any name is likely to lead to potential errors -- it took me five years to stop writing "import maths", and I still never remember whether I want to import date or datetime. But it's not a critical error -- it's pretty obvious when you've imported the wrong module. -- Steven D'Aprano From pingebre at yahoo.com Tue Oct 19 21:17:13 2010 From: pingebre at yahoo.com (Peter Ingebretson) Date: Tue, 19 Oct 2010 12:17:13 -0700 (PDT) Subject: [Python-ideas] Proposal for an enhanced reload mechanism Message-ID: <944134.31586.qm@web34408.mail.mud.yahoo.com> The builtin reload function is very useful for iterative development, but it is also limited.? Because references to types and functions in the old version of the module may persist after reloading, the builtin reload function is typically only useful in simple use cases. This is a proposal (pre-PEP?) for an enhanced reloading mechanism especially designed for iterative development: https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw The basic plan is to use the existing cycle-detecting GC to remap references from objects in the old module to equivalent objects in the new module. I have a patch against the current 3.2 branch that adds the gc.remap function (and unit tests, etc...) but not any of the additional reloading functionality. I have a separate prototype of the reloading module as well, but it only implements a portion of the proposal (one module at a time, and dicts/sets are not fixed up). A few questions: 1) Does this approach seem reasonable?? Has anyone tried something similar and run into unsolvable problems? 2) Would there be interest in a PEP for enhanced reloading? I would be happy to rewrite the proposal in PEP form if people think it would be worthwhile. 3) Should I submit my gc.remap patch to the issue tracker?? Because the change to visitproc modifies the ABI I would like to get that portion of the proposal in before 3.2 goes final. Since the bulk of the change is adding one method to the gc module I was hoping it might be accepted without requiring a PEP. From ckaynor at zindagigames.com Tue Oct 19 21:53:46 2010 From: ckaynor at zindagigames.com (Chris Kaynor) Date: Tue, 19 Oct 2010 12:53:46 -0700 Subject: [Python-ideas] Proposal for an enhanced reload mechanism In-Reply-To: <944134.31586.qm@web34408.mail.mud.yahoo.com> References: <944134.31586.qm@web34408.mail.mud.yahoo.com> Message-ID: On Tue, Oct 19, 2010 at 12:17 PM, Peter Ingebretson wrote: > The builtin reload function is very useful for iterative development, but > it is also limited. Because references to types and functions in the old > version of the module may persist after reloading, the builtin reload > function is typically only useful in simple use cases. > > This is a proposal (pre-PEP?) for an enhanced reloading mechanism > especially designed for iterative development: > > > https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw > > The basic plan is to use the existing cycle-detecting GC to remap > references from objects in the old module to equivalent objects in the new > module. > > I have a patch against the current 3.2 branch that adds the gc.remap > function (and unit tests, etc...) but not any of the additional reloading > functionality. I have a separate prototype of the reloading module as well, > but it only implements a portion of the proposal (one module at a time, and > dicts/sets are not fixed up). > > A few questions: > > 1) Does this approach seem reasonable? Has anyone tried something similar > and run into unsolvable problems? > > 2) Would there be interest in a PEP for enhanced reloading? I would be > happy to rewrite the proposal in PEP form if people think it would be > worthwhile. > > 3) Should I submit my gc.remap patch to the issue tracker? Because the > change to visitproc modifies the ABI I would like to get that portion of the > proposal in before 3.2 goes final. Since the bulk of the change is adding > one method to the gc module I was hoping it might be accepted without > requiring a PEP. > > > What happens if you change the __init__ or __new__ methods of an object or if you change a class's metaclass? It seems like those types of changes would be impossible to propagate to existing objects, and without propagating them any changes to existing objects may (are likely?) to break the object. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pingebre at yahoo.com Tue Oct 19 22:25:49 2010 From: pingebre at yahoo.com (Peter Ingebretson) Date: Tue, 19 Oct 2010 13:25:49 -0700 (PDT) Subject: [Python-ideas] Proposal for an enhanced reload mechanism In-Reply-To: Message-ID: <741796.36646.qm@web34402.mail.mud.yahoo.com> --- On Tue, 10/19/10, Chris Kaynor wrote: This is a proposal (pre-PEP?) for an enhanced reloading mechanism especially designed for iterative development: https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw The basic plan is to use the existing cycle-detecting GC to remap references from objects in the old module to equivalent objects in the new module. What happens if you change the __init__ or __new__ methods of an object or if you change a class's metaclass? It seems like those types of changes would be impossible to propagate to existing objects, and without propagating them any changes to existing objects may (are likely?) to break the object.Yes, this is a limitation of the approach. ?More generally, any logic that has already runand would execute differently with the reloaded module has the potential to break things. Even with this limitation I think the approach is still valuable. ?I spend far less time modifying__new__ methods and metaclasses than I spend changing the implementation and API ofother class- and module-level methods. The issue of old instances not having members that are added in a new __init__ isproblematic, but there are several workarounds such as temporarily?wrapping the newmember in a property, or potentially the @reloadable decorator alluded to in the doc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pingebre at yahoo.com Tue Oct 19 22:46:24 2010 From: pingebre at yahoo.com (Peter Ingebretson) Date: Tue, 19 Oct 2010 13:46:24 -0700 (PDT) Subject: [Python-ideas] Proposal for an enhanced reload mechanism Message-ID: <718804.70750.qm@web34406.mail.mud.yahoo.com> (Sorry, I sent an html-formatted email by accident) --- On Tue, 10/19/10, Chris Kaynor wrote: > > This is a proposal (pre-PEP?) for an enhanced reloading mechanism > > especially designed for iterative development: > > > > https://docs.google.com/document/pub?id=1GeVVC0pXTz1O6cK5mo-EaOJFqrL3PErO4okmHBlTeuw > > > > The basic plan is to use the existing cycle-detecting GC to remap > > references from objects in the old module to equivalent objects in > > the new module. > > What happens if you change the __init__ or __new__ methods of an object > or if you change a class's metaclass? It seems like those types of > changes would be impossible to propagate to existing objects, and without > propagating them any changes to existing objects may (are likely?) to > break the object. Yes, this is a limitation of the approach. ?More generally, any logic that has already run and would execute differently with the reloaded module has the potential to break things. Even with this limitation I think the approach is still valuable. ?I spend far less time modifying __new__ methods and metaclasses than I spend changing the implementation and API ofother class- and module-level methods. The issue of old instances not having members that are added in a new __init__ is problematic, but there are several workarounds such as temporarily?wrapping the new member in a property, or potentially the @reloadable decorator alluded to in the doc. From ziade.tarek at gmail.com Tue Oct 19 23:26:04 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Tue, 19 Oct 2010 23:26:04 +0200 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths Message-ID: Hello There's one feature I want to add in distutils2: the develop command setuptools provides. Basically it adds a "link" file into site-packages, and does some magic at startup to load the path that is contained in the link file. The use case is to be able to have a project added in the python path without installing it. I am not a huge fan of adding files in site-packages for this though, and the magic it supposes. I thought of another mechanism: a persistent list of paths site.py would load. So the idea is to have two files: - a site.cfg at the python level, with a persistent list of paths - a .local/site.cfg at the user level for user-defined paths. Then distutils2 would add/remove paths in these files in its develop command. This file could contain paths and also possibly sitedirs. Does this sound crazy ? Tarek -- Tarek Ziad? | http://ziade.org From ianb at colorstudy.com Wed Oct 20 00:03:14 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Tue, 19 Oct 2010 17:03:14 -0500 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Tue, Oct 19, 2010 at 4:26 PM, Tarek Ziad? wrote: > Hello > > There's one feature I want to add in distutils2: the develop command > setuptools provides. Basically it adds a "link" file into > site-packages, and does some magic at startup to load the path that is > contained in the link file. The use case is to be able to have a > project added in the python path without installing it. > The link file is a red herring -- setuptools adds an entry to easy-install.pth that points to the directory. It would work equally as well to add a .pth file for the specific package (though .pth files append to the path, so if you already have a package installed and then a .pth file pointing to a development version, then it won't work as expected, hence the magic in easy-install.pth). -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ziade.tarek at gmail.com Wed Oct 20 00:12:16 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 20 Oct 2010 00:12:16 +0200 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 12:03 AM, Ian Bicking wrote: > On Tue, Oct 19, 2010 at 4:26 PM, Tarek Ziad? wrote: >> >> Hello >> >> There's one feature I want to add in distutils2: the develop command >> setuptools provides. Basically it adds a "link" file into >> site-packages, and does some magic at startup to load the path that is >> contained in the link file. The use case is to be able to have a >> project added in the python path without installing it. > > The link file is a red herring -- setuptools adds an entry to > easy-install.pth that points to the directory.? It would work equally as > well to add a .pth file for the specific package (though .pth files append > to the path, so if you already have a package installed and then a .pth file > pointing to a development version, then it won't work as expected, hence the > magic in easy-install.pth). Yes, or a develop.pth file containing those paths, like Carl proposed on IRC. a .cfg is not really helping indeed. But we would need to have the metadata built and stored somewhere. A specific directory maybe for them. > -- > Ian Bicking? |? http://blog.ianbicking.org > -- Tarek Ziad? | http://ziade.org From p.f.moore at gmail.com Wed Oct 20 11:57:03 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 20 Oct 2010 10:57:03 +0100 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On 19 October 2010 22:26, Tarek Ziad? wrote: > There's one feature I want to add in distutils2: the develop command > setuptools provides. Basically it adds a "link" file into > site-packages, and does some magic at startup to load the path that is > contained in the link file. The use case is to be able to have a > project added in the python path without installing it. Can you explain the requirement in more detail? I don't use the setuptools develop command, so I don't have the background, but it seems to me that what you're proposing can be done simply by adding the relevant directory to PYTHONPATH. That's all I ever do when developing (but my needs are pretty simple, so there may well be subtle problems with that approach). Paul From ziade.tarek at gmail.com Wed Oct 20 15:36:23 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 20 Oct 2010 15:36:23 +0200 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 11:57 AM, Paul Moore wrote: > On 19 October 2010 22:26, Tarek Ziad? wrote: >> There's one feature I want to add in distutils2: the develop command >> setuptools provides. Basically it adds a "link" file into >> site-packages, and does some magic at startup to load the path that is >> contained in the link file. The use case is to be able to have a >> project added in the python path without installing it. > > Can you explain the requirement in more detail? I don't use the > setuptools develop command, so I don't have the background, but it > seems to me that what you're proposing can be done simply by adding > the relevant directory to PYTHONPATH. That's all I ever do when > developing (but my needs are pretty simple, so there may well be > subtle problems with that approach). Sorry that was vague indeed. It goes a little bit farther than than: the project packages and modules have to be found in the path, but we also need to publish the project metadata that would be installed in a normal installation, so our browsing/query APIs can find the project. So, if a project 'Boo' has two packages 'foo' and 'bar' and a module 'baz.py', we need those in the path but also the Boo.dist-info directory that is created at installation time (see PEP 376). Setuptools' metadata directory is called Boo.egg-info, and distutils 1 has a file called Boo.egg-info since python 2.5 And since a python project can publish several top level directories, all of them needs to be added in the path. so adding the current dir to PYTHONPATH will not work in every case even if the metadata are built and dropped there. I am not sure what would be the best way to handle this, maybe having these metadata built in place, then listing all the paths that need to be included and write them to a .pth file Distutils2 manage. So: 0. have a distutils2.pth file installed with distutils2 Then, to add the project in the path: 1. build the project metadata in-place 2. get the project paths by listing its packages and directories (by invoking a pseudo-install command) 3. inject these paths in distutils2.pth To remove it: 1. get the project paths by listing its packages and directories 2. remove these paths from distutils2.pth Another problem I see is that any module or package that is not listed by the project and that would not be installed in the site-packages might be added in the path, but that's probably not a huge issue. The goal is to be able to avoid re-installing a project you are working on to try it, every time you make a change. This is used a lot, and in particular with virtualenv. So in any case, it turns out .pth files are a good way to do this so I guess this thread does not belong to python-ideas anymore. Cross-posting to the D2 Mailing list to move it there ! Tarek -- Tarek Ziad? | http://ziade.org From ncoghlan at gmail.com Wed Oct 20 15:38:56 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Oct 2010 23:38:56 +1000 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 7:57 PM, Paul Moore wrote: > On 19 October 2010 22:26, Tarek Ziad? wrote: >> There's one feature I want to add in distutils2: the develop command >> setuptools provides. Basically it adds a "link" file into >> site-packages, and does some magic at startup to load the path that is >> contained in the link file. The use case is to be able to have a >> project added in the python path without installing it. > > Can you explain the requirement in more detail? I don't use the > setuptools develop command, so I don't have the background, but it > seems to me that what you're proposing can be done simply by adding > the relevant directory to PYTHONPATH. That's all I ever do when > developing (but my needs are pretty simple, so there may well be > subtle problems with that approach). A different idea along these lines that I've been pondering is an actual -p path option for the interpreter command line, that allowed a sequence of directories to be provided that would be prepended to PYTHONPATH (and hence included in sys.path). So if you're wanting to test two different versions of a module (from a parent directory containing the two versions in separate subdirectories): python -p versionA run_tests.py python -p versionB run_tests.py For more permanent additions to sys.path, PYTHONPATH (possibly in conjunction with virtualenv) is reasonable answer. Zipfile and directory execution covers execution of more complex applications containing multiple files as if they were simple scripts. The main piece I see missing from the puzzle is the ability to easily switch back and forth between multiple versions of a support package or library without mucking with persistent state like the environment variables or the filesystem. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Wed Oct 20 16:00:54 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 20 Oct 2010 15:00:54 +0100 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On 20 October 2010 14:36, Tarek Ziad? wrote: > It goes a little bit farther than than: the project packages and > modules have to be found in the path, but we also need to publish the > project metadata that would be installed in a normal installation, so > our browsing/query APIs can find the project. Maybe I'm still missing something, but are you saying that the metadata query APIs don't respect PYTHONPATH? Is there are reason why they can't? > So, if a project 'Boo' has two packages 'foo' and 'bar' and a module > 'baz.py', we need those in the path but also the Boo.dist-info > directory that is created at installation time (see PEP 376). > Setuptools' metadata directory is called Boo.egg-info, and distutils 1 > has a file called Boo.egg-info since python 2.5 ... and I'd expect the dist-info directory to be located by searching PYTHONPATH > And since a python project can publish several top level directories, > all of them needs to be added in the path. so adding the current dir > to PYTHONPATH will not work in every case even if the metadata are > built and dropped there. So, project Foo publishes packages bar and baz. MyDir Foo __init__.py bar __init__.py baz __init__.py Foo-N.M-pyx.y.dist-info (Is that right? I'm rusty on the structure. That's how it looks in Python 2.7) So the directory MyDir is on PYTHONPATH. Then Foo.bar and Foo.baz are visible, and the dist-info file is on PYTHONPATH for introspection. If you're saying that Foo *isn't* a package itself, so Foo/__init__.py doesn't exist, and bar and baz should be visible unqualified, then I begin to see your issue (although my first reaction is to say "don't do that, then" :-)). But don't you then just need to search *parents* of elements of PYTHONPATH as well for the metadata search? If that's an issue then doesn't that mean you've got other problems with how people structure their directories? Actually, I suspect my picture above is wrong, as I can't honestly see that mandating that the dist-info file be a *sibling* (in an arbitrarily cluttered directory) of the project directory, is sensible... But I'm probably not seeing the real issues here. All I would say is, don't let the needs of more unusual configurations over-complicate basic usage. Paul. From ziade.tarek at gmail.com Wed Oct 20 16:27:21 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 20 Oct 2010 16:27:21 +0200 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore wrote: ... > > If you're saying that Foo *isn't* a package itself, so Foo/__init__.py > doesn't exist, and bar and baz should be visible unqualified, then I > begin to see your issue (although my first reaction is to say "don't > do that, then" :-)). But don't you then just need to search *parents* > of elements of PYTHONPATH as well for the metadata search? If that's > an issue then doesn't that mean you've got other problems with how > people structure their directories? Actually, I suspect my picture > above is wrong, as I can't honestly see that mandating that the > dist-info file be a *sibling* (in an arbitrarily cluttered directory) > of the project directory, is sensible... yeah that the main issue: we can't make assumptions on how the source tree looks in the project, so adding the root path will not work all the time. Some people even have two separate root packages. Which is not a good layout, but allowed.. In Zope, I think the convention is to use a src/ directory so that's another level. Since distutils1 and distutils2 will let you provide in their options a list of packages and modules, I think it's the only sane way to get a list of paths we can then add in the path. > > But I'm probably not seeing the real issues here. > > All I would say is, don't let the needs of more unusual configurations > over-complicate basic usage. The trouble is: adding in PYTHONPATH the root of the source of your project can be different from what it would be once installed in Python. Now the question is: if 90% of the projects out there would work by adding the root, then this is might be overkill. I am afraid it's way less though... Tarek -- Tarek Ziad? | http://ziade.org From ianb at colorstudy.com Wed Oct 20 18:02:03 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 20 Oct 2010 11:02:03 -0500 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 8:36 AM, Tarek Ziad? wrote: > So, if a project 'Boo' has two packages 'foo' and 'bar' and a module > 'baz.py', we need those in the path but also the Boo.dist-info > directory that is created at installation time (see PEP 376). > Setuptools' metadata directory is called Boo.egg-info, and distutils 1 > has a file called Boo.egg-info since python 2.5 > So do it the same way as Setuptools -- setup.py egg_info writes the info to the root of the packages (which might be src/ for some libraries) and when that is added to the path, then the directory will be scanned and the metadata found. And setup.py develop calls egg_info. Replace egg with dist and it's all good, right? -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ziade.tarek at gmail.com Wed Oct 20 18:13:20 2010 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Wed, 20 Oct 2010 18:13:20 +0200 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 6:02 PM, Ian Bicking wrote: > On Wed, Oct 20, 2010 at 8:36 AM, Tarek Ziad? wrote: >> >> So, if a project 'Boo' has two packages 'foo' and 'bar' and a module >> 'baz.py', we need those in the path but also the Boo.dist-info >> directory that is created at installation time (see PEP 376). >> Setuptools' metadata directory is called Boo.egg-info, and distutils 1 >> has a file called Boo.egg-info since python 2.5 > > So do it the same way as Setuptools -- setup.py egg_info writes the info to > the root of the packages (which might be src/ for some libraries) and when > that is added to the path, then the directory will be scanned and the > metadata found.? And setup.py develop calls egg_info.? Replace egg with dist > and it's all good, right? Not quite, since packages can be located in other (and several) places than directly there. (See my answer to Paul) So I am trying to write this options_to_paths() code to see how things can work > -- > Ian Bicking? |? http://blog.ianbicking.org > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- Tarek Ziad? | http://ziade.org From rrr at ronadam.com Wed Oct 20 20:46:42 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 20 Oct 2010 13:46:42 -0500 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: <4CBF3912.5050009@ronadam.com> On 10/20/2010 08:38 AM, Nick Coghlan wrote: > On Wed, Oct 20, 2010 at 7:57 PM, Paul Moore wrote: >> On 19 October 2010 22:26, Tarek Ziad? wrote: >>> There's one feature I want to add in distutils2: the develop command >>> setuptools provides. Basically it adds a "link" file into >>> site-packages, and does some magic at startup to load the path that is >>> contained in the link file. The use case is to be able to have a >>> project added in the python path without installing it. >> >> Can you explain the requirement in more detail? I don't use the >> setuptools develop command, so I don't have the background, but it >> seems to me that what you're proposing can be done simply by adding >> the relevant directory to PYTHONPATH. That's all I ever do when >> developing (but my needs are pretty simple, so there may well be >> subtle problems with that approach). > > A different idea along these lines that I've been pondering is an > actual -p path option for the interpreter command line, that allowed a > sequence of directories to be provided that would be prepended to > PYTHONPATH (and hence included in sys.path). > > So if you're wanting to test two different versions of a module (from > a parent directory containing the two versions in separate > subdirectories): > > python -p versionA run_tests.py > python -p versionB run_tests.py > > For more permanent additions to sys.path, PYTHONPATH (possibly in > conjunction with virtualenv) is reasonable answer. Zipfile and > directory execution covers execution of more complex applications > containing multiple files as if they were simple scripts. > > The main piece I see missing from the puzzle is the ability to easily > switch back and forth between multiple versions of a support package > or library without mucking with persistent state like the environment > variables or the filesystem. Yes, I don't like changing the system wide environment variables and file system options. It's too easy to break other things that depend on them. How about adding the ability to use a .pth file from the current program directory? Ron From ianb at colorstudy.com Wed Oct 20 21:20:40 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Wed, 20 Oct 2010 14:20:40 -0500 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 9:27 AM, Tarek Ziad? wrote: > On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore wrote: > ... > > > > If you're saying that Foo *isn't* a package itself, so Foo/__init__.py > > doesn't exist, and bar and baz should be visible unqualified, then I > > begin to see your issue (although my first reaction is to say "don't > > do that, then" :-)). But don't you then just need to search *parents* > > of elements of PYTHONPATH as well for the metadata search? If that's > > an issue then doesn't that mean you've got other problems with how > > people structure their directories? Actually, I suspect my picture > > above is wrong, as I can't honestly see that mandating that the > > dist-info file be a *sibling* (in an arbitrarily cluttered directory) > > of the project directory, is sensible... > > yeah that the main issue: we can't make assumptions on how the source > tree looks in the project, so adding the root path will not work all > the time. Some people even have two separate root packages. Which is > not a good layout, but allowed.. In Zope, I think the convention is to > use a src/ directory so that's another level. > Setuptools puts the files in the src/ directory in that case. More complicated layouts simply aren't supported, and generally no one complains as more complicated layouts are uncommon and a sign someone's head is somewhere very different than where they would be if they were using setup.py develop. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From flub at devork.be Thu Oct 21 01:35:37 2010 From: flub at devork.be (Floris Bruynooghe) Date: Thu, 21 Oct 2010 00:35:37 +0100 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: [sorry, forgot to include the list address before] Hi On 20 October 2010 15:27, Tarek Ziad? wrote: > On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore wrote: >> But I'm probably not seeing the real issues here. >> >> All I would say is, don't let the needs of more unusual configurations >> over-complicate basic usage. > > The trouble is: adding in PYTHONPATH the root of the source of your > project can be different from what it would be once installed in > Python. Now the question is: if 90% of the projects out there would > work by adding the root, then this is might be overkill. I am afraid > it's way less though... I've read your and Ian's responses and still don't understand what setup.py develop brings to the party which can't be done with simple PYTHONPATH. Excuse me if I also completely misunderstand what develop does but it sounds like it's going to add an in-development version of a project on a users's sys.path (at the front?) until it's undone again somehow (is there a "setup.py undevelop"?). This just seems dangerous to me since it will affect all python programs run by that user. If I understand correctly this whole "develop" dance is for when you have two inter-depended packages in development at the same time. If manually setting PYTHONPATH correctly in this situation is too complicated then my feeling is there's nothing wrong with some sort of helper which manipulates PYTHONPATH for you, something like spaw a new shell and set the environment in that correctly. But placing things in files makes this permanent for the user and just seems the wrong way to go to me. Again, apologies if I understand the problem wrongly. But I too am worried about too many complexities and "magic". One of my main issues with setuptools is that it tries to handle my python environment (sys.path) outside of normally expected python mechanisms by modifying various custom files. I would hate to see distutils2 repeat this. Regards Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From ncoghlan at gmail.com Thu Oct 21 04:32:32 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Oct 2010 12:32:32 +1000 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) Message-ID: On Thu, Oct 21, 2010 at 4:46 AM, Ron Adam wrote: > > > On 10/20/2010 08:38 AM, Nick Coghlan wrote: >> A different idea along these lines that I've been pondering is an >> actual -p path option for the interpreter command line, that allowed a >> sequence of directories to be provided that would be prepended to >> PYTHONPATH (and hence included in sys.path). >> >> So if you're wanting to test two different versions of a module (from >> a parent directory containing the two versions in separate >> subdirectories): >> >> python -p versionA run_tests.py >> python -p versionB run_tests.py >> >> For more permanent additions to sys.path, PYTHONPATH (possibly in >> conjunction with virtualenv) is reasonable answer. Zipfile and >> directory execution covers execution of more complex applications >> containing multiple files as if they were simple scripts. >> >> The main piece I see missing from the puzzle is the ability to easily >> switch back and forth between multiple versions of a support package >> or library without mucking with persistent state like the environment >> variables or the filesystem. > > Yes, I don't like changing the system wide environment variables and file > system options. It's too easy to break other things that depend on them. > > How about adding the ability to use a .pth file from the current program > directory? A simple check to see if a supplied path was a directory or not would let us do both with one new option: -p Specify a directory or a .pth file (see site module docs) to be prepended to sys.path distutils2 could then provide a way to generate an appropriate .pth file instead of installing a distribution. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Thu Oct 21 06:36:01 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 20 Oct 2010 23:36:01 -0500 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) In-Reply-To: References: Message-ID: <4CBFC331.8020309@ronadam.com> On 10/20/2010 09:32 PM, Nick Coghlan wrote: > On Thu, Oct 21, 2010 at 4:46 AM, Ron Adam wrote: >> >> >> On 10/20/2010 08:38 AM, Nick Coghlan wrote: >>> A different idea along these lines that I've been pondering is an >>> actual -p path option for the interpreter command line, that allowed a >>> sequence of directories to be provided that would be prepended to >>> PYTHONPATH (and hence included in sys.path). >>> >>> So if you're wanting to test two different versions of a module (from >>> a parent directory containing the two versions in separate >>> subdirectories): >>> >>> python -p versionA run_tests.py >>> python -p versionB run_tests.py >>> >>> For more permanent additions to sys.path, PYTHONPATH (possibly in >>> conjunction with virtualenv) is reasonable answer. Zipfile and >>> directory execution covers execution of more complex applications >>> containing multiple files as if they were simple scripts. >>> >>> The main piece I see missing from the puzzle is the ability to easily >>> switch back and forth between multiple versions of a support package >>> or library without mucking with persistent state like the environment >>> variables or the filesystem. >> >> Yes, I don't like changing the system wide environment variables and file >> system options. It's too easy to break other things that depend on them. >> >> How about adding the ability to use a .pth file from the current program >> directory? > > A simple check to see if a supplied path was a directory or not would > let us do both with one new option: > > -p Specify a directory or a .pth file (see site module docs) to be > prepended to sys.path Prepending would be great. ;-) > distutils2 could then provide a way to generate an appropriate .pth > file instead of installing a distribution. Where would the .pth file be and how would I run the application if I don't know I need to specify a .pth file? How would I know I need to specify a .pth file? (ie... if I'm trying to figure out what is wrong on some one else's computer.) If you have a default .pth file in the same directory as the .py file being run, then that would give a way to specify an alternative or local library of modules and packages that is program specific without doing anything special. It would be included in the distribution files as well, so distuitils2 doesn't have to generate anything. +1 on the -p option with .pth files also. Can .pth files use environment variables? Cheers, Ron From ncoghlan at gmail.com Thu Oct 21 08:43:15 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Oct 2010 16:43:15 +1000 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) In-Reply-To: <4CBFC331.8020309@ronadam.com> References: <4CBFC331.8020309@ronadam.com> Message-ID: On Thu, Oct 21, 2010 at 2:36 PM, Ron Adam wrote: > Where would the .pth file be and how would I run the application if I don't > know I need to specify a .pth file? ?How would I know I need to specify a > .pth file? ?(ie... if I'm trying to figure out what is wrong on some one > else's computer.) This idea is only aimed at developers. To run an actual Python application that needs additional modules, either install it properly or put it in a zipfile or directory, put a __main__.py at the top level and just run the zipfile/directory directly. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From p.f.moore at gmail.com Thu Oct 21 13:21:56 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Oct 2010 12:21:56 +0100 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On 21 October 2010 00:35, Floris Bruynooghe wrote: > I've read your and Ian's responses and still don't understand what > setup.py develop brings to the party which can't be done with simple > PYTHONPATH. I'm glad it's not just me! > Again, apologies if I understand the problem wrongly. ?But I too am > worried about too many complexities and "magic". ?One of my main > issues with setuptools is that it tries to handle my python > environment (sys.path) outside of normally expected python mechanisms > by modifying various custom files. ?I would hate to see distutils2 > repeat this. I think that the key issue here is that PEP 376 introduces the idea of a "distribution" which is a somewhat vaguely defined concept, which can contain one or more packages or modules. Distributions don't have a well-defined directory structure, and don't participate properly in Python's standard import mechanism (PEP 302, PYTHONPATH, all that stuff). The distribution metadata (dist-info directory) is not package-based, and so doesn't fit the model. Suggestions: 1. PEP 376 clearly defines what a "distribution" (installed or otherwise) is, in terms of directory structure, whether/how it supports PEP302-style non-filesystem access, etc. I don't see a reason here why we can't mandate some structure, rather than leaving things as a "free for all" like the current setuptools/adhoc approach. 2. Mechanisms for dealing with distributions are *only* discussed in terms of the PEP 376 definitions, so we have a common understanding. As a first cut, I'd say that a distribution is defined purely in terms of its metadata (dist-info directory). On that basis, there should be a definition of where dist-info directories are searched for, PEP 376 seems to state that this is only in site-packages ("This PEP proposes an installation format inspired by one of the options in the EggFormats standard, the one that uses a distinct directory located in the site-packages directory."). And yet, this whole "develop" discussion seems to be about locating dist-info directories located elsewhere. Having said that, PEP 376 later states: get_distributions() -> iterator of Distribution instances. Provides an iterator that looks for .dist-info directories in sys.path and returns Distribution instances for each one of them. This implies dist-info directories are searched for in sys.path. OK, fine. That's broader than just site-packages, but still well-defined and acceptable. And that's where I get my expectations that manipulating PYTHONPATH should work. So what's this directory structure we're talking about with Foo containing two packages, and Foo.dist-info being alongside Foo? Foo itself isn't on PYTHONPATH, so why should Foo.dist-info be found at all? Based on PEP 376, it's not meant to be found. Maybe if this *is* a requirement, it needs a change to PEP 376, which I guess means the PEP discussion and approval process needs to be gone through again. I, for one, would be OK with that, as I remain to be convinced that the complexity and confusion is worth it. Paul. From doug.hellmann at gmail.com Thu Oct 21 14:14:46 2010 From: doug.hellmann at gmail.com (Doug Hellmann) Date: Thu, 21 Oct 2010 08:14:46 -0400 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: <884AE57C-ED90-4F10-8C26-32EE5E48B94A@gmail.com> On Oct 21, 2010, at 7:21 AM, Paul Moore wrote: > On 21 October 2010 00:35, Floris Bruynooghe wrote: > >> I've read your and Ian's responses and still don't understand what >> setup.py develop brings to the party which can't be done with simple >> PYTHONPATH. > > I'm glad it's not just me! Using develop does more than just modify the import path. It also generates the meta data, such as entry points, and re-generates any console scripts defined by my setup.py so that they point to the version of code in the sandbox. After I run develop, any Python process on the system using the same python interpreter will run the code in my sandbox instead of the version "installed" in site-packages. That includes any of the command line programs or plugins defined in my setup.py, and even applies to processes that don't run as my user. I use these features every day, since our application depends on a few daemons that run as root (it's a system management app, so it needs root privileges to do almost anything interesting). Doug From solipsis at pitrou.net Thu Oct 21 14:17:50 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 21 Oct 2010 14:17:50 +0200 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths References: Message-ID: <20101021141750.43a0c78d@pitrou.net> On Thu, 21 Oct 2010 12:21:56 +0100 Paul Moore wrote: > On 21 October 2010 00:35, Floris Bruynooghe wrote: > > > I've read your and Ian's responses and still don't understand what > > setup.py develop brings to the party which can't be done with simple > > PYTHONPATH. > > I'm glad it's not just me! How does PYTHONPATH work with C extensions? Besides, how do you manage your PYTHONPATH when you have multiple packages in "develop" mode, depending on each other? Regards Antoine. From benjamin at python.org Thu Oct 21 16:06:24 2010 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 21 Oct 2010 14:06:24 +0000 (UTC) Subject: [Python-ideas] New 3.x restriction on number of keyword arguments References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: Raymond Hettinger writes: > > One of the use cases for named tuples is to have them be automatically created from a SQL query or CSV header. > Sometimes (but not often), those can have a huge number of columns. In Python 2.x, it worked just fine -- we > had a test for a named tuple with 5000 fields. In Python 3.x, there is a SyntaxError when there are more than > 255 fields. I'm not sure why you think this is new. It's been true from at least 2.5 as far as I can see. From ianb at colorstudy.com Thu Oct 21 16:44:50 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 21 Oct 2010 09:44:50 -0500 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: References: Message-ID: On Wed, Oct 20, 2010 at 6:35 PM, Floris Bruynooghe wrote: > > On 20 October 2010 15:27, Tarek Ziad? wrote: > > On Wed, Oct 20, 2010 at 4:00 PM, Paul Moore wrote: > > >> But I'm probably not seeing the real issues here. > >> > >> All I would say is, don't let the needs of more unusual configurations > >> over-complicate basic usage. > > > > The trouble is: adding in PYTHONPATH the root of the source of your > > project can be different from what it would be once installed in > > Python. Now the question is: if 90% of the projects out there would > > work by adding the root, then this is might be overkill. I am afraid > > it's way less though... > > I've read your and Ian's responses and still don't understand what > setup.py develop brings to the party which can't be done with simple > PYTHONPATH. Excuse me if I also completely misunderstand what develop > does but it sounds like it's going to add an in-development version of > a project on a users's sys.path (at the front?) until it's undone > again somehow (is there a "setup.py undevelop"?). pip uninstall would unlink it (pip install -e calls setup.py develop as well). setup.py develop is persistent unlike PYTHONPATH. > This just seems > dangerous to me since it will affect all python programs run by that > user. > Hence virtualenv, which solves your other concerns. > If I understand correctly this whole "develop" dance is for when you > have two inter-depended packages in development at the same time. If > manually setting PYTHONPATH correctly in this situation is too > complicated then my feeling is there's nothing wrong with some sort of > helper which manipulates PYTHONPATH for you, something like spaw a new > shell and set the environment in that correctly. But placing things > in files makes this permanent for the user and just seems the wrong > way to go to me. > > Again, apologies if I understand the problem wrongly. But I too am > worried about too many complexities and "magic". One of my main > issues with setuptools is that it tries to handle my python > environment (sys.path) outside of normally expected python mechanisms > by modifying various custom files. I would hate to see distutils2 > repeat this. > Note if you use pip, it uses setuptools in a way where only setup.py develop uses .pth files, and otherwise the path is similar to how it is with distutils alone (except with that extra metadata, as Doug mentions). -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.f.moore at gmail.com Thu Oct 21 16:56:04 2010 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 21 Oct 2010 15:56:04 +0100 Subject: [Python-ideas] Add a site.cfg to keep a persistent list of paths In-Reply-To: <884AE57C-ED90-4F10-8C26-32EE5E48B94A@gmail.com> References: <884AE57C-ED90-4F10-8C26-32EE5E48B94A@gmail.com> Message-ID: On 21 October 2010 13:14, Doug Hellmann wrote: > > On Oct 21, 2010, at 7:21 AM, Paul Moore wrote: > >> On 21 October 2010 00:35, Floris Bruynooghe wrote: >> >>> I've read your and Ian's responses and still don't understand what >>> setup.py develop brings to the party which can't be done with simple >>> PYTHONPATH. >> >> I'm glad it's not just me! > > Using develop does more than just modify the import path. > > It also generates the meta data, such as entry points, and re-generates any console scripts defined by > my setup.py so that they point to the version of code in the sandbox. ?After I run develop, any Python > process on the system using the same python interpreter will run the code in my sandbox instead of the > version "installed" in site-packages. ?That includes any of the command line programs or plugins defined > in my setup.py, and even applies to processes that don't run as my user. > > I use these features every day, since our application depends on a few daemons that run as root (it's a > system management app, so it needs root privileges to do almost anything interesting). Note - my understanding is that this discussion is about metadata discovery for distutils2, *not* about setuptools' develop feature (which AIUI does far more than is being proposed at the moment). Specifically, I thought we were just talking about metadata here. As far as this discussion goes, entry points and console scripts aren't included. That's not to say they aren't useful, just that they are a separate discussion. In case it's not obvious, I'm a strong -1 on simply importing setuptools functionality into distutils2 wholesale, without discussion/review. Paul. From g.brandl at gmx.net Thu Oct 21 17:07:42 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 21 Oct 2010 17:07:42 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: Am 21.10.2010 16:06, schrieb Benjamin Peterson: > Raymond Hettinger writes: > >> >> One of the use cases for named tuples is to have them be automatically created > from a SQL query or CSV header. >> Sometimes (but not often), those can have a huge number of columns. In Python > 2.x, it worked just fine -- we >> had a test for a named tuple with 5000 fields. In Python 3.x, there is a > SyntaxError when there are more than >> 255 fields. > > I'm not sure why you think this is new. It's been true from at least 2.5 as far > as I can see. You must be talking of a different restriction. This snippet works fine in 2.7, but raises a SyntaxError in 3.1: exec("def f(" + ", ".join("a%d" % i for i in range(1000)) + "): pass") Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From mal at egenix.com Thu Oct 21 17:41:12 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 21 Oct 2010 17:41:12 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: <4CC05F18.3060607@egenix.com> Georg Brandl wrote: > Am 21.10.2010 16:06, schrieb Benjamin Peterson: >> Raymond Hettinger writes: >> >>> >>> One of the use cases for named tuples is to have them be automatically created >> from a SQL query or CSV header. >>> Sometimes (but not often), those can have a huge number of columns. In Python >> 2.x, it worked just fine -- we >>> had a test for a named tuple with 5000 fields. In Python 3.x, there is a >> SyntaxError when there are more than >>> 255 fields. >> >> I'm not sure why you think this is new. It's been true from at least 2.5 as far >> as I can see. > > You must be talking of a different restriction. This snippet works fine in > 2.7, but raises a SyntaxError in 3.1: > > exec("def f(" + ", ".join("a%d" % i for i in range(1000)) + "): pass") The AST code in 2.7 raises this error for function/method calls only. In 3.2, it also raises the error for function/method definitions. Looking at the AST code, the limitation appears somewhat arbitrary. There's no comment in the code suggesting a reason for the limit and it's still possible to pass in more arguments via *args and **kws - but without the built-in argument checking. Could someone provide some insight ? Note that it's not uncommon to have more than 255 possible function/method arguments in generated code, e.g. in database abstraction layers. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Thu Oct 21 18:13:49 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 21 Oct 2010 12:13:49 -0400 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC05F18.3060607@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC05F18.3060607@egenix.com> Message-ID: On Thu, Oct 21, 2010 at 11:41 AM, M.-A. Lemburg wrote: .. > Looking at the AST code, the limitation appears somewhat arbitrary. > There's no comment in the code suggesting a reason for the limit and > it's still possible to pass in more arguments via *args and **kws - > but without the built-in argument checking. > > Could someone provide some insight ? > My understanding is that the limitation comes from bytecode generation phase, not AST. See also Guido's http://bugs.python.org/issue1636#msg58760. According to Python manual section for opcodes, CALL_FUNCTION(argc) Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. On the stack, the opcode finds the keyword parameters first. For each keyword argument, the value is on top of the key. Below the keyword parameters, the positional parameters are on the stack, with the right-most parameter on top. Below the parameters, the function object to call is on the stack. Pops all function arguments, and the function itself off the stack, and pushes the return value. http://docs.python.org/dev/py3k/library/dis.html?highlight=opcode#opcode-CALL_FUNCTION From mal at egenix.com Thu Oct 21 19:31:48 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 21 Oct 2010 19:31:48 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC05F18.3060607@egenix.com> Message-ID: <4CC07904.6070100@egenix.com> Alexander Belopolsky wrote: > On Thu, Oct 21, 2010 at 11:41 AM, M.-A. Lemburg wrote: > .. >> Looking at the AST code, the limitation appears somewhat arbitrary. >> There's no comment in the code suggesting a reason for the limit and >> it's still possible to pass in more arguments via *args and **kws - >> but without the built-in argument checking. >> >> Could someone provide some insight ? >> > > My understanding is that the limitation comes from bytecode generation > phase, not AST. > > See also Guido's http://bugs.python.org/issue1636#msg58760. > > According to Python manual section for opcodes, > > CALL_FUNCTION(argc) > > Calls a function. The low byte of argc indicates the number of > positional parameters, the high byte the number of keyword parameters. > On the stack, the opcode finds the keyword parameters first. For each > keyword argument, the value is on top of the key. Below the keyword > parameters, the positional parameters are on the stack, with the > right-most parameter on top. Below the parameters, the function object > to call is on the stack. Pops all function arguments, and the function > itself off the stack, and pushes the return value. > > http://docs.python.org/dev/py3k/library/dis.html?highlight=opcode#opcode-CALL_FUNCTION Thanks for the insight. Even with the one byte per position and keywords arguments limitation imposed by the byte code, the checks in ast.c are a bit too simple, since they apply a limit on the sum of positional and keyword args, whereas the byte code and VM can deal with up to 255 positional and 255 keyword arguments. if (nposargs + nkwonlyargs > 255) { ast_error(n, "more than 255 arguments"); return NULL; } I think this should be: if (nposargs > 255) ast_error(n, "more than 255 positional arguments"); return NULL; } if (nkwonlyargs > 255) ast_error(n, "more than 255 keyword arguments"); return NULL; } There's a patch somewhere that turns Python's VM into a 16 or 32-bit byte code machine. Perhaps it's time to have a look at that again. Do other Python implementations have such limitations ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Thu Oct 21 19:36:48 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 21 Oct 2010 13:36:48 -0400 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC07904.6070100@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC05F18.3060607@egenix.com> <4CC07904.6070100@egenix.com> Message-ID: On Thu, Oct 21, 2010 at 1:31 PM, M.-A. Lemburg wrote: .. > There's a patch somewhere that turns Python's VM into a 16 or > 32-bit byte code machine. Perhaps it's time to have a look at that > again. > This sounds like a reference to wpython: http://code.google.com/p/wpython/ I hope 255 argument limitation can be removed by simpler means. From mal at egenix.com Thu Oct 21 19:46:06 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 21 Oct 2010 19:46:06 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC05F18.3060607@egenix.com> <4CC07904.6070100@egenix.com> Message-ID: <4CC07C5E.4060907@egenix.com> Alexander Belopolsky wrote: > On Thu, Oct 21, 2010 at 1:31 PM, M.-A. Lemburg wrote: > .. >> There's a patch somewhere that turns Python's VM into a 16 or >> 32-bit byte code machine. Perhaps it's time to have a look at that >> again. >> > > This sounds like a reference to wpython: > > http://code.google.com/p/wpython/ Indeed. That's what I was thinking of. > I hope 255 argument limitation can be removed by simpler means. Probably, but why not take this as a chance to improve other aspects of the CPython VM as well ? Here's a presentation by Cesare Di Mauro, the author of the patch: http://wpython.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-based%20Python.pdf -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 21 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From cesare.di.mauro at gmail.com Thu Oct 21 19:56:57 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Thu, 21 Oct 2010 19:56:57 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC07C5E.4060907@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC05F18.3060607@egenix.com> <4CC07904.6070100@egenix.com> <4CC07C5E.4060907@egenix.com> Message-ID: Hi Marc > I hope 255 argument limitation can be removed by simpler means. Probably, but why not take this as a chance to improve other > aspects of the CPython VM as well ? > > Here's a presentation by Cesare Di Mauro, the author of the > patch: > > > http://wpython.googlecode.com/files/Beyond%20Bytecode%20-%20A%20Wordcode-based%20Python.pdf > > -- > Marc-Andre Lemburg > eGenix.com > This presentation was made for wpython 1.0 alpha, which was the first release I made. Last year I released the second (and last), wpython 1.1, which carries several other changes and optimizations. You can find the new project here: http://code.google.com/p/wpython2/ and the presentation here: http://wpython2.googlecode.com/files/Cleanup%20and%20new%20optimizations%20in%20WPython%201.1.pdf Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Thu Oct 21 20:15:55 2010 From: benjamin at python.org (Benjamin Peterson) Date: Thu, 21 Oct 2010 18:15:55 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?New_3=2Ex_restriction_on_number_of_keywo?= =?utf-8?q?rd=09arguments?= References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: Georg Brandl writes: > You must be talking of a different restriction. I assumed Raymond was talking about calling a function with > 255 args. From g.brandl at gmx.net Thu Oct 21 22:08:46 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 21 Oct 2010 22:08:46 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: Am 21.10.2010 20:15, schrieb Benjamin Peterson: > Georg Brandl writes: > >> You must be talking of a different restriction. > > I assumed Raymond was talking about calling a function with > 255 args. And I assumed Raymond was talking about defining a function with > 255 args. Whatever, both instances should be fixed. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From cesare.di.mauro at gmail.com Fri Oct 22 09:18:01 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Fri, 22 Oct 2010 09:18:01 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: 2010/10/21 Benjamin Peterson > Georg Brandl writes: > > > You must be talking of a different restriction. > > I assumed Raymond was talking about calling a function with > 255 args. > I think that having max 255 args and 255 kwargs is a good and reasonable limit which we can live on, and helps the virtual machine implementation (and implementors :P). Python won't lose its "power" and "generality" if one VM (albeit the "mainstream" / "official" one) have some limits. We already have some other ones, such as max 65536 constants, names, globals and locals. Another one is the maximum 20 blocks for code object. Who thinks that such limits must be removed? I think that having more than 255 arguments for a function call is a very rare case for which a workaround (may be passing a tuple/list or a dictionary) can be a better solution than having to introduce a brand new opcode to handle it. Changing the current opcode(s) is a very bad idea, since common cases will slow down. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Oct 22 19:44:08 2010 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 22 Oct 2010 18:44:08 +0100 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: <4CC1CD68.4020507@mrabarnett.plus.com> On 22/10/2010 08:18, Cesare Di Mauro wrote: > 2010/10/21 Benjamin Peterson > > > Georg Brandl writes: > > > You must be talking of a different restriction. > > I assumed Raymond was talking about calling a function with > 255 args. > > > I think that having max 255 args and 255 kwargs is a good and reasonable > limit which we can live on, and helps the virtual machine implementation > (and implementors :P). > > Python won't lose its "power" and "generality" if one VM (albeit the > "mainstream" / "official" one) have some limits. > > We already have some other ones, such as max 65536 constants, names, > globals and locals. Another one is the maximum 20 blocks for code > object. Who thinks that such limits must be removed? > The BDFL thinks that 255 is too low. > I think that having more than 255 arguments for a function call is a > very rare case for which a workaround (may be passing a tuple/list or a > dictionary) can be a better solution than having to introduce a brand > new opcode to handle it. > > Changing the current opcode(s) is a very bad idea, since common cases > will slow down. > From solipsis at pitrou.net Fri Oct 22 19:53:19 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Oct 2010 19:53:19 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1CD68.4020507@mrabarnett.plus.com> Message-ID: <20101022195319.5a5043f9@pitrou.net> On Fri, 22 Oct 2010 18:44:08 +0100 MRAB wrote: > On 22/10/2010 08:18, Cesare Di Mauro wrote: > > > > I think that having max 255 args and 255 kwargs is a good and reasonable > > limit which we can live on, and helps the virtual machine implementation > > (and implementors :P). > > > > Python won't lose its "power" and "generality" if one VM (albeit the > > "mainstream" / "official" one) have some limits. > > > > We already have some other ones, such as max 65536 constants, names, > > globals and locals. Another one is the maximum 20 blocks for code > > object. Who thinks that such limits must be removed? > > > The BDFL thinks that 255 is too low. The BDFL can propose a patch :) Cheers Antoine. From mal at egenix.com Fri Oct 22 20:35:18 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 22 Oct 2010 20:35:18 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: <4CC1D966.2080007@egenix.com> Cesare Di Mauro wrote: > 2010/10/21 Benjamin Peterson > >> Georg Brandl writes: >> >>> You must be talking of a different restriction. >> >> I assumed Raymond was talking about calling a function with > 255 args. >> > > I think that having max 255 args and 255 kwargs is a good and reasonable > limit which we can live on, and helps the virtual machine implementation > (and implementors :P). > > Python won't lose its "power" and "generality" if one VM (albeit the > "mainstream" / "official" one) have some limits. > > We already have some other ones, such as max 65536 constants, names, globals > and locals. Another one is the maximum 20 blocks for code object. Who thinks > that such limits must be removed? > > I think that having more than 255 arguments for a function call is a very > rare case for which a workaround (may be passing a tuple/list or a > dictionary) can be a better solution than having to introduce a brand new > opcode to handle it. It's certainly rare when writing applications by hand, but such limits can be reached with code generators wrapping external resources such as database query rows, spreadsheet rows, sensor data input, etc. We've had such a limit before (number of lines in a module) and that was raised for the same reason. > Changing the current opcode(s) is a very bad idea, since common cases will > slow down. I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG for such cases. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 22 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From qrczak at knm.org.pl Fri Oct 22 20:52:01 2010 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Fri, 22 Oct 2010 20:52:01 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: 2010/10/22 Cesare Di Mauro : > I think that having more than 255 arguments for a function call is a very > rare case for which a workaround (may be passing a tuple/list or a > dictionary) can be a better solution than having to introduce a brand new > opcode to handle it. It does not need a new opcode. The bytecode can create an argument tuple explicitly and pass it like it passes *args. -- Marcin Kowalczyk From cesare.di.mauro at gmail.com Fri Oct 22 22:31:10 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Fri, 22 Oct 2010 22:31:10 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC1D966.2080007@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> Message-ID: 2010/10/22 M.-A. Lemburg > Cesare Di Mauro wrote: > > I think that having more than 255 arguments for a function call is a very > > rare case for which a workaround (may be passing a tuple/list or a > > dictionary) can be a better solution than having to introduce a brand new > > opcode to handle it. > > It's certainly rare when writing applications by hand, but such > limits can be reached with code generators wrapping external resources > such as database query rows, spreadsheet rows, sensor data input, etc. > > We've had such a limit before (number of lines in a module) and that > was raised for the same reason. > > > Changing the current opcode(s) is a very bad idea, since common cases > will > > slow down. > > I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG > for such cases. > > -- > Marc-Andre Lemburg > eGenix.com > I've patched Python 3.2 alpha 3 with a rough solution using EXTENDED_ARG for CALL_FUNCTION* opcodes, raising the arguments and keywords limits to 65535 maximum. I hope it'll be enough. :) In ast.c: ast_for_arguments: if (nposargs > 65535 || nkwonlyargs > 65535) { ast_error(n, "more than 65535 arguments"); return NULL; } ast_for_call: if (nargs + ngens > 65535 || nkeywords > 65535) { ast_error(n, "more than 65535 arguments"); return NULL; } In compile.c: opcode_stack_effect: #define NARGS(o) (((o) & 0xff) + ((o) >> 8 & 0xff00) + 2*(((o) >> 8 & 0xff) + ((o) >> 16 & 0xff00))) case CALL_FUNCTION: return -NARGS(oparg); case CALL_FUNCTION_VAR: case CALL_FUNCTION_KW: return -NARGS(oparg)-1; case CALL_FUNCTION_VAR_KW: return -NARGS(oparg)-2; #undef NARGS #define NARGS(o) (((o) % 256) + 2*(((o) / 256) % 256)) case MAKE_FUNCTION: return -NARGS(oparg) - ((oparg >> 16) & 0xffff); case MAKE_CLOSURE: return -1 - NARGS(oparg) - ((oparg >> 16) & 0xffff); #undef NARGS compiler_call_helper: int len; int code = 0; len = asdl_seq_LEN(args) + n; n = len & 0xff | (len & 0xff00) << 8; VISIT_SEQ(c, expr, args); if (keywords) { VISIT_SEQ(c, keyword, keywords); len = asdl_seq_LEN(keywords); n |= (len & 0xff | (len & 0xff00) << 8) << 8; } In ceval.c: PyEval_EvalFrameEx: TARGET_WITH_IMPL(CALL_FUNCTION_VAR, _call_function_var_kw) TARGET_WITH_IMPL(CALL_FUNCTION_KW, _call_function_var_kw) TARGET(CALL_FUNCTION_VAR_KW) _call_function_var_kw: { int na = oparg & 0xff | oparg >> 8 & 0xff00; int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8; call_function: int na = oparg & 0xff | oparg >> 8 & 0xff00; int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8; A quick example: s = '''def f(*Args, **Keywords): print('Got', len(Args), 'arguments and', len(Keywords), 'keywords') def g(): f(''' + ', '.join(str(i) for i in range(500)) + ', ' + ', '.join('k{} = {}'.format(i, i) for i in range(500)) + ''') g() ''' c = compile(s, '', 'exec') eval(c) from dis import dis dis(g) The output is: Got 500 arguments and 500 keywords 5 0 LOAD_GLOBAL 0 (f) 3 LOAD_CONST 1 (0) 6 LOAD_CONST 2 (1) [...] 1497 LOAD_CONST 499 (498) 1500 LOAD_CONST 500 (499) 1503 LOAD_CONST 501 ('k0') 1506 LOAD_CONST 1 (0) 1509 LOAD_CONST 502 ('k1') 1512 LOAD_CONST 2 (1) [...] 4491 LOAD_CONST 999 ('k498') 4494 LOAD_CONST 499 (498) 4497 LOAD_CONST 1000 ('k499') 4500 LOAD_CONST 500 (499) 4503 EXTENDED_ARG 257 4506 CALL_FUNCTION 16905460 4509 POP_TOP 4510 LOAD_CONST 0 (None) 4513 RETURN_VALUE The dis module seems to have some problem displaying the correct extended value, but I have no time now to check and fix it. Anyway, I'm still unconvinced of the need to raise the function def/call limits. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From cesare.di.mauro at gmail.com Fri Oct 22 22:36:49 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Fri, 22 Oct 2010 22:36:49 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: 2010/10/22 Marcin 'Qrczak' Kowalczyk > 2010/10/22 Cesare Di Mauro : > > > I think that having more than 255 arguments for a function call is a very > > rare case for which a workaround (may be passing a tuple/list or a > > dictionary) can be a better solution than having to introduce a brand new > > opcode to handle it. > > It does not need a new opcode. The bytecode can create an argument > tuple explicitly and pass it like it passes *args. > > -- > Marcin Kowalczyk > It'll be too slow. Current CALL_FUNCTION* uses "packed" ints, not PyLongObject ints. Having a tuple you need (at least) to extract the PyLongs, and convert them to ints, before using them. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Sat Oct 23 00:36:30 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 23 Oct 2010 00:36:30 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> Message-ID: <4CC211EE.1050308@egenix.com> Cesare Di Mauro wrote: > 2010/10/22 M.-A. Lemburg > >> Cesare Di Mauro wrote: >>> I think that having more than 255 arguments for a function call is a very >>> rare case for which a workaround (may be passing a tuple/list or a >>> dictionary) can be a better solution than having to introduce a brand new >>> opcode to handle it. >> >> It's certainly rare when writing applications by hand, but such >> limits can be reached with code generators wrapping external resources >> such as database query rows, spreadsheet rows, sensor data input, etc. >> >> We've had such a limit before (number of lines in a module) and that >> was raised for the same reason. >> >>> Changing the current opcode(s) is a very bad idea, since common cases >> will >>> slow down. >> >> I'm sure there are ways to avoid that, e.g. by using EXTENDED_ARG >> for such cases. >> >> -- >> Marc-Andre Lemburg >> eGenix.com >> > > I've patched Python 3.2 alpha 3 with a rough solution using EXTENDED_ARG for > CALL_FUNCTION* opcodes, raising the arguments and keywords limits to 65535 > maximum. I hope it'll be enough. :) Sure, we don't have to raise it to 2**64 :-) Looks like a pretty simple fix, indeed. I wish we could get rid off all the byte shifting and div'ery use in the byte compiler - I'm pretty sure that such operations are rather slow nowadays compared to working with 16-bit or 32-bit integers and dropping the notion of taking the word "byte" in byte code literally. > In ast.c: > > ast_for_arguments: > if (nposargs > 65535 || nkwonlyargs > 65535) { > ast_error(n, "more than 65535 arguments"); > return NULL; > } > > ast_for_call: > if (nargs + ngens > 65535 || nkeywords > 65535) { > ast_error(n, "more than 65535 arguments"); > return NULL; > } > > > In compile.c: > > opcode_stack_effect: > #define NARGS(o) (((o) & 0xff) + ((o) >> 8 & 0xff00) + 2*(((o) >> 8 & 0xff) > + ((o) >> 16 & 0xff00))) > case CALL_FUNCTION: > return -NARGS(oparg); > case CALL_FUNCTION_VAR: > case CALL_FUNCTION_KW: > return -NARGS(oparg)-1; > case CALL_FUNCTION_VAR_KW: > return -NARGS(oparg)-2; > #undef NARGS > #define NARGS(o) (((o) % 256) + 2*(((o) / 256) % 256)) > case MAKE_FUNCTION: > return -NARGS(oparg) - ((oparg >> 16) & 0xffff); > case MAKE_CLOSURE: > return -1 - NARGS(oparg) - ((oparg >> 16) & 0xffff); > #undef NARGS > > compiler_call_helper: > int len; > int code = 0; > > len = asdl_seq_LEN(args) + n; > n = len & 0xff | (len & 0xff00) << 8; > VISIT_SEQ(c, expr, args); > if (keywords) { > VISIT_SEQ(c, keyword, keywords); > len = asdl_seq_LEN(keywords); > n |= (len & 0xff | (len & 0xff00) << 8) << 8; > } > > > In ceval.c: > > PyEval_EvalFrameEx: > TARGET_WITH_IMPL(CALL_FUNCTION_VAR, _call_function_var_kw) > TARGET_WITH_IMPL(CALL_FUNCTION_KW, _call_function_var_kw) > TARGET(CALL_FUNCTION_VAR_KW) > _call_function_var_kw: > { > int na = oparg & 0xff | oparg >> 8 & 0xff00; > int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8; > > > call_function: > int na = oparg & 0xff | oparg >> 8 & 0xff00; > int nk = (oparg & 0xff00 | oparg >> 8 & 0xff0000) >> 8; > > > A quick example: > > s = '''def f(*Args, **Keywords): > print('Got', len(Args), 'arguments and', len(Keywords), 'keywords') > > def g(): > f(''' + ', '.join(str(i) for i in range(500)) + ', ' + ', '.join('k{} = > {}'.format(i, i) for i in range(500)) + ''') > > g() > ''' > > c = compile(s, '', 'exec') > eval(c) > from dis import dis > dis(g) > > > The output is: > > Got 500 arguments and 500 keywords > > 5 0 LOAD_GLOBAL 0 (f) > 3 LOAD_CONST 1 (0) > 6 LOAD_CONST 2 (1) > [...] > 1497 LOAD_CONST 499 (498) > 1500 LOAD_CONST 500 (499) > 1503 LOAD_CONST 501 ('k0') > 1506 LOAD_CONST 1 (0) > 1509 LOAD_CONST 502 ('k1') > 1512 LOAD_CONST 2 (1) > [...] > 4491 LOAD_CONST 999 ('k498') > 4494 LOAD_CONST 499 (498) > 4497 LOAD_CONST 1000 ('k499') > 4500 LOAD_CONST 500 (499) > 4503 EXTENDED_ARG 257 > 4506 CALL_FUNCTION 16905460 > 4509 POP_TOP > 4510 LOAD_CONST 0 (None) > 4513 RETURN_VALUE > > The dis module seems to have some problem displaying the correct extended > value, but I have no time now to check and fix it. > > Anyway, I'm still unconvinced of the need to raise the function def/call > limits. It may seem strange to have functions, methods or object constructors with more than 255 parameters, but as I said: when using code generators, the generators don't care whether they use 100 or 300 parameters. Even if just 10 parameters are actually used later on. However, the user will care a lot if the generators fail due such limits and then become unusable. As example, take a database query method that exposes 3-4 parameters for each query field. In more complex database schemas that you find in e.g. data warehouse applications, it is not uncommon to have 100+ query fields or columns in a data table. With the current limit in function/call argument counts, such a model could not be mapped directly to Python. Instead, you'd have to turn to solutions based on other data structures that are not automatically checked by Python when calling methods/functions. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 22 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Sat Oct 23 00:45:08 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 23 Oct 2010 00:45:08 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> Message-ID: <20101023004508.6a6c1373@pitrou.net> On Sat, 23 Oct 2010 00:36:30 +0200 "M.-A. Lemburg" wrote: > > It may seem strange to have functions, methods or object constructors > with more than 255 parameters, but as I said: when using code generators, > the generators don't care whether they use 100 or 300 parameters. Why not make the code generators smarter? From cesare.di.mauro at gmail.com Sat Oct 23 08:07:48 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Sat, 23 Oct 2010 08:07:48 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC211EE.1050308@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> Message-ID: 2010/10/23 M.-A. Lemburg > > I wish we could get rid off all the byte shifting and div'ery > use in the byte compiler - I'm pretty sure that such operations > are rather slow nowadays compared to working with 16-bit or 32-bit > integers and dropping the notion of taking the word "byte" > in byte code literally. > Unfortunately we can't remove such shift & masking operations, even on non-byte(code) compilers/VMs. In wpython I handle 16 or 32 bits opcodes (it works on multiple of 16 bits words), but I have: - specialized opcodes to call functions and procedures (functions which trashes the result) which handle the most common cases (84-85% on average from that stats that I have collected from some projects and standard library); I have packed 4 bits nargs and 4 bits nkwargs into a single byte in order to obtain a short (and fast), 16 bits opcode; - big endian systems still need to extract and "rotate" the bytes to get the correct word(s) value. So, even on words (and longwords) representations, they are need. The good thing is that they can be handled a bit fast because oparg stays in one register, and na and nk vars read (and manipulate) it independently, so a (common) out-of-order processor can do a good work, scheduling and parallelize such instructions, leaving a few final dependencies (when recombining shift and/or mask partial results). Some work can also be done reordering the instructions to enhance execution on in-order processors. It may seem strange to have functions, methods or object constructors > with more than 255 parameters, but as I said: when using code generators, > the generators don't care whether they use 100 or 300 parameters. Even if > just 10 parameters are actually used later on. However, the user > will care a lot if the generators fail due such limits and then become > unusable. > > As example, take a database query method that exposes 3-4 parameters > for each query field. In more complex database schemas that you find > in e.g. data warehouse applications, it is not uncommon to have > 100+ query fields or columns in a data table. > > With the current > limit in function/call argument counts, such a model could not be > mapped directly to Python. Instead, you'd have to turn to solutions > based on other data structures that are not automatically checked > by Python when calling methods/functions. > > -- > Marc-Andre Lemburg > I understood the problem, but I don't know if this is the correct solution. Anyway, now there's at least one solution. :) Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott+python-ideas at scottdial.com Sat Oct 23 06:55:35 2010 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Sat, 23 Oct 2010 00:55:35 -0400 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) In-Reply-To: References: <4CBFC331.8020309@ronadam.com> Message-ID: <4CC26AC7.6060801@scottdial.com> On 10/21/2010 2:43 AM, Nick Coghlan wrote: > This idea is only aimed at developers. To run an actual Python > application that needs additional modules, either install it properly > or put it in a zipfile or directory, put a __main__.py at the top > level and just run the zipfile/directory directly. If this is only aimed at developers, then those developers why isn't, PYTHONPATH="versionA:${PYTHONPATH}" python run_tests.py PYTHONPATH="versionB:${PYTHONPATH}" python run_tests.py , completely and utterly sufficient for the job. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From ncoghlan at gmail.com Sat Oct 23 16:21:09 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Oct 2010 00:21:09 +1000 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) In-Reply-To: <4CC26AC7.6060801@scottdial.com> References: <4CBFC331.8020309@ronadam.com> <4CC26AC7.6060801@scottdial.com> Message-ID: On Sat, Oct 23, 2010 at 2:55 PM, Scott Dial wrote: > On 10/21/2010 2:43 AM, Nick Coghlan wrote: >> This idea is only aimed at developers. To run an actual Python >> application that needs additional modules, either install it properly >> or put it in a zipfile or directory, put a __main__.py at the top >> level and just run the zipfile/directory directly. > > If this is only aimed at developers, then those developers why isn't, > > PYTHONPATH="versionA:${PYTHONPATH}" python run_tests.py > PYTHONPATH="versionB:${PYTHONPATH}" python run_tests.py > > , completely and utterly sufficient for the job. Without the addition of the ability to supply a .pth file instead, I would tend to agree with you. There's a reason I'd never actually made the suggestion before, despite first thinking of it ages ago. (Although, I'll also point out that your suggestion doesn't work on Windows, which has its own idiosyncratic way of dealing with environment variables). The proposed command line switch would also be compatible with -E, which is *not* the case for any approach based on modifying PYTHONPATH. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Sat Oct 23 20:32:15 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 23 Oct 2010 13:32:15 -0500 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) In-Reply-To: References: <4CBFC331.8020309@ronadam.com> <4CC26AC7.6060801@scottdial.com> Message-ID: <4CC32A2F.6040408@ronadam.com> On 10/23/2010 09:21 AM, Nick Coghlan wrote: > On Sat, Oct 23, 2010 at 2:55 PM, Scott Dial > wrote: >> On 10/21/2010 2:43 AM, Nick Coghlan wrote: >>> This idea is only aimed at developers. To run an actual Python >>> application that needs additional modules, either install it properly >>> or put it in a zipfile or directory, put a __main__.py at the top >>> level and just run the zipfile/directory directly. >> >> If this is only aimed at developers, then those developers why isn't, >> >> PYTHONPATH="versionA:${PYTHONPATH}" python run_tests.py >> PYTHONPATH="versionB:${PYTHONPATH}" python run_tests.py >> >> , completely and utterly sufficient for the job. > > Without the addition of the ability to supply a .pth file instead, I > would tend to agree with you. There's a reason I'd never actually made > the suggestion before, despite first thinking of it ages ago. > (Although, I'll also point out that your suggestion doesn't work on > Windows, which has its own idiosyncratic way of dealing with > environment variables). > > The proposed command line switch would also be compatible with -E, > which is *not* the case for any approach based on modifying > PYTHONPATH. When you say "developers", do you mean developers of python, or developers with python? I presumed the later. Ron From scott+python-ideas at scottdial.com Sat Oct 23 20:56:46 2010 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Sat, 23 Oct 2010 14:56:46 -0400 Subject: [Python-ideas] Add a command line option to adjust sys.path? (was Re: Add a site.cfg to keep a persistent list of paths) In-Reply-To: <4CC32A2F.6040408@ronadam.com> References: <4CBFC331.8020309@ronadam.com> <4CC26AC7.6060801@scottdial.com> <4CC32A2F.6040408@ronadam.com> Message-ID: <4CC32FEE.3020708@scottdial.com> On 10/23/2010 2:32 PM, Ron Adam wrote: > When you say "developers", do you mean developers of python, or > developers with python? I presumed the later. I intended "developers" to mean anyone proficient with the use of python as a tool or anyone who should be bothered to read "--help" to find out about how to much with the path (e.g., PYTHONPATH). > On 10/23/2010 09:21 AM, Nick Coghlan wrote: >> The proposed command line switch would also be compatible with -E, >> which is *not* the case for any approach based on modifying >> PYTHONPATH. Does anyone actually use -E? Is that a critical feature worth adding yet another way to add something to sys.path for? I don't find "-p" to be a confusing addition to the switch flag set, so I would say I am mostly a -0 on adding another flag for this purpose unless it has serious advantages over PYTHONPATH. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From ianb at colorstudy.com Sat Oct 23 21:32:37 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 23 Oct 2010 14:32:37 -0500 Subject: [Python-ideas] pythonv / Python path Message-ID: The recent path discussion reminded me of a project I talked about with Larry Hastings at the last PyCon about virtualenv and what could possibly be included into Python. Larry worked on a prototype that I was supposed to do something with, and then I didn't, which is lame of me but deserves some note: http://bitbucket.org/larry/pythonv/src It satisfies several requirements that I feel virtualenv accomplishes and a lot of other systems do not; but it also has a few useful features virtualenv doesn't have and is much simpler (mostly because it has a compiled component, and changes the system site.py). The features I think are important: * Works with "environments", which is a set of paths and installed components that work together (instead of just ad hoc single path extensions like adding one entry to PYTHONPATH) * Modifies sys.prefix, so all the existing installation tools respect the new environment * Works with #!, which basically means it needs its own per-environment interpreter, as #! is so completely broken that it can't have any real arguments (though it occurs to me that a magic comment could work) * Doesn't use environmental variables (actually it uses them internally, but not in a way that is exposed to developers) -- for instance, hg should not be affected by whatever development you are doing just because it happens to be written in Python Anyway, I think what Larry did with pythonv accomplishes a lot of these things, and probably some more constraints that I've forgotten about. It does have a more complicated/dangerous installation procedure than virtualenv (but if it was part of Python proper that would be okay). -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From debatem1 at gmail.com Sat Oct 23 22:14:18 2010 From: debatem1 at gmail.com (geremy condra) Date: Sat, 23 Oct 2010 13:14:18 -0700 Subject: [Python-ideas] stats module Was: minmax() function ... In-Reply-To: <201010180357.59264.steve@pearwood.info> References: <201010161111.21847.steve@pearwood.info> <622121A3-6A51-4735-A292-9F82502BB623@gmail.com> <201010180357.59264.steve@pearwood.info> Message-ID: On Sun, Oct 17, 2010 at 9:57 AM, Steven D'Aprano wrote: > On Sat, 16 Oct 2010 11:33:02 am Raymond Hettinger wrote: >> > Are you still interested in working on it, or is this a subtle hint >> > that somebody else should do so? >> >> Hmm, perhaps this would be less subtle: >> HEY, WHY DON'T YOU GUYS GO TO WORK ON A STATS MODULE! > > > http://pypi.python.org/pypi/stats > > It is not even close to production ready. It needs unit tests. The API > should be considered unstable. There's no 3.x version yet. Obviously it > has no real-world usage. But if anyone would like to contribute, > critique or criticize, I welcome feedback or assistance, or even just > encouragement. Just an update on this, there's a sprint planned to work on this this coming Thursday at the University of Washington in Seattle. We'll also be set up for people to join us remotely if anybody's interested. Here's the link to the signup: http://goo.gl/PJn4 Geremy Condra From brett at python.org Sat Oct 23 23:27:25 2010 From: brett at python.org (Brett Cannon) Date: Sat, 23 Oct 2010 14:27:25 -0700 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: Is this email meant to simply point out the existence of pythonv, or to start a conversation about whether something should be tweaked in Python so as to make pythonv/virtualenv easier to implement/use? If it's the latter then let's have the conversation! This was brought up at the PyCon US 2010 language summit and the consensus was that modifying Python to make something like virtualenv or pythonv easier to implement is completely acceptable and something worth doing. On Sat, Oct 23, 2010 at 12:32, Ian Bicking wrote: > The recent path discussion reminded me of a project I talked about with > Larry Hastings at the last PyCon about virtualenv and what could possibly be > included into Python.? Larry worked on a prototype that I was supposed to do > something with, and then I didn't, which is lame of me but deserves some > note: > > http://bitbucket.org/larry/pythonv/src > > It satisfies several requirements that I feel virtualenv accomplishes and a > lot of other systems do not; but it also has a few useful features > virtualenv doesn't have and is much simpler (mostly because it has a > compiled component, and changes the system site.py). > > The features I think are important: > > * Works with "environments", which is a set of paths and installed > components that work together (instead of just ad hoc single path extensions > like adding one entry to PYTHONPATH) > * Modifies sys.prefix, so all the existing installation tools respect the > new environment > * Works with #!, which basically means it needs its own per-environment > interpreter, as #! is so completely broken that it can't have any real > arguments (though it occurs to me that a magic comment could work) > * Doesn't use environmental variables (actually it uses them internally, but > not in a way that is exposed to developers) -- for instance, hg should not > be affected by whatever development you are doing just because it happens to > be written in Python > > Anyway, I think what Larry did with pythonv accomplishes a lot of these > things, and probably some more constraints that I've forgotten about.? It > does have a more complicated/dangerous installation procedure than > virtualenv (but if it was part of Python proper that would be okay). > > -- > Ian Bicking? |? http://blog.ianbicking.org > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From ianb at colorstudy.com Sat Oct 23 23:33:38 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 23 Oct 2010 16:33:38 -0500 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon wrote: > Is this email meant to simply point out the existence of pythonv, or > to start a conversation about whether something should be tweaked in > Python so as to make pythonv/virtualenv easier to implement/use? > Both? I have felt guilty for not following up on what Larry did, so this is my other-people-should-think-about-this-too email. > If it's the latter then let's have the conversation! This was brought > up at the PyCon US 2010 language summit and the consensus was that > modifying Python to make something like virtualenv or pythonv easier > to implement is completely acceptable and something worth doing. > OK, sure! Mostly it's about changing site.py. The pythonv executable is itself very simple, just a shim to make #! easier. For Windows it would have to be different (maybe similar to Setuptools' cmd.exe), but... I think it's possible, and I'd just hope some Windows people would explore what specifically is needed. virtualenv has another feature, which isn't part of pythonv and needn't be part of Python, which is to bootstrap installation tools. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Sun Oct 24 00:05:30 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sat, 23 Oct 2010 15:05:30 -0700 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: <92006216-1056-4B98-867A-41559A894B1B@gmail.com> On Oct 23, 2010, at 2:27 PM, Brett Cannon wrote: > Is this email meant to simply point out the existence of pythonv, or > to start a conversation about whether something should be tweaked in > Python so as to make pythonv/virtualenv easier to implement/use? > > If it's the latter then let's have the conversation! This was brought > up at the PyCon US 2010 language summit and the consensus was that > modifying Python to make something like virtualenv or pythonv easier > to implement is completely acceptable and something worth doing. +1 Raymond From brett at python.org Sun Oct 24 00:11:44 2010 From: brett at python.org (Brett Cannon) Date: Sat, 23 Oct 2010 15:11:44 -0700 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: On Sat, Oct 23, 2010 at 14:33, Ian Bicking wrote: > On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon wrote: >> >> Is this email meant to simply point out the existence of pythonv, or >> to start a conversation about whether something should be tweaked in >> Python so as to make pythonv/virtualenv easier to implement/use? > > Both?? I have felt guilty for not following up on what Larry did, so this is > my other-people-should-think-about-this-too email. > >> >> If it's the latter then let's have the conversation! This was brought >> up at the PyCon US 2010 language summit and the consensus was that >> modifying Python to make something like virtualenv or pythonv easier >> to implement is completely acceptable and something worth doing. > > OK, sure!? Mostly it's about changing site.py. OK, what exactly needs to change? >? The pythonv executable is > itself very simple, just a shim to make #! easier. But that's in C, though, right? What exactly does it do? It would be best to make it if the shim can be in Python so that other VMs can work with it. -Brett >? For Windows it would > have to be different (maybe similar to Setuptools' cmd.exe), but... I think > it's possible, and I'd just hope some Windows people would explore what > specifically is needed. > > virtualenv has another feature, which isn't part of pythonv and needn't be > part of Python, which is to bootstrap installation tools. > > -- > Ian Bicking? |? http://blog.ianbicking.org > From ianb at colorstudy.com Sun Oct 24 00:16:52 2010 From: ianb at colorstudy.com (Ian Bicking) Date: Sat, 23 Oct 2010 17:16:52 -0500 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: On Sat, Oct 23, 2010 at 5:11 PM, Brett Cannon wrote: > On Sat, Oct 23, 2010 at 14:33, Ian Bicking wrote: > > On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon wrote: > >> > >> Is this email meant to simply point out the existence of pythonv, or > >> to start a conversation about whether something should be tweaked in > >> Python so as to make pythonv/virtualenv easier to implement/use? > > > > Both? I have felt guilty for not following up on what Larry did, so this > is > > my other-people-should-think-about-this-too email. > > > >> > >> If it's the latter then let's have the conversation! This was brought > >> up at the PyCon US 2010 language summit and the consensus was that > >> modifying Python to make something like virtualenv or pythonv easier > >> to implement is completely acceptable and something worth doing. > > > > OK, sure! Mostly it's about changing site.py. > > OK, what exactly needs to change? > Well, add a notion of "prefixes", where the system sys.prefix is one item, but the environment location is the "active" sys.prefix. Then most of the site.py changes can follow logically from that (whatever you do for the one prefix, do for all prefixes). Then there's a matter of using an environmental variable to add a new prefix (or multiple prefixes -- inheritable virtualenvs, if virtualenv allowed such a thing). In the pythonv implementation it sets that variable, and site.py deletes that variable (it could be a command-line switch, that's just slightly hard to implement -- but it's not intended as an inheritable attribute of the execution environment like PYTHONPATH). > The pythonv executable is > > itself very simple, just a shim to make #! easier. > > But that's in C, though, right? What exactly does it do? It would be > best to make it if the shim can be in Python so that other VMs can > work with it. > #! doesn't work with a Python target, otherwise it would be easy to implement in Python. #! is awful. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Oct 24 01:26:25 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Oct 2010 12:26:25 +1300 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> Message-ID: <4CC36F21.3040306@canterbury.ac.nz> Cesare Di Mauro wrote: > I think that having max 255 args and 255 kwargs is a good and reasonable > limit which we can live on, and helps the virtual machine implementation Is there any corresponding limit to the number of arguments to tuple and dict constructor? If not, the limit could perhaps be circumvented without changing the VM by having the compiler convert calls with large numbers of args into code that builds an appropriate tuple and dict and makes a *args/**kwds call. -- Greg From greg.ewing at canterbury.ac.nz Sun Oct 24 01:52:25 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Oct 2010 12:52:25 +1300 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC211EE.1050308@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> Message-ID: <4CC37539.6090900@canterbury.ac.nz> M.-A. Lemburg wrote: > I wish we could get rid off all the byte shifting and div'ery > use in the byte compiler I think it's there to take care of endianness issues. -- Greg From brett at python.org Sun Oct 24 02:12:35 2010 From: brett at python.org (Brett Cannon) Date: Sat, 23 Oct 2010 17:12:35 -0700 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: On Sat, Oct 23, 2010 at 15:16, Ian Bicking wrote: > On Sat, Oct 23, 2010 at 5:11 PM, Brett Cannon wrote: >> >> On Sat, Oct 23, 2010 at 14:33, Ian Bicking wrote: >> > On Sat, Oct 23, 2010 at 4:27 PM, Brett Cannon wrote: >> >> >> >> Is this email meant to simply point out the existence of pythonv, or >> >> to start a conversation about whether something should be tweaked in >> >> Python so as to make pythonv/virtualenv easier to implement/use? >> > >> > Both?? I have felt guilty for not following up on what Larry did, so >> > this is >> > my other-people-should-think-about-this-too email. >> > >> >> >> >> If it's the latter then let's have the conversation! This was brought >> >> up at the PyCon US 2010 language summit and the consensus was that >> >> modifying Python to make something like virtualenv or pythonv easier >> >> to implement is completely acceptable and something worth doing. >> > >> > OK, sure!? Mostly it's about changing site.py. >> >> OK, what exactly needs to change? > > Well, add a notion of "prefixes", where the system sys.prefix is one item, > but the environment location is the "active" sys.prefix.? Then most of the > site.py changes can follow logically from that (whatever you do for the one > prefix, do for all prefixes).? Then there's a matter of using an > environmental variable to add a new prefix (or multiple prefixes -- > inheritable virtualenvs, if virtualenv allowed such a thing).? In the > pythonv implementation it sets that variable, and site.py deletes that > variable (it could be a command-line switch, that's just slightly hard to > implement -- but it's not intended as an inheritable attribute of the > execution environment like PYTHONPATH). OK, so it sounds like site.py would just need a restructuring. That's sounds like just a technical challenge and not a backwards-compatibility one. Am I right? > >> >? The pythonv executable is >> > itself very simple, just a shim to make #! easier. >> >> But that's in C, though, right? What exactly does it do? It would be >> best to make it if the shim can be in Python so that other VMs can >> work with it. > > #! doesn't work with a Python target, otherwise it would be easy to > implement in Python.? #! is awful. As in a #! can't target a Python script that has been chmod'ed to be executable? From cesare.di.mauro at gmail.com Sun Oct 24 08:31:04 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Sun, 24 Oct 2010 08:31:04 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC36F21.3040306@canterbury.ac.nz> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC36F21.3040306@canterbury.ac.nz> Message-ID: 2010/10/24 Greg Ewing > Cesare Di Mauro wrote: > >> I think that having max 255 args and 255 kwargs is a good and reasonable >> limit which we can live on, and helps the virtual machine implementation >> > > Is there any corresponding limit to the number of arguments to > tuple and dict constructor? AFAIK there's no such limit. However, I'll use BUILD_TUPLE and BUILD_MAP opcodes for such purpose, because they are faster. > If not, the limit could perhaps be > circumvented without changing the VM by having the compiler > convert calls with large numbers of args into code that builds > an appropriate tuple and dict and makes a *args/**kwds call. > > -- > Greg I greatly prefer this solution, but it's a bit more complicated when there are *arg and/or **kwargs special arguments. If we have > 255 args and *args is defined, we need to: 1) emit BUILD_TUPLE after pushed the regular arguments 2) emit LOAD_GLOBAL("tuple") 3) push *args 4) emit CALL_FUNCTION(1) to convert *args to a tuple 5) emit BINARY_ADD to append *args to the regular arguments 6) emit CALL_FUNCTION_VAR If we have > 255 kwargs and **kwargs defined, we need to: 1) emit BUILD_MAP after pushed the regular keyword arguments 2) emit LOAD_ATTR("update") 3) push **kwargs 4) emit CALL_FUNCTION(1) to update the regular keyword arguments with the ones in **kwargs 5) emit CALL_FUNCTION_KW And, finally, combining all the above in the worst case. But, as I said, I prefer this one to handle "complex" cases instead of changing the VM slowing the common ones. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phd.pp.ru Sun Oct 24 17:40:05 2010 From: phd at phd.pp.ru (Oleg Broytman) Date: Sun, 24 Oct 2010 19:40:05 +0400 Subject: [Python-ideas] pythonv / Python path In-Reply-To: References: Message-ID: <20101024154005.GA21511@phd.pp.ru> On Sat, Oct 23, 2010 at 05:12:35PM -0700, Brett Cannon wrote: > On Sat, Oct 23, 2010 at 15:16, Ian Bicking wrote: > > #! doesn't work with a Python target, otherwise it would be easy to > > implement in Python.? #! is awful. > > As in a #! can't target a Python script that has been chmod'ed to be executable? It also handles parameters in strange ways. One can write #!/usr/bin/pytnon -O but not #!/usr/bin/env pytnon -O Some operating systems understand that, some ignore -O completely, but most OSes interpret "python -O" as one parameter and emit "/usr/bin/env: python -O: No such file or directory" error. Oleg. -- Oleg Broytman http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From lie.1296 at gmail.com Sun Oct 24 23:59:16 2010 From: lie.1296 at gmail.com (Lie Ryan) Date: Mon, 25 Oct 2010 08:59:16 +1100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CB8B2F5.2020507@ronadam.com> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: On 10/16/10 07:00, Ron Adam wrote: > > > On 10/15/2010 02:04 PM, Arnaud Delobelle wrote: > >>> Because it would always interpret a list of values as a single item. >>> >>> This function looks at args and if its a single value without an >>> "__iter__" >>> method, it passes it to min as min([value], **kwds) instead of >>> min(value, >>> **kwds). >> >> But there are many iterable objects which are also comparable (hence >> it makes sense to consider their min/max), for example strings. >> >> So we get: >> >> xmin("foo", "bar", "baz") == "bar" >> xmin("foo", "bar") == "bar" >> >> but: >> >> xmin("foo") == "f" >> >> This will create havoc in your running min routine. >> >> (Notice the same will hold for min() but at least you know that min(x) >> considers x as an iterable and complains if it isn't) > > Yes > > There doesn't seem to be a way to generalize min/max in a way to handle > all the cases without knowing the context. > > So in a coroutine version of Tals class, you would need to pass a hint > along with the value. > > Ron There is a way, by using threading, and injecting a thread-safe tee into max/min/otherFuncs (over half of the code is just for implementing thread-safe tee). Using this, there is no need to make any modification to min/max. I suppose it might be possible to convert this to using the new coroutine proposal (though I haven't been following the proposal close enough). The code is quite slow (and ugly), but it can handle large generators (or infinite generators). The memory shouldn't grow if the functions in *funcs takes more or less similar amount of time (which is true in case of max and min); if *funcs need to take both very fast and very slow codes at the same time, some more code can be added for load-balancing by stalling faster threads' request for more items, until the slower threads finishes. Pros: - no modification to max/min Cons: - slow, since itertools.tee() is reimplemented in pure-python - thread is uninterruptible import threading, itertools class counting_dict(dict): """ A thread-safe dict that allows its items to be accessed max_access times, after that the item is deleted. >>> d = counting_dict(2) >>> d['a'] = 'e' >>> d['a'] 'e' >>> d['a'] 'e' >>> d['a'] Traceback (most recent call last): File "", line 1, in File "", line 10, in __getitem__ KeyError: 'a' """ def __init__(self, max_access): self.max_access = max_access def __setitem__(self, key, item): super().__setitem__(key, [item, self.max_access, threading.Lock()] ) def __getitem__(self, key): val = super().__getitem__(key) item, count, lock = val with lock: val[1] -= 1 if val[1] == 0: del self[key] return item def tee(iterable, n=2): """ like itertools.tee(), but thread-safe """ def _tee(): for i in itertools.count(): try: yield cache[i] except KeyError: producer_next() yield cache[i] def produce(next): for i in itertools.count(): cache[i] = next() yield produce.lock = threading.Lock() def producer_next(): with produce.lock: next(producer); next(producer); next(producer); next(producer); cache = counting_dict(n) it = iter(iterable) producer = produce(it.__next__) return [_tee() for _ in range(n)] def parallel_reduce(iterable, *funcs): class Worker(threading.Thread): def __init__(self, source, func): super().__init__() self.source = source self.func = func def run(self): self.result = self.func(self.source) sources = tee(iterable, len(funcs)) threads = [] for func, source in zip(funcs, sources): thread = Worker(source, func) thread.setDaemon(True) threads.append(thread) for thread in threads: thread.start() # this lets Ctrl+C work, it doesn't actually terminate # currently running threads for thread in threads: while thread.isAlive(): thread.join(100) return tuple(thread.result for thread in threads) # test code import random, time parallel_reduce([4, 6, 2, 3, 5, 7, 2, 3, 7, 8, 9, 6, 2, 10], min, max) l = (random.randint(-1000000000, 1000000000) for _ in range(100000)) start = time.time(); parallel_reduce(l, min, min, max, min, max); time.time() - start From guido at python.org Mon Oct 25 04:37:32 2010 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Oct 2010 19:37:32 -0700 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: On Sun, Oct 24, 2010 at 2:59 PM, Lie Ryan wrote: > There is a way, by using threading, and injecting a thread-safe tee into > max/min/otherFuncs (over half of the code is just for implementing > thread-safe tee). Using this, there is no need to make any modification > to min/max. I suppose it might be possible to convert this to using the > new coroutine proposal (though I haven't been following the proposal > close enough). > > The code is quite slow (and ugly), but it can handle large generators > (or infinite generators). The memory shouldn't grow if the functions in > *funcs takes more or less similar amount of time (which is true in case > of max and min); if *funcs need to take both very fast and very slow > codes at the same time, some more code can be added for load-balancing > by stalling faster threads' request for more items, until the slower > threads finishes. > > Pros: > - no modification to max/min > > Cons: > - slow, since itertools.tee() is reimplemented in pure-python > - thread is uninterruptible [snip] This should not require threads. Here's a bare-bones sketch using generators: def reduce_collector(func): outcome = None while True: try: val = yield except GeneratorExit: break if outcome is None: outcome = val else: outcome = func(outcome, val) raise StopIteration(outcome) def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] values = [None for _ in collectors] for i, coll in enumerate(collectors): try: next(coll) except StopIteration as err: values[i] = err.args[0] collectors[i] = None for val in iterable: for i, coll in enumerate(collectors): if coll is not None: try: coll.send(val) except StopIteration as err: values[i] = err.args[0] collectors[i] = None for i, coll in enumerate(collectors): if coll is not None: try: coll.throw(GeneratorExit) except StopIteration as err: values[i] = err.args[0] return values def main(): it = range(100) print(parallel_reduce(it, [min, max])) if __name__ == '__main__': main() -- --Guido van Rossum (python.org/~guido) From jh at improva.dk Mon Oct 25 12:19:14 2010 From: jh at improva.dk (Jacob Holm) Date: Mon, 25 Oct 2010 12:19:14 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: <4CC559A2.8090305@improva.dk> On 2010-10-25 04:37, Guido van Rossum wrote: > This should not require threads. > > Here's a bare-bones sketch using generators: > If you don't care about allowing the funcs to raise StopIteration, this can actually be simplified to: def reduce_collector(func): try: outcome = yield except GeneratorExit: outcome = None else: while True: try: val = yield except GeneratorExit: break outcome = func(outcome, val) raise StopIteration(outcome) def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] values = [None for _ in collectors] for coll in collectors: next(coll) for val in iterable: for coll in collectors: coll.send(val) for i, coll in enumerate(collectors): try: coll.throw(GeneratorExit) except StopIteration as err: values[i] = err.args[0] return values More interesting (to me at least) is that this is an excellent example of why I would like to see a version of PEP380 where "close" on a generator can return a value (AFAICT the version of PEP380 on http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not mention this possibility, or even link to the heated discussion we had on python-ideas around march/april 2009). Assuming that "close" on a reduce_collector generator instance returns the value of the StopIteration raised by the "return" statements, we can simplify the code even further: def reduce_collector(func): try: outcome = yield except GeneratorExit: return None while True: try: val = yield except GeneratorExit: return outcome outcome = func(outcome, val) def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for coll in collectors: next(coll) for val in iterable: for coll in collectors: coll.send(val) return [coll.close() for coll in collectors] Yes, this is only saving a few lines, but I find it *much* more readable... - Jacob From denis.spir at gmail.com Mon Oct 25 15:49:32 2010 From: denis.spir at gmail.com (spir) Date: Mon, 25 Oct 2010 15:49:32 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') Message-ID: <20101025154932.06be2faf@o> Hello, A recommended idiom to construct a text from bits -- usually when the bits themselves are constructed by mapping on a sequence -- is to store the intermediate results and then only join() them all at once. Since I discovered this idiom I find myself constantly use it, to the point of having a func doing that in my python toolkit: def textFromMap(seq , map=None , sep='' , ldelim='',rdelim=''): if (map is None): return "%s%s%s" %(ldelim , sep.join(str(e) for e in seq) , rdelim) else: return "%s%s%s" %(ldelim , sep.join(str(map(e)) for e in seq) , rdelim) Example use: class LispList(list): def __repr__(self): return textFromMap(self , repr , ' ' , '(',')') print LispList([1, 2, 3]) # --> (1 2 3) Is there any similar routine in Python? If yes, please inform me off list and excuse the noise. Else, I wonder whether such a routine would be useful as builtin, precisely since it is a common and recommended idiom. The issues with not having it, according to me, are that the expression is somewhat complicated and, more importantly, hardly tells the reader what it means & does -- even when "unfolded" into 2 or more lines of code: elements = (map(e) for e in seq) elementTexts = (str(e) for e in elements) content = sep.join(elementTexts) text = "%s%s%s" %(ldelim , content , rdelim) There are 2 discussable choices in the func above: * Unlike join(), it converts to str automagically. * It takes optional delimiter parameters which complicate the interface (but are really handy for my common use cases :-) Also, the map parameter is optional in case there is no mapping at all, which is more common if the func "string-ifies" itself. If ever you find this proposal sensible, then what should be the routine's name? And where to integrate it in the language? I think there are at least 3 options: 1. A simple func textFromMap(seq, ...) 2. A static method of str str.fromMap(seq, ...) 3. A method for iterables (1) seq.textFromMap(...) (I personly find the latter more correct semantically (2).) What do you think? Denis (1) I don't know exactly what should be the top class, if any. (2) I think the same about join: should be "seq.join(sep)" since for me the object on which the method applies is seq, not sep. -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From masklinn at masklinn.net Mon Oct 25 16:10:56 2010 From: masklinn at masklinn.net (Masklinn) Date: Mon, 25 Oct 2010 16:10:56 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <20101025154932.06be2faf@o> References: <20101025154932.06be2faf@o> Message-ID: <6E25ADF7-129D-4396-A569-06CB2BA5D7C7@masklinn.net> On 2010-10-25, at 15:49 , spir wrote: > Hello, > > > A recommended idiom to construct a text from bits -- usually when the bits themselves are constructed by mapping on a sequence -- is to store the intermediate results and then only join() them all at once. Since I discovered this idiom I find myself constantly use it, to the point of having a func doing that in my python toolkit: > > def textFromMap(seq , map=None , sep='' , ldelim='',rdelim=''): > if (map is None): > return "%s%s%s" %(ldelim , sep.join(str(e) for e in seq) , rdelim) > else: > return "%s%s%s" %(ldelim , sep.join(str(map(e)) for e in seq) , rdelim) > > Example use: > > class LispList(list): > def __repr__(self): > return textFromMap(self , repr , ' ' , '(',')') > print LispList([1, 2, 3]) # --> (1 2 3) > > Is there any similar routine in Python? If yes, please inform me off list and excuse the noise. Else, I wonder whether such a routine would be useful as builtin, precisely since it is a common and recommended idiom. The issues with not having it, according to me, are that the expression is somewhat complicated and, more importantly, hardly tells the reader what it means & does -- even when "unfolded" into 2 or more lines of code: > > elements = (map(e) for e in seq) > elementTexts = (str(e) for e in elements) > content = sep.join(elementTexts) > text = "%s%s%s" %(ldelim , content , rdelim) > I really am not sure you gain so much over the current `sep.join(str(map(e)) for e in seq))`, even with the addition of ldelim and rdelim which end-up in arguments-soup/noise (5 arguments in the worst case is quite a lot). The name is also strange, and hints at needing function composition more than a new builtin. > 3. A method for iterables (1) seq.textFromMap(...) > (I personly find the latter more correct semantically (2).) > > (2) I think the same about join: should be "seq.join(sep)" since for me the object on which the method applies is seq, not sep. > This is also the choice of e.g. Ruby, but it has a severe limitation: Python doesn't have any `Iterable` type, yet `join` can be used with any iterable including generators or callable-iterators. Thus you can not put it on the iterable or sequence, or you have to prepare some kind of iterable mixin. This issue might be solved/solvable via the new abstract base classes, but I'm so sure about it (do you explicitly have to mix-in an abc to use its methods?). In fact, Ruby 1.8 does have that limitation (though it's arguably not the worst limitation ever): `Array#join` exists but not `Enumerable#join`. They tried to add `Enumerable#join` in 1.9.1 (though a fairly strange, recursive version of it) then took it out then added it back again (or something, I got lost around there). And in any case since there is no requirement for enumerable collections to mix Enumerable in, you can have enumerable collections with no join support. From guido at python.org Mon Oct 25 17:13:19 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 08:13:19 -0700 Subject: [Python-ideas] Possible PEP 380 tweak Message-ID: [Changed subject] > On 2010-10-25 04:37, Guido van Rossum wrote: >> This should not require threads. >> >> Here's a bare-bones sketch using generators: [...] On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm wrote: > If you don't care about allowing the funcs to raise StopIteration, this > can actually be simplified to: [...] Indeed, I realized this after posting. :-) I had several other ideas for improvements, e.g. being able to pass an initial value to the reduce-like function or even being able to supply a reduce-like function of one's own. > More interesting (to me at least) is that this is an excellent example > of why I would like to see a version of PEP380 where "close" on a > generator can return a value (AFAICT the version of PEP380 on > http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not > mention this possibility, or even link to the heated discussion we had > on python-ideas around march/april 2009). Can you dig up the link here? I recall that discussion but I don't recall a clear conclusion coming from it -- just heated debate. Based on my example I have to agree that returning a value from close() would be nice. There is a little detail, how multiple arguments to StopIteration should be interpreted, but that's not so important if it's being raised by a return statement. > Assuming that "close" on a reduce_collector generator instance returns > the value of the StopIteration raised by the "return" statements, we can > simplify the code even further: > > > def reduce_collector(func): > ? ?try: > ? ? ? ?outcome = yield > ? ?except GeneratorExit: > ? ? ? ?return None > ? ?while True: > ? ? ? ?try: > ? ? ? ? ? ?val = yield > ? ? ? ?except GeneratorExit: > ? ? ? ? ? ?return outcome > ? ? ? ?outcome = func(outcome, val) > > def parallel_reduce(iterable, funcs): > ? ?collectors = [reduce_collector(func) for func in funcs] > ? ?for coll in collectors: > ? ? ? ?next(coll) > ? ?for val in iterable: > ? ? ? ?for coll in collectors: > ? ? ? ? ? ?coll.send(val) > ? ?return [coll.close() for coll in collectors] > > > Yes, this is only saving a few lines, but I find it *much* more readable... I totally agree that not having to call throw() and catch whatever it bounces back is much nicer. (Now I wish there was a way to avoid the "try..except GeneratorExit" construct in the generator, but I think I should stop while I'm ahead. :-) The interesting thing is that I've been dealing with generators used as coroutines or tasks intensely on and off since July, and I haven't had a single need for any of the three patterns that this example happened to demonstrate: - the need to "prime" the generator in a separate step - throwing and catching GeneratorExit - getting a value from close() (I did have a lot of use for send(), throw(), and extracting a value from StopIteration.) In my context, generators are used to emulate concurrently running tasks, and "yield" is always used to mean "block until this piece of async I/O is complete, and wake me up with the result". This is similar to the "classic" trampoline code found in PEP 342. In fact, when I wrote the example for this thread, I fumbled a bit because the use of generators there is different than I had been using them (though it was no doubt thanks to having worked with them intensely that I came up with the example quickly). So, it is clear that generators are extremely versatile, and PEP 380 deserves several good use cases to explain all the API subtleties. BTW, while I have you, what do you think of Greg's "cofunctions" proposal? -- --Guido van Rossum (python.org/~guido) From lvh at laurensvh.be Mon Oct 25 18:00:40 2010 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Mon, 25 Oct 2010 18:00:40 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <20101025154932.06be2faf@o> References: <20101025154932.06be2faf@o> Message-ID: Hm. I suppose the need for this would be slightly mitigated if I understood why str.join does not try to convert the elements of the iterable it is passed to strs (and analogously for unicode). Does anyone know what the rationale for that is? lvh -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Oct 25 19:12:02 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 10:12:02 -0700 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: On Mon, Oct 25, 2010 at 9:00 AM, Laurens Van Houtven wrote: > Hm. I suppose the need for this would be slightly mitigated if I understood > why str.join does not try to convert the elements of the iterable it is > passed to strs (and analogously for unicode). > Does anyone know what the rationale for that is? To detect buggy code. -- --Guido van Rossum (python.org/~guido) From __peter__ at web.de Mon Oct 25 19:10:56 2010 From: __peter__ at web.de (Peter Otten) Date: Mon, 25 Oct 2010 19:10:56 +0200 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: Guido van Rossum wrote: > On Sun, Oct 24, 2010 at 2:59 PM, Lie Ryan > wrote: >> There is a way, by using threading, and injecting a thread-safe tee into >> max/min/otherFuncs (over half of the code is just for implementing >> thread-safe tee). Using this, there is no need to make any modification >> to min/max. I suppose it might be possible to convert this to using the >> new coroutine proposal (though I haven't been following the proposal >> close enough). >> >> The code is quite slow (and ugly), but it can handle large generators >> (or infinite generators). The memory shouldn't grow if the functions in >> *funcs takes more or less similar amount of time (which is true in case >> of max and min); if *funcs need to take both very fast and very slow >> codes at the same time, some more code can be added for load-balancing >> by stalling faster threads' request for more items, until the slower >> threads finishes. >> >> Pros: >> - no modification to max/min >> >> Cons: >> - slow, since itertools.tee() is reimplemented in pure-python >> - thread is uninterruptible > [snip] > > This should not require threads. > > Here's a bare-bones sketch using generators: > outcome = func(outcome, val) I don't think the generator-based approach is equivalent to what Lie Ryan's threaded code does. You are calling max(a, b) 99 times while Lie calls max(items) once. Is it possible to calculate min(items) and max(items) simultaneously using generators? I don't see how... Peter From guido at python.org Mon Oct 25 20:53:40 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 11:53:40 -0700 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: > Guido van Rossum wrote: [...] >> This should not require threads. >> >> Here's a bare-bones sketch using generators: [...] On Mon, Oct 25, 2010 at 10:10 AM, Peter Otten <__peter__ at web.de> wrote: > I don't think the generator-based approach is equivalent to what Lie Ryan's > threaded code does. You are calling max(a, b) 99 times while Lie calls > max(items) once. True. Nevertheless, my point stays: you shouldn't have to use threads to do such concurrent computations over a single-use iterable. Threads too slow and since there is no I/O multiplexing they don't offer advantages. > Is it possible to calculate min(items) and max(items) simultaneously using > generators? I don't see how... No, this is why the reduce-like approach is better for such cases. Otherwise you keep trying to fit a square peg into a round hold. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Mon Oct 25 21:27:52 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 25 Oct 2010 15:27:52 -0400 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <20101025154932.06be2faf@o> References: <20101025154932.06be2faf@o> Message-ID: On 10/25/2010 9:49 AM, spir wrote: > Hello, > > > A recommended idiom to construct a text from bits -- usually when the > bits themselves are constructed by mapping on a sequence -- is to > store the intermediate results and then only join() them all at once. > Since I discovered this idiom I find myself constantly use it, to the > point of having a func doing that in my python toolkit: > > def textFromMap(seq , map=None , sep='' , ldelim='',rdelim=''): 'map' is a bad parameter name as it 1. reuses the builtin name and 2. uses it for a parameter (the mapped function) of that builtin. ... > (2) I think the same about join: should be "seq.join(sep)" since for > me the object on which the method applies is seq, not sep. The two parameters for the join function are a string and an iterable of strings. There is no 'iterable of strings' class, so that leaves the string class to attach it to as a method. (It once *was* just a function in the string module before it and other functions were so attached.) The fact that the function produces a string is another reason it should be a string method. Ditto for bytes and iterable of bytes. -- Terry Jan Reedy From rrr at ronadam.com Mon Oct 25 21:53:57 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 25 Oct 2010 14:53:57 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: Message-ID: <4CC5E055.9010009@ronadam.com> On 10/25/2010 10:13 AM, Guido van Rossum wrote: > [Changed subject] > >> On 2010-10-25 04:37, Guido van Rossum wrote: >>> This should not require threads. >>> >>> Here's a bare-bones sketch using generators: > [...] > > On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm wrote: >> If you don't care about allowing the funcs to raise StopIteration, this >> can actually be simplified to: > [...] > > Indeed, I realized this after posting. :-) I had several other ideas > for improvements, e.g. being able to pass an initial value to the > reduce-like function or even being able to supply a reduce-like > function of one's own. > >> More interesting (to me at least) is that this is an excellent example >> of why I would like to see a version of PEP380 where "close" on a >> generator can return a value (AFAICT the version of PEP380 on >> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not >> mention this possibility, or even link to the heated discussion we had >> on python-ideas around march/april 2009). > > Can you dig up the link here? > > I recall that discussion but I don't recall a clear conclusion coming > from it -- just heated debate. > > Based on my example I have to agree that returning a value from > close() would be nice. There is a little detail, how multiple > arguments to StopIteration should be interpreted, but that's not so > important if it's being raised by a return statement. > >> Assuming that "close" on a reduce_collector generator instance returns >> the value of the StopIteration raised by the "return" statements, we can >> simplify the code even further: >> >> >> def reduce_collector(func): >> try: >> outcome = yield >> except GeneratorExit: >> return None >> while True: >> try: >> val = yield >> except GeneratorExit: >> return outcome >> outcome = func(outcome, val) >> >> def parallel_reduce(iterable, funcs): >> collectors = [reduce_collector(func) for func in funcs] >> for coll in collectors: >> next(coll) >> for val in iterable: >> for coll in collectors: >> coll.send(val) >> return [coll.close() for coll in collectors] >> >> >> Yes, this is only saving a few lines, but I find it *much* more readable... > > I totally agree that not having to call throw() and catch whatever it > bounces back is much nicer. (Now I wish there was a way to avoid the > "try..except GeneratorExit" construct in the generator, but I think I > should stop while I'm ahead. :-) This is how my mind wants to write this. @consumer def reduce_collector(func): try: value = yield # No value to yield here. while True: value = func((yield), value) # or here. except YieldError: # next was called not send. yield = value def parallel_reduce(iterable, funcs): collectors = [reduce_collector(func) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [next(c) for c in collectors] It nicely separates input and output parts of a co-function, which can be tricky to get right when you have to receive and send at the same yield. Maybe in Python 4k? Oh well. :-) > The interesting thing is that I've been dealing with generators used > as coroutines or tasks intensely on and off since July, and I haven't > had a single need for any of the three patterns that this example > happened to demonstrate: > > - the need to "prime" the generator in a separate step Having a consumer decorator would be good. def consumer(f): @wraps(f) def wrapper(*args, **kwds): coroutine = f(*args, **kwds) next(coroutine) return coroutine return wrapper Or maybe it would be possible for python to autostart a generator if it's sent a value before it's started? Currently you get an almost useless TypeError. The reason it's almost useless is unless you are testing for it right after you create the generator, you can't (easily) be sure it's not from someplace inside the generator. Ron From rrr at ronadam.com Mon Oct 25 22:08:06 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 25 Oct 2010 15:08:06 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC5E055.9010009@ronadam.com> References: <4CC5E055.9010009@ronadam.com> Message-ID: <4CC5E3A6.1000303@ronadam.com> Minor correction... On 10/25/2010 02:53 PM, Ron Adam wrote: > @consumer > def reduce_collector(func): > try: > value = yield # No value to yield here. > while True: > value = func((yield), value) # or here. > except YieldError: > # next was called not send. > yield = value This line should have been "yield value" not "yield = value". > def parallel_reduce(iterable, funcs): > collectors = [reduce_collector(func) for func in funcs] > for v in iterable: > for coll in collectors: > coll.send(v) > return [next(c) for c in collectors] From guido at python.org Mon Oct 25 22:21:07 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 13:21:07 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC5E055.9010009@ronadam.com> References: <4CC5E055.9010009@ronadam.com> Message-ID: On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam wrote: > This is how my mind wants to write this. > > @consumer > def reduce_collector(func): > ? ?try: > ? ? ? ?value = yield ? ? ? ? ? ?# No value to yield here. > ? ? ? ?while True: > ? ? ? ? ? ?value = func((yield), value) ? ? ? ?# or here. > ? ?except YieldError: IIUC this works today if you substitute GeneratorExit and use c.close() instead of next(c) below. (I don't recall why I split it out into two different try/except blocks but it doesn't seem necessary. As for being able to distinguish next(c) from c.send(None), that's a few language revisions too late. Perhaps more to the point, I don't like that idea; it breaks the general treatment of things that return None and throwing away values. (Long, long, long ago there were situations where Python balked when you threw away a non-None value. The feature was boohed off the island and it's better this way.) > ? ? ? ?# next was called not send. > ? ? ? ?yield value I object to overloading yield for both a *resumable* operation and returning a (final) value; that's why PEP 380 will let you write "return value". (Many alternatives were considered but we always come back to the simple "return value".) > def parallel_reduce(iterable, funcs): > ? ?collectors = [reduce_collector(func) for func in funcs] > ? ?for v in iterable: > ? ? ? ?for coll in collectors: > ? ? ? ? ? ?coll.send(v) > ? ?return [next(c) for c in collectors] I really object to using next() for both getting the return value and the next yielded value. Jacob's proposal to spell this as c.close() sounds much better to me. > It nicely separates input and output parts of a co-function, which can be > tricky to get right when you have to receive and send at the same yield. I don't think there was a problem with this in my code (or if there was you didn't solve it). > Maybe in Python 4k? ?Oh well. :-) Nah. >> The interesting thing is that I've been dealing with generators used >> as coroutines or tasks intensely on and off since July, and I haven't >> had a single need for any of the three patterns that this example >> happened to demonstrate: >> >> - the need to "prime" the generator in a separate step > > Having a consumer decorator would be good. > > def consumer(f): > ? ?@wraps(f) > ? ?def wrapper(*args, **kwds): > ? ? ? ?coroutine = f(*args, **kwds) > ? ? ? ?next(coroutine) > ? ? ? ?return coroutine > ? ?return wrapper This was proposed during the PEP 380 discussions. I still don't like it because I can easily imagine situations where sending an initial None falls totally naturally out of the sending logic (as it does for my async tasks use case), and it would be a shame if the generator's declaration prevented this. > Or maybe it would be possible for python to autostart a generator if it's > sent a value before it's started? ?Currently you get an almost useless > TypeError. ?The reason it's almost useless is unless you are testing for it > right after you create the generator, you can't (easily) be sure it's not > from someplace inside the generator. I'd be okay with this raising a different exception (though for compatibility it would have to subclass TypeError). I'd also be okay with having a property on generator objects that let you inspect the state. There should really be three states: not yet started, started, finished -- and of course "started and currently executing" but that one is already exposed via g.gi_running. Changing the behavior on .send(val) doesn't strike me as a good idea, because the caller would be missing the first value yielded! IOW I want to support this use case but not make it the central driving use case for the API design. -- --Guido van Rossum (python.org/~guido) From raymond.hettinger at gmail.com Mon Oct 25 23:11:26 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Mon, 25 Oct 2010 14:11:26 -0700 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: <5D61A18E-3BAE-43DA-B6A9-892BB7E925AB@gmail.com> >> Is it possible to calculate min(items) and max(items) simultaneously using >> generators? I don't see how... > > No, this is why the reduce-like approach is better for such cases. > Otherwise you keep trying to fit a square peg into a round hold. Which, of course, is neither good for the peg, nor for the hole ;-) no-square-pegs-in-round-holes-ly yours, Raymond From eric at trueblade.com Tue Oct 26 01:44:11 2010 From: eric at trueblade.com (Eric Smith) Date: Mon, 25 Oct 2010 19:44:11 -0400 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <20101023004508.6a6c1373@pitrou.net> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> Message-ID: <4CC6164B.5040201@trueblade.com> On 10/22/2010 6:45 PM, Antoine Pitrou wrote: > On Sat, 23 Oct 2010 00:36:30 +0200 > "M.-A. Lemburg" wrote: >> >> It may seem strange to have functions, methods or object constructors >> with more than 255 parameters, but as I said: when using code generators, >> the generators don't care whether they use 100 or 300 parameters. > > Why not make the code generators smarter? Because it makes more sense to fix it in one place than force the burden of coding around an arbitrary limit upon each such code generator. Eric. From jnoller at gmail.com Tue Oct 26 02:56:48 2010 From: jnoller at gmail.com (Jesse Noller) Date: Mon, 25 Oct 2010 20:56:48 -0400 Subject: [Python-ideas] PyCon 2011 Reminder: Call for Proposals, Posters and Tutorials - us.pycon.org Message-ID: PyCon 2011 Reminder: Call for Proposals, Posters and Tutorials - us.pycon.org =============================================== Well, it's October 25th! The leaves have turned and the deadline for submitting main-conference talk proposals expires in 7 days (November 1st, 2010)! We are currently accepting main conference talk proposals: http://us.pycon.org/2011/speaker/proposals/ Tutorial Proposals: http://us.pycon.org/2011/speaker/proposals/tutorials/ Poster Proposals: http://us.pycon.org/2011/speaker/posters/cfp/ PyCon 2011 will be held March 9th through the 17th, 2011 in Atlanta, Georgia. (Home of some of the best southern food you can possibly find on Earth!) The PyCon conference days will be March 11-13, preceded by two tutorial days (March 9-10), and followed by four days of development sprints (March 14-17). We are also proud to announce that we have booked our first Keynote speaker - Hilary Mason, her bio: "Hilary is the lead scientist at bit.ly, where she is finding sense in vast data sets. She is a former computer science professor with a background in machine learning and data mining, has published numerous academic papers, and regularly releases code on her personal site, http://www.hilarymason.com/. She has discovered two new species, loves to bake cookies, and asks way too many questions." We're really looking forward to having her this year as a keynote speaker! Remember, we've also added an "Extreme" talk track this year - no introduction, no fluff - only the pure technical meat! For more information on "Extreme Talks" see: http://us.pycon.org/2011/speaker/extreme/ We look forward to seeing you in Atlanta! Please also note - registration for PyCon 2011 will also be capped at a maximum of 1,500 delegates, including speakers. When registration opens (soon), you're going to want to make sure you register early! Speakers with accepted talks will have a guaranteed slot. We have published all registration prices online at: http://us.pycon.org/2011/tickets/ Important Dates November 1st, 2010: Talk proposals due. December 15th, 2010: Acceptance emails sent. January 19th, 2011: Early bird registration closes. March 9-10th, 2011: Tutorial days at PyCon. March 11-13th, 2011: PyCon main conference. March 14-17th, 2011: PyCon sprints days. Contact Emails: Van Lindberg (Conference Chair) - van at python.org Jesse Noller (Co-Chair) - jnoller at python.org PyCon Organizers list: pycon-organizers at python.org From rrr at ronadam.com Tue Oct 26 03:01:29 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 25 Oct 2010 20:01:29 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC5E055.9010009@ronadam.com> Message-ID: <4CC62869.6090503@ronadam.com> On 10/25/2010 03:21 PM, Guido van Rossum wrote: > On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam wrote: >> This is how my mind wants to write this. >> >> @consumer >> def reduce_collector(func): >> try: >> value = yield # No value to yield here. >> while True: >> value = func((yield), value) # or here. >> except YieldError: > > IIUC this works today if you substitute GeneratorExit and use > c.close() instead of next(c) below. (I don't recall why I split it out > into two different try/except blocks but it doesn't seem necessary. I tried it, c.close() doesn't work yet, but it does work with c.throw(GeneratorExit) :-) But that still uses yield to get the value. I used a different way of starting the generator that checks for a value being yielded. class GeneratorStartError(TypeError): pass def start(g): value = next(g) if value is not None: raise GeneratorStartError('started generator yielded a value') return g def reduce_collector(func): value = None try: value = yield while True: value = func((yield), value) except GeneratorExit: yield value def parallel_reduce(iterable, funcs): collectors = [start(reduce_collector(func)) for func in funcs] for v in iterable: for coll in collectors: coll.send(v) return [c.throw(GeneratorExit) for c in collectors] def main(): it = range(100) print(parallel_reduce(it, [min, max])) if __name__ == '__main__': main() > As for being able to distinguish next(c) from c.send(None), that's a > few language revisions too late. Perhaps more to the point, I don't > like that idea; it breaks the general treatment of things that return > None and throwing away values. (Long, long, long ago there were > situations where Python balked when you threw away a non-None value. > The feature was boohed off the island and it's better this way.) I'm not sure I follow the relationship you suggest. No values would be thrown away. Or did you mean that it should be ok to throw away values? I don't think it would prevent that either. What the YieldError case really does is give the generator a bit more control. As far as the calling routine that uses it is concerned, it just works. What happend inside the generator is completely transparent to the routine using the generator. If the calling routine does see a YieldError, it means it probably was a bug. >> # next was called not send. >> yield value > > I object to overloading yield for both a *resumable* operation and > returning a (final) value; that's why PEP 380 will let you write > "return value". (Many alternatives were considered but we always come > back to the simple "return value".) That works for me. I think lot of people will find it easy to learn. >> def parallel_reduce(iterable, funcs): >> collectors = [reduce_collector(func) for func in funcs] >> for v in iterable: >> for coll in collectors: >> coll.send(v) >> return [next(c) for c in collectors] > > I really object to using next() for both getting the return value and > the next yielded value. Jacob's proposal to spell this as c.close() > sounds much better to me. If c.close also throws the GeneratorExit and returns a value, that would be cool. Thanks. I take it that the objections have more to do with style and coding practices rather than what is possible. >> It nicely separates input and output parts of a co-function, which can be >> tricky to get right when you have to receive and send at the same yield. > > I don't think there was a problem with this in my code (or if there > was you didn't solve it). There wasn't in this code. This is one of those areas where it can be really difficult to find the correct way to express a co-function that does both input and output, but not necessarily in a fixed order. I begin almost any co-function with this at the top of the loop and later trim it up if parts of it aren't needed. out_value = None while True: in_value = yield out_value out_value = None ... # rest of loop to check in_value and modify out_value As long as None isn't a valid data item, this works most of the time. >> Maybe in Python 4k? Oh well. :-) > > Nah. I'm ok with that. Ron From guido at python.org Tue Oct 26 03:34:40 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 18:34:40 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC62869.6090503@ronadam.com> References: <4CC5E055.9010009@ronadam.com> <4CC62869.6090503@ronadam.com> Message-ID: On Mon, Oct 25, 2010 at 6:01 PM, Ron Adam wrote: > > On 10/25/2010 03:21 PM, Guido van Rossum wrote: >> >> On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam ?wrote: >>> >>> This is how my mind wants to write this. >>> >>> @consumer >>> def reduce_collector(func): >>> ? ?try: >>> ? ? ? ?value = yield ? ? ? ? ? ?# No value to yield here. >>> ? ? ? ?while True: >>> ? ? ? ? ? ?value = func((yield), value) ? ? ? ?# or here. >>> ? ?except YieldError: >> >> IIUC this works today if you substitute GeneratorExit and use >> c.close() instead of next(c) below. (I don't recall why I split it out >> into two different try/except blocks but it doesn't seem necessary. > > I tried it, c.close() doesn't work yet, but it does work with > c.throw(GeneratorExit) :-) ? But that still uses yield to get the value. Yeah, sorry, I didn't mean to say that g.close() would return the value, but that you can use GeneratorExit here. g.close() *does* throw GeneratorExit (that's PEP 342); but it doesn't return the value yet. I like adding that to PEP 380 though. > I used a different way of starting the generator that checks for a value > being yielded. > > > > class GeneratorStartError(TypeError): pass > > def start(g): > ? ?value = next(g) > ? ?if value is not None: > ? ? ? ?raise GeneratorStartError('started generator yielded a value') > ? ?return g Whatever tickles your fancy. I just don't think this deserves a builtin. > def reduce_collector(func): > ? ?value = None > ? ?try: > ? ? ? ?value = yield > ? ? ? ?while True: > ? ? ? ? ? ?value = func((yield), value) > ? ?except GeneratorExit: > ? ? ? ?yield value Even today, I would much prefer using raise StopIteration(value) over yield value (or yield Return(value)). Reusing yield to return a value just looks wrong to me, there are too many ways to get confused (and this area doesn't need more of that :-). > def parallel_reduce(iterable, funcs): > ? ?collectors = [start(reduce_collector(func)) for func in funcs] > ? ?for v in iterable: > ? ? ? ?for coll in collectors: > ? ? ? ? ? ?coll.send(v) > ? ?return [c.throw(GeneratorExit) for c in collectors] > > def main(): > ? ?it = range(100) > ? ?print(parallel_reduce(it, [min, max])) > > if __name__ == '__main__': > ? ?main() > > > >> As for being able to distinguish next(c) from c.send(None), that's a >> few language revisions too late. Perhaps more to the point, I don't >> like that idea; it breaks the general treatment of things that return >> None and throwing away values. (Long, long, long ago there were >> situations where Python balked when you threw away a non-None value. >> The feature was boohed off the island and it's better this way.) > > I'm not sure I follow the relationship you suggest. ?No values would be > thrown away. ?Or did you mean that it should be ok to throw away values? I > don't think it would prevent that either. Well maybe I was misunderstanding your proposed YieldError. You didn't really explain it -- you just used it and assumed everybody understood what you meant. My assumption was that you meant for YieldError to be raised if yield was used as an expression (not a statement) but next() was called instead of send(). My response was that it's ugly to make a distinction between x = del x # Or just not use x and But maybe I misunderstood what you meant. > What the YieldError case really does is give the generator a bit more > control. ?As far as the calling routine that uses it is concerned, it just > works. ?What happend inside the generator is completely transparent to the > routine using the generator. ?If the calling routine does see a YieldError, > it means it probably was a bug. That sounds pretty close to the rules for GeneratorExit. >>> ? ? ? ?# next was called not send. >>> ? ? ? ?yield value >> >> I object to overloading yield for both a *resumable* operation and >> returning a (final) value; that's why PEP 380 will let you write >> "return value". (Many alternatives were considered but we always come >> back to the simple "return value".) > > That works for me. ?I think lot of people will find it easy to learn. > > >>> def parallel_reduce(iterable, funcs): >>> ? ?collectors = [reduce_collector(func) for func in funcs] >>> ? ?for v in iterable: >>> ? ? ? ?for coll in collectors: >>> ? ? ? ? ? ?coll.send(v) >>> ? ?return [next(c) for c in collectors] >> >> I really object to using next() for both getting the return value and >> the next yielded value. Jacob's proposal to spell this as c.close() >> sounds much better to me. > > If c.close also throws the GeneratorExit and returns a value, that would be > cool. Thanks. It does throw GeneratorExit (that's the whole reason for GeneratorExit's existence :-). > I take it that the objections have more to do with style and coding > practices rather than what is possible. Yeah, it's my gut and that's hard to reason with but usually right. (See also: http://www.amazon.com/How-We-Decide-Jonah-Lehrer/dp/0618620117 ) >>> It nicely separates input and output parts of a co-function, which can be >>> tricky to get right when you have to receive and send at the same yield. >> >> I don't think there was a problem with this in my code (or if there >> was you didn't solve it). > > There wasn't in this code. ?This is one of those areas where it can be > really difficult to find the correct way to express a co-function that does > both input and output, but not necessarily in a fixed order. Maybe for that one should use a "channel" abstraction, like Go (and before it, CSP)? I noticed that Monocle (http://github.com/saucelabs/monocle) has a demo of that in its "experimental" module (but the example is kind of silly). > I begin almost any co-function with this at the top of the loop and later > trim it up if parts of it aren't needed. > > ? out_value = None > ? while True: > ? ? ?in_value = yield out_value > ? ? ?out_value = None > ? ? ?... > ? ? ?# rest of loop to check in_value and modify out_value > > As long as None isn't a valid data item, this works most of the time. > > >>> Maybe in Python 4k? ?Oh well. :-) >> >> Nah. > > I'm ok with that. > > Ron > -- --Guido van Rossum (python.org/~guido) From rrr at ronadam.com Tue Oct 26 03:36:13 2010 From: rrr at ronadam.com (Ron Adam) Date: Mon, 25 Oct 2010 20:36:13 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: Message-ID: <4CC6308D.9090201@ronadam.com> On 10/25/2010 10:13 AM, Guido van Rossum wrote: > BTW, while I have you, what do you think of Greg's "cofunctions" proposal? Well, my .5 cents worth, for what it's worth. I'm still undecided. Because of the many optimazations python has had in the last year on speeding up attribute access, (thanks guys!), classes don't get penalized as much as they use to be. So I'd like to see some speed comparisons with using class's vs co-functions. I think the class's are much easier to use and may not be as slow as some may think. Ron From jh at improva.dk Tue Oct 26 03:35:33 2010 From: jh at improva.dk (Jacob Holm) Date: Tue, 26 Oct 2010 03:35:33 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: Message-ID: <4CC63065.9040507@improva.dk> On 2010-10-25 17:13, Guido van Rossum wrote: > On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm wrote: >> More interesting (to me at least) is that this is an excellent example >> of why I would like to see a version of PEP380 where "close" on a >> generator can return a value (AFAICT the version of PEP380 on >> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not >> mention this possibility, or even link to the heated discussion we had >> on python-ideas around march/april 2009). > > Can you dig up the link here? > > I recall that discussion but I don't recall a clear conclusion coming > from it -- just heated debate. > Well here is a recap of the end of the discussion about how to handle generator return values and g.close(). Gregs conclusion that g.close() should not return a value: http://mail.python.org/pipermail/python-ideas/2009-April/003959.html My reply (ordered list of ways to handle return values in generators): http://mail.python.org/pipermail/python-ideas/2009-April/003984.html Some arguments for storing the return value on the generator: http://mail.python.org/pipermail/python-ideas/2009-April/004008.html Some support for that idea from Nick: http://mail.python.org/pipermail/python-ideas/2009-April/004012.html You're not convinced by Gregs argument: http://mail.python.org/pipermail/python-ideas/2009-April/003985.html Greg arguing that using GeneratorExit this way is bad: http://mail.python.org/pipermail/python-ideas/2009-April/004001.html You add a new complete proposal including g.close() returning a value: http://mail.python.org/pipermail/python-ideas/2009-April/003944.html I point out some problems e.g. with the handling of return values: http://mail.python.org/pipermail/python-ideas/2009-April/003981.html Then the discussion goes on at length about the problems of using a coroutine decorator with yield-from. At one point I am arguing for generators to keep a reference to the last value yielded: http://mail.python.org/pipermail/python-ideas/2009-April/004032.html And you reply that storing "unnatural" state on the generator or frame object is a bad idea: http://mail.python.org/pipermail/python-ideas/2009-April/004034.html From which I concluded that having g.close() return a value (the same on each successive call) would be a no-go: http://mail.python.org/pipermail/python-ideas/2009-April/004040.html Which you confirmed: http://mail.python.org/pipermail/python-ideas/2009-April/004041.html The latest draft (#13) I have been able to find was announced in http://mail.python.org/pipermail/python-ideas/2009-April/004189.html And can be found at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt I had some later suggestions for how to change the expansion, see e.g. http://mail.python.org/pipermail/python-ideas/2009-April/004195.html (I find that version easier to reason about even now 1? years later) > Based on my example I have to agree that returning a value from > close() would be nice. There is a little detail, how multiple > arguments to StopIteration should be interpreted, but that's not so > important if it's being raised by a return statement. > Right. I would assume that the return value of g.close() if we ever got one was to be taken from the first argument to the StopIteration. What killed the proposal last time was the question of what should happen when you call g.close() on an exhausted generator. My preferred solution was (and is) that the generator should save the value from the terminating StopIteration (or None if it ended by some other means) and that g.close() should return that value each time and g.next(), g.send() and g.throw() should raise a StopIteration with the value. Unless you have changed your position on storing the return value, that solution is dead in the water. For this use case we don't actually need to call close() on an exhausted generator so perhaps there is *some* use in only returning a value when the generator is actually running. Here's a stupid idea... let g.close take an optional argument that it can return if the generator is already exhausted and let it return the value from the StopIteration otherwise. def close(self, default=None): if self.gi_frame is None: return default try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None else: raise RuntimeError('generator ignored GeneratorExit') > I totally agree that not having to call throw() and catch whatever it > bounces back is much nicer. (Now I wish there was a way to avoid the > "try..except GeneratorExit" construct in the generator, but I think I > should stop while I'm ahead. :-) > > The interesting thing is that I've been dealing with generators used > as coroutines or tasks intensely on and off since July, and I haven't > had a single need for any of the three patterns that this example > happened to demonstrate: > > - the need to "prime" the generator in a separate step > - throwing and catching GeneratorExit > - getting a value from close() > > (I did have a lot of use for send(), throw(), and extracting a value > from StopIteration.) > I think these things (at least priming and close()) are mostly an issue when using coroutines from non-coroutines. That means it is likely to be common in small examples where you write the whole program, but less common when you are writing small(ish) parts of a larger framework. Throwing and catching GeneratorExit is not common, and according to some shouldn't be used for this purpose at all. > In my context, generators are used to emulate concurrently running > tasks, and "yield" is always used to mean "block until this piece of > async I/O is complete, and wake me up with the result". This is > similar to the "classic" trampoline code found in PEP 342. > > In fact, when I wrote the example for this thread, I fumbled a bit > because the use of generators there is different than I had been using > them (though it was no doubt thanks to having worked with them > intensely that I came up with the example quickly). > This sounds a lot like working in a "larger framework" to me. :) > So, it is clear that generators are extremely versatile, and PEP 380 > deserves several good use cases to explain all the API subtleties. > I like your example because it matches the way I would have used generators to solve it. OTOH, it is not hard to rewrite parallel_reduce as a traditional function. In fact, the result is a bit shorter and quite a bit faster so it is not a good example of what you need generators for. > BTW, while I have you, what do you think of Greg's "cofunctions" proposal? > I'll have to get back to you on that. - Jacob From guido at python.org Tue Oct 26 05:14:34 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 20:14:34 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC63065.9040507@improva.dk> References: <4CC63065.9040507@improva.dk> Message-ID: On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm wrote: > On 2010-10-25 17:13, Guido van Rossum wrote: >> On Mon, Oct 25, 2010 at 3:19 AM, Jacob Holm wrote: >>> More interesting (to me at least) is that this is an excellent example >>> of why I would like to see a version of PEP380 where "close" on a >>> generator can return a value (AFAICT the version of PEP380 on >>> http://www.python.org/dev/peps/pep-0380 is not up-to-date and does not >>> mention this possibility, or even link to the heated discussion we had >>> on python-ideas around march/april 2009). >> >> Can you dig up the link here? >> >> I recall that discussion but I don't recall a clear conclusion coming >> from it -- just heated debate. >> > > > Well here is a recap of the end of the discussion about how to handle > generator return values and g.close(). Thanks, very thorough! > ?Gregs conclusion that g.close() should not return a value: > ?http://mail.python.org/pipermail/python-ideas/2009-April/003959.html > > ?My reply (ordered list of ways to handle return values in generators): > ?http://mail.python.org/pipermail/python-ideas/2009-April/003984.html > > ?Some arguments for storing the return value on the generator: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004008.html > > ?Some support for that idea from Nick: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004012.html > > ?You're not convinced by Gregs argument: > ?http://mail.python.org/pipermail/python-ideas/2009-April/003985.html > > ?Greg arguing that using GeneratorExit this way is bad: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004001.html > > ?You add a new complete proposal including g.close() returning a value: > ?http://mail.python.org/pipermail/python-ideas/2009-April/003944.html > > ?I point out some problems e.g. with the handling of return values: > ?http://mail.python.org/pipermail/python-ideas/2009-April/003981.html > > ?Then the discussion goes on at length about the problems of using a > ?coroutine decorator with yield-from. ?At one point I am arguing for > ?generators to keep a reference to the last value yielded: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004032.html > > ?And you reply that storing "unnatural" state on the generator or > ?frame object is a bad idea: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004034.html > > ?From which I concluded that having g.close() return a value (the same > ?on each successive call) would be a no-go: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004040.html > > ?Which you confirmed: > ?http://mail.python.org/pipermail/python-ideas/2009-April/004041.html > > > The latest draft (#13) I have been able to find was announced in > http://mail.python.org/pipermail/python-ideas/2009-April/004189.html > > And can be found at > http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt Hmm... It does look like the PEP editors dropped the ball on this one (or maybe Greg didn't mail it directly to them). It doesn't seem there are substantial differences with the published version at http://www.python.org/dev/peps/pep-0380/ though, close() still doesn't return a value. > I had some later suggestions for how to change the expansion, see e.g. > http://mail.python.org/pipermail/python-ideas/2009-April/004195.html ?(I > find that version easier to reason about even now 1? years later) Hopefully you & Greg can agree on a new draft. I like this to make progress and I really want this to appear in 3.3. But I don't have the time to do the editing and reviewing of the PEP. >> Based on my example I have to agree that returning a value from >> close() would be nice. There is a little detail, how multiple >> arguments to StopIteration should be interpreted, but that's not so >> important if it's being raised by a return statement. >> > > Right. ?I would assume that the return value of g.close() if we ever got > one was to be taken from the first argument to the StopIteration. That's a reasonable position. Monocle currently makes it so that using yield Return(x, y, z) [which in my view should be spelled raise Return(x, y, z0] is equivalent to return x, y, z, but there's no real need if the latter syntax is actually supported. > What killed the proposal last time was the question of what should > happen when you call g.close() on an exhausted generator. ?My preferred > solution was (and is) that the generator should save the value from the > terminating StopIteration (or None if it ended by some other means) and > that g.close() should return that value each time and g.next(), g.send() > and g.throw() should raise a StopIteration with the value. > Unless you have changed your position on storing the return value, that > solution is dead in the water. I haven't changed my position. Closing a file twice doesn't do anything the second time either. > For this use case we don't actually need to call close() on an exhausted > generator so perhaps there is *some* use in only returning a value when > the generator is actually running. :-) > Here's a stupid idea... let g.close take an optional argument that it > can return if the generator is already exhausted and let it return the > value from the StopIteration otherwise. > > def close(self, default=None): > ? ?if self.gi_frame is None: > ? ? ? ?return default > ? ?try: > ? ? ? ?self.throw(GeneratorExit) > ? ?except StopIteration as e: > ? ? ? ?return e.args[0] > ? ?except GeneratorExit: > ? ? ? ?return None > ? ?else: > ? ? ? ?raise RuntimeError('generator ignored GeneratorExit') You'll have to explain why None isn't sufficient. >> I totally agree that not having to call throw() and catch whatever it >> bounces back is much nicer. (Now I wish there was a way to avoid the >> "try..except GeneratorExit" construct in the generator, but I think I >> should stop while I'm ahead. :-) >> >> The interesting thing is that I've been dealing with generators used >> as coroutines or tasks intensely on and off since July, and I haven't >> had a single need for any of the three patterns that this example >> happened to demonstrate: >> >> - the need to "prime" the generator in a separate step >> - throwing and catching GeneratorExit >> - getting a value from close() >> >> (I did have a lot of use for send(), throw(), and extracting a value >> from StopIteration.) >> > > I think these things (at least priming and close()) are mostly an issue > when using coroutines from non-coroutines. ?That means it is likely to > be common in small examples where you write the whole program, but less > common when you are writing small(ish) parts of a larger framework. > > Throwing and catching GeneratorExit is not common, and according to some > shouldn't be used for this purpose at all. Well, *throwing* it is close()'s job. And *catching* it ought to be pretty rare. Maybe this idiom would be better: def sum(): total = 0 try: while True: value = yield total += value finally: return total >> In my context, generators are used to emulate concurrently running >> tasks, and "yield" is always used to mean "block until this piece of >> async I/O is complete, and wake me up with the result". This is >> similar to the "classic" trampoline code found in PEP 342. >> >> In fact, when I wrote the example for this thread, I fumbled a bit >> because the use of generators there is different than I had been using >> them (though it was no doubt thanks to having worked with them >> intensely that I came up with the example quickly). >> > > This sounds a lot like working in a "larger framework" to me. :) Possibly. I realize that I have code something like this: next_input = None while ...not done yet...: output = gen.send(next_input) next_input = ...computed from output... # many variations which quite naturally computes next_input from output but it does start out with an initial value of None for next_input in order to prime the pump. >> So, it is clear that generators are extremely versatile, and PEP 380 >> deserves several good use cases to explain all the API subtleties. >> > > I like your example because it matches the way I would have used > generators to solve it. ?OTOH, it is not hard to rewrite parallel_reduce > as a traditional function. ?In fact, the result is a bit shorter and > quite a bit faster so it is not a good example of what you need > generators for. I'm not sure I understand. Maybe you meant to rewrite it as a class? There's some state that wouldn't have a good place to live without either a class or a (generator) stackframe to survive. >> BTW, while I have you, what do you think of Greg's "cofunctions" proposal? >> > > I'll have to get back to you on that. > > - Jacob > -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Oct 26 05:25:08 2010 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Oct 2010 20:25:08 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: By the way, here's how to emulate the value-returning-close() on a generator, assuming the generator uses raise StopIteration(x) to mean return x: def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration, err: if err.args: return err.args[0] except GeneratorExit: pass return None I like this because it's fairly straightforward (except for the detail of having to also catch GeneratorExit). In fact it would be a really simple change to gen_close() in genobject.c -- the only change needed there would be to return err.args[0]. I like small evolutionary improvements to APIs. -- --Guido van Rossum (python.org/~guido) From rrr at ronadam.com Tue Oct 26 08:57:32 2010 From: rrr at ronadam.com (Ron Adam) Date: Tue, 26 Oct 2010 01:57:32 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC5E055.9010009@ronadam.com> <4CC62869.6090503@ronadam.com> Message-ID: <4CC67BDC.8010601@ronadam.com> On 10/25/2010 08:34 PM, Guido van Rossum wrote: > On Mon, Oct 25, 2010 at 6:01 PM, Ron Adam wrote: >> >> On 10/25/2010 03:21 PM, Guido van Rossum wrote: >>> >>> On Mon, Oct 25, 2010 at 12:53 PM, Ron Adam wrote: >>>> >>>> This is how my mind wants to write this. >>>> >>>> @consumer >>>> def reduce_collector(func): >>>> try: >>>> value = yield # No value to yield here. >>>> while True: >>>> value = func((yield), value) # or here. >>>> except YieldError: > Well maybe I was misunderstanding your proposed YieldError. You didn't > really explain it -- you just used it and assumed everybody understood > what you meant. Sorry about that, it is too easy to think something is clear on these boards when in fact it's isn't as clear as we (I in this case) think it is. hmmm ... I feel a bit embarrassed because I wasn't really meaning to try to convince you to do this. It's just what first came to mind when I asked myself, "if there was an easier way to write it, how would I do it?". As you pointed out, it isn't that much different from the c.close() example Jacob gave. To me, that is a nice indication that you (and Jacob and Greg) are on the right track. I think YieldError is an interesting concept, but it requires too many changes to make it work. ( I just wish I could be of more help here. :-/ ) Cheers, Ron From __peter__ at web.de Tue Oct 26 10:12:54 2010 From: __peter__ at web.de (Peter Otten) Date: Tue, 26 Oct 2010 10:12:54 +0200 Subject: [Python-ideas] Possible PEP 380 tweak References: <4CC63065.9040507@improva.dk> Message-ID: Guido van Rossum wrote: >> I like your example because it matches the way I would have used >> generators to solve it. OTOH, it is not hard to rewrite parallel_reduce >> as a traditional function. In fact, the result is a bit shorter and >> quite a bit faster so it is not a good example of what you need >> generators for. > > I'm not sure I understand. Maybe you meant to rewrite it as a class? > There's some state that wouldn't have a good place to live without > either a class or a (generator) stackframe to survive. How about def parallel_reduce(items, funcs): items = iter(items) try: first = next(items) except StopIteration: raise TypeError accu = [first] * len(funcs) for b in items: accu = [f(a, b) for f, a in zip(funcs, accu)] return accu Peter From solipsis at pitrou.net Tue Oct 26 10:50:18 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Oct 2010 10:50:18 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC6164B.5040201@trueblade.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> Message-ID: <1288083018.3547.0.camel@localhost.localdomain> Le lundi 25 octobre 2010 ? 19:44 -0400, Eric Smith a ?crit : > On 10/22/2010 6:45 PM, Antoine Pitrou wrote: > > On Sat, 23 Oct 2010 00:36:30 +0200 > > "M.-A. Lemburg" wrote: > >> > >> It may seem strange to have functions, methods or object constructors > >> with more than 255 parameters, but as I said: when using code generators, > >> the generators don't care whether they use 100 or 300 parameters. > > > > Why not make the code generators smarter? > > Because it makes more sense to fix it in one place than force the burden > of coding around an arbitrary limit upon each such code generator. Sure, but in the absence of anyone providing a patch for CPython, it is still a possible resolution. Regards Antoine. From mal at egenix.com Tue Oct 26 11:31:41 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 26 Oct 2010 11:31:41 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <1288083018.3547.0.camel@localhost.localdomain> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> Message-ID: <4CC69FFD.7080102@egenix.com> Antoine Pitrou wrote: > Le lundi 25 octobre 2010 ? 19:44 -0400, Eric Smith a ?crit : >> On 10/22/2010 6:45 PM, Antoine Pitrou wrote: >>> On Sat, 23 Oct 2010 00:36:30 +0200 >>> "M.-A. Lemburg" wrote: >>>> >>>> It may seem strange to have functions, methods or object constructors >>>> with more than 255 parameters, but as I said: when using code generators, >>>> the generators don't care whether they use 100 or 300 parameters. >>> >>> Why not make the code generators smarter? I don't see a way to work around the limitation without starting every single wrapper object's .__init__() with a test routine that checks the parameters in Python - and that's not really feasible since it would kill performance. You'd also have to move all **kws parameters to locals in order to emulate the normal Python parameter invokation of the method. >> Because it makes more sense to fix it in one place than force the burden >> of coding around an arbitrary limit upon each such code generator. > > Sure, but in the absence of anyone providing a patch for CPython, it is > still a possible resolution. Cesare already posted a patch based on using EXTENDED_ARG. Should we reopen that old ticket or create a new one ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Tue Oct 26 11:44:38 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Oct 2010 11:44:38 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC69FFD.7080102@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> Message-ID: <1288086278.3547.8.camel@localhost.localdomain> > >>> Why not make the code generators smarter? > > I don't see a way to work around the > limitation without starting every single wrapper object's .__init__() > with a test routine that checks the parameters in Python - and that's > not really feasible since it would kill performance. Have you considered that having 200 or 300 keyword arguments might already kill performance? I don't think our function invocation code is tuned for such a number. > Cesare already posted a patch based on using EXTENDED_ARG. Should we > reopen that old ticket or create a new one ? Was there an old ticket open? I have only seen a piece of code on python-ideas. Regardless, whether one or the other doesn't really matter, as long as it's recorded somewhere :) Regards Antoine. From cesare.di.mauro at gmail.com Tue Oct 26 11:45:56 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Tue, 26 Oct 2010 11:45:56 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC69FFD.7080102@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> Message-ID: 2010/10/26 M.-A. Lemburg > Antoine Pitrou wrote: > > Sure, but in the absence of anyone providing a patch for CPython, it is > > still a possible resolution. > > Cesare already posted a patch based on using EXTENDED_ARG. Should we > reopen that old ticket or create a new one ? > > -- > Marc-Andre Lemburg > I can provide another patch that will not use EXTENDED_ARG (no VM changes), and uses *args and/or **kwargs function calls when there are more than 255 arguments or keyword arguments. But I need some days. If needed, I'll post it at most on this week-end. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Oct 26 11:51:52 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Oct 2010 19:51:52 +1000 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: On Tue, Oct 26, 2010 at 2:00 AM, Laurens Van Houtven wrote: > Hm. I suppose the need for this would be slightly mitigated if I understood > why str.join does not try to convert the elements of the iterable it is > passed to strs (and analogously for unicode). > Does anyone know what the rationale for that is? To elaborate on Guido's answer, omitting automatic coercion makes it fairly easy to coerce via str, repr or ascii (as appropriate), or else to implicitly assert that all the inputs should be strings (or buffers) already. Once you put automatic coercion inside str.join, the last option becomes comparatively hard to do. Note that easy coercion in str.join is one of the use cases that prompted us to keep map as a builtin though: sep.join(map(str, seq)) sep.join(map(repr, seq)) sep.join(map(ascii, seq)) sep.join(seq) The genexp equivalents are both slower and harder to read than the simple map invocations. To elaborate on Terry's answer as well - when join was the function string.join, people often had troubling remembering if the sequence or the separator argument came first. With the str method, while some people may find it odd to have the method invocation on the separator, they typically don't forget the order once they learn it for the first time. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Tue Oct 26 12:10:59 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 26 Oct 2010 21:10:59 +1100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: <4CC6A933.3080605@pearwood.info> Guido van Rossum wrote: > This should not require threads. > > Here's a bare-bones sketch using generators: > > def reduce_collector(func): > outcome = None > while True: > try: > val = yield > except GeneratorExit: > break > if outcome is None: > outcome = val > else: > outcome = func(outcome, val) > raise StopIteration(outcome) > > def parallel_reduce(iterable, funcs): > collectors = [reduce_collector(func) for func in funcs] > values = [None for _ in collectors] > for i, coll in enumerate(collectors): > try: > next(coll) > except StopIteration as err: > values[i] = err.args[0] > collectors[i] = None > for val in iterable: > for i, coll in enumerate(collectors): > if coll is not None: > try: > coll.send(val) > except StopIteration as err: > values[i] = err.args[0] > collectors[i] = None > for i, coll in enumerate(collectors): > if coll is not None: > try: > coll.throw(GeneratorExit) > except StopIteration as err: > values[i] = err.args[0] > return values Perhaps I'm missing something, but to my mind, that's an awfully complicated solution for such a simple problem. Here's my attempt: def multi_reduce(iterable, funcs): it = iter(iterable) collectors = [next(it)]*len(funcs) for i, f in enumerate(funcs): x = next(it) collectors[i] = f(collectors[i], x) return collectors I've called it multi_reduce rather than parallel_reduce, because it doesn't execute the functions in parallel. By my testing on Python 3.1.1, multi_reduce is consistently ~30% faster than the generator based solution for lists with 1000 - 10,000,000 items. So what am I missing? What does your parallel_reduce give us that multi_reduce doesn't? -- Steven From ncoghlan at gmail.com Tue Oct 26 12:36:08 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Oct 2010 20:36:08 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum wrote: > On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm wrote: >> Throwing and catching GeneratorExit is not common, and according to some >> shouldn't be used for this purpose at all. > > Well, *throwing* it is close()'s job. And *catching* it ought to be > pretty rare. Maybe this idiom would be better: > > def sum(): > ?total = 0 > ?try: > ? ?while True: > ? ? ?value = yield > ? ? ?total += value > ?finally: > ? ?return total Rereading my previous post that Jacob linked, I'm still a little uncomfortable with the idea of people deliberately catching GeneratorExit to turn it into a normal value return to be reported by close(). That said, I'm even less comfortable with the idea of encouraging the moral equivalent of a bare except clause :) I see two realistic options here: 1. Use GeneratorExit for this, have g.close() return a value and I (and others that agree with me) just get the heck over it. 2. Add a new GeneratorReturn exception and a new g.finish() method that follows the same basic algorithm Guido suggested, only with a different exception type: class GeneratorReturn(Exception): # Note: ordinary exception, unlike GeneratorExit pass def finish(gen): try: gen.throw(GeneratorReturn) raise RuntimeError("Generator ignored GeneratorReturn") except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass return None (Why "finish" as the suggested name for the method? I'd prefer "return", but that's a keyword and "return_" is somewhat ugly. Pairing GeneratorReturn with finish() is my second choice, for the "OK, time to wrap things up and complete your assigned task" connotations, as compared to the "drop everything and clean up the mess" connotations of GeneratorExit and close()) I'd personally be +1 on option 2 (since it addresses the immediate use case while maintaining appropriate separation of concerns between guaranteed resource cleanup and graceful completion of coroutines) and -0 on option 1 (unsurprising, given my previously stated objections to failing to maintain appropriate separation of concerns). (I should note that this differs from the previous suggestion of a GeneratorReturn exception in the context of PEP 380. Those suggestions were to use it as a replacement for StopIteration when a generator contained a return statement. The suggestion here is to instead use it as a replacement for GeneratorExit in order to request prompt-but-graceful completion of a generator rather than just bailing out immediately). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mal at egenix.com Tue Oct 26 12:38:44 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 26 Oct 2010 12:38:44 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> Message-ID: <4CC6AFB4.3040003@egenix.com> Cesare Di Mauro wrote: > 2010/10/26 M.-A. Lemburg > >> Antoine Pitrou wrote: >>> Sure, but in the absence of anyone providing a patch for CPython, it is >>> still a possible resolution. >> >> Cesare already posted a patch based on using EXTENDED_ARG. Should we >> reopen that old ticket or create a new one ? >> >> -- >> Marc-Andre Lemburg >> > > I can provide another patch that will not use EXTENDED_ARG (no VM changes), > and uses *args and/or **kwargs function calls when there are more than 255 > arguments or keyword arguments. > > But I need some days. > > If needed, I'll post it at most on this week-end. You mean a version that pushes the *args tuple and **kws dict on the stack and then uses those for calling the function/method ? I think that would be a lot more efficient than pushing/popping hundreds of parameters on/off the stack. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From cesare.di.mauro at gmail.com Tue Oct 26 13:10:56 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Tue, 26 Oct 2010 13:10:56 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC6AFB4.3040003@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> Message-ID: 2010/10/26 M.-A. Lemburg > Cesare Di Mauro wrote: > > I can provide another patch that will not use EXTENDED_ARG (no VM > changes), > > and uses *args and/or **kwargs function calls when there are more than > 255 > > arguments or keyword arguments. > > > > But I need some days. > > > > If needed, I'll post it at most on this week-end. > > You mean a version that pushes the *args tuple and **kws dict > on the stack and then uses those for calling the function/method ? > > I think that would be a lot more efficient than pushing/popping > hundreds of parameters on/off the stack. > > -- > Marc-Andre Lemburg > I was referring to the solution (which I prefer) that I proposed answering to Greg, two days ago. Unfortunately, the stack must be used whatever the solution we will use. Pushing the "final" tuple and/or dictionary is a possible optimization, but we can use it only when we have a tuple or dict of constants; otherwise we need to use the stack. Good case: f(1, 2, 3, a = 1, b = 2) We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0. Worst case: f(1, x, 3, a = x, b = 2) We can't push the tuple and dict as a whole, because they need first to be built using the stack. The good case is possible, and I have already done some work in wpython collecting constants on parameters push (even partial constant sequences), but some additional work must be done recognizing costants-only tuple / dict. However, the worst case rest unresolved. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From bborcic at gmail.com Tue Oct 26 14:04:49 2010 From: bborcic at gmail.com (Boris Borcic) Date: Tue, 26 Oct 2010 14:04:49 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: Nick Coghlan wrote: > With the str method, while some > people may find it odd to have the method invocation on the separator, > they typically don't forget the order once they learn it for the first > time. OTOH, it is a pain that join and split aren't *both* methods on the separator. Imho, 71% of what makes it strange for join to be a method on the separator, is that split doesn't follow the same convention. Cheers, BB From jh at improva.dk Tue Oct 26 14:22:11 2010 From: jh at improva.dk (Jacob Holm) Date: Tue, 26 Oct 2010 14:22:11 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: <4CC6C7F3.6090405@improva.dk> On 2010-10-26 05:14, Guido van Rossum wrote: > On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm wrote: >> On 2010-10-25 17:13, Guido van Rossum wrote: >>> Can you dig up the link here? >> >> Well here is a recap of the end of the discussion about how to handle >> generator return values and g.close(). > > Thanks, very thorough! I had to read through it myself to remember what actually happened, and thought you (and the rest of the world) might as well benefit from the notes I made. >> The latest draft (#13) I have been able to find was announced in >> http://mail.python.org/pipermail/python-ideas/2009-April/004189.html >> >> And can be found at >> http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt > > Hmm... It does look like the PEP editors dropped the ball on this one > (or maybe Greg didn't mail it directly to them). It doesn't seem there > are substantial differences with the published version at > http://www.python.org/dev/peps/pep-0380/ though, close() still doesn't > return a value. > IIRC, there are a few minor semantic differences in how non-generators are handled. I haven't made a detailed comparison. >> I had some later suggestions for how to change the expansion, see e.g. >> http://mail.python.org/pipermail/python-ideas/2009-April/004195.html (I >> find that version easier to reason about even now 1? years later) > > Hopefully you & Greg can agree on a new draft. I like this to make > progress and I really want this to appear in 3.3. But I don't have the > time to do the editing and reviewing of the PEP. IIRC, this was just a presentation issue - the two expansions were supposed to be equivalent. It might become relevant if we want to change something in the definition, because we need a common base to discuss from. My version is (intended to be) simpler to reason about in the sense that things that should be handled the same are only written once. >> What killed the proposal last time was the question of what should >> happen when you call g.close() on an exhausted generator. My preferred >> solution was (and is) that the generator should save the value from the >> terminating StopIteration (or None if it ended by some other means) and >> that g.close() should return that value each time and g.next(), g.send() >> and g.throw() should raise a StopIteration with the value. >> Unless you have changed your position on storing the return value, that >> solution is dead in the water. > > I haven't changed my position. Closing a file twice doesn't do > anything the second time either. > Ok >> Here's a stupid idea... let g.close take an optional argument that it >> can return if the generator is already exhausted and let it return the >> value from the StopIteration otherwise. >> >> def close(self, default=None): >> if self.gi_frame is None: >> return default >> try: >> self.throw(GeneratorExit) >> except StopIteration as e: >> return e.args[0] >> except GeneratorExit: >> return None >> else: >> raise RuntimeError('generator ignored GeneratorExit') > > You'll have to explain why None isn't sufficient. > It is not really necessary, but seemed "cleaner" somehow. Think of "g.close(default)" as "get me the result if possible, and this default otherwise". Then think of dict.get()... An even cleaner solution might be Nicks "g.finish()" proposal, which I will comment on separately. >> I think these things (at least priming and close()) are mostly an issue >> when using coroutines from non-coroutines. That means it is likely to >> be common in small examples where you write the whole program, but less >> common when you are writing small(ish) parts of a larger framework. >> >> Throwing and catching GeneratorExit is not common, and according to some >> shouldn't be used for this purpose at all. > > Well, *throwing* it is close()'s job. And *catching* it ought to be > pretty rare. Maybe this idiom would be better: > > def sum(): > total = 0 > try: > while True: > value = yield > total += value > finally: > return total > This is essentially the same as a bare except. I think there is general agreement that that is a bad idea. >>> So, it is clear that generators are extremely versatile, and PEP 380 >>> deserves several good use cases to explain all the API subtleties. >>> >> >> I like your example because it matches the way I would have used >> generators to solve it. OTOH, it is not hard to rewrite parallel_reduce >> as a traditional function. In fact, the result is a bit shorter and >> quite a bit faster so it is not a good example of what you need >> generators for. > > I'm not sure I understand. Maybe you meant to rewrite it as a class? > There's some state that wouldn't have a good place to live without > either a class or a (generator) stackframe to survive. > See the reply by Peter Otten (and my reply to him). You mentioned some possible extensions though. At a guess, at least some of these would benefit greatly from the use of generators. Maybe such an extension would be a better example? - Jacob From ncoghlan at gmail.com Tue Oct 26 14:35:24 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Oct 2010 22:35:24 +1000 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: On Tue, Oct 26, 2010 at 10:04 PM, Boris Borcic wrote: > Nick Coghlan wrote: >> >> With the str method, while some >> people may find it odd to have the method invocation on the separator, >> they typically don't forget the order once they learn it for the first >> time. > > OTOH, it is a pain that join and split aren't *both* methods on the > separator. Imho, 71% of what makes it strange for join to be a method on the > separator, is that split doesn't follow the same convention. But split only makes sense for strings, not arbitrary sequences. It's the other way around for join. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From alexander.belopolsky at gmail.com Tue Oct 26 16:00:33 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 26 Oct 2010 10:00:33 -0400 Subject: [Python-ideas] Move Demo scripts under Lib Message-ID: I originally proposed this under the Demo and Tools cleanup issue [1]. The idea was to create a new package "demo" in the standard library which will host selected demo programs or modules that currently reside in the Demo/ directory of the source distribution. There are several advantages to this approach: 1. Discoverability. Currently, various distributions place demo scripts in different places or not include them at all. There is no easy way for an end user to discover them. With a demo package, there will be a natural place in the python manual to document demo scripts and users will be able to run them using -m option. IDEs will be able to present demo source code and documentation consistently. 2. Test coverage. One of the points raised in [1] was that Demo scripts are not routinely tested. While it is not strictly necessary to move them under Lib to enable testing, doing so will put these scripts on the same footing as the rest of the standard library modules eliminating an unnecessary barrier to writing tests. 3. Quality/relevance. Many scripts in Demo are very old and do not reflect modern best practices. By picking and choosing what goes to Lib/demo, we can improve the demo collection without removing older scripts that some may find useful. One objection raised to this idea was that Demo scripts do not have the same stability of the API and backward compatibility requirements as the rest of the standard library. I don't think this is a serious issue. As long as we don't start importing demo modules from other stdlib modules, there is no impact on the stdlib itself from changing demo APIs. Users may be warned that their production programs should not depend on the demo modules. I think the word "demo" itself suggests that. What do you think? [1] http://bugs.python.org/issue7962#msg111677 From mal at egenix.com Tue Oct 26 16:15:47 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 26 Oct 2010 16:15:47 +0200 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: <4CC6E293.7060309@egenix.com> Alexander Belopolsky wrote: > I originally proposed this under the Demo and Tools cleanup issue [1]. > The idea was to create a new package "demo" in the standard library > which will host selected demo programs or modules that currently > reside in the Demo/ directory of the source distribution. There are > several advantages to this approach: > > 1. Discoverability. Currently, various distributions place demo > scripts in different places or not include them at all. There is no > easy way for an end user to discover them. With a demo package, there > will be a natural place in the python manual to document demo scripts > and users will be able to run them using -m option. IDEs will be able > to present demo source code and documentation consistently. > > 2. Test coverage. One of the points raised in [1] was that Demo > scripts are not routinely tested. While it is not strictly necessary > to move them under Lib to enable testing, doing so will put these > scripts on the same footing as the rest of the standard library > modules eliminating an unnecessary barrier to writing tests. > > 3. Quality/relevance. Many scripts in Demo are very old and do not > reflect modern best practices. By picking and choosing what goes to > Lib/demo, we can improve the demo collection without removing older > scripts that some may find useful. > > One objection raised to this idea was that Demo scripts do not have > the same stability of the API and backward compatibility requirements > as the rest of the standard library. I don't think this is a serious > issue. As long as we don't start importing demo modules from other > stdlib modules, there is no impact on the stdlib itself from changing > demo APIs. Users may be warned that their production programs should > not depend on the demo modules. I think the word "demo" itself > suggests that. > > What do you think? Calling a stdlib package "demo" or "example" is not a good idea, since those are rather common package names in existing applications. I also don't really see the point in moving *scripts* to the stdlib. The lib modules are usually not executable or meant for execution and you'd normally expect scripts to be under .../bin rather than .../lib. Why don't you turn the ones you find useful into PyPI packages to install separately ? > [1] http://bugs.python.org/issue7962#msg111677 -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From bborcic at gmail.com Tue Oct 26 16:32:10 2010 From: bborcic at gmail.com (Boris Borcic) Date: Tue, 26 Oct 2010 16:32:10 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: Nick Coghlan wrote: >>> With the str method, while some >>> people may find it odd to have the method invocation on the separator, >>> they typically don't forget the order once they learn it for the first >>> time. >> >> OTOH, it is a pain that join and split aren't *both* methods on the >> separator. Imho, 71% of what makes it strange for join to be a method on the >> separator, is that split doesn't follow the same convention. > > But split only makes sense for strings, not arbitrary sequences. It's > the other way around for join. I don't feel your "the other way around" makes clear sense. The /split/ function depends on two string parameters, what allows a design choice on which one should be the object when making it a method call. I have been burned more than once with internalizing that /join/ is a method on the separator, just to (re-)discover that such is *not* the case of the converse method /split/ - although it could (and therefore should, to minimize cognitive dissonance). IOW, instead of whining that there is no way to make join a method on what "we should" think of as the prominent object (ie the sequence/iterator) and then half-heartedly promote sep.join as the solution, let's take the sep.join idiom seriously together with its implication that a core object role for a string, is to act as a separator. And let's then propagate that notion, to a *coherent* definition of split that makes it as well a method on the separator. Cheers, BB From mal at egenix.com Tue Oct 26 16:37:29 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 26 Oct 2010 16:37:29 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> Message-ID: <4CC6E7A9.4030205@egenix.com> Cesare Di Mauro wrote: > 2010/10/26 M.-A. Lemburg > >> Cesare Di Mauro wrote: >> > I can provide another patch that will not use EXTENDED_ARG (no VM >> changes), >>> and uses *args and/or **kwargs function calls when there are more than >> 255 >>> arguments or keyword arguments. >>> >>> But I need some days. >>> >>> If needed, I'll post it at most on this week-end. >> >> You mean a version that pushes the *args tuple and **kws dict >> on the stack and then uses those for calling the function/method ? >> >> I think that would be a lot more efficient than pushing/popping >> hundreds of parameters on/off the stack. >> >> -- >> Marc-Andre Lemburg >> > > I was referring to the solution (which I prefer) that I proposed answering > to Greg, two days ago. > > Unfortunately, the stack must be used whatever the solution we will use. > > Pushing the "final" tuple and/or dictionary is a possible optimization, but > we can use it only when we have a tuple or dict of constants; otherwise we > need to use the stack. > > Good case: f(1, 2, 3, a = 1, b = 2) > We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with > CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0. > > Worst case: f(1, x, 3, a = x, b = 2) > We can't push the tuple and dict as a whole, because they need first to be > built using the stack. > > The good case is possible, and I have already done some work in wpython > collecting constants on parameters push (even partial constant sequences), > but some additional work must be done recognizing costants-only tuple / > dict. > > However, the worst case rest unresolved. I don't understand. What is the difference between pushing values on the stack and building a tuple/dict and then pushing those on the stack ? In your worst case example, the compiler would first build a tuple/dict using the args already on the stack (BUILD_TUPLE, BUILD_MAP) and then call the function with this tuple/dict combination - you'd basically move the tuple/dict building to the compiler rather than having the CALL* opcodes do this internally. It would essentially run: f(*(1,x,3), **{'a':x, 'b':2}) and bypass the "max. number of opcode args" limit without degrading performance, since BUILD_TUPLE et al. essentially run the same code for building the call arguments as the helpers for calling a function. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From dirkjan at ochtman.nl Tue Oct 26 16:33:38 2010 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 26 Oct 2010 16:33:38 +0200 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky wrote: > What do you think? After browsing through the Demo dir a bit, I came away thinking most of these should just be removed from the repository. I think there's enough demo material out there on the internet (for example in the cookbook), a lot of it of higher quality than what we have in the Demo dir right now. Maybe it makes sense to have a basic tkinter app to get you started. And some of the smaller functions or classes could possibly be used in the documentation. But as it is, it seems silly to waste developer time on stuff that few people look at or make use of (I'm assuming this from the fact that they have previously been neglected). Back to the original question: I don't think moving the Demo stuff to the Lib dir is a good idea, simply because the Lib dir should contain libraries, not applications or scripts. Writing a section for the documentation seems a better way to solve the discoverability problem, testing could be done even in the Demo dir (with more structure if need be), and quality control could just as well be exercised in the current location. Cheers, Dirkjan From alexander.belopolsky at gmail.com Tue Oct 26 16:50:57 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 26 Oct 2010 10:50:57 -0400 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: <4CC6E293.7060309@egenix.com> References: <4CC6E293.7060309@egenix.com> Message-ID: On Tue, Oct 26, 2010 at 10:15 AM, M.-A. Lemburg wrote: .. > Calling a stdlib package "demo" or "example" is not a good idea, > since those are rather common package names in existing > applications. > Since proposed "demo" package is not intended to be imported by applications, there is not a problem if they are shadowed by application modules. There are plenty of common names in the stdlib. I don't remember this cited as a problem in the past. For example, recently added collections and subprocess modules were likely to conflict with names used by applications. The "test" package has been installed with stdlib for ages and it often conflicts with user test packages. I believe applications commonly using a "demo" or "example" package is an argument for rather than against my idea. (What's good for users is probably good for stdlib.) Finally, if the name conflict is indeed an issue, it is not hard to come up with a less common name: "pydemo", "python_examples", etc. > I also don't really see the point in moving *scripts* to the stdlib. I gave three reasons in my first post. The first is specifically for *scripts*: to be able to run them using python -m without having to know an obscure path or polluting system path. > The lib modules are usually not executable or meant for execution > and you'd normally expect scripts to be under .../bin rather than > .../lib. Most of stdlib modules are in fact executable with python -m. Just grep for 'if __name__ == "__main__":' line. While most demo scripts are self-contained programs, many are examples on how to write modules or classes. See for example Demo/classes. Furthermore, while users can run demos, presumably the main purpose of demos is to present the source code in them. I believe it is more natural too look for python source code along PYTHONPATH than along PATH. I don't think any demo scripts are suitable to be installed under .../bin. In fact, Red Hat distribution installs them under /usr/lib/pythonX.Y/Demo. > Why don't you turn the ones you find useful into PyPI packages > to install separately ? That's a good way to make them *less* discoverable than they currently are and make even fewer distributions include them by default. BTW, what is the purpose of the "Demo" directory to begin with? I would naively assume that it is the place where new users would look to get the idea of what they can do with python. If this is the case, it completely misses the target because new users are unlikely to have a source distribution or look under /usr/lib/pythonX.Y/Demo or other system specific place. From jh at improva.dk Tue Oct 26 16:44:31 2010 From: jh at improva.dk (Jacob Holm) Date: Tue, 26 Oct 2010 16:44:31 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: <4CC6E94F.3090702@improva.dk> On 2010-10-26 12:36, Nick Coghlan wrote: > On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum wrote: >> Well, *throwing* it is close()'s job. And *catching* it ought to be >> pretty rare. Maybe this idiom would be better: >> >> def sum(): >> total = 0 >> try: >> while True: >> value = yield >> total += value >> finally: >> return total > > Rereading my previous post that Jacob linked, I'm still a little > uncomfortable with the idea of people deliberately catching > GeneratorExit to turn it into a normal value return to be reported by > close(). That said, I'm even less comfortable with the idea of > encouraging the moral equivalent of a bare except clause :) What Nick said. :) > I see two realistic options here: > > 1. Use GeneratorExit for this, have g.close() return a value and I > (and others that agree with me) just get the heck over it. > This has the benefit of not needing an extra method/function and an extra exception for this style of programming. It still has the refactoring problem I mention below. That might be fixable in a similar way though. (Hmm thinking about this gives me a strong sense of deja-vu). > 2. Add a new GeneratorReturn exception and a new g.finish() method > that follows the same basic algorithm Guido suggested, only with a > different exception type: > > class GeneratorReturn(Exception): # Note: ordinary exception, unlike > GeneratorExit > pass > > def finish(gen): > try: > gen.throw(GeneratorReturn) > raise RuntimeError("Generator ignored GeneratorReturn") > except StopIteration as err: > if err.args: > return err.args[0] > except GeneratorReturn: > pass > return None > I like this. Having a separate function lets you explicitly request a return value and making it fail loudly when called on an exhausted generator feels just right given the prohibition against saving the "True" return value anywhere. Also, using a different exception lets the generator distinguish between the "close" and "finish" cases, and making it an ordinary exception makes it clear that it is *intended* to be caught. All good stuff. I am not sure that returning None when finish() cathes GeneratorReturn is a good idea though. If you call finish on a generator you expect it to do something about it and return a value. If the GeneratorReturn escapes, it is a sign that the generator was not written to expect this and so it likely an error. OTOH, I am not sure it always is so maybe allowing it is OK. I just don't know. How does it fit with the current PEP 380, and esp. the refactoring principle? It seems like we need to special-case the GeneratorReturn exception somehow. Perhaps like this: [...] try: _s = yield _y + except GeneratorReturn as _e: + try: + _m = _i.finish + except AttributeError: + raise _e # XXX RuntimeError? + raise YieldFromFinished(_m()) except GeneratorExit as _e: [...] Where YieldFromFinished inherits from GeneratorReturn, and has a 'value' attribute like the new StopIteration. Without something like this a function that is written to work with "finish" is unlikely to be refactorable. With this, the trivial case of perfect delegation can be written as: def outer(): try: return yield from inner() except YieldFromFinished as e: return e.value and a slightly more complex case... def outer2(): try: a = yield from innerA() except YieldFromFinished as e: return e.value try: b = yield from innerB() except YieldFromFinished as e: return a+e.value return a+b the "outer2" example shows why the special casing is needed. If outer2.finish() is called while outer2 is suspended in innerA, a GeneratorReturn would be thrown directly into innerA. Since innerA is supposed to be expecting this, it returns a value immediately which would then be the return value of the yield-from. outer2 would then erroneously continue to the "b = yield from innerB()" line, which unless innerB immediately raised StopIteration would yield a value causing the outer2.finish() to raise a RuntimeError... We can avoid the extra YieldFromFinished exception if we let the new GeneratorReturn exception grow a value attribute instead and use it for both purposes. But then the distinction between a GeneratorReturn that is thrown in by "finish" (which has no associated value) and the GeneratorReturn raised by the yield-from (which has) gets blurred a bit. Another idea is to actually replace YieldFromFinished with StopIteration or a GeneratorReturn inheriting from StopIteration. That would mean we could drop the first try-except block in each of the above example generators because the "finished" result from the inner function is returned directly anyway. On the other hand, that could easily lead to subtle bugs if you forget a try...except block that is actually needed, like the second block in outer2. A different way to handle this would be to change the PEP 380 expansion as follows: [...] - except GeneratorExit as _e: + except (GeneratorReturn, GeneratorExit) as _e: [...] What this means is that only the outermost generator would see the GeneratorReturn. If the outermost generator is suspended using yield-from, and finish() is called. The inner generator is simply closed and the GeneratorReturn re-raised. This version is only really useful for delegating to generators that *don't* return a value, but it is simpler and at least it allows *some* use of yield-from with "finish". > (Why "finish" as the suggested name for the method? I'd prefer > "return", but that's a keyword and "return_" is somewhat ugly. Pairing > GeneratorReturn with finish() is my second choice, for the "OK, time > to wrap things up and complete your assigned task" connotations, as > compared to the "drop everything and clean up the mess" connotations > of GeneratorExit and close()) I like the names. GeneratorFinish might work as well for the exception, but I like GeneratorReturn better for its connection with "return". > > I'd personally be +1 on option 2 (since it addresses the immediate use > case while maintaining appropriate separation of concerns between > guaranteed resource cleanup and graceful completion of coroutines) and > -0 on option 1 (unsurprising, given my previously stated objections to > failing to maintain appropriate separation of concerns). > I agree the "finish" idea looks far better for generators without yield-from. It is unfortunate that extending it to work with yield-from isn't prettier that it is though. > (I should note that this differs from the previous suggestion of a > GeneratorReturn exception in the context of PEP 380. Those suggestions > were to use it as a replacement for StopIteration when a generator > contained a return statement. The suggestion here is to instead use it > as a replacement for GeneratorExit in order to request > prompt-but-graceful completion of a generator rather than just bailing > out immediately). I agree the name fits this use better than the original. Too bad some of my suggestions above are starting to blur the line between GeneratorReturn and StopIteration again. - Jacob From guido at python.org Tue Oct 26 16:56:51 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Oct 2010 07:56:51 -0700 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On Tue, Oct 26, 2010 at 7:33 AM, Dirkjan Ochtman wrote: > On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky > wrote: >> What do you think? > > After browsing through the Demo dir a bit, I came away thinking most > of these should just be removed from the repository. +1. Most of them are either quick hacks I once wrote and didn't know where to put (dutree.py, repeat.py come to mind) or in some cases contributed 3rd party code that was looking for a home. I think that all of these ought to live somewhere else and I have no problem with tossing out the entire Demo and Tools directories -- anything that's not needed as part of the build should go. (Though a few things might indeed be moved into the stdlib if they are useful enough.) > I think there's > enough demo material out there on the internet (for example in the > cookbook), a lot of it of higher quality than what we have in the Demo > dir right now. Maybe it makes sense to have a basic tkinter app to get > you started. And some of the smaller functions or classes could > possibly be used in the documentation. But as it is, it seems silly to > waste developer time on stuff that few people look at or make use of > (I'm assuming this from the fact that they have previously been > neglected). None of that belongs in the core distro any more. > Back to the original question: I don't think moving the Demo stuff to > the Lib dir is a good idea, simply because the Lib dir should contain > libraries, not applications or scripts. Writing a section for the > documentation seems a better way to solve the discoverability problem, > testing could be done even in the Demo dir (with more structure if > need be), and quality control could just as well be exercised in the > current location. If there are demos that are useful for testing, move them into Lib/test/. -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Tue Oct 26 17:13:27 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 26 Oct 2010 11:13:27 -0400 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On Tue, Oct 26, 2010 at 10:33 AM, Dirkjan Ochtman wrote: > On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky > wrote: >> What do you think? > > After browsing through the Demo dir a bit, I came away thinking most > of these should just be removed from the repository. I think there's > enough demo material out there on the internet (for example in the > cookbook), a lot of it of higher quality than what we have in the Demo > dir right now. Maybe it makes sense to have a basic tkinter app to get > you started. The one demo that I want to find a better place for is Demo/turtle. For a novice-oriented framework that turtle is, it is really a shame to require $ cd /Demo/turtle $ python turtleDemo.py to run the demo. I would much rather use $ python demo.turtle or $ python turtle.demo (the later would require converting turtle.py into a package) > And some of the smaller functions or classes could > possibly be used in the documentation. And most likely not be automatically tested contributing to users' confusion: "I copied it from the documentation and it does not work!" See http://bugs.python.org/issue10029 . > But as it is, it seems silly to > waste developer time on stuff that few people look at or make use of > (I'm assuming this from the fact that they have previously been > neglected). > It is debatable what is the cause and what is the effect here. > Back to the original question: I don't think moving the Demo stuff to > the Lib dir is a good idea, simply because the Lib dir should contain > libraries, not applications or scripts. Introduction of -m option has changed that IMO. For example, when I work with recent versions of python, I always run pydoc as python -m pydoc because pydoc script on the path amy not correspond to the same version of python that I use. The trace, timeit, dis and probably many other useful modules don't even have a corresponding script in the standard distribution. > Writing a section for the > documentation seems a better way to solve the discoverability problem, What exactly such a section should say? "In order to find demo scripts, pleas unpack the source distribution and look under the Demo directory"? > testing could be done even in the Demo dir (with more structure if > need be), and quality control could just as well be exercised in the > current location. This is a valid option and if running Demo tests is added to make test target, it has a fighting chance to work. However, if Demo test are organized differently from stdlib module tests, maintaining them will be more difficult than it needs to be. From guido at python.org Tue Oct 26 17:18:44 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Oct 2010 08:18:44 -0700 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On Tue, Oct 26, 2010 at 8:13 AM, Alexander Belopolsky wrote: > The one demo that I want to find a better place for is Demo/turtle. Sure, go for it. It is a special case because the turtle module is also in the stdlib and these are intended for a particular novice audience. Anything we can do to make things easier for those people to get start with is probably worth it. Ideally they could just double click some file and the demo would fire up, with a command-line alternative (for the geeks among them) e.g. "python -m turtledemo" . -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Oct 26 17:43:00 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Oct 2010 08:43:00 -0700 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CC6A933.3080605@pearwood.info> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> <4CC6A933.3080605@pearwood.info> Message-ID: On Tue, Oct 26, 2010 at 3:10 AM, Steven D'Aprano wrote: > Perhaps I'm missing something, but to my mind, that's an awfully complicated > solution for such a simple problem. Here's my attempt: > > def multi_reduce(iterable, funcs): > ? ?it = iter(iterable) > ? ?collectors = [next(it)]*len(funcs) > ? ?for i, f in enumerate(funcs): > ? ? ? ?x = next(it) > ? ? ? ?collectors[i] = f(collectors[i], x) > ? ?return collectors > > I've called it multi_reduce rather than parallel_reduce, because it doesn't > execute the functions in parallel. By my testing on Python 3.1.1, > multi_reduce is consistently ~30% faster than the generator based solution > for lists with 1000 - 10,000,000 items. > > So what am I missing? What does your parallel_reduce give us that > multi_reduce doesn't? You're right, the point I wanted to prove was that generators are better than threads, but the code was based on emulating reduce(). The generalization that I was aiming for was that it is convenient to write a generator that does some kind of computation over a sequence of items and returns a result at the end, and then have a driver that feeds a single sequence to a bunch such generators. This is more powerful once you try to use reduce to compute e.g. the average of the numbers fed to it -- of course you can do it using a function of (state, value) but it is so much easier to write as a loop! (At least for me -- if you do nothing but write Haskell all day I'm sure it comes naturally. :-) def avg(): total = 0 count = 0 try: while True: value = yield total += value count += 1 except GeneratorExit: raise StopIteration(total / count) The essential boilerplate here is try: while True: value = yield except GeneratorExit: raise StopIteration() No doubt functional aficionados will snub this, but in Python, this should run much faster than the same thing written as a reduce-ready function, due to the function overhead (which wasn't a problem in the min/max example since those are built-ins). BTW This episode led me to better understand my objection against reduce() as the universal hammer: my problem with writing avg() using reduce is that the function one feeds into reduce is asymmetric -- its first argument must be some state, e.g. a tuple (total, count), and the second argument must be the next value. This is the moment that my head reliably explodes -- even though it has no problem visualizing reduce() using a *symmetric* function like +, min or max. Also note that the reduce() based solution would have to have a separate function to extract the desired result (total / count) from the state (total, count), and for multi_reduce() you would have to supply a separate list of functions for these or some other hacky approach. -- --Guido van Rossum (python.org/~guido) From alexander.belopolsky at gmail.com Tue Oct 26 17:49:49 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 26 Oct 2010 11:49:49 -0400 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On Tue, Oct 26, 2010 at 11:18 AM, Guido van Rossum wrote: > On Tue, Oct 26, 2010 at 8:13 AM, Alexander Belopolsky > wrote: >> The one demo that I want to find a better place for is Demo/turtle. > > Sure, go for it. It is a special case because the turtle module is > also in the stdlib and these are intended for a particular novice > audience. Please see http://bugs.python.org/issue10199 for further discussion. From steve at pearwood.info Tue Oct 26 18:09:23 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Oct 2010 03:09:23 +1100 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: <4CC6FD33.4050305@pearwood.info> Boris Borcic wrote: > And let's then propagate that notion, to a *coherent* definition of > split that makes it as well a method on the separator. Let's not. Splitting is not something that you on the separator, it's something you do on the source string. I'm sure you wouldn't expect this: ":".find("key:value") => 3 Nor should we expect this: ":".split("key:value") => ["key", "value"] You perform a search *on* the source string, not the target substring. Likewise you split the source string, not the separator. -- Steven From masklinn at masklinn.net Tue Oct 26 18:33:41 2010 From: masklinn at masklinn.net (Masklinn) Date: Tue, 26 Oct 2010 18:33:41 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <4CC6FD33.4050305@pearwood.info> References: <20101025154932.06be2faf@o> <4CC6FD33.4050305@pearwood.info> Message-ID: <5B4B42DF-25F5-4DC4-90B2-4AF5B7AF40D6@masklinn.net> On 2010-10-26, at 18:09 , Steven D'Aprano wrote: > Boris Borcic wrote: >> And let's then propagate that notion, to a *coherent* definition of split that makes it as well a method on the separator. > > Let's not. > > Splitting is not something that you on the separator, it's something you do on the source string. I'm sure you wouldn't expect this: Much as joining, that's a completely arbitrary decision. Python's take is that you split on a source and join on a separator, most APIs I've seen so far agree on the former but not on the latter. And Python has an API which splits on the separator anyway: re.RegexObject is not a value you can provide to str.split() as far as I know (whereas in Ruby String#split can take a string or a regex indifferently, so it's coherent in that it always splits on the source string, never on the separator). From guido at python.org Tue Oct 26 18:56:41 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Oct 2010 09:56:41 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: On Tue, Oct 26, 2010 at 3:36 AM, Nick Coghlan wrote: > On Tue, Oct 26, 2010 at 1:14 PM, Guido van Rossum wrote: >> On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm wrote: >>> Throwing and catching GeneratorExit is not common, and according to some >>> shouldn't be used for this purpose at all. >> >> Well, *throwing* it is close()'s job. And *catching* it ought to be >> pretty rare. Maybe this idiom would be better: >> >> def sum(): >> ?total = 0 >> ?try: >> ? ?while True: >> ? ? ?value = yield >> ? ? ?total += value >> ?finally: >> ? ?return total > Rereading my previous post that Jacob linked, I'm still a little > uncomfortable with the idea of people deliberately catching > GeneratorExit to turn it into a normal value return to be reported by > close(). That said, I'm even less comfortable with the idea of > encouraging the moral equivalent of a bare except clause :) My bad. I should have stopped at "except GeneratorExit: return total". > I see two realistic options here: > > 1. Use GeneratorExit for this, have g.close() return a value and I > (and others that agree with me) just get the heck over it. This is still my preferred option. > 2. Add a new GeneratorReturn exception and a new g.finish() method > that follows the same basic algorithm Guido suggested, only with a > different exception type: > > class GeneratorReturn(Exception): # Note: ordinary exception, unlike > GeneratorExit > ?pass > > def finish(gen): > ?try: > ? gen.throw(GeneratorReturn) > ? raise RuntimeError("Generator ignored GeneratorReturn") > ?except StopIteration as err: > ? if err.args: > ? ? return err.args[0] > ?except GeneratorReturn: > ? pass > ?return None IMO there are already too many special exceptions and methods. > (Why "finish" as the suggested name for the method? I'd prefer > "return", but that's a keyword and "return_" is somewhat ugly. Pairing > GeneratorReturn with finish() is my second choice, for the "OK, time > to wrap things up and complete your assigned task" connotations, as > compared to the "drop everything and clean up the mess" connotations > of GeneratorExit and close()) > > I'd personally be +1 on option 2 (since it addresses the immediate use > case while maintaining appropriate separation of concerns between > guaranteed resource cleanup and graceful completion of coroutines) and > -0 on option 1 (unsurprising, given my previously stated objections to > failing to maintain appropriate separation of concerns). Hm, I guess I'm more in favor of minimal mechanism. The clincher for me is pretty much that the extended g.close() semantics are a very simple mod to the existing gen_close() function in genobject.c -- it currently always returns None but could very easily be changed to extract the return value from err.args when it catches StopIteration (but not GeneratorExit). it also looks like my proposal doesn't get in the way of anything -- if the generator doesn't catch GeneratorExit g.close() will return None, and if the caller of g.close() doesn't expect a value, they can just ignore it. Finally note that this still looks like a relatively esoteric use case: when using "var = yield from generator()" the the return value from the generator (written as "return X" and implemented as "raise StopIteration(X)") will automatically be delivered to var, and there's no need to call g.close(). In this case there is also no reason for the generator to catch GeneratorExit -- that is purely needed for the idiom of writing "inside-out iterators" using this pattern in the generator (as I mentioned on the parent thread): try: while True: value = yield except GeneratorExit: raise StopIteration() # Or "return " in PEP 380 syntax Now, if I may temporarily go into wild-and-crazy mode (this *is* python-ideas after all :-), we could invent some ad-hoc syntax for this pattern, e.g.: for value in yield: return IOW the special form: for in yield: would translate into: try: while True: = yield except GeneratorExit: pass If (and this is a big if) the while-True-yield-inside-try-except-GeneratorExit pattern somehow becomes popular we could reconsider this syntactic extension or some variant. (I have to add that the syntactic ice is a bit thin here, since "for in (yield)" already has a meaning, and a totally different one of course. A variant could be "for from yield" or some other abuse of keywords. But let me stop here before people think I've just volunteered my retirement... :-) > (I should note that this differs from the previous suggestion of a > GeneratorReturn exception in the context of PEP 380. Those suggestions > were to use it as a replacement for StopIteration when a generator > contained a return statement. The suggestion here is to instead use it > as a replacement for GeneratorExit in order to request > prompt-but-graceful completion of a generator rather than just bailing > out immediately). Noted. -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Oct 26 19:01:50 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Oct 2010 10:01:50 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC6C7F3.6090405@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6C7F3.6090405@improva.dk> Message-ID: On Tue, Oct 26, 2010 at 5:22 AM, Jacob Holm wrote: [...] >>> Here's a stupid idea... let g.close take an optional argument that it >>> can return if the generator is already exhausted and let it return the >>> value from the StopIteration otherwise. >>> >>> def close(self, default=None): >>> ? ?if self.gi_frame is None: >>> ? ? ? ?return default >>> ? ?try: >>> ? ? ? ?self.throw(GeneratorExit) >>> ? ?except StopIteration as e: >>> ? ? ? ?return e.args[0] >>> ? ?except GeneratorExit: >>> ? ? ? ?return None >>> ? ?else: >>> ? ? ? ?raise RuntimeError('generator ignored GeneratorExit') >> >> You'll have to explain why None isn't sufficient. > It is not really necessary, but seemed "cleaner" somehow. ?Think of > "g.close(default)" as "get me the result if possible, and this default > otherwise". ?Then think of dict.get()... Hm, I'd say there always is a result -- it just sometimes is None. I really don't want to make distinctions between falling off the end of the function, "return" without a value, "return None", "raise StopIteration()", "raise StopIteration(None)", or even (in response to a close() request) "raise GeneratorExit". > You mentioned some possible extensions though. ?At a guess, at least > some of these would benefit greatly from the use of generators. ?Maybe > such an extension would be a better example? Yes, see the avg() example I posted in the parent thread. -- --Guido van Rossum (python.org/~guido) From cesare.di.mauro at gmail.com Tue Oct 26 19:22:32 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Tue, 26 Oct 2010 19:22:32 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <4CC6E7A9.4030205@egenix.com> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> Message-ID: 2010/10/26 M.-A. Lemburg > Cesare Di Mauro wrote: > > 2010/10/26 M.-A. Lemburg > > > > I was referring to the solution (which I prefer) that I proposed > answering > > to Greg, two days ago. > > > > Unfortunately, the stack must be used whatever the solution we will use. > > > > Pushing the "final" tuple and/or dictionary is a possible optimization, > but > > we can use it only when we have a tuple or dict of constants; otherwise > we > > need to use the stack. > > > > Good case: f(1, 2, 3, a = 1, b = 2) > > We can push (1, 2, 3) tuple and {'a' : 1, 'b' : 2}, then calling f with > > CALL_FUNCTION_VAR_KW opcode passing narg = nkarg = 0. > > > > Worst case: f(1, x, 3, a = x, b = 2) > > We can't push the tuple and dict as a whole, because they need first to > be > > built using the stack. > > > > The good case is possible, and I have already done some work in wpython > > collecting constants on parameters push (even partial constant > sequences), > > but some additional work must be done recognizing costants-only tuple / > > dict. > > > > However, the worst case rest unresolved. > > I don't understand. What is the difference between pushing values > on the stack and building a tuple/dict and then pushing those on > the stack ? > > In your worst case example, the compiler would first build > a tuple/dict using the args already on the stack (BUILD_TUPLE, > BUILD_MAP) and then call the function with this tuple/dict > combination - you'd basically move the tuple/dict building to > the compiler rather than having the CALL* opcodes do this > internally. > > It would essentially run: > > f(*(1,x,3), **{'a':x, 'b':2}) > > and bypass the "max. number of opcode args" limit without > degrading performance, since BUILD_TUPLE et al. essentially > run the same code for building the call arguments as the > helpers for calling a function. > > -- > Marc-Andre Lemburg > Yes, the idea is to let the compiler emit proper code to build the tuple/dict, instead of using the CALL_* to do it, in order to bypass the current limits. That's if we don't want to change the current CALL_* behavior, so speeding up the common cases and introducing a slower (but working) path for the uncommon ones. Another solution can be to introduce a specific opcode, but I don't see it well if the purpose is just to permit more than 255 arguments. At this time I have no other ideas to solve this problem. Please, let me know if there's interest on a new patch to implement the "compiler-based" solution. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Oct 26 19:33:53 2010 From: guido at python.org (Guido van Rossum) Date: Tue, 26 Oct 2010 10:33:53 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC6E94F.3090702@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Tue, Oct 26, 2010 at 7:44 AM, Jacob Holm wrote: [...] > I like this. ?Having a separate function lets you explicitly request a > return value and making it fail loudly when called on an exhausted > generator feels just right given the prohibition against saving the > "True" return value anywhere. ?Also, using a different exception lets > the generator distinguish between the "close" and "finish" cases, and > making it an ordinary exception makes it clear that it is *intended* to > be caught. ?All good stuff. I don't know. There are places where failing loudly is the right thing to do (1 + 'a'). But when it comes to return values Python takes a pretty strong position that there's no difference between functions and procedures, that "return", "return None" and falling off the end all mean the same thing, and that it's totally fine to ignore a value or to return a value that will be of no interest for most callers. > I am not sure that returning None when finish() cathes GeneratorReturn > is a good idea though. ?If you call finish on a generator you expect it > to do something about it and return a value. ?If the GeneratorReturn > escapes, it is a sign that the generator was not written to expect this > and so it likely an error. ?OTOH, I am not sure it always is so maybe > allowing it is OK. ?I just don't know. > > How does it fit with the current PEP 380, and esp. the refactoring > principle? ? It seems like we need to special-case the GeneratorReturn > exception somehow. ?Perhaps like this: > > [...] > ?try: > ? ? ?_s = yield _y > + except GeneratorReturn as _e: > + ? ? try: > + ? ? ? ? _m = _i.finish > + ? ? except AttributeError: > + ? ? ? ? raise _e ?# XXX RuntimeError? > + ? ? raise YieldFromFinished(_m()) > ?except GeneratorExit as _e: > [...] > > Where YieldFromFinished inherits from GeneratorReturn, and has a 'value' > attribute like the new StopIteration. > > Without something like this a function that is written to work with > "finish" is unlikely to be refactorable. ? With this, the trivial case > of perfect delegation can be written as: > > def outer(): > ? ?try: > ? ? ? ?return yield from inner() > ? ?except YieldFromFinished as e: > ? ? ? ?return e.value > > and a slightly more complex case... > > def outer2(): > ? ?try: > ? ? ? ?a = yield from innerA() > ? ?except YieldFromFinished as e: > ? ? ? ?return e.value > ? ?try: > ? ? ? ?b = yield from innerB() > ? ?except YieldFromFinished as e: > ? ? ? ?return a+e.value > ? ?return a+b > > the "outer2" example shows why the special casing is needed. ?If > outer2.finish() is called while outer2 is suspended in innerA, a > GeneratorReturn would be thrown directly into innerA. ?Since innerA is > supposed to be expecting this, it returns a value immediately which > would then be the return value of the yield-from. ?outer2 would then > erroneously continue to the "b = yield from innerB()" line, which unless > innerB immediately raised StopIteration would yield a value causing the > outer2.finish() to raise a RuntimeError... > > We can avoid the extra YieldFromFinished exception if we let the new > GeneratorReturn exception grow a value attribute instead and use it for > both purposes. ?But then the distinction between a GeneratorReturn that > is thrown in by "finish" (which has no associated value) and the > GeneratorReturn raised by the yield-from (which has) gets blurred a bit. > > Another idea is to actually replace YieldFromFinished with StopIteration > or a GeneratorReturn inheriting from StopIteration. ?That would mean we > could drop the first try-except block in each of the above example > generators because the "finished" result from the inner function is > returned directly anyway. ?On the other hand, that could easily lead to > subtle bugs if you forget a try...except block that is actually needed, > like the second block in outer2. I'm afraid that all was too much to really reach my brain, which keeps telling me "he's commenting on Nick's proposal which I've already rejected". > A different way to handle this would be to change the PEP 380 expansion > as follows: > > [...] > - except GeneratorExit as _e: > + except (GeneratorReturn, GeneratorExit) as _e: > [...] That just strikes me as one more reason why a separate GeneratorReturn is a bad idea. In my ideal world, you almost never need to catch or raise StopIteration; you don't raise GeneratorExit (that is close()'s job) but you catch it to notice that your data source is finished, and then you return a value. (And see my crazy idea in my previous post to get rid of that too. :-) > What this means is that only the outermost generator would see the > GeneratorReturn. ?If the outermost generator is suspended using > yield-from, and finish() is called. ?The inner generator is simply > closed and the GeneratorReturn re-raised. ?This version is only really > useful for delegating to generators that *don't* return a value, but it > is simpler and at least it allows *some* use of yield-from with "finish". > > >> (Why "finish" as the suggested name for the method? I'd prefer >> "return", but that's a keyword and "return_" is somewhat ugly. Pairing >> GeneratorReturn with finish() is my second choice, for the "OK, time >> to wrap things up and complete your assigned task" connotations, as >> compared to the "drop everything and clean up the mess" connotations >> of GeneratorExit and close()) > > I like the names. ?GeneratorFinish might work as well for the exception, > but I like GeneratorReturn better for its connection with "return". > > >> >> I'd personally be +1 on option 2 (since it addresses the immediate use >> case while maintaining appropriate separation of concerns between >> guaranteed resource cleanup and graceful completion of coroutines) and >> -0 on option 1 (unsurprising, given my previously stated objections to >> failing to maintain appropriate separation of concerns). >> > > I agree the "finish" idea looks far better for generators without > yield-from. ?It is unfortunate that extending it to work with yield-from > isn't prettier that it is though. > > > >> (I should note that this differs from the previous suggestion of a >> GeneratorReturn exception in the context of PEP 380. Those suggestions >> were to use it as a replacement for StopIteration when a generator >> contained a return statement. The suggestion here is to instead use it >> as a replacement for GeneratorExit in order to request >> prompt-but-graceful completion of a generator rather than just bailing >> out immediately). > > I agree the name fits this use better than the original. ?Too bad some > of my suggestions above are starting to blur the line between > GeneratorReturn and StopIteration again. So now I'm even more convinced that it's not worth it... -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Tue Oct 26 19:44:53 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Oct 2010 19:44:53 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> Message-ID: <20101026194453.46d42bf9@pitrou.net> On Tue, 26 Oct 2010 19:22:32 +0200 Cesare Di Mauro wrote: > > At this time I have no other ideas to solve this problem. > > Please, let me know if there's interest on a new patch to implement the > "compiler-based" solution. Have you timed the EXTENDED_ARG solution? Regards Antoine. From tjreedy at udel.edu Tue Oct 26 19:55:30 2010 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 26 Oct 2010 13:55:30 -0400 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> <4CC6A933.3080605@pearwood.info> Message-ID: On 10/26/2010 11:43 AM, Guido van Rossum wrote: > You're right, the point I wanted to prove was that generators are > better than threads, but the code was based on emulating reduce(). The > generalization that I was aiming for was that it is convenient to > write a generator that does some kind of computation over a sequence > of items and returns a result at the end, and then have a driver that > feeds a single sequence to a bunch such generators. This is more > powerful once you try to use reduce to compute e.g. the average of the > numbers fed to it -- of course you can do it using a function of > (state, value) but it is so much easier to write as a loop! (At least > for me -- if you do nothing but write Haskell all day I'm sure it > comes naturally. :-) > > def avg(): > total = 0 > count = 0 > try: > while True: > value = yield > total += value > count += 1 > except GeneratorExit: > raise StopIteration(total / count) The more traditional pull or grab (rather than push receive) version is def avg(it): total = 0 count = 0 for value in it: total += value count += 1 return total/count > The essential boilerplate here is > > try: > while True: > value = yield > > except GeneratorExit: > raise StopIteration() with corresponding boilersplate. I can see that the receiving generator version would be handy when you do not really want to package the producer into an iterator (perhaps because items are needed for other purposes also) and want to send items to the averager as they are produced, from the point of production. > No doubt functional aficionados will snub this, but in Python, this > should run much faster than the same thing written as a reduce-ready > function, due to the function overhead (which wasn't a problem in the > min/max example since those are built-ins). > > BTW This episode led me to better understand my objection against > reduce() as the universal hammer: my problem with writing avg() using > reduce is that the function one feeds into reduce is asymmetric -- its > first argument must be some state, e.g. a tuple (total, count), and > the second argument must be the next value. Not hard: def update(pair, item): return pair[0]+1, pair[1]+item > This is the moment that my > head reliably explodes -- even though it has no problem visualizing > reduce() using a *symmetric* function like +, min or max. > > Also note that the reduce() based solution would have to have a > separate function to extract the desired result (total / count) from > the state (total, count), and for multi_reduce() you would have to > supply a separate list of functions for these or some other hacky > approach. Reduce is extremely important as concept: any function of a sequence (or collection arbitrarily ordered) can be written as a post-processed reduction. In practice, at least for Python, it is better thought of as wide-spread template pattern, such as the boilerplate above, than just as a function. This is partly because Python does not have general function expressions (and should not!) and also because Python does have high function call overhead (because of signature flexibility). -- Terry Jan Reedy From bborcic at gmail.com Tue Oct 26 19:59:17 2010 From: bborcic at gmail.com (Boris Borcic) Date: Tue, 26 Oct 2010 19:59:17 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <4CC6FD33.4050305@pearwood.info> References: <20101025154932.06be2faf@o> <4CC6FD33.4050305@pearwood.info> Message-ID: Steven D'Aprano wrote: > Boris Borcic wrote: > >> And let's then propagate that notion, to a *coherent* definition of >> split that makes it as well a method on the separator. > > Let's not. > > Splitting is not something that you on the separator, it's something you > do on the source string. I'm sure you wouldn't expect this: > > ":".find("key:value") > => 3 To be honest, my test for this type of questions is how likely I find myself using the bound method outside of immediate method call syntax, and I'd say having a specialized callable that will find specific content in whatever future argument, is more likely than the converse callable that will find occurences of whatever future argument in a fixed string. YMMV > > Nor should we expect this: > > ":".split("key:value") > => ["key", "value"] > > > You perform a search *on* the source string, not the target substring. > Likewise you split the source string, not the separator. To me, this sounds like giving too much weight to english language intuition. What really counts is not how it gets to be said in good english, but rather - what's the variable/object/value that, in the context of the action, tends to be the most stable focus of attention. And remember that most speakers of E as a second language, never become fully comfortable with E prepositions. Cheers, BB From brett at python.org Tue Oct 26 20:01:42 2010 From: brett at python.org (Brett Cannon) Date: Tue, 26 Oct 2010 11:01:42 -0700 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On Tue, Oct 26, 2010 at 07:56, Guido van Rossum wrote: > On Tue, Oct 26, 2010 at 7:33 AM, Dirkjan Ochtman wrote: >> On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky >> wrote: >>> What do you think? >> >> After browsing through the Demo dir a bit, I came away thinking most >> of these should just be removed from the repository. > > +1. Most of them are either quick hacks I once wrote and didn't know > where to put (dutree.py, repeat.py come to mind) or in some cases > contributed 3rd party code that was looking for a home. I think that > all of these ought to live somewhere else and I have no problem with > tossing out the entire Demo and Tools directories -- anything that's > not needed as part of the build should go. (Though a few things might > indeed be moved into the stdlib if they are useful enough.) Just to toss in my +1, I have suggested doing this before and received push-back in the form of "it isn't hurting anyone". But considering how often the idea of trying to fix the directory comes up and never occurs, it is obviously wasting people's time keeping the directories around. So I say move the stuff needed as part fo the build or dev process (e.g., patchcheck is in the Tools directory) and then drop the directory. We can give a deadline of some release like Python 3.2b1 or b2 to move scripts people care about, and then simply do a mass deletion just before cutting a release. -Brett > >> I think there's >> enough demo material out there on the internet (for example in the >> cookbook), a lot of it of higher quality than what we have in the Demo >> dir right now. Maybe it makes sense to have a basic tkinter app to get >> you started. And some of the smaller functions or classes could >> possibly be used in the documentation. But as it is, it seems silly to >> waste developer time on stuff that few people look at or make use of >> (I'm assuming this from the fact that they have previously been >> neglected). > > None of that belongs in the core distro any more. > >> Back to the original question: I don't think moving the Demo stuff to >> the Lib dir is a good idea, simply because the Lib dir should contain >> libraries, not applications or scripts. Writing a section for the >> documentation seems a better way to solve the discoverability problem, >> testing could be done even in the Demo dir (with more structure if >> need be), and quality control could just as well be exercised in the >> current location. > > If there are demos that are useful for testing, move them into Lib/test/. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From raymond.hettinger at gmail.com Tue Oct 26 20:26:35 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 26 Oct 2010 11:26:35 -0700 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: <171F8E33-A726-4955-A8F3-C3CE3829974C@gmail.com> >>> Back to the original question: I don't think moving the Demo stuff to >>> the Lib dir is a good idea, simply because the Lib dir should contain >>> libraries, not applications or scripts. Writing a section for the >>> documentation seems a better way to solve the discoverability problem, ... If any of the demos survive the purge, I agree that they should have their own docs. Otherwise, they might as well be invisible. Raymond From cesare.di.mauro at gmail.com Tue Oct 26 22:30:16 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Tue, 26 Oct 2010 22:30:16 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <20101026194453.46d42bf9@pitrou.net> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> <20101026194453.46d42bf9@pitrou.net> Message-ID: 2010/10/26 Antoine Pitrou > On Tue, 26 Oct 2010 19:22:32 +0200 > Cesare Di Mauro > wrote: > > > > At this time I have no other ideas to solve this problem. > > > > Please, let me know if there's interest on a new patch to implement the > > "compiler-based" solution. > > Have you timed the EXTENDED_ARG solution? > > Regards > > Antoine. I made some a few minutes ago, and the results are unbelievable and counter-intuitive on my machine (Athlon64 2800+ socket 754, 2GB DDR 400, Windows 7 x64, Python 3.2a3 32 bits running at high priority): python.exe -m timeit -r 1 -n 100000000 -s "def f(): pass" "f()" Standard : 100000000 loops, best of 1: 0.348 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.341 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z): pass" "f(1, 2, 3)" Standard : 100000000 loops, best of 1: 0.452 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.451 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(a = 1, b = 2, c = 3): pass" "f(a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.578 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.556 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z, a = 1, b = 2, c = 3): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.761 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.739 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args): pass" "f(1, 2, 3)" Standard : 100000000 loops, best of 1: 0.511 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.508 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(**Keys): pass" "f(a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 0.789 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.784 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" Standard : 100000000 loops, best of 1: 1.01 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 1.01 usec per loop python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f()" Standard : 100000000 loops, best of 1: 0.393 usec per loop EXTENDED_ARG: 100000000 loops, best of 1: 0.41 usec per loop I really can't explain it. Ouch! Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 26 22:39:00 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Oct 2010 22:39:00 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> <20101026194453.46d42bf9@pitrou.net> Message-ID: <1288125540.3547.22.camel@localhost.localdomain> > [snip lots of timeit results comparing unpatched and EXTENDED_ARG] > > I really can't explain it. Ouch! What do you mean exactly? There's no significant change at all. From cesare.di.mauro at gmail.com Tue Oct 26 22:58:52 2010 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Tue, 26 Oct 2010 22:58:52 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: <1288125540.3547.22.camel@localhost.localdomain> References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> <20101026194453.46d42bf9@pitrou.net> <1288125540.3547.22.camel@localhost.localdomain> Message-ID: 2010/10/26 Antoine Pitrou > > > [snip lots of timeit results comparing unpatched and EXTENDED_ARG] > > > > I really can't explain it. Ouch! > > What do you mean exactly? There's no significant change at all. > I cannot explain why the unpatched version was slower than the patched one most of the times. I find it silly and illogical. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Tue Oct 26 23:04:30 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 26 Oct 2010 23:04:30 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> <20101026194453.46d42bf9@pitrou.net> <1288125540.3547.22.camel@localhost.localdomain> Message-ID: <1288127070.3547.25.camel@localhost.localdomain> > I cannot explain why the unpatched version was slower than the patched > one most of the times. It just looks like measurement noise or, at worse, the side effect of slightly different code generation by the compiler. I don't think a ?1% variation on a desktop computer can be considered significant. (which means that the patch reaches its goal of not decreasing performance, anyway :-)) Regards Antoine. From g.brandl at gmx.net Tue Oct 26 23:04:47 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 26 Oct 2010 23:04:47 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> <20101026194453.46d42bf9@pitrou.net> <1288125540.3547.22.camel@localhost.localdomain> Message-ID: Am 26.10.2010 22:58, schrieb Cesare Di Mauro: > 2010/10/26 Antoine Pitrou > > > > > [snip lots of timeit results comparing unpatched and EXTENDED_ARG] > > > > I really can't explain it. Ouch! > > What do you mean exactly? There's no significant change at all. > > > I cannot explain why the unpatched version was slower than the patched one most > of the times. > > I find it silly and illogical. It rather seems that you're seeing statistics, and the impact of the change is not measurable. Nothing silly about it. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From g.brandl at gmx.net Tue Oct 26 23:37:26 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 26 Oct 2010 23:37:26 +0200 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: Am 26.10.2010 20:01, schrieb Brett Cannon: > On Tue, Oct 26, 2010 at 07:56, Guido van Rossum wrote: >> On Tue, Oct 26, 2010 at 7:33 AM, Dirkjan Ochtman wrote: >>> On Tue, Oct 26, 2010 at 16:00, Alexander Belopolsky >>> wrote: >>>> What do you think? >>> >>> After browsing through the Demo dir a bit, I came away thinking most >>> of these should just be removed from the repository. >> >> +1. Most of them are either quick hacks I once wrote and didn't know >> where to put (dutree.py, repeat.py come to mind) or in some cases >> contributed 3rd party code that was looking for a home. I think that >> all of these ought to live somewhere else and I have no problem with >> tossing out the entire Demo and Tools directories -- anything that's >> not needed as part of the build should go. (Though a few things might >> indeed be moved into the stdlib if they are useful enough.) > > Just to toss in my +1, I have suggested doing this before and received > push-back in the form of "it isn't hurting anyone". But considering > how often the idea of trying to fix the directory comes up and never > occurs, it is obviously wasting people's time keeping the directories > around. So I say move the stuff needed as part fo the build or dev > process (e.g., patchcheck is in the Tools directory) and then drop the > directory. We can give a deadline of some release like Python 3.2b1 or > b2 to move scripts people care about, and then simply do a mass > deletion just before cutting a release. I've started a list of Demos and Tools here: https://spreadsheets.google.com/ccc?key=0AherhJVUN_I2dFNQdjNPMFdnOHVpdERSdWxqaXBkWWc&hl=en&authkey=CMWEn84C Please, feel free to complete and argue about fates. I'd like the corresponding actions taken by 3.2b1. (One note about the fate "showcase": it might be nice to keep a minimal set of demos from various topics as a kind of showcase what you can do with a few lines of Python.) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From mal at egenix.com Tue Oct 26 23:46:38 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 26 Oct 2010 23:46:38 +0200 Subject: [Python-ideas] New 3.x restriction on number of keyword arguments In-Reply-To: References: <589C8BF5-F11F-4E10-A7ED-6627EF625E1C@gmail.com> <4CC1D966.2080007@egenix.com> <4CC211EE.1050308@egenix.com> <20101023004508.6a6c1373@pitrou.net> <4CC6164B.5040201@trueblade.com> <1288083018.3547.0.camel@localhost.localdomain> <4CC69FFD.7080102@egenix.com> <4CC6AFB4.3040003@egenix.com> <4CC6E7A9.4030205@egenix.com> <20101026194453.46d42bf9@pitrou.net> Message-ID: <4CC74C3E.8080909@egenix.com> Cesare Di Mauro wrote: > 2010/10/26 Antoine Pitrou > >> On Tue, 26 Oct 2010 19:22:32 +0200 >> Cesare Di Mauro >> wrote: >>> >>> At this time I have no other ideas to solve this problem. >>> >>> Please, let me know if there's interest on a new patch to implement the >>> "compiler-based" solution. >> >> Have you timed the EXTENDED_ARG solution? >> >> Regards >> >> Antoine. > > > I made some a few minutes ago, and the results are unbelievable and > counter-intuitive on my machine (Athlon64 2800+ socket 754, 2GB DDR 400, > Windows 7 x64, Python 3.2a3 32 bits running at high priority): > > python.exe -m timeit -r 1 -n 100000000 -s "def f(): pass" "f()" > Standard : 100000000 loops, best of 1: 0.348 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.341 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z): pass" "f(1, 2, > 3)" > Standard : 100000000 loops, best of 1: 0.452 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.451 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(a = 1, b = 2, c = 3): pass" > "f(a = 1, b = 2, c = 3)" > Standard : 100000000 loops, best of 1: 0.578 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.556 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(x, y, z, a = 1, b = 2, c = > 3): pass" "f(1, 2, 3, a = 1, b = 2, c = 3)" > Standard : 100000000 loops, best of 1: 0.761 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.739 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args): pass" "f(1, 2, 3)" > Standard : 100000000 loops, best of 1: 0.511 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.508 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(**Keys): pass" "f(a = 1, b > = 2, c = 3)" > Standard : 100000000 loops, best of 1: 0.789 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.784 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f(1, > 2, 3, a = 1, b = 2, c = 3)" > Standard : 100000000 loops, best of 1: 1.01 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 1.01 usec per loop > > python.exe -m timeit -r 1 -n 100000000 -s "def f(*Args, **Keys): pass" "f()" > Standard : 100000000 loops, best of 1: 0.393 usec per loop > EXTENDED_ARG: 100000000 loops, best of 1: 0.41 usec per loop > > I really can't explain it. Ouch! Looks like a good solution to the problem - no performance loss and a much higher limit on the number of arguments. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 26 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Wed Oct 27 00:14:14 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 27 Oct 2010 08:14:14 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Wed, Oct 27, 2010 at 3:33 AM, Guido van Rossum wrote: > On Tue, Oct 26, 2010 at 7:44 AM, Jacob Holm wrote: >> A different way to handle this would be to change the PEP 380 expansion >> as follows: >> >> [...] >> - except GeneratorExit as _e: >> + except (GeneratorReturn, GeneratorExit) as _e: >> [...] > > That just strikes me as one more reason why a separate GeneratorReturn > is a bad idea. > > In my ideal world, you almost never need to catch or raise > StopIteration; you don't raise GeneratorExit (that is close()'s job) > but you catch it to notice that your data source is finished, and then > you return a value. (And see my crazy idea in my previous post to get > rid of that too. :-) Jacob's "implications for PEP 380" exploration started to give me some doubts, but I think there are actually some flaws in his argument. Accordingly, I would like to make one more attempt at explaining why I think throwing in a separate exception for this use case is valuable (and *doesn't* require any changes to PEP 380). As I see it, there's a bit of a disconnect between many PEP 380 use cases and any mechanism or idiom which translates a thrown in exception into an ordinary StopIteration. If you expect your thrown in exception to always terminate the generator in some fashion, adopting the latter idiom in your generator will make it potentially unsafe to use in a "yield from" expression that isn't the very last yield operation in any outer generator. Consider the following: def example(arg): try: yield arg except GeneratorExit return "Closed" return "Finished" def outer_ok1(arg): # close() after next() returns "Closed" return yield from example(arg) def outer_ok2(arg): # close() after next() returns None yield from example(arg) def outer_broken(arg): # close() after next() gives RuntimeError val = yield from example(arg) yield val # All 3 cases: close() before next() returns None # All 3 cases: close() after 2x next() returns None Using close() to say "give me your return value" creates the risk of hitting those runtime errors in a generator's __del__ method, and exceptions in __del__ are always a bit ugly. Keeping the "give me your return value" and "clean up your resources" concerns separate by adding a new method and thrown exception means that close() is less likely to unpredictably raise RuntimeError (and when it does, will reliably indicate a genuine bug in a generator somewhere that is suppressing GeneratorExit). As far as PEP 380's semantics go, I think it should ignore the existence of anything like GeneratorReturn completely. Either one of the generators in the chain will catch the exception and turn it into StopIteration, or they won't. If they convert it to StopIteration, and they aren't the last generator in the chain, then maybe what actually needs to happen at the outermost level is something like this: class GeneratorReturn(Exception): pass def finish(gen): try: gen.throw(GeneratorReturn) # Ask generator to wrap things up except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass else: # Asking nicely didn't work, so force resource cleanup # and treat the result as if the generator had already # been exhausted or hadn't started yet gen.close() return None Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Oct 27 00:28:07 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 27 Oct 2010 08:28:07 +1000 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: On Wed, Oct 27, 2010 at 12:24 AM, Boris Borcic wrote: > On Tue, Oct 26, 2010 at 2:35 PM, Nick Coghlan wrote: >> But split only makes sense for strings, not arbitrary sequences. It's >> the other way around for join. > > I don't feel your "the other way around" makes clear sense. Indeed, I realised my comment was slightly ambigous some time after I posted it. "the other way around" refers to the English sentence, not to the Python parameter order (i.e. join makes sense for arbitrary sequences, not just strings). If you're looking for the relevant piece of the Zen here, it's "practicality beats purity". string.join used to be used primarily as a function, but people had trouble remembering the parameter order. Locking it in as a str method on the separator made the argument order easier to remember at the cost of making it somewhat unintuitive to learn in the first place (making it a method of the sequence being joined was not an option, since join accepts arbitrary iterables). Absolutely nothing has changed in the intervening years to affect the rationale of that decision, so you can rail against it all you want (with some justification) but you aren't going to change it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From kristjan at ccpgames.com Wed Oct 27 06:02:11 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Wed, 27 Oct 2010 12:02:11 +0800 Subject: [Python-ideas] ExternalMemory Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB5E@exchcn.ccp.ad.local> I find myself sometimes, when writing IO Code from C, wanting to pass memory that I have allocated internally and filled with data, to Python without copying it into a string object. To this end, I have locally created a (2.x) method called PyBuffer_FromMemoryAndDestructor(), which is the same as PyBuffer_FromMemory() except that it will call a provided destructor function with an optional arg, to release the memory address given, when no longer in use. First of all, I'd futilely like to suggest this change for 2.x. The existing PyBuffer_FromMemory() provides no lifetime management. Second, the ByBuffer object doesn't support the new Py_buffer interface, so you can't really use this then, like putting a memoryview around it. This is a fixable bug, otoh. Thirdly, in py3k I think the situation is different. There you would (probably, correct me if I'm wrong) emulate the old PyBuffer_FromMemory with a combination of the new PyBuffer_FromContiguous and a PyMemoryView_FromBuffer(). But this also does not allow any lifetime magement of the external memory. So, for py3k, I'd actually like to extend the Memoryview object, and provide something like PyMemoryView_FromExternal() that takes an optional pointer to a "void destructor(void *arg, void *ptr)) and an (void *arg), to be called when the buffer is released. K -------------- next part -------------- An HTML attachment was scrubbed... URL: From lie.1296 at gmail.com Wed Oct 27 07:07:39 2010 From: lie.1296 at gmail.com (Lie Ryan) Date: Wed, 27 Oct 2010 16:07:39 +1100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> Message-ID: On 10/26/10 05:53, Guido van Rossum wrote: >> Guido van Rossum wrote: > [...] >>> This should not require threads. >>> >>> Here's a bare-bones sketch using generators: > [...] > > On Mon, Oct 25, 2010 at 10:10 AM, Peter Otten <__peter__ at web.de> wrote: >> I don't think the generator-based approach is equivalent to what Lie Ryan's >> threaded code does. You are calling max(a, b) 99 times while Lie calls >> max(items) once. > > True. Nevertheless, my point stays: you shouldn't have to use threads > to do such concurrent computations over a single-use iterable. Threads > too slow and since there is no I/O multiplexing they don't offer > advantages. > >> Is it possible to calculate min(items) and max(items) simultaneously using >> generators? I don't see how... > > No, this is why the reduce-like approach is better for such cases. > Otherwise you keep trying to fit a square peg into a round hold. except the max(a, b) is an attempt to find square hole to fit the square peg, and the max([a]) attempt is trying to find a round peg to fit the round hole with. From lie.1296 at gmail.com Wed Oct 27 07:19:31 2010 From: lie.1296 at gmail.com (Lie Ryan) Date: Wed, 27 Oct 2010 16:19:31 +1100 Subject: [Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence In-Reply-To: <4CC6A933.3080605@pearwood.info> References: <1ADBA76FE1F94E2CAA5211B04E950B48@PTMCG> <201010111017.56101.steve@pearwood.info> <4CB7B7C2.8090401@ronadam.com> <4CB88BD2.4010901@ronadam.com> <4CB898CD.6000207@ronadam.com> <4CB8B2F5.2020507@ronadam.com> <4CC6A933.3080605@pearwood.info> Message-ID: On 10/26/10 21:10, Steven D'Aprano wrote: > def multi_reduce(iterable, funcs): > it = iter(iterable) > collectors = [next(it)]*len(funcs) > for i, f in enumerate(funcs): > x = next(it) > collectors[i] = f(collectors[i], x) > return collectors > > I've called it multi_reduce rather than parallel_reduce, because it > doesn't execute the functions in parallel. By my testing on Python > 3.1.1, multi_s designed for functions with the signature `func([object])` (a function that takes, as argument, a list of objects). I believereduce is consistently ~30% faster than the generator based > solution for lists with 1000 - 10,000,000 items. > > So what am I missing? What does your parallel_reduce give us that > multi_reduce doesn't? The parallel_reduce() is specifically designed for for functions with the signature `func([object])` (a function that takes, as argument, a list of objects). The idea is that, you can write your func() iteratively, and parallel_reduce() will somehow handle splitting work into multiple funcs, as if you tee() the iterator, but without caching the whole iterable. Maybe max and min is a bad example, as it happens to be the case that max and min have the alternative signature `func(int, int)` which makes it a better fit with the traditional reduce() approach (and as it happens to be, parallel_reduce() seems to be a bad name as well, since it's not related to the traditional reduce() in any way). And I believe you do miss something: >>> multi_reduce([1, 2, 3], [max, min]) [2, 1] >>> parallel_reduce([1, 2, 3], [max, min]) [3, 1] I don't think that my original attempt with threading is an ideal solution either, as Guido stated, it's too complicated for such a simple problem. cProfile even shows that 30% of the its time is spent waiting on acquiring locks. The ideal solution would probably require a way for a function to interrupt its own execution (when the teed iterator is exhausted, but there is still some item in iterable), let other part of code continues (the iterator feeder, and other funcs), and then resume where it was left off (which is why I think cofunction is probably the way to go, assuming I understand correctly what cofunction is). In diagram: Initially, producer creates a few funcs, and feeds them a suspendable teed-iterators: +--------+ +------------+ true iterator +---| func_1 |---| iterator_1 |--[1, ...] [1, 2, ..] | +--------+ +------------+ | | +**********+ | +--------+ +------------+ * producer *----+---| func_2 |---| iterator_2 |--[1, ...] +**********+ | +--------+ +------------+ | | +--------+ +------------+ +---| func_3 |---| iterator_3 |--[1, ...] +--------+ +------------+ First, func_1 is executed, and iterator_1 produce item 1: +********+ +************+ true iterator +---* func_1 *---* iterator_1 *--[*1*, ...] [*1*, 2, ..] | +********+ +************+ | | +----------+ | +--------+ +------------+ | producer |----+---| func_2 |---| iterator_2 |--[1, ...] +----------+ | +--------+ +------------+ | | +--------+ +------------+ +---| func_3 |---| iterator_3 |--[1, ...] +--------+ +------------+ then iterator_1 suspends execution, giving control back to producer: +========+ +============+ true iterator +---| func_1 |---| iterator_1 |--[...] [*1*, 2, ..] | +========+ +============+ | | +**********+ | +--------+ +------------+ * producer *----+---| func_2 |---| iterator_2 |--[1, ...] +**********+ | +--------+ +------------+ | | +--------+ +------------+ +---| func_3 |---| iterator_3 |--[1, ...] +--------+ +------------+ Then, producer give execution to func_2: +========+ +============+ true iterator +---| func_1 |---| iterator_1 |--[...] [*1*, 2, ..] | +========+ +============+ | | +----------+ | +********+ +************+ | producer |----+---* func_2 *---* iterator_2 *--[*1*, ...] +----------+ | +********+ +************+ | | +--------+ +------------+ +---| func_3 |---| iterator_3 |--[1, ...] +--------+ +------------+ func_2 processes item 1, then iterator_2 suspends and give control back to producer: +========+ +============+ true iterator +---| func_1 |---| iterator_1 |--[...] [*1*, 2, ..] | +========+ +============+ | | +**********+ | +========+ +============+ * producer *----+---| func_2 |---| iterator_2 |--[...] +**********+ | +========+ +============+ | | +--------+ +------------+ +---| func_3 |---| iterator_3 |--[1, ...] +--------+ +------------+ and now it's func_3's turn: +========+ +============+ true iterator +---| func_1 |---| iterator_1 |--[...] [*1*, 2, ..] | +========+ +============+ | | +----------+ | +========+ +============+ | producer |----+---| func_2 |---| iterator_2 |--[...] +----------+ | +========+ +============+ | | +********+ +************+ +---* func_3 *---* iterator_3 *--[*1*, ...] +********+ +************+ func_3 processes item 1, then iterator_3 suspends and give control back to producer: +========+ +============+ true iterator +---| func_1 |---| iterator_1 |--[...] [*1*, 2, ..] | +========+ +============+ | | +**********+ | +========+ +============+ * producer *----+---| func_2 |---| iterator_2 |--[...] +**********+ | +========+ +============+ | | +========+ +============+ +---| func_3 |---| iterator_3 |--[...] +========+ +============+ all funcs already consumed item 1, so producer advances (next()-ed)the "true iterator", and feeds it to the teed-iterator. +========+ +============+ true iterator +---| func_1 |---| iterator_1 |--[2, ...] [*2*, 3, ..] | +========+ +============+ | | +**********+ | +========+ +============+ * producer *----+---| func_2 |---| iterator_2 |--[2, ...] +**********+ | +========+ +============+ | | +========+ +============+ +---| func_3 |---| iterator_3 |--[2, ...] +========+ +============+ then producer resumes func_1, and it processes item 2: +********+ +************+ true iterator +---* func_1 *---* iterator_1 *--[*2*, ...] [*2*, 3, ..] | +********+ +************+ | | +----------+ | +========+ +============+ | producer |----+---| func_2 |---| iterator_2 |--[2, ...] +----------+ | +========+ +============+ | | +========+ +============+ +---| func_3 |---| iterator_3 |--[2, ...] +========+ +============+ then the same thing happens to func_2 and func_3; and repeat this until the "true iterator" is exhausted. When the true iterator is exhausted, producer signals iterator_1, iterator_2, and iterator_3 so they raises StopIteration, causing func_1, func_2, and func_3 to return a result. And producer collects the result into a list and return to the result to its caller. Basically, it is a form of cooperative multithreading where iterator_XXX (instead of func_xxx) decides when to suspend the execution of func_XXX (in this particular case, when its own cache is exhausted, but there is still some item in the true iterator). The advantage is that func_1, func_2, and func_3 can be written iteratively (i.e. as func([object])), as opposed to reduce-like approach. If performance is important, iterator_xxx can feed multiple items to func_xxx before suspending. Also, it should require no locking as object sharing and suspending execution is controlled by iterator_xxx (instead of the indeterministic preemptive threading). From denis.spir at gmail.com Wed Oct 27 09:10:28 2010 From: denis.spir at gmail.com (spir) Date: Wed, 27 Oct 2010 09:10:28 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <4CC6FD33.4050305@pearwood.info> References: <20101025154932.06be2faf@o> <4CC6FD33.4050305@pearwood.info> Message-ID: <20101027091028.11322756@o> On Wed, 27 Oct 2010 03:09:23 +1100 Steven D'Aprano wrote: > Boris Borcic wrote: > > > And let's then propagate that notion, to a *coherent* definition of > > split that makes it as well a method on the separator. > > Let's not. > > Splitting is not something that you on the separator, it's something you > do on the source string. I'm sure you wouldn't expect this: > > ":".find("key:value") > => 3 > > Nor should we expect this: > > ":".split("key:value") > => ["key", "value"] > > > You perform a search *on* the source string, not the target substring. > Likewise you split the source string, not the separator. I completely share this view. Also, when one needs to split on multiple seps, repetitive seps, or even more complex separation schemes, it makes even less sense to see split applying on the sep, instead of on the string. Even less when splitting should remove empty parts generated by seps at both end or repeted seps. Note that it's precisely what split() without sep does: >>> s = " some \t little words " >>> s.split() ['some', 'little', 'words'] >>> s.split(' ') ['', 'some', '', 'little', '', '', 'words', '', ''] Finally, in any of such cases, join is _not_ a reverse function for split. split in the general case is not reversable because there is loss of information. It is possible only with a pattern limited to a single sep, no (implicit) repetition, and keeping empty parts at ends. Very fine that python's split semantics is so defined, one cannot think at split as reversible in general (*). Denis (*) Similar rule: one cannot rewrite original code from an AST: there is loss of information. One can only write code in a standard form that has same semantics (hopefully). -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From jh at improva.dk Wed Oct 27 09:57:16 2010 From: jh at improva.dk (Jacob Holm) Date: Wed, 27 Oct 2010 09:57:16 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6C7F3.6090405@improva.dk> Message-ID: <4CC7DB5C.9060304@improva.dk> On 2010-10-26 19:01, Guido van Rossum wrote: > On Tue, Oct 26, 2010 at 5:22 AM, Jacob Holm wrote: > [...] >>>> Here's a stupid idea... let g.close take an optional argument that it >>>> can return if the generator is already exhausted and let it return the >>>> value from the StopIteration otherwise. >>>> >>>> def close(self, default=None): >>>> if self.gi_frame is None: >>>> return default >>>> try: >>>> self.throw(GeneratorExit) >>>> except StopIteration as e: >>>> return e.args[0] >>>> except GeneratorExit: >>>> return None >>>> else: >>>> raise RuntimeError('generator ignored GeneratorExit') >>> >>> You'll have to explain why None isn't sufficient. > >> It is not really necessary, but seemed "cleaner" somehow. Think of >> "g.close(default)" as "get me the result if possible, and this default >> otherwise". Then think of dict.get()... > > Hm, I'd say there always is a result -- it just sometimes is None. I > really don't want to make distinctions between falling off the end of > the function, "return" without a value, "return None", "raise > StopIteration()", "raise StopIteration(None)", or even (in response to > a close() request) "raise GeneratorExit". None of these cover the distinction I am making. I want to distinguish between a non-exhausted and an exhausted generator. When calling close on a non-exhausted generator, the generator decides how to return by any one of the means you mentioned. In this case you are right that there is always a result. When calling close on an exhausted generator, the generator has no choice in the matter as the "true" return value was thrown away. We have to return *something*, but calling it the "result" of the generator is stretching it too far. Making it possible to return something other than None in this case seems to be analogous to dict.get(). If we chose to use a different method (e.g. Nicks "finish") for getting the "result", I would instead raise a RuntimeError when calling it on an exhausted generator. i.o.w, I would want it defined something like this: def finish(self): if self.gi_frame is None: raise RuntimeError('generator already finished') try: self.throw(GeneratorExit) except StopIteration as e: return e.args[0] except GeneratorExit: return None # XXX debatable but unimportant to me else: raise RuntimeError('generator ignored GeneratorExit') (possibly using a new GeneratorReturn exception instead) You might argue for using a different exception for signaling the exhausted case, e.g.: class GeneratorFinishedError(StandardError): """finish() called on exhaused generator.""" but that only really makes sense if you think calling finish without knowing whether the generator is exhausted is a reasonable thing to do. *If* that is the case, we should also consider adding a 'default' argument to finish which (if provided) could be returned instead of raising the exception (kind of like dict.pop). - Jacob From solipsis at pitrou.net Wed Oct 27 12:48:26 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 27 Oct 2010 12:48:26 +0200 Subject: [Python-ideas] ExternalMemory References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB5E@exchcn.ccp.ad.local> Message-ID: <20101027124826.7b925a8c@pitrou.net> On Wed, 27 Oct 2010 12:02:11 +0800 Kristj?n Valur J?nsson wrote: > > First of all, I'd futilely like to suggest this change for 2.x. The existing > PyBuffer_FromMemory() provides no lifetime management. By "futilely" you mean you know it won't be accepted, since 2.x is in bug fixes-only mode? :) > So, for py3k, I'd actually like to extend the Memoryview object, and > provide something like PyMemoryView_FromExternal() that takes an > optional pointer to a "void destructor(void *arg, void *ptr)) and an > (void *arg), to be called when the buffer is released. Sounds reasonable to me. Regards Antoine. From cmjohnson.mailinglist at gmail.com Wed Oct 27 13:22:41 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Wed, 27 Oct 2010 01:22:41 -1000 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: The downside of flipping the object and parameter of split is that there's no clear thing to translate "blah\tblah\nblah".split() ==> ['blah', 'blah', 'blah'] into. None.split(string) is crazy talk. Then again, the case can be made that split() doesn't behave like the other splits (it drops empty segments; it treats all whitespace the same), so maybe it shouldn't have the same name as the normal kind of split. I do think that it might be convenient to be able to do this: commasplit = ', '.divide #If we're going to imagine this, we should probably use a different name than "split" list1 = commasplit(string1) list2 = commasplit(string2) ? The same way that one can do: commajoin = ', '.join string1 = commajoin(list1) string2 = commajoin(list2) ? But the convention is too old and the advantage is too slight to bother with sort of bikeshedding now. Save it for when you design a new language to replace Python. :-) -- Carl From bborcic at gmail.com Wed Oct 27 14:17:44 2010 From: bborcic at gmail.com (Boris Borcic) Date: Wed, 27 Oct 2010 14:17:44 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: References: <20101025154932.06be2faf@o> Message-ID: Carl M. Johnson wrote: > The downside of flipping the object and parameter of split is that > there's no clear thing to translate "blah\tblah\nblah".split() ==> > ['blah', 'blah', 'blah'] into. None.split(string) is crazy talk. ''.join(seqofstr) is deemed better-looking than sum(seqofstr), isn't it? Imo, this entails an aesthetic canon in favor of ''.split in the above context. Note that currently s.split('') bombs, so there would be no functional behavior to save. > Then > again, the case can be made that split() doesn't behave like the other > splits (it drops empty segments; it treats all whitespace the same), > so maybe it shouldn't have the same name as the normal kind of split. > > I do think that it might be convenient to be able to do this: > > commasplit = ', '.divide #If we're going to imagine this, we should > probably use a different name than "split" > list1 = commasplit(string1) > list2 = commasplit(string2) > ? > > The same way that one can do: > > commajoin = ', '.join > string1 = commajoin(list1) > string2 = commajoin(list2) Yeah, that's a concrete rendition of my earlier point on bound methods. > ? > > But the convention is too old and the advantage is too slight to > bother with sort of bikeshedding now. Save it for when you design a > new language to replace Python. :-) Thanks :) But if I was to redesign the snake, I guess I might contemplate pieces = string/separator to mean pieces = string.split(separator) :) Cheers, BB From bborcic at gmail.com Wed Oct 27 15:16:46 2010 From: bborcic at gmail.com (Boris Borcic) Date: Wed, 27 Oct 2010 15:16:46 +0200 Subject: [Python-ideas] textFromMap(seq , map=None , sep='' , ldelim='', rdelim='') In-Reply-To: <20101027091028.11322756@o> References: <20101025154932.06be2faf@o> <4CC6FD33.4050305@pearwood.info> <20101027091028.11322756@o> Message-ID: spir wrote: > On Wed, 27 Oct 2010 03:09:23 +1100 > Steven D'Aprano wrote: > >> Boris Borcic wrote: >> >>> And let's then propagate that notion, to a *coherent* definition of >>> split that makes it as well a method on the separator. >> >> Let's not. >> >> Splitting is not something that you on the separator, it's something you >> do on the source string. I'm sure you wouldn't expect this: >> >> ":".find("key:value") >> => 3 >> >> Nor should we expect this: >> >> ":".split("key:value") >> => ["key", "value"] >> >> >> You perform a search *on* the source string, not the target substring. >> Likewise you split the source string, not the separator. > > I completely share this view. Pack behavior ! Where's the alpha male ? :) > Also, when one needs to split on multiple seps, repetitive seps, or even > more complex separation schemes, it makes even less sense to see split > applying on the sep, instead of on the string. Now that's a mighty strange argument, unless you think of /split/ as some sort of multimethod. I didn't mean to deprive you of your preferred swiss army knife :) Obviously the algorithm must change according to the sort of "separation scheme". Isn't it then a natural anticipation to see the dispatch effected along the lines of Python's native object orientation ? Maybe though, this is a case of the user overstepping into the private coding business of language implementors. But on the user's own coding side, the more complex the "separation scheme", the most likely it is that code written to achieve it using /split/, applies multiply on *changing* input "source string"s. What in turn would justify that the action name /split/ be bound more tightly to the relatively stable "separation scheme" than to the relatively unstable "source string". > Even less when splitting should remove empty parts generated by seps at both > end or repeted seps. Note that it's precisely what split() without sep does: > >>>> s = " some \t little words " >>>> s.split() > ['some', 'little', 'words'] >>>> s.split(' ') > ['', 'some', '', 'little', '', '', 'words', '', ''] /split/ currently behaves as it does currently, sure. If it was bound on the separator, s.split() could naturally be written ''.split(s) - so what's your point ? As I told Johnson, deeming ''.join(seqofstr) better-looking than sum(seqofstr) entails promotion of aesthetic sense in favor of ''.split... > > Finally, in any of such cases, join is _not_ a reverse function for split. > split in the general case is not reversable because there is loss of information. > It is possible only with a pattern limited to a single sep, no (implicit) repetition, >and keeping empty parts at ends. Very fine that python's split semantics > is so defined, one cannot think at split as reversible in general (*). Now that's gratuitous pedantry ! Note that given f = sep.join g = lambda t : t.split(sep) it is true that g(f(g(x)))==g(x) and f(g(f(y)))==f(y) for whatever values of sep, x, and y that do not provoke any exception. What covers all natural use cases with the notable exception of s.split(), iow sep=None. That is clearly enough to justify calling, as I did, /split/ the "converse" of /join/ (note the order, sep.join applied first, which eliminates sep=None as a use case) And iirc, the mathematical notion that best fits the idea, is not that of http://en.wikipedia.org/wiki/Inverse_function but that of http://en.wikipedia.org/wiki/Adjoint_functors Cheers, BB From rrr at ronadam.com Wed Oct 27 17:01:00 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 27 Oct 2010 10:01:00 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: <4CC83EAC.7010001@ronadam.com> On 10/25/2010 10:25 PM, Guido van Rossum wrote: > By the way, here's how to emulate the value-returning-close() on a > generator, assuming the generator uses raise StopIteration(x) to mean > return x: > > def gclose(gen): > try: > gen.throw(GeneratorExit) > except StopIteration, err: > if err.args: > return err.args[0] > except GeneratorExit: > pass > return None > > I like this because it's fairly straightforward (except for the detail > of having to also catch GeneratorExit). > > In fact it would be a really simple change to gen_close() in > genobject.c -- the only change needed there would be to return > err.args[0]. I like small evolutionary improvements to APIs. Here's an interesting idea... It looks like a common case for consumer co-functions is they need to be started and then closed, so I'm wondering if we can make these work context managers? That may be a way to reduce the need for the try/except blocks inside the generators. with my_cofunction(args) as c: ... use c Regards, Ron From alexander.belopolsky at gmail.com Wed Oct 27 18:05:58 2010 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 27 Oct 2010 12:05:58 -0400 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: I would like to report a conclusion reached on the tracker to a wider audience before committing the changes. The new home for Demo/turtle is Lib/turtledemo. (Lib/turtle/demo alternative received no support and Lib/demo/turtle was not even in the running.) If anyone is interested in reviewing the patch, please see http://bugs.python.org/issue10199. Note that I tried to limit changes to what was necessary for running the demo script as python -m demoturtle. Running the scripts as unit tests and from python prompt will be subject of a separate issue. On Tue, Oct 26, 2010 at 11:49 AM, Alexander Belopolsky wrote: > On Tue, Oct 26, 2010 at 11:18 AM, Guido van Rossum wrote: >> On Tue, Oct 26, 2010 at 8:13 AM, Alexander Belopolsky >> wrote: >>> The one demo that I want to find a better place for is Demo/turtle. >> >> Sure, go for it. It is a special case because the turtle module is >> also in the stdlib and these are intended for a particular novice >> audience. > > Please see http://bugs.python.org/issue10199 for further discussion. > From jh at improva.dk Wed Oct 27 18:53:07 2010 From: jh at improva.dk (Jacob Holm) Date: Wed, 27 Oct 2010 18:53:07 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> Message-ID: <4CC858F3.4000602@improva.dk> On 2010-10-26 18:56, Guido van Rossum wrote: > Now, if I may temporarily go into wild-and-crazy mode (this *is* > python-ideas after all :-), we could invent some ad-hoc syntax for > this pattern, e.g.: > > for value in yield: > > return > > IOW the special form: > > for in yield: > > > would translate into: > > try: > while True: > = yield > > except GeneratorExit: > pass > > If (and this is a big if) the > while-True-yield-inside-try-except-GeneratorExit pattern somehow > becomes popular we could reconsider this syntactic extension or some > variant. (I have to add that the syntactic ice is a bit thin here, > since "for in (yield)" already has a meaning, and a totally > different one of course. A variant could be "for from yield" or > some other abuse of keywords. Hmm. This got me thinking. One thing I'd really like to see in python is something like the "channel" object from the go language (http://golang.org/). Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without any of them) it is possible to write a trampoline-based implementation of a channel object with "send" and "next" methods that work as expected. One thing that is *not* possible (I think) is to make that object iterable. Your wild idea above gave me a similar wild idea of my own. An extension to the cofunctions PEP that would make that possible. 1) Define a new "coiterator" protocol, consisting of a new special method __conext__, and a new StopCoIteration exception that the regular StopIteration inherits from. __conext__ should be a generator that yields as many times as necessary, then either raises StopCoIteration or returns a result (possibly by raising a StopIteration with a value). Add a new built-in "conext" cofunction that looks for a __conext__ method instead of a __next__ method. 2) Define a new "coiterable" protocol, consisting of a new special method __coiter__. __coiter__ is a regular function and should return an object implementing the "coiterator" protocol. Add a new built-in "coiter" function that looks for a __coiter__ method instead of an __iter__ method. (We could also make this a cofunction but for now I don't see the point). 3) Make sure that the for-loop in a cofunction: for val in coiterable: else: expands as: _it = coiter(coiterable) while True: try: val = cocall conext(_it) except StopCoIteration: break else: Which is exactly the same as in a normal function, except for the use of "coiter" and "cocall conext" instead of "iter" and "next", and the use of StopCoIteration instead of StopIteration. 3a) Alternatively define a new syntax for "coiterating" that expands as in 3 and whose use is an alternative indicator that this is a cofunction. All this to make it possible to write a code like this: def consumer(ch): for val in ch: cocall print(val) # XXX need a cocall somewhere def producer(ch): for val in range(10): cocall ch.send(val) def main() sched = scheduler() ch = channel() sched.add(consumer(ch)) sched.add(producer(ch)) sched.run() Thoughts? - Jacob From rrr at ronadam.com Wed Oct 27 18:18:58 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 27 Oct 2010 11:18:58 -0500 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: <4CC83EAC.7010001@ronadam.com> References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> Message-ID: <4CC850F2.7010202@ronadam.com> On 10/27/2010 10:01 AM, Ron Adam wrote: > > On 10/25/2010 10:25 PM, Guido van Rossum wrote: >> By the way, here's how to emulate the value-returning-close() on a >> generator, assuming the generator uses raise StopIteration(x) to mean >> return x: >> >> def gclose(gen): >> try: >> gen.throw(GeneratorExit) >> except StopIteration, err: >> if err.args: >> return err.args[0] >> except GeneratorExit: >> pass >> return None >> >> I like this because it's fairly straightforward (except for the detail >> of having to also catch GeneratorExit). >> >> In fact it would be a really simple change to gen_close() in >> genobject.c -- the only change needed there would be to return >> err.args[0]. I like small evolutionary improvements to APIs. > > Here's an interesting idea... > > It looks like a common case for consumer co-functions is they need to be > started and then closed, so I'm wondering if we can make these work > context managers? That may be a way to reduce the need for the > try/except blocks inside the generators. It looks like No context managers return values in the finally or __exit__ part of a context manager. Is there way to do that? Here's a context manager version of the min/max with nested coroutines, but it doesn't return a value from close. ###### from contextlib import contextmanager # New close function that enables returning a # value. def gclose(gen): try: gen.throw(GeneratorExit) except StopIteration as err: if err.args: return err.args[0] except GeneratorExit: pass return None # Showing both the class and geneator based # context managers for comparison and to better # see how these things may work. class Consumer: def __init__(self, cofunc): next(cofunc) self.cofunc = cofunc def __enter__(self): return self.cofunc def __exit__(self, *exc_info): gclose(self.cofunc) @contextmanager def consumer(cofunc): next(cofunc) try: yield cofunc finally: gclose(cofunc) class MultiConsumer: def __init__(self, cofuncs): for c in cofuncs: next(c) self.cofuncs = cofuncs def __enter__(self): return self.cofuncs def __exit__(self, *exc_info): for c in self.cofuncs: gclose(c) @contextmanager def multiconsumer(cofuncs): for c in cofuncs: next(c) try: yield cofuncs finally: for c in cofuncs: gclose(c) # Min/max coroutine example slpit into # nested coroutines for testing these ideas # in a more complex situation that may arise # when working with cofunctions and generators. # Question: # How to rewrite this so close returns # a final value? def reduce_i(f): i = yield while True: i = f(i, (yield i)) def reduce_it_to(funcs): with multiconsumer([reduce_i(f) for f in funcs]) as mc: values = None while True: i = yield values values = [c.send(i) for c in mc] def main(): with consumer(reduce_it_to([min, max])) as c: for i in range(100): value = c.send(i) print(value) if __name__ == '__main__': main() From guido at python.org Wed Oct 27 20:38:49 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Oct 2010 11:38:49 -0700 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: <4CC850F2.7010202@ronadam.com> References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com> Message-ID: On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam wrote: > > > On 10/27/2010 10:01 AM, Ron Adam wrote: > It looks like No context managers return values in the finally or __exit__ > part of a context manager. ?Is there way to do that? How would that value be communicated to the code containing the with-clause? > Here's a context manager version of the min/max with nested coroutines, but > it doesn't return a value from close. > > > ###### > from contextlib import contextmanager > > > # New close function that enables returning a > # value. > > def gclose(gen): > ? try: > ? ? gen.throw(GeneratorExit) > ? except StopIteration as err: > ? ? if err.args: > ? ? ? return err.args[0] > ? except GeneratorExit: > ? ? pass > ? return None > > > # Showing ?both the class and geneator based > # context managers for comparison and to better > # see how these things may work. > > class Consumer: > ? ?def __init__(self, cofunc): > ? ? ? ?next(cofunc) > ? ? ? ?self.cofunc = cofunc > ? ?def __enter__(self): > ? ? ? ?return self.cofunc > ? ?def __exit__(self, *exc_info): > ? ? ? ?gclose(self.cofunc) > > @contextmanager > def consumer(cofunc): > ? ?next(cofunc) > ? ?try: > ? ? ? ?yield cofunc > ? ?finally: > ? ? ? ?gclose(cofunc) > > > class MultiConsumer: > ? ?def __init__(self, cofuncs): > ? ? ? ?for c in cofuncs: > ? ? ? ? ? ?next(c) > ? ? ? ?self.cofuncs = cofuncs > ? ?def __enter__(self): > ? ? ? ?return self.cofuncs > ? ?def __exit__(self, *exc_info): > ? ? ? ?for c in self.cofuncs: > ? ? ? ? ? ?gclose(c) > > @contextmanager > def multiconsumer(cofuncs): > ? ?for c in cofuncs: > ? ? ? ?next(c) > ? ?try: > ? ? ? ?yield cofuncs > ? ?finally: > ? ? ? ?for c in cofuncs: > ? ? ? ? ? ?gclose(c) So far so good. > # Min/max coroutine example slpit into > # nested coroutines for testing these ideas > # in a more complex situation that may arise > # when working with cofunctions and generators. > > # Question: > # ? ?How to rewrite this so close returns > # ? ?a final value? Change the function to catch GeneratorExit and when it catches that, raise StopIteration(). > def reduce_i(f): > ? ? i = yield > ? ? while True: > ? ? ? ? i = f(i, (yield i)) Unfortunately from here on till the end of your example my brain exploded. > def reduce_it_to(funcs): > ? ?with multiconsumer([reduce_i(f) for f in funcs]) as mc: > ? ? ? ?values = None > ? ? ? ?while True: > ? ? ? ? ? ?i = yield values > ? ? ? ? ? ?values = [c.send(i) for c in mc] Maybe you could have picked a better name than 'i' for this variable... > def main(): > ? ?with consumer(reduce_it_to([min, max])) as c: > ? ? ? ?for i in range(100): > ? ? ? ? ? ?value = c.send(i) > ? ? ? ?print(value) I sort of get what you are doing here but I think you left one abstraction out. Something like this: def blah(it, funcs): with consumer(reduce_it_to(funcs) as c: for i in it: value = c.send(i) return value def main(): print(blah(range(100), [min, max])) > if __name__ == '__main__': > ? ?main() -- --Guido van Rossum (python.org/~guido) From lie.1296 at gmail.com Wed Oct 27 19:59:44 2010 From: lie.1296 at gmail.com (Lie Ryan) Date: Thu, 28 Oct 2010 04:59:44 +1100 Subject: [Python-ideas] Move Demo scripts under Lib In-Reply-To: References: Message-ID: On 10/27/10 02:13, Alexander Belopolsky wrote: > Introduction of -m option has changed that IMO. For example, when I > work with recent versions of python, I always run pydoc as python -m > pydoc because pydoc script on the path amy not correspond to the same > version of python that I use. Shouldn't there be a pydoc2.6, pydoc3.1, and other pydocX.X that corresponds to each python version? Otherwise you should be able to create an alias in your shell. From jh at improva.dk Wed Oct 27 22:22:09 2010 From: jh at improva.dk (Jacob Holm) Date: Wed, 27 Oct 2010 22:22:09 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: <4CC889F1.8010603@improva.dk> On 2010-10-27 00:14, Nick Coghlan wrote: > Jacob's "implications for PEP 380" exploration started to give me some > doubts, but I think there are actually some flaws in his argument. I'm not sure I made much of an argument. I showed an example that assumed the change I was suggesting and explained what the problem would be without the change. Let me try another example: def filesum(fn): s = 0 with fd in open(fn): for line in fd: s += int(line) yield # be cooperative.. return s def multifilesum(): a = yield from filesum('fileA') b = yield from filesum('fileB') return a+b def main() g = multifilesum() for i in range(10): try: next(g) except StopIteration as e: r = e.value break else: r = g.finish() This tries to read at most 10 lines from 'fileA' + 'fileB' and returning their sums when interpreting each line as an integer. It works fine if there are at most 10 lines but is broken if 'fileA' has more than 10 lines. What's more, assuming latest PEP 380 + your "finish" and no other changes I don't see a simple way of fixing it. With my modification of your "finish" proposal you can add a few try...except blocks to the code and it will "just work (tm)"... > Accordingly, I would like to make one more attempt at explaining why I > think throwing in a separate exception for this use case is valuable > (and *doesn't* require any changes to PEP 380). > I am convinced that it does, at least if you want it to be useable with yield-from. But the same goes for any version that uses GeneratorExit. > As I see it, there's a bit of a disconnect between many PEP 380 use > cases and any mechanism or idiom which translates a thrown in > exception into an ordinary StopIteration. If you expect your thrown in > exception to always terminate the generator in some fashion, adopting > the latter idiom in your generator will make it potentially unsafe to > use in a "yield from" expression that isn't the very last yield > operation in any outer generator. > Right. This is the problem I'm trying to address by modifying the PEP expansion. > Consider the following: > > def example(arg): > try: > yield arg > except GeneratorExit > return "Closed" > return "Finished" > > def outer_ok1(arg): # close() after next() returns "Closed" > return yield from example(arg) > > def outer_ok2(arg): # close() after next() returns None > yield from example(arg) > > def outer_broken(arg): # close() after next() gives RuntimeError > val = yield from example(arg) > yield val > > # All 3 cases: close() before next() returns None > # All 3 cases: close() after 2x next() returns None > Actually, AFAICT outer_broken will *not* give a RuntimeError on close() after next(). This is due to the special-casing of GeneratorExit in PEP 380. That special-casing is also the basis for both my suggested modifications. In fact, in all 3 cases close() after next() would give None because the "inner" return value is discarded and the GeneratorExit reraised. Only when called directly would the inner "example" function return "Closed" on close() after next(). > Using close() to say "give me your return value" creates the risk of > hitting those runtime errors in a generator's __del__ method, Not really. Returning a value from close with no other changes does not change the risk of that happening. Of course I *do* think other changes are necessary, but then we'll need to look at those before concluding they are a problem... > and > exceptions in __del__ are always a bit ugly. > That they are. > Keeping the "give me your return value" and "clean up your resources" > concerns separate by adding a new method and thrown exception means > that close() is less likely to unpredictably raise RuntimeError (and > when it does, will reliably indicate a genuine bug in a generator > somewhere that is suppressing GeneratorExit). > > As far as PEP 380's semantics go, I think it should ignore the > existence of anything like GeneratorReturn completely. Either one of > the generators in the chain will catch the exception and turn it into > StopIteration, or they won't. If they convert it to StopIteration, and > they aren't the last generator in the chain, then maybe what actually > needs to happen at the outermost level is something like this: > > class GeneratorReturn(Exception): pass > > def finish(gen): > try: > gen.throw(GeneratorReturn) # Ask generator to wrap things up > except StopIteration as err: > if err.args: > return err.args[0] > except GeneratorReturn: > pass > else: > # Asking nicely didn't work, so force resource cleanup > # and treat the result as if the generator had already > # been exhausted or hadn't started yet > gen.close() > return None > This, I don't like. If we have a distinct method for "finishing" a generator and getting a return value, I want it to tell me if the return value was arrived at in some other way. Preferably with an exception, as in: def finish(self): if self.gi_frame is None: raise RuntimeError('finish() on exhausted/closed generator') try: self.throw(GeneratorReturn) except StopIteration as err: if err.args: return err.args[0] except GeneratorReturn: pass else: raise RuntimeError('generator ignored GeneratorReturn') return None The point of "finish" as I see it is not the "closing" part, but the "give me a result" part. Anyway, I am (probably) not going to argue much further for this. The only new thing that is on the table here is the "finish" function, and using a new exception. The use of a new exception solves some of the issues that you and Greg had earlier, but leaves the problem of using a value-returning close/finish with yield-from. (And Guido doesn't like it). Since noone seems interested in even considering a change to the PEP 380 expansion to fix this, i don't really see a any more I can contribute at this point. - Jacob From ncoghlan at gmail.com Thu Oct 28 00:00:36 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Oct 2010 08:00:36 +1000 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: <4CC850F2.7010202@ronadam.com> References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com> Message-ID: On Thu, Oct 28, 2010 at 2:18 AM, Ron Adam wrote: > > It looks like No context managers return values in the finally or __exit__ > part of a context manager. ?Is there way to do that? The return value from __exit__ is used to decide whether or not to suppress the exception (i.e. bool(__exit__()) == True will suppress the exception that was passed in). There are a few CMs in the test suite (test.support) that provide info about things that happened during their with statement - they all use the trick of returning a stateful object from __enter__, then modifying the attributes of that object in __exit__. I seem to recall the CM variants of unittest.TestCase.assertRaises* doing the same thing (so you can poke and prod at the raised exception yourself). warnings.catch_warnings also appends encountered warnings to a list returned by __enter__ when record=True. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 28 00:52:59 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Oct 2010 08:52:59 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC889F1.8010603@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm wrote: > Actually, AFAICT outer_broken will *not* give a RuntimeError on close() > after next(). ?This is due to the special-casing of GeneratorExit in PEP > 380. ?That special-casing is also the basis for both my suggested > modifications. Ah, you're quite right - I'd completely forgotten about the GeneratorExit special-casing in the PEP 380 semantics, so I was arguing from a faulty premise. With that error corrected, I can happily withdraw my objection to idioms that convert GeneratorExit to StopIteration (since any yield from expressions will reraise the GeneratorExit in that case). The "did-it-really-finish?" question can likely be answered by slightly improving generator state introspection from the Python level (as I believe Guido suggested earlier in the thread). That way close() can keep the gist of its current semantics (return something if the generator ends up in an inactive state, raise RuntimeError if it yields another value), while frameworks can object to other unexpected states if they want to. As it turns out, the information on generator state is already there, just not in a particularly user friendly format ("not started" = "g.gi_frame is not None and g.gi_frame.f_lasti == -1", "terminated" = "g.gi_frame is None"). So, without any modifications at all to the current incarnation of PEP 380, it is already possible to write: def finish(gen): frame = gen.gi_frame if frame is None: raise RuntimeError('finish() on exhausted/closed generator') if frame.f_lasti == -1: raise RuntimeError('finish() on not yet started generator') try: gen.throw(GeneratorExit) except StopIteration as err: if err.args: return err.args[0] return None except GeneratorExit: pass else: raise RuntimeError('Generator ignored GeneratorExit') raise RuntimeError('Generator failed to return a value') I think I'm finally starting to understand *your* question/concern though. Given the current PEP 380 expansion, the above definition of finish() and the following two generators: def g_inner(): yield return "Hello world!" def g_outer(): yield (yield from g_inner()) You would get the following result (as g_inner converts GeneratorExit to StopIteration, then yield from propogates that up the stack): >>> g = g_outer() >>> next(g) >>> finish(g) "Hello world!" Oops? I'm wondering if this part of the PEP 380 expansion: if _e is _x[1] or isinstance(_x[1], GeneratorExit): raise Should actually look like: if _e is _x[1]: raise if isinstance(_x[1], GeneratorExit): raise GeneratorExit(*_e.args) Once that distinction is made, you can more easily write helper functions and context managers that allow code to do the "right thing" according to the needs of a particular framework or application. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 28 00:54:30 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Oct 2010 08:54:30 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 8:52 AM, Nick Coghlan wrote: > On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm wrote: >> Actually, AFAICT outer_broken will *not* give a RuntimeError on close() >> after next(). ?This is due to the special-casing of GeneratorExit in PEP >> 380. ?That special-casing is also the basis for both my suggested >> modifications. > > Ah, you're quite right - I'd completely forgotten about the > GeneratorExit special-casing in the PEP 380 semantics, so I was > arguing from a faulty premise. With that error corrected, I can > happily withdraw my objection to idioms that convert GeneratorExit to > StopIteration (since any yield from expressions will reraise the > GeneratorExit in that case). Correction: they'll reraise StopIteration with the current PEP semantics, GeneratorExit with the proposed modification at the end of my last message. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Oct 28 01:46:30 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Oct 2010 16:46:30 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: Nick & Jacob, Unfortunately other things are in need of my attention and I am quickly lagging behind on this thread. I'll try to respond to some issues without specific quoting. If GeneratorReturn and finish() can be implemented in pure user code, then I think it should be up to every (framework) developer to provide their own API, using whatever constraints they chose. Without specific use cases it's hard to reason about API design. Still, I think it is reasonable to offer some basic behavior on the generator object, and I still think that the best compromise here is to let g.close() extract the return value from StopIteration if it catches it. If a framework decides not to use this, fine. For a user working without a framework this is still just a little nicer than having to figure out the required logic yourself. I am aware of four relevant states for generators. Here's how they work (in current Python): - initial state: execution is poised at the top of the function. g.throw() always bounces back the exception. g.close() moves it to the final state. g.next() starts it running. g.send() requires a None argument and is then the same as g.next(). - running state: the frame is active. none of g.next(), g.send(), g.throw() or g.close() work -- they all raise ValueError. - suspended state: execution is suspended at a yield. g.close() raises GeneratorExit and if the generator catches this it can do whatever it pleases. If it then raises StopIteration or GeneratorExit, g.close() is happy, if it raises another exception g.close() just passes that through, if it yields a value g.close() complains and raises RuntimeError(). - finished (exhausted) state: the generator has returned. g.close() always return None. g.throw() always bounces back the exception. g.next() and g.send() always raise StopIteration. I would be in favor of adding an introspection API to distinguish these four states and I think it would be a fine thing to add to Python 3.2 if anyone finds the time to produce a patch (Nick? You showed what these boil down to.) I note that in the initial state a generator has no choice in how to respond because it hasnt't yet had the opportunity to set up a try/except, so in this state it acts pretty much the same as in the exhausted state when receiving a throw() or close(). Regarding built-in syntax for Go-like channels, let's first see an implementation in userland become successful *or* see that it's impossible to write an efficient one before adding more to the language. Note that having a different expansion of a for-loop based on the run-time value or type of the iterable cannot be done -- the expansion can only vary based on the syntactic form. There are a few different conventions for using generators and yield-from; e.g. generators used as proper iterators with easy refactoring; generators used as tasks where yield X is used for blocking I/O operations; and generators used as "inverse generators" as in the parallel_reduce() example that initiated this thread. I don't particularly care about what kind of errors you get if a generator written for one convention is accidentally used by another convention, as long as it is made clear which convention is being used in each case. Frameworks/libraries can and probably should develop decorators to mark up the 2nd and 3rd conventions, but I don't think the *language* needs to go out of its way to enforce proper usage. -- --Guido van Rossum (python.org/~guido) From rrr at ronadam.com Thu Oct 28 02:00:52 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 27 Oct 2010 19:00:52 -0500 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com> Message-ID: <4CC8BD34.3090700@ronadam.com> On 10/27/2010 01:38 PM, Guido van Rossum wrote: > On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam wrote: >> >> >> On 10/27/2010 10:01 AM, Ron Adam wrote: >> It looks like No context managers return values in the finally or __exit__ >> part of a context manager. Is there way to do that? > > How would that value be communicated to the code containing the with-clause? I think that was what I was trying to figure out also. >> def reduce_i(f): >> i = yield >> while True: >> i = f(i, (yield i)) > > Unfortunately from here on till the end of your example my brain exploded. Mine did too, but I think it was a useful but strange experience. ;-) It forced me to take a break and think about the problem from a different viewpoint. Heres the conclusion I came to, but be forewarned, it's kind of anti-climatic. :-) The use of an exception to signal some bit of code, is a way to reach over a wall that also protects that bit of code. This seems to be a more common need when using coroutines, because it's more common to have some bits of code indirectly, direct some other bit of code. Generators already have a nice .throw() method that will return the value at the next yield. But we either have to choose an existing exception to throw, that has some other purpose, or make up a new one. When it comes to making up new ones, lots of other programmers may each call it something else. That isn't a big problem, but it may be nice if we had a standard exception for saying.. "Hey you!, send me a total or subtotal!". And that's all that it does. For now lets call it a ValueRequest exception. ValueRequest makes sense if you are throwing an exception, I think ValueReturn may make more sense if you are raising an exception. Or maybe there is something that reads well both ways? These both fit very nice with ValueError and it may make reading code easier if we make a distinction between a request and a return. Below is the previous example rewritten to do this. A ValueRequest doesn't stop anything or force anything to close, so it wont ever interfere, confuse, or complicate, code that uses other exceptions. You can always throw or catch one of these and raise something else if you need to. Since throwing it into a generator doesn't stop the generator, the generator can put the try-except into a larger loop and loop back to get more values and catch another ValueRequest at some later point. I feel that is a useful and handy thing to do. So here's the example again. The first version of this took advantage of yield's ability to send and get data at the same time to always send back an update (subtotal) to the parent routine. That's nearly free since a yield always sends something back anyway. (None if you don't give it something else.) But it's not always easy to do, or easy to understand if you do it. IE.. brain exploding stuff. In this version, data only flows into the coroutine until a ValueRequest exception is thrown at it, at which point it then yields back a total. *I can see where some routines may reverse the control, by throwing ValueReturns from the inside out, rather than ValueRequests from the outside in. Is it useful to distinquish between the two or should there be just one? *Yes this can be made to work with gclose() and return, but I feel that is more restrictive, and more complex, than it needs to be. *I still didn't figure out how to use the context managers to get rid of the try except. Oh well. ;-) from contextlib import contextmanager class ValueRequest(Exception): pass @contextmanager def consumer(cofunc, result=True): next(cofunc) try: yield cofunc finally: cofunc.close() @contextmanager def multiconsumer(cofuncs, result=True): for c in cofuncs: next(c) try: yield cofuncs finally: for c in cofuncs: c.close() # Min/max coroutine example slpit into # nested coroutines for testing these ideas # in a more complex situation that may arise # when working with cofunctions and generators. def reduce_item(f): try: x = yield while True: x = f(x, (yield)) except ValueRequest: yield x def reduce_group(funcs): with multiconsumer([reduce_item(f) for f in funcs]) as mc: try: while True: x = yield for c in mc: c.send(x) except ValueRequest: yield [c.throw(ValueRequest) for c in mc] def get_reductions(funcs, iterable): with consumer(reduce_group(funcs)) as c: for x in iterable: c.send(x) return c.throw(ValueRequest) def main(): funcs = [min, max] print(get_reductions(funcs, range(100))) s = "Python is fun for play, and great for work too." print(get_reductions(funcs, s)) if __name__ == '__main__': main() From rrr at ronadam.com Thu Oct 28 03:26:23 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 27 Oct 2010 20:26:23 -0500 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com> Message-ID: <4CC8D13F.4070203@ronadam.com> On 10/27/2010 05:00 PM, Nick Coghlan wrote: > On Thu, Oct 28, 2010 at 2:18 AM, Ron Adam wrote: >> >> It looks like No context managers return values in the finally or __exit__ >> part of a context manager. Is there way to do that? > > The return value from __exit__ is used to decide whether or not to > suppress the exception (i.e. bool(__exit__()) == True will suppress > the exception that was passed in). > > There are a few CMs in the test suite (test.support) that provide info > about things that happened during their with statement - they all use > the trick of returning a stateful object from __enter__, then > modifying the attributes of that object in __exit__. I seem to recall > the CM variants of unittest.TestCase.assertRaises* doing the same > thing (so you can poke and prod at the raised exception yourself). > warnings.catch_warnings also appends encountered warnings to a list > returned by __enter__ when record=True. > > Cheers, > Nick. Thanks, I'll take a look. If for nothing else it will help me understand it better. BTW, The use case of the (min/max) examples doesn't fit that particular need. It turned out that just creating a custom exception and throwing it into the coroutine is probably the best and simplest way to do it. That's not to say that some of the other things Guido is thinking of won't benefit close() returning a value, but that particular example doesn't. Cheers, Ron From kristjan at ccpgames.com Thu Oct 28 04:05:22 2010 From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=) Date: Thu, 28 Oct 2010 10:05:22 +0800 Subject: [Python-ideas] ExternalMemory In-Reply-To: <20101027124826.7b925a8c@pitrou.net> References: <2E034B571A5CE44E949B9FCC3B6D24EE5761FB5E@exchcn.ccp.ad.local> <20101027124826.7b925a8c@pitrou.net> Message-ID: <2E034B571A5CE44E949B9FCC3B6D24EE5761FC52@exchcn.ccp.ad.local> Looking better at this, I don't think it is such a The MemoryView is designed to be a wrapper around the Py_buffer interface. Slicing it, for example, creates a new memoryview based on the same underlying object. Having the MemoryView get its data from two different places would be very hacky, I think. What is needed, I think, is a basic ExternalMemory C api object with a buffer interface that does what I describe. This exists in 2.7 (the BufferObject) but with the shortcomings I mentioned. But as far as I know, there is no similar object in py3k. K -----Original Message----- From: python-ideas-bounces+kristjan=ccpgames.com at python.org [mailto:python-ideas-bounces+kristjan=ccpgames.com at python.org] On Behalf Of Antoine Pitrou Sent: Wednesday, October 27, 2010 18:48 To: python-ideas at python.org Subject: Re: [Python-ideas] ExternalMemory > So, for py3k, I'd actually like to extend the Memoryview object, and > provide something like PyMemoryView_FromExternal() that takes an > optional pointer to a "void destructor(void *arg, void *ptr)) and an > (void *arg), to be called when the buffer is released. Sounds reasonable to me. Regards Antoine. _______________________________________________ Python-ideas mailing list Python-ideas at python.org http://mail.python.org/mailman/listinfo/python-ideas From guido at python.org Thu Oct 28 04:53:14 2010 From: guido at python.org (Guido van Rossum) Date: Wed, 27 Oct 2010 19:53:14 -0700 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: <4CC8BD34.3090700@ronadam.com> References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com> <4CC8BD34.3090700@ronadam.com> Message-ID: On Wed, Oct 27, 2010 at 5:00 PM, Ron Adam wrote: > > On 10/27/2010 01:38 PM, Guido van Rossum wrote: >> >> On Wed, Oct 27, 2010 at 9:18 AM, Ron Adam ?wrote: >>> >>> >>> On 10/27/2010 10:01 AM, Ron Adam wrote: >>> It looks like No context managers return values in the finally or >>> __exit__ >>> part of a context manager. ?Is there way to do that? >> >> How would that value be communicated to the code containing the >> with-clause? > > I think that was what I was trying to figure out also. > >>> def reduce_i(f): >>> ? ? i = yield >>> ? ? while True: >>> ? ? ? ? i = f(i, (yield i)) >> >> Unfortunately from here on till the end of your example my brain exploded. > > Mine did too, but I think it was a useful but strange experience. ;-) > > It forced me to take a break and think about the problem from a different > viewpoint. ?Heres the conclusion I came to, but be forewarned, it's kind of > anti-climatic. ?:-) > > > The use of an exception to signal some bit of code, is a way to reach over a > wall that also protects that bit of code. ?This seems to be a more common > need when using coroutines, because it's more common to have some bits of > code indirectly, direct some other bit of code. > > Generators already have a nice .throw() method that will return the value at > the next yield. ?But we either have to choose an existing exception to > throw, that has some other purpose, or make up a new one. ?When it comes to > making up new ones, lots of other programmers may each call it something > else. > > That isn't a big problem, but it may be nice if we had a standard exception > for saying.. "Hey you!, send me a total or subtotal!". ?And that's all that > it does. ?For now lets call it a ValueRequest exception. > > ValueRequest makes sense if you are throwing an exception, ?I think > ValueReturn may make more sense if you are raising an exception. ?Or maybe > there is something that reads well both ways? ?These both fit very nice with > ValueError and it may make reading code easier if we make a distinction > between a request and a return. > > > Below is the previous example rewritten to do this. A ValueRequest doesn't > stop anything or force anything to close, so it wont ever interfere, > confuse, or complicate, code that uses other exceptions. ?You can always > throw or catch one of these and raise something else if you need to. > > Since throwing it into a generator doesn't stop the generator, the generator > can put the try-except into a larger loop and loop back to get more values > and catch another ValueRequest at some later point. ?I feel that is a useful > and handy thing to do. > > > So here's the example again. > > The first version of this took advantage of yield's ability to send and get > data at the same time to always send back an update (subtotal) to the parent > routine. ?That's nearly free since a yield always sends something back > anyway. (None if you don't give it something else.) ?But it's not always > easy to do, or easy to understand if you do it. ?IE.. brain exploding stuff. > > In this version, data only flows into the coroutine until a ValueRequest > exception is thrown at it, at which point it then yields back a total. > > > *I can see where some routines may reverse the control, by throwing > ValueReturns from the inside out, rather than ValueRequests from the outside > in. ? Is it useful to distinquish between the two or should there be just > one? > > *Yes this can be made to work with gclose() and return, but I feel that is > more restrictive, and more complex, than it needs to be. > > *I still didn't figure out how to use the context managers to get rid of the > try except. Oh well. ?;-) > > > > from contextlib import contextmanager > > class ValueRequest(Exception): > ? ?pass > > @contextmanager > def consumer(cofunc, result=True): > ? ?next(cofunc) > ? ?try: > ? ? ? ?yield cofunc > ? ?finally: > ? ? ? ?cofunc.close() > > @contextmanager > def multiconsumer(cofuncs, result=True): > ? ?for c in cofuncs: > ? ? ? ?next(c) > ? ?try: > ? ? ? ?yield cofuncs > ? ?finally: > ? ? ? ?for c in cofuncs: > ? ? ? ? ? ?c.close() > > # Min/max coroutine example slpit into > # nested coroutines for testing these ideas > # in a more complex situation that may arise > # when working with cofunctions and generators. > > def reduce_item(f): > ? ?try: > ? ? ? ?x = yield > ? ? ? ?while True: > ? ? ? ? ? ?x = f(x, (yield)) > ? ?except ValueRequest: > ? ? ? ?yield x > > def reduce_group(funcs): > ? ?with multiconsumer([reduce_item(f) for f in funcs]) as mc: > ? ? ? ?try: > ? ? ? ? ? ?while True: > ? ? ? ? ? ? ? ?x = yield > ? ? ? ? ? ? ? ?for c in mc: > ? ? ? ? ? ? ? ? ? ?c.send(x) > ? ? ? ?except ValueRequest: > ? ? ? ? ? ?yield [c.throw(ValueRequest) for c in mc] > > def get_reductions(funcs, iterable): > ? ?with consumer(reduce_group(funcs)) as c: > ? ? ? ?for x in iterable: > ? ? ? ? ? ?c.send(x) > ? ? ? ?return c.throw(ValueRequest) > > def main(): > ? ?funcs = [min, max] > ? ?print(get_reductions(funcs, range(100))) > ? ?s = "Python is fun for play, and great for work too." > ? ?print(get_reductions(funcs, s)) > > if __name__ == '__main__': > ? ?main() Hm... Certainly interesting. My own (equally anti-climactic :-) conclusions would be: - Tastes differ - There is a point where yield gets overused - I am not convinced that using reduce as a paradigm here is right -- --Guido van Rossum (python.org/~guido) From rrr at ronadam.com Thu Oct 28 05:57:08 2010 From: rrr at ronadam.com (Ron Adam) Date: Wed, 27 Oct 2010 22:57:08 -0500 Subject: [Python-ideas] PEP 380 close and contextmanagers? In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC83EAC.7010001@ronadam.com> <4CC850F2.7010202@ronadam.com> <4CC8BD34.3090700@ronadam.com> Message-ID: <4CC8F494.5030601@ronadam.com> On 10/27/2010 09:53 PM, Guido van Rossum wrote: > Hm... Certainly interesting. My own (equally anti-climactic :-) > conclusions would be: > > - Tastes differ > > - There is a point where yield gets overused > > - I am not convinced that using reduce as a paradigm here is right I Agree. :-) This was a contrived example for the purpose of testing an idea. The concept being tested had nothing to do with reduce. It had to do with the interface and control mechanisms. Cheers, Ron From offline at offby1.net Thu Oct 28 08:32:43 2010 From: offline at offby1.net (Chris Rose) Date: Thu, 28 Oct 2010 00:32:43 -0600 Subject: [Python-ideas] Ordered storage of keyword arguments Message-ID: I'd like to resurrect a discussion that went on a little over a year ago [1] started by Michael Foord suggesting that it'd be nice if keyword arguments' storage was implemented as an ordered dict as opposed to the current unordered form. I'm interested in picking this up for implementation, which presumably will require moving the implementation of the existing ordereddict class into the C library. Are there any issues that this might cause in implementation on the py3k development line? [1] http://mail.python.org/pipermail/python-ideas/2009-April/004163.html -- Chris R. Not to be taken literally, internally, or seriously. From mal at egenix.com Thu Oct 28 10:13:09 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 28 Oct 2010 10:13:09 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: Message-ID: <4CC93095.3080704@egenix.com> Chris Rose wrote: > I'd like to resurrect a discussion that went on a little over a year > ago [1] started by Michael Foord suggesting that it'd be nice if > keyword arguments' storage was implemented as an ordered dict as > opposed to the current unordered form. > > I'm interested in picking this up for implementation, which presumably > will require moving the implementation of the existing ordereddict > class into the C library. > > Are there any issues that this might cause in implementation on the > py3k development line? > > [1] http://mail.python.org/pipermail/python-ideas/2009-April/004163.html Ordered dicts are a lot slower than normal dictionaries. I don't think that we can make such a change unless we want to make Python a lot slower at the same time. If you only want to learn about the definition order of the keywords you can use the inspect module. >>> def f(a,b,c=1,d=2): pass ... >>> inspect.getargspec(f) (['a', 'b', 'c', 'd'], None, None, (1, 2)) I don't see much use in having the order of providing the keyword arguments in a function call always available. Perhaps there's a way to have this optionally, i.e. by allowing odicts to be passed in as keyword argument dict ?! Where I do see a real use is making the order of class attribute and method definition accessible in Python (without having to use meta-class hacks like e.g. Django's ORM does). It would then be must easier to use classes to represent external resources that rely on order, e.g. database table schemas or XML schemas. Classes are created using a keyword-like dictionary as well, so the situation is similar. The major difference is that classes aren't created as often as functions are called. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From jh at improva.dk Thu Oct 28 10:24:28 2010 From: jh at improva.dk (Jacob Holm) Date: Thu, 28 Oct 2010 10:24:28 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: <4CC9333C.1030709@improva.dk> On 2010-10-28 01:46, Guido van Rossum wrote: > Nick & Jacob, > > Unfortunately other things are in need of my attention and I am > quickly lagging behind on this thread. > Too bad, but understandable. I'll try to be brief(er). > I'll try to respond to some issues without specific quoting. > > If GeneratorReturn and finish() can be implemented in pure user code, > then I think it should be up to every (framework) developer to provide > their own API, using whatever constraints they chose. Without specific > use cases it's hard to reason about API design. GeneratorReturn and finish *can* be implemented in pure user code, as long as you accept that the premature return has to use some other mechanism than "return" or StopIteration. > Still, I think it is > reasonable to offer some basic behavior on the generator object, and I > still think that the best compromise here is to let g.close() extract > the return value from StopIteration if it catches it. If a framework > decides not to use this, fine. For a user working without a framework > this is still just a little nicer than having to figure out the > required logic yourself. > This works only as long as you don't actually use yield-from, making it a bit of a strange match to that PEP. To get it to work *with* yield-from you need the reraised GeneratorExit to include the return value (possibly None) from the inner generator. I seem to have convinced Nick that the problem is real and that a modification to the expansion might be needed/desirable. > I am aware of four relevant states for generators. Here's how they > work (in current Python): > > - initial state: execution is poised at the top of the function. > g.throw() always bounces back the exception. g.close() moves it to the > final state. g.next() starts it running. g.send() requires a None > argument and is then the same as g.next(). > > - running state: the frame is active. none of g.next(), g.send(), > g.throw() or g.close() work -- they all raise ValueError. > > - suspended state: execution is suspended at a yield. g.close() raises > GeneratorExit and if the generator catches this it can do whatever it > pleases. If it then raises StopIteration or GeneratorExit, g.close() > is happy, if it raises another exception g.close() just passes that > through, if it yields a value g.close() complains and raises > RuntimeError(). > > - finished (exhausted) state: the generator has returned. g.close() > always return None. g.throw() always bounces back the exception. > g.next() and g.send() always raise StopIteration. > > I would be in favor of adding an introspection API to distinguish > these four states and I think it would be a fine thing to add to > Python 3.2 if anyone finds the time to produce a patch (Nick? You > showed what these boil down to.) > > I note that in the initial state a generator has no choice in how to > respond because it hasnt't yet had the opportunity to set up a > try/except, so in this state it acts pretty much the same as in the > exhausted state when receiving a throw() or close(). > Yes, I forgot about this case in the versions of "finish" that I wrote. Nick showed a better version that handled it properly. > Regarding built-in syntax for Go-like channels, let's first see an > implementation in userland become successful *or* see that it's > impossible to write an efficient one before adding more to the > language. > It is impossible in current python to use a for-loop or generator expression to loop over a Go-like channel without using threads for everything. (The only way to suspend the iteration is to suspend the thread, and then whatever code is supposed to write to the channel must be running in another thread) This is a shame, since the blocking nature of channels otherwise make them ideal for cooperative multitasking. Note, this restriction (no for-loop iteration without threads) does not make channels useless in current python, just much less convenient to work with. That, unfortunately, makes it less likely that a userland implementation will ever become successful. > Note that having a different expansion of a for-loop based on the > run-time value or type of the iterable cannot be done -- the expansion > can only vary based on the syntactic form. > The intent was to have a different expansion depending on the type of function containing the for-loop (as in regular/cofunction). I think I made a few errors though, so the new expansion doesn't actually work with regular iterables. If I get around to fixing it I'll post the fix in that thread. > There are a few different conventions for using generators and > yield-from; e.g. generators used as proper iterators with easy > refactoring; generators used as tasks where yield X is used for > blocking I/O operations; and generators used as "inverse generators" > as in the parallel_reduce() example that initiated this thread. I > don't particularly care about what kind of errors you get if a > generator written for one convention is accidentally used by another > convention, as long as it is made clear which convention is being used > in each case. Frameworks/libraries can and probably should develop > decorators to mark up the 2nd and 3rd conventions, but I don't think > the *language* needs to go out of its way to enforce proper usage. > Agreed, I think. - Jacob From pyideas at rebertia.com Thu Oct 28 10:38:09 2010 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 28 Oct 2010 01:38:09 -0700 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC93095.3080704@egenix.com> References: <4CC93095.3080704@egenix.com> Message-ID: On Thu, Oct 28, 2010 at 1:13 AM, M.-A. Lemburg wrote: > I don't see much use in having the order of providing the > keyword arguments in a function call always available. > Perhaps there's a way to have this optionally, i.e. by > allowing odicts to be passed in as keyword argument dict ?! > > Where I do see a real use is making the order of class > attribute and method definition accessible in Python > (without having to use meta-class hacks like e.g. Django's > ORM does). So, you want to make class bodies use (C-implemented) OrderedDicts by default, thus rendering metaclass __prepare__() definitions for that purpose ("meta-class hacks") superfluous? Cheers, Chris From jh at improva.dk Thu Oct 28 10:52:53 2010 From: jh at improva.dk (Jacob Holm) Date: Thu, 28 Oct 2010 10:52:53 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> Message-ID: <4CC939E5.5070700@improva.dk> On 2010-10-28 00:52, Nick Coghlan wrote: > On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm wrote: >> Actually, AFAICT outer_broken will *not* give a RuntimeError on close() >> after next(). This is due to the special-casing of GeneratorExit in PEP >> 380. That special-casing is also the basis for both my suggested >> modifications. > > Ah, you're quite right - I'd completely forgotten about the > GeneratorExit special-casing in the PEP 380 semantics, so I was > arguing from a faulty premise. With that error corrected, I can > happily withdraw my objection to idioms that convert GeneratorExit to > StopIteration (since any yield from expressions will reraise the > GeneratorExit in that case). > Looks like we are still not on exactly the same page though... You seem to be arguing from the version at http://www.python.org/dev/peps/pep-0380, whereas I am looking at http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt, which is newer. > The "did-it-really-finish?" question can likely be answered by > slightly improving generator state introspection from the Python level > (as I believe Guido suggested earlier in the thread). That way close() > can keep the gist of its current semantics (return something if the > generator ends up in an inactive state, raise RuntimeError if it > yields another value), while frameworks can object to other unexpected > states if they want to. > > As it turns out, the information on generator state is already there, > just not in a particularly user friendly format ("not started" = > "g.gi_frame is not None and g.gi_frame.f_lasti == -1", "terminated" = > "g.gi_frame is None"). > > So, without any modifications at all to the current incarnation of PEP > 380, it is already possible to write: > > def finish(gen): > frame = gen.gi_frame > if frame is None: > raise RuntimeError('finish() on exhausted/closed generator') > if frame.f_lasti == -1: > raise RuntimeError('finish() on not yet started generator') > try: > gen.throw(GeneratorExit) > except StopIteration as err: > if err.args: > return err.args[0] > return None > except GeneratorExit: > pass > else: > raise RuntimeError('Generator ignored GeneratorExit') > raise RuntimeError('Generator failed to return a value') > Yes. I forgot about the "not yet started" case in my earlier versions. > I think I'm finally starting to understand *your* question/concern > though. Given the current PEP 380 expansion, the above definition of > finish() and the following two generators: > > def g_inner(): > yield > return "Hello world!" > > def g_outer(): > yield (yield from g_inner()) > > You would get the following result (as g_inner converts GeneratorExit > to StopIteration, then yield from propogates that up the stack): >>>> g = g_outer() >>>> next(g) >>>> finish(g) > "Hello world!" > > Oops? > Well. Not with the newest expansion. Not that the None you will get from that one is any better. > I'm wondering if this part of the PEP 380 expansion: > if _e is _x[1] or isinstance(_x[1], GeneratorExit): > raise > > Should actually look like: > if _e is _x[1]: > raise > if isinstance(_x[1], GeneratorExit): > raise GeneratorExit(*_e.args) > In the newer expansion, I would change: except GeneratorExit as _e: try: _m = getattr(_i, 'close') except AttributeError: pass else: _m() raise _e Into: except GeneratorExit as _e: try: _m = getattr(_i, 'close') except AttributeError: pass else: raise GeneratorExit(_m()) raise _e (Which can cleaned up a bit btw., by removing _e and using direct attribute access instead of getattr) > Once that distinction is made, you can more easily write helper > functions and context managers that allow code to do the "right thing" > according to the needs of a particular framework or application. > Yes. OTOH, I have argued for this change before with no luck. - Jacob From denis.spir at gmail.com Thu Oct 28 11:19:36 2010 From: denis.spir at gmail.com (spir) Date: Thu, 28 Oct 2010 11:19:36 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC93095.3080704@egenix.com> References: <4CC93095.3080704@egenix.com> Message-ID: <20101028111936.1629eade@o> On Thu, 28 Oct 2010 10:13:09 +0200 "M.-A. Lemburg" wrote: > Ordered dicts are a lot slower than normal dictionaries. I don't > think that we can make such a change unless we want to make > Python a lot slower at the same time. Ruby has ordered hashes since 1.9 with apparently no relevant performance loss -- actually there was gain of performance due to improvement in other aspects of the language. See eg http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash/ I have no idea how python dicts are implemented, especially how entries are held in "buckets". The trick for Ruby is that buckets are actually linked lists, entries are list nodes with pointers allowing linear search inside the bucket. To preserve insertion order, all what is needed is to add parallel pointers to each node representing a parallel list, namely insertion order. Iteration just follows this second sequence of pointers. (I find this solution rather elegant.) Denis -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From mal at egenix.com Thu Oct 28 11:27:27 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 28 Oct 2010 11:27:27 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC93095.3080704@egenix.com> Message-ID: <4CC941FF.6070408@egenix.com> Chris Rebert wrote: > On Thu, Oct 28, 2010 at 1:13 AM, M.-A. Lemburg wrote: > >> I don't see much use in having the order of providing the >> keyword arguments in a function call always available. >> Perhaps there's a way to have this optionally, i.e. by >> allowing odicts to be passed in as keyword argument dict ?! >> >> Where I do see a real use is making the order of class >> attribute and method definition accessible in Python >> (without having to use meta-class hacks like e.g. Django's >> ORM does). > > So, you want to make class bodies use (C-implemented) OrderedDicts by > default, thus rendering metaclass __prepare__() definitions for that > purpose ("meta-class hacks") superfluous? Yes. The http://www.python.org/dev/peps/pep-3115/ has an interesting paragraph: Another good suggestion was to simply use an ordered dict for all classes, and skip the whole 'custom dict' mechanism. This was based on the observation that most use cases for a custom dict were for the purposes of preserving order information. However, this idea has several drawbacks, first because it means that an ordered dict implementation would have to be added to the set of built-in types in Python, and second because it would impose a slight speed (and complexity) penalty on all class declarations. Later, several people came up with ideas for use cases for custom dictionaries other than preserving field orderings, so this idea was dropped. Some comments: An ordered dict in C could be optimized to keep the same performance as the regular dict by only storing the insertion index together with the dict item and not maintaining these in a separate list. Access to the order would be slower, but it would make its use in timing critical parts of CPython a lot more attractive. An ordered dict would then require more memory, but not necessarily introduce a performance hit. The quoted paragraph starts with "observation that most use cases for a custom dict were for the purposes of preserving order information" and ends with "several people came up with ideas for use cases for custom dictionaries other than preserving field orderings". I'd call that a standard case of over-generalization - a rather popular syndrom in Python-land :-) - but that left aside: the first part is very true. Python has always tried to make the most common use case simple, so asking programmers to use a meta-class to be able to access the order of definitions in a class definition isn't exactly what the normal Python programmer would expect. Named tuples and similar sequence/mapping hybrids could probably also benefit from having the order of definition readily available, either directly via an odict cls.__dict__ or via a new attribute cls.__deforder__ which provides the order information in form of a tuple of cls.__dict__ keys. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From jh at improva.dk Thu Oct 28 12:12:55 2010 From: jh at improva.dk (Jacob Holm) Date: Thu, 28 Oct 2010 12:12:55 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC858F3.4000602@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC858F3.4000602@improva.dk> Message-ID: <4CC94CA7.2030001@improva.dk> On 2010-10-27 18:53, Jacob Holm wrote: > Hmm. This got me thinking. One thing I'd really like to see in python > is something like the "channel" object from the go language > (http://golang.org/). > > Based on PEP 380 or Gregs new cofunctions PEP (or perhaps even without > any of them) it is possible to write a trampoline-based implementation > of a channel object with "send" and "next" methods that work as > expected. One thing that is *not* possible (I think) is to make that > object iterable. Your wild idea above gave me a similar wild idea of my > own. An extension to the cofunctions PEP that would make that possible. > Seems like I screwed up the semantics of the standard for-loop in that version. Let me try again... 1) Add new exception StopCoIteration, inheriting from StandardError. Change the regular StopIteration to inherit from the new exception instead of directly from StandardError. This ensures that code that catches StopCoIteration also catches StopIteration, which I think is what we want. The new exception is needed because "cocall func()" can never raise the regular StopIteration (or any subclass thereof). This might actually be an argument for using a different exception for returning a value from a coroutine... 2) Allow __next__ on an object to be a cofunction. Add a __cocall__ to the built-in next(ob) that tries to uses cocall to call ob.__next__. def next__cocall__(ob, *args): if len(args)>1: raise TypeError try: _next = type(ob).__next__ except AttributeError: raise TypeError try: return cocall _next(ob) except StopCoIteration: if args: return args[0] raise 2a) Optionally allow __iter__ on an object to be a cofunction. Add a __cocall__ to the builtin iter. class _func_iter(object): def __init__(self, callable, sentinel): self.callable = callable self.sentinel = sentinel def __next__(self): v = cocall self.callable() if v is sentinel: raise StopCoIteration return v def iter__cocall__(*args): try: ob, = args except ValueError: try: callable, sentinel = args except ValueError: raise TypeError return _func_iter(callable, sentinel) try: _iter = type(ob).__iter__ except AttributeError: raise TypeError return cocall _iter(ob) 3) Change the for-loop in a cofunction: for val in iterable: else: so it expands into: _it = cocall iter(iterable) while True: try: val = cocall next(iterable) except StopCoIteration: break else: which is exactly the normal expansion, but using cocall to call iter and next, and catching StopCoIteration instead of StopIteration. Since cocall falls back to using a regular call, this should work well with all normal iterables. 3a) Alternatively define a new syntax for "coiterating", e.g. cocall for val in iterable: else: All this to make it possible to write a code like this: def consumer(ch): cocall for val in ch: print(val) def producer(ch): cocall for val in range(10): cocall ch.send(val) def main() sched = scheduler() ch = channel() sched.add(consumer(ch)) sched.add(producer(ch)) sched.run() Thoughts? - Jacob From solipsis at pitrou.net Thu Oct 28 13:10:54 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 28 Oct 2010 13:10:54 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> Message-ID: <20101028131054.22e8ba25@pitrou.net> On Thu, 28 Oct 2010 11:19:36 +0200 spir wrote: > On Thu, 28 Oct 2010 10:13:09 +0200 > "M.-A. Lemburg" wrote: > > > Ordered dicts are a lot slower than normal dictionaries. I don't > > think that we can make such a change unless we want to make > > Python a lot slower at the same time. > > Ruby has ordered hashes since 1.9 with apparently no relevant > performance loss Performance would probably not suffer on micro-benchmarks (with everything fitting in the CPU's L1 cache), but making dicts bigger (by 66%: 5 pointer-sized fields per hash entry instead of 3) could be detrimental in real life workloads. Regards Antoine. From mal at egenix.com Thu Oct 28 13:52:35 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 28 Oct 2010 13:52:35 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <20101028131054.22e8ba25@pitrou.net> References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> Message-ID: <4CC96403.6030106@egenix.com> Antoine Pitrou wrote: > On Thu, 28 Oct 2010 11:19:36 +0200 > spir wrote: >> On Thu, 28 Oct 2010 10:13:09 +0200 >> "M.-A. Lemburg" wrote: >> >>> Ordered dicts are a lot slower than normal dictionaries. I don't >>> think that we can make such a change unless we want to make >>> Python a lot slower at the same time. >> >> Ruby has ordered hashes since 1.9 with apparently no relevant >> performance loss > > Performance would probably not suffer on micro-benchmarks (with > everything fitting in the CPU's L1 cache), but making dicts bigger > (by 66%: 5 pointer-sized fields per hash entry instead of 3) could > be detrimental in real life workloads. For function calls, yes. For class creation, I doubt that a few extra bytes would make much difference in real life - classes typically don't have thousands of methods or attributes :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 28 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Thu Oct 28 14:10:07 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 28 Oct 2010 14:10:07 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC96403.6030106@egenix.com> References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com> Message-ID: <1288267807.3705.0.camel@localhost.localdomain> > > Performance would probably not suffer on micro-benchmarks (with > > everything fitting in the CPU's L1 cache), but making dicts bigger > > (by 66%: 5 pointer-sized fields per hash entry instead of 3) could > > be detrimental in real life workloads. > > For function calls, yes. For class creation, I doubt that a few > extra bytes would make much difference in real life - classes typically > don't have thousands of methods or attributes :-) Right. I was talking about the prospect of making dicts ordered by default. Regards Antoine. From ncoghlan at gmail.com Thu Oct 28 14:18:25 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Oct 2010 22:18:25 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC939E5.5070700@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 6:52 PM, Jacob Holm wrote: > On 2010-10-28 00:52, Nick Coghlan wrote: >> On Thu, Oct 28, 2010 at 6:22 AM, Jacob Holm wrote: >>> Actually, AFAICT outer_broken will *not* give a RuntimeError on close() >>> after next(). ?This is due to the special-casing of GeneratorExit in PEP >>> 380. ?That special-casing is also the basis for both my suggested >>> modifications. >> >> Ah, you're quite right - I'd completely forgotten about the >> GeneratorExit special-casing in the PEP 380 semantics, so I was >> arguing from a faulty premise. With that error corrected, I can >> happily withdraw my objection to idioms that convert GeneratorExit to >> StopIteration (since any yield from expressions will reraise the >> GeneratorExit in that case). >> > > Looks like we are still not on exactly the same page though... ?You seem > to be arguing from the version at > http://www.python.org/dev/peps/pep-0380, whereas I am looking at > http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt, > which is newer. Ah, the comment earlier in the thread about the PEP not being up to date with the last discussion makes more sense now... Still, the revised expansion also does the right thing in the case that was originally bothering me, and I agree with your suggested tweak to that version. I've cc'ed Greg directly on this email - if he wants, I can check in an updated version of the PEP to bring the python.org version up to speed with the later discussions. With that small change to the yield from expansion, as well as the change to close to return the first argument to StopIteration (if any) and None otherwise, I think PEP 380 will be in a much better position to support user experimentation in this area. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Oct 28 14:44:31 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Oct 2010 22:44:31 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 9:46 AM, Guido van Rossum wrote: > Nick & Jacob, > > Unfortunately other things are in need of my attention and I am > quickly lagging behind on this thread. > > I'll try to respond to some issues without specific quoting. > > If GeneratorReturn and finish() can be implemented in pure user code, > then I think it should be up to every (framework) developer to provide > their own API, using whatever constraints they chose. Without specific > use cases it's hard to reason about API design. Still, I think it is > reasonable to offer some basic behavior on the generator object, and I > still think that the best compromise here is to let g.close() extract > the return value from StopIteration if it catches it. If a framework > decides not to use this, fine. For a user working without a framework > this is still just a little nicer than having to figure out the > required logic yourself. Yep, we've basically agreed on that as the way forward as well. We have a small tweak to suggest for PEP 380 to avoid losing the return value from inner close() calls, and I've cc'ed Greg directly on the relevant message in order to move that idea forward (and bring the python.org version of the PEP up to date with the last posted version as well). That should provide a solid foundation for experimentation in user code in 3.3 without overcomplicating PEP 380 with stuff that will probably end up being YAGNI. > I would be in favor of adding an introspection API to distinguish > these four states and I think it would be a fine thing to add to > Python 3.2 if anyone finds the time to produce a patch (Nick? You > showed what these boil down to.) I've created a tracker issue proposing a simple inspect.getgeneratorstate() function (issue 10220). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Thu Oct 28 16:58:08 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 28 Oct 2010 07:58:08 -0700 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <1288267807.3705.0.camel@localhost.localdomain> References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com> <1288267807.3705.0.camel@localhost.localdomain> Message-ID: Let's see if someone can come up with an ordereddict implemented in C first and then benchmark the hell out of it. Once its performance is acceptable we can talk about using it for keyword args, class dicts, or even make it the one and only dict object -- but the latter would be a really high bar to pass. -- --Guido van Rossum (python.org/~guido) From guido at python.org Thu Oct 28 17:04:55 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 28 Oct 2010 08:04:55 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan wrote: > Yep, we've basically agreed on that as the way forward as well. We > have a small tweak to suggest for PEP 380 to avoid losing the return > value from inner close() calls, This is my "gclose()" function, right? Or is there more to it? > and I've cc'ed Greg directly on the > relevant message in order to move that idea forward (and bring the > python.org version of the PEP up to date with the last posted version > as well). Greg's been remarkably quiet on this thread even though I cc'ed him early on. Have you heard back from him yet? > That should provide a solid foundation for experimentation in user > code in 3.3 without overcomplicating PEP 380 with stuff that will > probably end up being YAGNI. > >> I would be in favor of adding an introspection API to distinguish >> these four states and I think it would be a fine thing to add to >> Python 3.2 if anyone finds the time to produce a patch (Nick? You >> showed what these boil down to.) > > I've created a tracker issue proposing a simple > inspect.getgeneratorstate() function (issue 10220). I added a little something to the issue. -- --Guido van Rossum (python.org/~guido) From denis.spir at gmail.com Thu Oct 28 19:58:59 2010 From: denis.spir at gmail.com (spir) Date: Thu, 28 Oct 2010 19:58:59 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com> <1288267807.3705.0.camel@localhost.localdomain> Message-ID: <20101028195859.738d22b8@o> On Thu, 28 Oct 2010 07:58:08 -0700 Guido van Rossum wrote: > Let's see if someone can come up with an ordereddict implemented in C > first and then benchmark the hell out of it. > > Once its performance is acceptable we can talk about using it for > keyword args, class dicts, or even make it the one and only dict > object -- but the latter would be a really high bar to pass. > What does the current implementation use as buckets? deins -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From solipsis at pitrou.net Thu Oct 28 20:06:05 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 28 Oct 2010 20:06:05 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com> <1288267807.3705.0.camel@localhost.localdomain> <20101028195859.738d22b8@o> Message-ID: <20101028200605.14e829e3@pitrou.net> On Thu, 28 Oct 2010 19:58:59 +0200 spir wrote: > On Thu, 28 Oct 2010 07:58:08 -0700 > Guido van Rossum wrote: > > > Let's see if someone can come up with an ordereddict implemented in C > > first and then benchmark the hell out of it. > > > > Once its performance is acceptable we can talk about using it for > > keyword args, class dicts, or even make it the one and only dict > > object -- but the latter would be a really high bar to pass. > > > > What does the current implementation use as buckets? It uses an open addressing strategy. Each dict entry holds three pointer-sized fields: key object, value object, and cached hash value of the key. (set entries have only two fields, since they don't hold a value object) You'll find details in Include/dictobject.h and Objects/dictobject.c. Regards Antoine. From raymond.hettinger at gmail.com Thu Oct 28 20:10:24 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 28 Oct 2010 11:10:24 -0700 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC96403.6030106@egenix.com> References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com> Message-ID: <74CCD6FE-E625-4CDB-B9EE-9DA6D30713EB@gmail.com> On Oct 28, 2010, at 4:52 AM, M.-A. Lemburg wrote: > Antoine Pitrou wrote: >> On Thu, 28 Oct 2010 11:19:36 +0200 >> spir wrote: >>> On Thu, 28 Oct 2010 10:13:09 +0200 >>> "M.-A. Lemburg" wrote: >>> >>>> Ordered dicts are a lot slower than normal dictionaries. I don't >>>> think that we can make such a change unless we want to make >>>> Python a lot slower at the same time. >>> >>> Ruby has ordered hashes since 1.9 with apparently no relevant >>> performance loss >> >> Performance would probably not suffer on micro-benchmarks (with >> everything fitting in the CPU's L1 cache), but making dicts bigger >> (by 66%: 5 pointer-sized fields per hash entry instead of 3) could >> be detrimental in real life workloads. > > For function calls, yes. For class creation, I doubt that a few > extra bytes would make much difference in real life - classes typically > don't have thousands of methods or attributes :-) Last year, I experimented with this design (changing the dict implementation to be ordered by adding two fields for links). The effects are: * The expected 66% increase in size was unavoidable for large dicts. * For smaller dicts the link fields used indices instead of pointers and those indices were smaller than the existing fields (i.e. 8 bits per entry for tables under 256 rows, 16 bits per entry for tables under 65k rows). * Iteration speed improved for smaller dicts because we don't have to examine empty slots (we also get to eliminate the "search finger" hack). For larger dicts, results were mixed (because of the loss of locality of access). Raymond From jimjjewett at gmail.com Thu Oct 28 20:44:59 2010 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 28 Oct 2010 14:44:59 -0400 Subject: [Python-ideas] dict changes [was: Ordered storage of keyword arguments] Message-ID: On 10/28/10, Antoine Pitrou wrote: > On Thu, 28 Oct 2010 19:58:59 +0200 > spir wrote: >> What does the current implementation use as buckets? > It uses an open addressing strategy. Each dict entry holds three > pointer-sized fields: key object, value object, and cached hash value > of the key. > (set entries have only two fields, since they don't hold a value object) Has anyone benchmarked not storing the hash value here? For a string dict, that hash should already be available on the string object itself, so it is redundant. Keeping it obviously improves cache locality, but ... it also makes the dict objects 50% larger, and there is a chance that the strings themselves would already be in cache anyhow. And if strings were reliably interned, the comparison check should normally just be a pointer compare -- possibly fast enough that the "different hash" shortcut doesn't buy anything. [caveats about still needing to go to the slower dict implementation for string subclasses] -jJ From raymond.hettinger at gmail.com Thu Oct 28 21:06:38 2010 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 28 Oct 2010 12:06:38 -0700 Subject: [Python-ideas] dict changes [was: Ordered storage of keyword arguments] In-Reply-To: References: Message-ID: <02DB84F9-2D62-4A57-BA78-728C4E3ED399@gmail.com> On Oct 28, 2010, at 11:44 AM, Jim Jewett wrote: > >> It uses an open addressing strategy. Each dict entry holds three >> pointer-sized fields: key object, value object, and cached hash value >> of the key. >> (set entries have only two fields, since they don't hold a value object) > > Has anyone benchmarked not storing the hash value here That would be a small disaster. Either you call PyObject_Hash() for every probe (adding function call overhead for int and str, and adding tons of work for other types) or you can go directly to Py_RichCompareBool() which is never fast. I haven't timed this for dicts, but I did see major speed boosts in the performance of set-to-set operations when the internally stored hash was used instead of calling PyObject_Hash(). Raymond From greg.ewing at canterbury.ac.nz Thu Oct 28 22:14:46 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 09:14:46 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC858F3.4000602@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC858F3.4000602@improva.dk> Message-ID: <4CC9D9B6.8000005@canterbury.ac.nz> Jacob Holm wrote: > 1) Define a new "coiterator" protocol, consisting of a new special > method __conext__, and a new StopCoIteration exception that the regular > StopIteration inherits from. I don't think it's necessary to have a new protocol. All that's needed is to allow for the possibility of the __next__ method of an iterator being a cofunction. Under the current version of PEP 3152, with an explicit "cocall" operation, this would require a new kind of for-loop. Maybe using "cofor"? However, my current thinking on cofunctions is that cocalls should be implicit -- you declare a cofunction with "codef", and any call made within it can potentially be a cocall. In that case, there would be no need for new syntax -- the existing for-loop could just do the right thing when given an object whose __next__ method is a cofunction. Thinking about this has made me even more sure that implicit cocalls are the way to go, because it means that any other things we think of that need to take cofunctions into account can be fixed without having to introduce new syntax for each one. -- Greg From debatem1 at gmail.com Thu Oct 28 22:17:22 2010 From: debatem1 at gmail.com (geremy condra) Date: Thu, 28 Oct 2010 13:17:22 -0700 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <74CCD6FE-E625-4CDB-B9EE-9DA6D30713EB@gmail.com> References: <4CC93095.3080704@egenix.com> <20101028111936.1629eade@o> <20101028131054.22e8ba25@pitrou.net> <4CC96403.6030106@egenix.com> <74CCD6FE-E625-4CDB-B9EE-9DA6D30713EB@gmail.com> Message-ID: On Thu, Oct 28, 2010 at 11:10 AM, Raymond Hettinger wrote: > > On Oct 28, 2010, at 4:52 AM, M.-A. Lemburg wrote: > >> Antoine Pitrou wrote: >>> On Thu, 28 Oct 2010 11:19:36 +0200 >>> spir wrote: >>>> On Thu, 28 Oct 2010 10:13:09 +0200 >>>> "M.-A. Lemburg" wrote: >>>> >>>>> Ordered dicts are a lot slower than normal dictionaries. I don't >>>>> think that we can make such a change unless we want to make >>>>> Python a lot slower at the same time. >>>> >>>> Ruby has ordered hashes since 1.9 with apparently no relevant >>>> performance loss >>> >>> Performance would probably not suffer on micro-benchmarks (with >>> everything fitting in the CPU's L1 cache), but making dicts bigger >>> (by 66%: 5 pointer-sized fields per hash entry instead of 3) could >>> be detrimental in real life workloads. >> >> For function calls, yes. For class creation, I doubt that a few >> extra bytes would make much difference in real life - classes typically >> don't have thousands of methods or attributes :-) > > Last year, I experimented with this design (changing the dict implementation > to be ordered by adding two fields for links). The effects are: > > * The expected 66% increase in size was unavoidable for large dicts. > > * For smaller dicts the link fields used indices instead of pointers > and those indices were smaller than the existing fields (i.e. 8 bits > per entry for tables under 256 rows, 16 bits per entry for tables under > 65k rows). > > * Iteration speed improved for smaller dicts because we don't have > to examine empty slots (we also get to eliminate the "search > finger" hack). For larger dicts, results were mixed (because of the > loss of locality of access). > > > Raymond Is this available somewhere? I'd like to play around with this for a bit. Geremy Condra From greg.ewing at canterbury.ac.nz Thu Oct 28 23:22:39 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 10:22:39 +1300 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC941FF.6070408@egenix.com> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> Message-ID: <4CC9E99F.5030805@canterbury.ac.nz> M.-A. Lemburg wrote: > Python has always tried > to make the most common use case simple, so asking programmers to > use a meta-class to be able to access the order of definitions > in a class definition isn't exactly what the normal Python > programmer would expect. But needing to know the order of definitions in a class is a very uncommon thing to want to do in the first place. -- Greg From greg.ewing at canterbury.ac.nz Thu Oct 28 23:37:17 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 10:37:17 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC94CA7.2030001@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC858F3.4000602@improva.dk> <4CC94CA7.2030001@improva.dk> Message-ID: <4CC9ED0D.1010800@canterbury.ac.nz> Jacob Holm wrote: > The new exception is needed because "cocall func()" can never raise the > regular StopIteration (or any subclass thereof). Botheration, I hadn't thought of that! I'll have to think about this one. I still feel that it shouldn't be necessary to define any new protocol -- one ought to be able to simply write a __next__ cofunction that looks like a normal one in all respects except that it's defined with 'codef'. Maybe a StopIteration raised inside a cofunction shouldn't be synonymous with a return, but instead should be caught and tunnelled around the yield-from via another exception. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 29 00:03:45 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 11:03:45 +1300 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: Message-ID: <4CC9F341.8020404@canterbury.ac.nz> Chris Rose wrote: > I'd like to resurrect a discussion that went on a little over a year > ago [1] started by Michael Foord suggesting that it'd be nice if > keyword arguments' storage was implemented as an ordered dict as > opposed to the current unordered form. What's the use case for this? One of the reasons that keyword arguments are useful is that you don't have to care what order you write them in! -- Greg From solipsis at pitrou.net Fri Oct 29 00:11:51 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Oct 2010 00:11:51 +0200 Subject: [Python-ideas] dict changes [was: Ordered storage of keyword arguments] In-Reply-To: References: Message-ID: <1288303911.3753.9.camel@localhost.localdomain> Le jeudi 28 octobre 2010 ? 14:44 -0400, Jim Jewett a ?crit : > > For a string dict, that hash should already be available on the string > object itself, so it is redundant. Keeping it obviously improves > cache locality, but ... it also makes the dict objects 50% larger, and > there is a chance that the strings themselves would already be in > cache anyhow. And if strings were reliably interned, the comparison > check should normally just be a pointer compare -- possibly fast > enough that the "different hash" shortcut doesn't buy anything. > [caveats about still needing to go to the slower dict implementation > for string subclasses] I've thought about that. The main annoyance is to be able to switch transparently between the two implementations. But I think it would be interesting to pursue that effort, since indeed dicts with interned keys are the most common case of dicts in the average Python workload. Saving 1/3 of the memory size on these dicts would be worthwhile IMO. (addressing itself would perhaps be a bit simpler, because of multiplying by 8 or 16 instead of multiplying by 12 or 24. But I doubt the difference would be noticeable) Regartds Antoine. From greg.ewing at canterbury.ac.nz Fri Oct 29 00:43:19 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 11:43:19 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> Message-ID: <4CC9FC87.1040600@canterbury.ac.nz> Nick Coghlan wrote: > On Thu, Oct 28, 2010 at 6:52 PM, Jacob Holm wrote: >>Looks like we are still not on exactly the same page though... You seem >>to be arguing from the version at >>http://www.python.org/dev/peps/pep-0380, whereas I am looking at >>http://mail.python.org/pipermail/python-ideas/attachments/20090419/c7d72ba8/attachment-0001.txt, >>which is newer. > > Still, the revised expansion also does the right thing in the case > that was originally bothering me, That attachment is slightly older than my own current draft, which is attached below. The differences in the expansion are as follows (- is the version linked to above, + is my current version): @@ -141,20 +141,21 @@ _s = yield _y except GeneratorExit as _e: try: - _m = getattr(_i, 'close') + _m = _i.close except AttributeError: pass else: _m() raise _e except BaseException as _e: + _x = sys.exc_info() try: - _m = getattr(_i, 'throw') + _m = _i.throw except AttributeError: raise _e else: try: - _y = _m(*sys.exc_info()) + _y = _m(*_x) except StopIteration as _e: _r = _e.value break Does this version still address your concerns? If so, please check it in as the latest version. -- Greg -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: yield-from-rev14.txt URL: From jh at improva.dk Fri Oct 29 02:45:00 2010 From: jh at improva.dk (Jacob Holm) Date: Fri, 29 Oct 2010 02:45:00 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC9D9B6.8000005@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC858F3.4000602@improva.dk> <4CC9D9B6.8000005@canterbury.ac.nz> Message-ID: <4CCA190C.6050709@improva.dk> On 2010-10-28 22:14, Greg Ewing wrote: > Jacob Holm wrote: > >> 1) Define a new "coiterator" protocol, consisting of a new special >> method __conext__, and a new StopCoIteration exception that the regular >> StopIteration inherits from. > > I don't think it's necessary to have a new protocol. All > that's needed is to allow for the possibility of the > __next__ method of an iterator being a cofunction. > That is more or less exactly what I did for my second version. Turns out to be less simple than that because you need to "next" work as a cofunction as well, and there is a problem with raising StopIteration from a cofunction. > Under the current version of PEP 3152, with an explicit > "cocall" operation, this would require a new kind of > for-loop. Maybe using "cofor"? > > However, my current thinking on cofunctions is that > cocalls should be implicit -- you declare a cofunction > with "codef", and any call made within it can potentially > be a cocall. In that case, there would be no need for new > syntax -- the existing for-loop could just do the right > thing when given an object whose __next__ method is a > cofunction. > > Thinking about this has made me even more sure that > implicit cocalls are the way to go, because it means > that any other things we think of that need to take > cofunctions into account can be fixed without having > to introduce new syntax for each one. > Yes. Looking at a few examples using my toy implementation of Go channels made me realise just how awkward it would be to have to mark all cocall sites explicitly. With implicit cocalls and a for-loop changed to work with a cofunction __next__, working with channels can be made to look exactly like working with generators. For me, that would be a major selling point for the PEP. - Jacob From ncoghlan at gmail.com Fri Oct 29 02:54:38 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 29 Oct 2010 10:54:38 +1000 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC9F341.8020404@canterbury.ac.nz> References: <4CC9F341.8020404@canterbury.ac.nz> Message-ID: On Fri, Oct 29, 2010 at 8:03 AM, Greg Ewing wrote: > What's the use case for this? One of the reasons that keyword > arguments are useful is that you don't have to care what order > you write them in! The use case is being able to interface naturally with any key-value API where order matters. For example: # Create an ordered dictionary (WRONG!) d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due to unordered kw dict Another example is an addition made to part of the json API (to accept an iterable of key-value pairs) to work around exactly this problem. Basically, if an API accepts an iterable of key-value pairs instead of a dictionary, it's a case where ordered keyword dictionaries would likely improve usability. That said, there are plenty of steps to be taken before the idea of using ordered dictionaries implicitly anywhere in the interpreter can even be seriously considered. Step 1 is to come up with a C-accelerated version of collections.OrderedDictionary, step 2 is to make it a builtin (odict?), step 3 is to consider using it for class namespaces and/or for keyword arguments by default, then step 4 would probably be to switch "dict=odict" and add a collections.UnorderedDictionary interface to the old dict implementation. The bar for progression (in terms of acceptable impacts on speed and memory usage) would get higher with each step along the path. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Oct 29 03:17:15 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 29 Oct 2010 11:17:15 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum wrote: > On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan wrote: >> Yep, we've basically agreed on that as the way forward as well. We >> have a small tweak to suggest for PEP 380 to avoid losing the return >> value from inner close() calls, > > This is my "gclose()" function, right? Or is there more to it? Yeah, the idea your gclose(), plus one extra tweak to the expansion of "yield from" to store the result of the inner close() call on a new GeneratorExit instance. To use a toy example: # Even this toy framework needs a little structure class EndSum(Exception): pass def gsum(): # Sums sent values until EndSum or GeneratorExit are thrown in tally = 0 try: while 1: tally += yield except (EndSum, GeneratorExit): pass return x def average_sums(): # Advances to a new sum when EndSum is thrown in # Finishes the last sum and averages them all when GeneratorExit is thrown in sums = [] try: while 1: sums.append(yield from gsum()) except GeneratorExit as ex: # Our proposed expansion tweak is to enable the next line sums.append(ex.args[0]) return sum(sums) / len(sums) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Fri Oct 29 04:21:27 2010 From: guido at python.org (Guido van Rossum) Date: Thu, 28 Oct 2010 19:21:27 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 6:17 PM, Nick Coghlan wrote: > On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum wrote: >> On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan wrote: >>> Yep, we've basically agreed on that as the way forward as well. We >>> have a small tweak to suggest for PEP 380 to avoid losing the return >>> value from inner close() calls, >> >> This is my "gclose()" function, right? Or is there more to it? > > Yeah, the idea your gclose(), plus one extra tweak to the expansion of > "yield from" to store the result of the inner close() call on a new > GeneratorExit instance. > > To use a toy example: > > ?# Even this toy framework needs a little structure > ?class EndSum(Exception): pass > > ?def gsum(): > ? ?# Sums sent values until EndSum or GeneratorExit are thrown in > ? ?tally = 0 > ? ?try: > ? ? ?while 1: > ? ? ? ?tally += yield > ? ?except (EndSum, GeneratorExit): > ? ? ?pass > ? ?return x You meant return tally. Right? > ?def average_sums(): > ? ?# Advances to a new sum when EndSum is thrown in > ? ?# Finishes the last sum and averages them all when GeneratorExit > is thrown in > ? ?sums = [] > ? ?try: > ? ? ?while 1: > ? ? ? ?sums.append(yield from gsum()) > ? ?except GeneratorExit as ex: > ? ? ?# Our proposed expansion tweak is to enable the next line > ? ? ?sums.append(ex.args[0]) > ? ?return sum(sums) / len(sums) Hmmm... That looks pretty complicated. Wouldn't it be much more straightforward if instead of value ... value EndSum value ... value EndSum value ... value GeneratorExit the input sequence was required to be value ... value EndSum value ... value EndSum value ... value *EndSum* GeneratorExit ? Then gsum() wouldn't have to catch EndSum at all, and I don't think the PEP would have to special-case GeneratorExit. average_sums() could simply have except GeneratorExit: return sum(sums) / len(sums) After all this is a fairly arbitrary protocol and the caller presumably can do whatever is required of it. If there are values between the last EndSum and the last GeneratorExit those will be ignored -- that is a case of garbage in garbage out. If you really wanted to catch that mistake there would be several ways to translate it reliably into some other exception -- or log it, or whatever. It is also defensible that a better design of the protocol would not require throwing EndSum but sending some agreed-upon marker value. -- --Guido van Rossum (python.org/~guido) From cmjohnson.mailinglist at gmail.com Fri Oct 29 05:17:14 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Thu, 28 Oct 2010 17:17:14 -1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: On Thu, Oct 28, 2010 at 4:21 PM, Guido van Rossum wrote: > On Thu, Oct 28, 2010 at 6:17 PM, Nick Coghlan wrote: >> To use a toy example: >> >> ?# Even this toy framework needs a little structure >> ?class EndSum(Exception): pass >> >> ?def gsum(): >> ? ?# Sums sent values until EndSum or GeneratorExit are thrown in >> ? ?tally = 0 >> ? ?try: >> ? ? ?while 1: >> ? ? ? ?tally += yield >> ? ?except (EndSum, GeneratorExit): >> ? ? ?pass >> ? ?return x > > You meant return tally. Right? > >> ?def average_sums(): >> ? ?# Advances to a new sum when EndSum is thrown in >> ? ?# Finishes the last sum and averages them all when GeneratorExit >> is thrown in >> ? ?sums = [] >> ? ?try: >> ? ? ?while 1: >> ? ? ? ?sums.append(yield from gsum()) >> ? ?except GeneratorExit as ex: >> ? ? ?# Our proposed expansion tweak is to enable the next line >> ? ? ?sums.append(ex.args[0]) >> ? ?return sum(sums) / len(sums) > This toy example is a little confusing to me because it has typos? which is natural when one is writing a program without being able to run it to debug it. So, I wrote a version of the accumulator/averager that will work in Python 2.7 (and I think 3, but I didn't test it): class ReturnValue(Exception): pass def prime_pump(gen): def f(*args, **kwargs): g = gen(*args, **kwargs) next(g) return g return f @prime_pump def accumulator(): total = 0 length = 0 try: while 1: value = yield total += value length += 1 print(length, value, total) except GeneratorExit: r = ReturnValue() r.total = total r.length = length raise r @contextmanager def get_sum(it): try: it.close() except ReturnValue as r: yield r.total @contextmanager def get_average(it): try: it.close() except ReturnValue as r: yield r.total / r.length def main(): running_total = accumulator() sums = accumulator() running_total.send(6) #For example, whatever running_total.send(7) with get_sum(running_total) as first_sum: sums.send(first_sum) running_total = accumulator() #Zero it out running_total.send(2) #For example, whatever running_total.send(2) running_total.send(5) running_total.send(8) with get_sum(running_total) as second_sum: sums.send(second_sum) #Get the average of the sums with get_average(sums) as r: return r main() So, I guess the question I have is how will the proposed extensions to the language make the above code prettier? One thing I can see is that if it's possible to return from inside a generator, it can be more straightforward to get the values out of the accumulator at the end: try: while 1: value = yield total += value length += 1 print(length, value, total) except GeneratorExit: return total, length With Guido's proposed "for item from yield" syntax, IIUC this can be prettied up even more as: for value from yield: total += value length += 1 return total, length Are there other benefits to the proposed extensions? How will the call sites be improved? I'm not sure how I would rewrite main() to be prettier/more clear in light of the proposals? Thanks, -- Carl Johnson From greg.ewing at canterbury.ac.nz Fri Oct 29 06:26:30 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 17:26:30 +1300 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC9F341.8020404@canterbury.ac.nz> Message-ID: <4CCA4CF6.6020006@canterbury.ac.nz> On 29/10/10 13:54, Nick Coghlan wrote: > The use case is being able to interface naturally with any key-value > API where order matters. > > For example: > > # Create an ordered dictionary (WRONG!) > d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due > to unordered kw dict I'd need convincing that the API wouldn't be better designed to take something other than keyword arguments: d = OrderedDictionary(('a', 1), ('b', 2), ('c', 3)) and have it refuse to accept keyword arguments to prevent accidents. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 29 06:35:20 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 17:35:20 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: <4CCA4F08.3020206@canterbury.ac.nz> On 29/10/10 15:21, Guido van Rossum wrote: > value ... value EndSum value ... value EndSum value ... value > *EndSum* GeneratorExit Seems to me that anything requiring asking for intermediate values while not stopping the computation entirely is going beyond what can reasonably be supported with a generator. I wouldn't like to see yield-from and/or the generator protocol made any more complicated in order to allow such things. -- Greg From offline at offby1.net Fri Oct 29 06:43:48 2010 From: offline at offby1.net (Chris Rose) Date: Thu, 28 Oct 2010 22:43:48 -0600 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CCA4CF6.6020006@canterbury.ac.nz> References: <4CC9F341.8020404@canterbury.ac.nz> <4CCA4CF6.6020006@canterbury.ac.nz> Message-ID: On Thu, Oct 28, 2010 at 10:26 PM, Greg Ewing wrote: > On 29/10/10 13:54, Nick Coghlan wrote: > >> The use case is being able to interface naturally with any key-value >> API where order matters. >> >> For example: >> >> # Create an ordered dictionary (WRONG!) >> d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due >> to unordered kw dict > > I'd need convincing that the API wouldn't be better designed > to take something other than keyword arguments: > > ?d = OrderedDictionary(('a', 1), ('b', 2), ('c', 3)) > > and have it refuse to accept keyword arguments to prevent > accidents. I'm hard pressed to see how an ordered dict and a dict should be expected to differ by such a degree; in every particular they behave the same, except in the case of the OrderedDict you specify your initial parameters in tuples? Eugh. I'm not saying that it's a big enough gap to justify the amount of work that so clearly needs to be done (and now that I've followed some of the more indepth comments here, as well as read over the documentation in dictobject.c, I get a sense of how big a deal this could end up being) but there's not a lot to be said for the current weird behaviour of the ordered dict constructor. -- Chris R. ====== Not to be taken literally, internally, or seriously. Twitter: http://twitter.com/offby1 From rrr at ronadam.com Fri Oct 29 06:25:46 2010 From: rrr at ronadam.com (Ron Adam) Date: Thu, 28 Oct 2010 23:25:46 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: <4CCA4CCA.1040809@ronadam.com> On 10/28/2010 08:17 PM, Nick Coghlan wrote: > On Fri, Oct 29, 2010 at 1:04 AM, Guido van Rossum wrote: >> On Thu, Oct 28, 2010 at 5:44 AM, Nick Coghlan wrote: >>> Yep, we've basically agreed on that as the way forward as well. We >>> have a small tweak to suggest for PEP 380 to avoid losing the return >>> value from inner close() calls, >> >> This is my "gclose()" function, right? Or is there more to it? > > Yeah, the idea your gclose(), plus one extra tweak to the expansion of > "yield from" to store the result of the inner close() call on a new > GeneratorExit instance. > > To use a toy example: > > # Even this toy framework needs a little structure > class EndSum(Exception): pass > > def gsum(): > # Sums sent values until EndSum or GeneratorExit are thrown in > tally = 0 > try: > while 1: > tally += yield > except (EndSum, GeneratorExit): > pass > return x > > def average_sums(): > # Advances to a new sum when EndSum is thrown in > # Finishes the last sum and averages them all when GeneratorExit > is thrown in > sums = [] > try: > while 1: > sums.append(yield from gsum()) > except GeneratorExit as ex: > # Our proposed expansion tweak is to enable the next line > sums.append(ex.args[0]) > return sum(sums) / len(sums) Nick, could you add a main() or calling routine? I'm having trouble seeing the complete logic without that. Cheers, Ron From greg.ewing at canterbury.ac.nz Fri Oct 29 07:25:56 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 18:25:56 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> Message-ID: <4CCA5AE4.7080403@canterbury.ac.nz> Guido van Rossum wrote: > I'd also like to convince you to change g.close() so that it captures > and returns the return value from StopIteration if it has one. Looking at this again, I find that I'm not really sure how this impacts PEP 380. The current expansion specifies that when a delegating generator is closed, the subgenerator's close() method is called, any value it returns is ignored, and GeneratorExit is re-raised. If that close() call were to return a value, what do you think should be done with it? -- Greg From stefan_ml at behnel.de Fri Oct 29 07:45:16 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 29 Oct 2010 07:45:16 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC9E99F.5030805@canterbury.ac.nz> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> Message-ID: Greg Ewing, 28.10.2010 23:22: > M.-A. Lemburg wrote: >> Python has always tried >> to make the most common use case simple, so asking programmers to >> use a meta-class to be able to access the order of definitions >> in a class definition isn't exactly what the normal Python >> programmer would expect. > > But needing to know the order of definitions in a class > is a very uncommon thing to want to do in the first > place. Uncommon, sure, but there are use cases. A couple of Python based DSLs use classes as namespaces. Think of SOAP interface classes or database table definitions. In these cases, users usually have the field/column order in the back of their head when they write or read the code. So it's actually a bit surprising and somewhat error prone when the fields show up in arbitrary (and unpredictable!) order at runtime. And even on the same system, the order can change arbitrarily when new fields are added. Stefan From rrr at ronadam.com Fri Oct 29 07:53:24 2010 From: rrr at ronadam.com (Ron Adam) Date: Fri, 29 Oct 2010 00:53:24 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> Message-ID: <4CCA6154.3030507@ronadam.com> On 10/28/2010 08:17 PM, Nick Coghlan wrote: > def average_sums(): > # Advances to a new sum when EndSum is thrown in > # Finishes the last sum and averages them all when GeneratorExit > is thrown in > sums = [] > try: > while 1: > sums.append(yield from gsum()) Wouldn't this need to be... gsum_ = gsum() next(gsum_) sums.append(yield from gsum_) Or does the yield from allow send on a just started generator? > except GeneratorExit as ex: > # Our proposed expansion tweak is to enable the next line > sums.append(ex.args[0]) > return sum(sums) / len(sums) Ron From stephen at xemacs.org Fri Oct 29 08:28:07 2010 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 29 Oct 2010 15:28:07 +0900 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC9F341.8020404@canterbury.ac.nz> <4CCA4CF6.6020006@canterbury.ac.nz> Message-ID: <87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp> Chris Rose writes: > I'm hard pressed to see how an ordered dict and a dict should be > expected to differ by such a degree; in every particular they behave > the same, except in the case of the OrderedDict you specify your > initial parameters in tuples? Eugh. But initializing a dict with "dict(a=1, b=2)" is purely an accidental convenience based on the fact that a **kw argument is implemented as a dict. I find that syntax a bit disconcerting, actually, though it's natural on reflection. If you think of odict as an (efficient) associative list (order is primary function, random access via keys secondary), rather than an ordered mapping (random access via keys is primary function, order secondary) then the syntaxes ['a' : 1, 'b' : 2, 'c' : 3] # create an odict (surely that has been suggested before!) and def foo([**kw]): # pass kw as an odict pass are suggestive. I don't know whether either would be parsable by Python's parser, and I haven't thought about how the latter would deal with positional or kw-only arguments. From cmjohnson.mailinglist at gmail.com Fri Oct 29 08:32:37 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Thu, 28 Oct 2010 20:32:37 -1000 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4CC9F341.8020404@canterbury.ac.nz> <4CCA4CF6.6020006@canterbury.ac.nz> <87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Thu, Oct 28, 2010 at 8:28 PM, Stephen J. Turnbull wrote: > If you think of odict as an (efficient) associative list (order is > primary function, random access via keys secondary), rather than an > ordered mapping (random access via keys is primary function, order > secondary) then the syntaxes > > ? ?['a' : 1, 'b' : 2, 'c' : 3] ? ?# create an odict > > (surely that has been suggested before!) and Yup: http://mail.python.org/pipermail/python-ideas/2009-June/004924.html GvR says "-100" :-O Archivally-yrs, -- Carl From greg.ewing at canterbury.ac.nz Fri Oct 29 09:18:05 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 20:18:05 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> Message-ID: <4CCA752D.5090904@canterbury.ac.nz> I've been pondering the whole close()-returning-a-value thing I've convinced myself once again that it's a bad idea. Essentially the problem is that we're trying to make the close() method, and consequently GeneratorExit, serve two different and incompatible roles. One role (the one it currently serves) is as an emergency bail-out mechanism. In that role, when we have a stack of generators delegating via yield-from, we want things to behave as thought the GeneratorExit originates in the innermost one and propagates back out of the entire stack. We don't want any of the intermediate generators to catch it and turn it into a StopIteration, because that would give the next outer one the misleading impression that it's business as usual, but it's not. This is why PEP 380 currently specifies that, after calling the close() method of the subgenerator, GeneratorExit is unconditionally re-raised in the delegating generator. The proponents of close()-returning-a-value, however, want GeneratorExit to serve another role: as a way of signalling to a consuming generator (i.e. one that is having values passed into it using send()) that there are no more values left to pass in. It seems to me that this is analogous to a function reading values from a file, or getting them from an iterator. The behaviour that's usually required in the presence of delegation is quite different in those cases. Consider a function f1, that calls another function f2, which loops reading from a file. When f2 reaches the end of the file, this is a signal that it should finish what it's doing and return a value to f1, which then continues in its usual way. Similarly, if f2 uses a for-loop to iterate over something, when the iterator is exhausted, f2 continues and returns normally. I don't see how GeneratorExit can be made to fulfil this role, i.e. as a "producer exhausted" signal, without compromising its existing one. And if that idea is dropped, the idea of close() returning a value no longer has much motivation that I can see. So how should "producer exhausted" be signalled, and how should the result of a consumer generator be returned? As for returning the result, I think it should be done using the existing PEP 380 mechanism, i.e. the generator executes a "return", consequently raising StopIteration with the value. A delegating generator will then see this as the result of a yield-from and continue normally. As for the signalling mechanism, I think that's entirely a matter for the producer and consumer to decide between themselves. One way would be to send() in a sentinel value, if there is a suitable out-of-band value available. Another would be to throw() in some pre-arranged exception, perhaps EOFError as a suggested convention. If we look at files as an analogy, we see a similar range of conventions. Most file reading operations return an empty string or bytes object on EOF. Some, such as readline(), raise an exception, because the empty element of the relevant type is also a valid return value. As an example, a consumer generator using None as a sentinel value might look like this: def summer(): tot = 0 while 1: x = yield if x is None: break tot += x return tot and a producer using it: s = summer() s.next() for x in values: s.send(x) try: s.send(None) except StopIteration as e: result = e.value Having to catch StopIteration is a little tedious, but it could easily be encapsulated in a helper function: def close_consumer(g, sentinel): try: g.send(sentinel) except StopIteration as e: return e.value The helper function could also take care of another issue that arises. What happens if a delegating consumer carries on after a subconsumer has finished and yields again? The analogous situation with files is trying to read from a file that has already signalled EOF before. In that case, the file simply signals EOF again. Similarly, calling next() on an exhausted iterator raises StopIteration again. So, if a "finished" consumer yields again, and we are using a sentinel value, the yield should return the sentinel again. We can get this behaviour by writing our helper function like this: def close_consumer(g, sentinel): while 1: try: g.send(sentinel) except StopIteration as e: return e.value So in summary, I think PEP 380 and current generator semantics are fine as they stand with regard to the behaviour of close(). Signalling the end of a stream of values to a consumer generator can and should be handled by convention, using existing facilities. -- Greg From masklinn at masklinn.net Fri Oct 29 09:47:38 2010 From: masklinn at masklinn.net (Masklinn) Date: Fri, 29 Oct 2010 09:47:38 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CCA4CF6.6020006@canterbury.ac.nz> References: <4CC9F341.8020404@canterbury.ac.nz> <4CCA4CF6.6020006@canterbury.ac.nz> Message-ID: On 2010-10-29, at 06:26 , Greg Ewing wrote: > On 29/10/10 13:54, Nick Coghlan wrote: > >> The use case is being able to interface naturally with any key-value >> API where order matters. >> >> For example: >> >> # Create an ordered dictionary (WRONG!) >> d = OrderedDictionary(a=1, b=2, c=3) # Order is actually arbitrary due >> to unordered kw dict > > I'd need convincing that the API wouldn't be better designed > to take something other than keyword arguments: > > d = OrderedDictionary(('a', 1), ('b', 2), ('c', 3)) Well that your version takes up nearly 3 times as many characters per item would be quite a ding against it I think. It is verbose and quite alien-looking, and thus not quite ideal for interfacing with systems where ordered keyword arguments are common place (a Python interface to Cocoa for instance, or to a Smalltalk-type system). Then again, you can easily counter that Smalltalk-type keyword arguments allow for repeated keys, whereas Python's don't allow this in any case, so the interface is broken in any case. Furthermore, it is downright painful to interface with on the other side of the equation, especially since Python has (as far as I know) no support for association lists, as it is necessary to either manually walk the list for the right keys or one has to deal with two different structures at once (a dict for k:v access and a list for order) From greg.ewing at canterbury.ac.nz Fri Oct 29 09:47:54 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 20:47:54 +1300 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC9F341.8020404@canterbury.ac.nz> <4CCA4CF6.6020006@canterbury.ac.nz> Message-ID: <4CCA7C2A.10005@canterbury.ac.nz> Chris Rose wrote: > I'm hard pressed to see how an ordered dict and a dict should be > expected to differ by such a degree; in every particular they behave > the same, except in the case of the OrderedDict you specify your > initial parameters in tuples? Well, you *can* specify them in tuples for an ordinary dict if you want: >>> d = dict([('a', 1), ('b', 2)]) >>> d {'a': 1, 'b': 2} The fact that keywords also work for an ordinary dict is really just a lucky fluke that works in the special case where the keys are identifier-like strings. In any other case you have to use tuples anyway. Also, you get away with it because dicts happen to be unordered. Expecting your luck in this area to extend to other data types is pushing things a bit, I think. -- Greg From greg.ewing at canterbury.ac.nz Fri Oct 29 10:08:44 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Oct 2010 21:08:44 +1300 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC9F341.8020404@canterbury.ac.nz> <4CCA4CF6.6020006@canterbury.ac.nz> <87bp6dqzrc.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4CCA810C.1070501@canterbury.ac.nz> Carl M. Johnson wrote: > On Thu, Oct 28, 2010 at 8:28 PM, Stephen J. Turnbull wrote: > >> ['a' : 1, 'b' : 2, 'c' : 3] # create an odict >> >>(surely that has been suggested before!) and > > GvR says "-100" :-O That's a pity, because the next obvious step, for those who don't want to give up their keywords, would be [a = 1, b = 2, c = 3] :-) -- Greg From mal at egenix.com Fri Oct 29 10:30:26 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 29 Oct 2010 10:30:26 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CC9E99F.5030805@canterbury.ac.nz> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> Message-ID: <4CCA8622.5060405@egenix.com> Greg Ewing wrote: > M.-A. Lemburg wrote: >> Python has always tried >> to make the most common use case simple, so asking programmers to >> use a meta-class to be able to access the order of definitions >> in a class definition isn't exactly what the normal Python >> programmer would expect. > > But needing to know the order of definitions in a class > is a very uncommon thing to want to do in the first > place. I've already pointed to a couple of existing use cases where the authors had to play all sorts of tricks to access the order of such definitions. Since Python programs are executed sequentially (within the resp. scope) in the order given in the source file, it is quite natural to expect this order to be accessible somehow. If it were easier to access this order, a lot of the extra magic needed to map fixed order records to Python classes could go away. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Fri Oct 29 10:49:31 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Oct 2010 10:49:31 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> Message-ID: <20101029104931.665ee663@pitrou.net> On Fri, 29 Oct 2010 10:30:26 +0200 "M.-A. Lemburg" wrote: > Greg Ewing wrote: > > M.-A. Lemburg wrote: > >> Python has always tried > >> to make the most common use case simple, so asking programmers to > >> use a meta-class to be able to access the order of definitions > >> in a class definition isn't exactly what the normal Python > >> programmer would expect. > > > > But needing to know the order of definitions in a class > > is a very uncommon thing to want to do in the first > > place. > > I've already pointed to a couple of existing use cases where the > authors had to play all sorts of tricks to access the order of > such definitions. > > Since Python programs are executed sequentially (within the resp. > scope) in the order given in the source file, it is quite natural > to expect this order to be accessible somehow. > > If it were easier to access this order, a lot of the extra magic > needed to map fixed order records to Python classes could go > away. Interestingly, this order is already accessible on the code object used to build the class namespace: >>> def f(): ... class C: ... x = 5 ... def y(): pass ... z = 6 ... >>> code = f.__code__.co_consts[1] >>> code.co_names ('__name__', '__module__', 'x', 'y', 'z') Regards Antoine. From cmjohnson.mailinglist at gmail.com Fri Oct 29 10:50:26 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Thu, 28 Oct 2010 22:50:26 -1000 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CCA8622.5060405@egenix.com> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> Message-ID: On Thu, Oct 28, 2010 at 10:30 PM, M.-A. Lemburg wrote: > If it were easier to access this order, a lot of the extra magic > needed to map fixed order records to Python classes could go > away. But, pending the creation of an odict of equal or greater speed, is there any reason we can't just make due for now by having the __prepare__ method of our relevant metaclasses return an odict? Sure, in Python 2 people used to have to do crazy stack frame hacks and such to preserve the ordering info for their ORMs, but now that we have __prepare__, I don't think the need for a better solution is particularly urgent. I agree it might be nice to have odicts everywhere, but it wouldn't be so nice that we need to sacrifice performance for it at the moment. Let someone forge the bell first, then we can talk about getting the cat to wear it. From mal at egenix.com Fri Oct 29 10:51:54 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 29 Oct 2010 10:51:54 +0200 Subject: [Python-ideas] dict changes [was: Ordered storage of keyword arguments] In-Reply-To: <1288303911.3753.9.camel@localhost.localdomain> References: <1288303911.3753.9.camel@localhost.localdomain> Message-ID: <4CCA8B2A.2060702@egenix.com> Antoine Pitrou wrote: > Le jeudi 28 octobre 2010 ? 14:44 -0400, Jim Jewett a ?crit : >> >> For a string dict, that hash should already be available on the string >> object itself, so it is redundant. Keeping it obviously improves >> cache locality, but ... it also makes the dict objects 50% larger, and >> there is a chance that the strings themselves would already be in >> cache anyhow. And if strings were reliably interned, the comparison >> check should normally just be a pointer compare -- possibly fast >> enough that the "different hash" shortcut doesn't buy anything. >> [caveats about still needing to go to the slower dict implementation >> for string subclasses] > > I've thought about that. The main annoyance is to be able to switch > transparently between the two implementations. But I think it would be > interesting to pursue that effort, since indeed dicts with interned keys > are the most common case of dicts in the average Python workload. Saving > 1/3 of the memory size on these dicts would be worthwhile IMO. Are you sure ? In the age of GB RAM, runtime performance appears to be more important than RAM usage. Moving the hash comparison out of the dict would likely cause (cache) locality to no longer trigger. > (addressing itself would perhaps be a bit simpler, because of > multiplying by 8 or 16 instead of multiplying by 12 or 24. But I doubt > the difference would be noticeable) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Fri Oct 29 11:06:22 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Oct 2010 11:06:22 +0200 Subject: [Python-ideas] dict changes [was: Ordered storage of keyword arguments] In-Reply-To: <4CCA8B2A.2060702@egenix.com> References: <1288303911.3753.9.camel@localhost.localdomain> <4CCA8B2A.2060702@egenix.com> Message-ID: <1288343183.3565.26.camel@localhost.localdomain> Le vendredi 29 octobre 2010 ? 10:51 +0200, M.-A. Lemburg a ?crit : > > > > I've thought about that. The main annoyance is to be able to switch > > transparently between the two implementations. But I think it would be > > interesting to pursue that effort, since indeed dicts with interned keys > > are the most common case of dicts in the average Python workload. Saving > > 1/3 of the memory size on these dicts would be worthwhile IMO. > > Are you sure ? In the age of GB RAM, runtime performance appears > to be more important than RAM usage. > Moving the hash comparison out > of the dict would likely cause (cache) locality to no longer trigger. Good point. It probably depends on the collision rate. Also, a string key dict could be optimized for interned strings, in which case the hash comparison is unnecessary. (knowing whether the key is interned could be stored in e.g. the low-order bit of the key pointer) Regards Antoine. From mal at egenix.com Fri Oct 29 11:15:42 2010 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 29 Oct 2010 11:15:42 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <20101029104931.665ee663@pitrou.net> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> <20101029104931.665ee663@pitrou.net> Message-ID: <4CCA90BE.30206@egenix.com> Antoine Pitrou wrote: > On Fri, 29 Oct 2010 10:30:26 +0200 > "M.-A. Lemburg" wrote: >> Greg Ewing wrote: >>> M.-A. Lemburg wrote: >>>> Python has always tried >>>> to make the most common use case simple, so asking programmers to >>>> use a meta-class to be able to access the order of definitions >>>> in a class definition isn't exactly what the normal Python >>>> programmer would expect. >>> >>> But needing to know the order of definitions in a class >>> is a very uncommon thing to want to do in the first >>> place. >> >> I've already pointed to a couple of existing use cases where the >> authors had to play all sorts of tricks to access the order of >> such definitions. >> >> Since Python programs are executed sequentially (within the resp. >> scope) in the order given in the source file, it is quite natural >> to expect this order to be accessible somehow. >> >> If it were easier to access this order, a lot of the extra magic >> needed to map fixed order records to Python classes could go >> away. > > Interestingly, this order is already accessible on the code object used > to build the class namespace: > >>>> def f(): > ... class C: > ... x = 5 > ... def y(): pass > ... z = 6 > ... >>>> code = f.__code__.co_consts[1] >>>> code.co_names > ('__name__', '__module__', 'x', 'y', 'z') Interesting indeed and I kind of expected that order to be available somewhere via the compiler. Can this be generalized to arbitrary classes ? Would other Python implementations be able to provide the same information ? If so, we could add this order as .__deforder__ tuple to class objects. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 29 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Fri Oct 29 11:38:41 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Oct 2010 11:38:41 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CCA90BE.30206@egenix.com> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> <20101029104931.665ee663@pitrou.net> <4CCA90BE.30206@egenix.com> Message-ID: <1288345121.3565.48.camel@localhost.localdomain> Le vendredi 29 octobre 2010 ? 11:15 +0200, M.-A. Lemburg a ?crit : > > Would other Python implementations be able to provide the > same information ? Probably, yes. > Can this be generalized to arbitrary classes ? In CPython, it could be done by modifying the default __build_class__ function (which always gets called regardless of metaclasses and other stuff). Of course, it won't work if the metaclass forbids setting attributes on the class object. Here's a pure Python prototype: import builtins, types _old_build_class = builtins.__build_class__ def __build_class__(func, name, *bases, **kwds): cls = _old_build_class(func, name, *bases, **kwds) # Extract the code object used to create the class namespace co = func.__code__ cls.__deforder__ = tuple(n for n in co.co_names if n in cls.__dict__) return cls builtins.__build_class__ = __build_class__ class C: y = 5 z = staticmethod(len) def x(): pass print(C.__deforder__) From jh at improva.dk Fri Oct 29 12:13:16 2010 From: jh at improva.dk (Jacob Holm) Date: Fri, 29 Oct 2010 12:13:16 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCA752D.5090904@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> Message-ID: <4CCA9E3C.8000604@improva.dk> On 2010-10-29 09:18, Greg Ewing wrote: > I've been pondering the whole close()-returning-a-value > thing I've convinced myself once again that it's a bad > idea. > And I still believe we could have made it work. However, I have been doing my own thinking about the whole of PEP 380, PEP 3152, for-loop co-iteration and so on. And I think I have an idea that improves the whole story. The main thing to note is that the expression form of yield-from is mostly intended to make it easier to have cofunctions that return a value, and that there is a problem with reusing StopIteration for that purpose. Now that we have an actual PEP 3152, we could choose to move the necessary support over there. Here is one way to do that: 1) Drop the current PEP 380 support for using "return " inside a generator. That means no extended StopIteration and no expression form of "yield from". And since there are no return values, there is no problem with how "close" should treat them. 2) In PEP 3152, define "return " in a cofunction to raise a new IterationResult exception with the value. (And treat falling off the edge of the function or returning without a value as "return None") 3) In PEP 3152, change the "cocall" expansion so that: = cocall f(*args, **kwargs) Expands to: try: yield from f.__cocall__(*args, **kwargs) except IterationResult as e: = e.value else: raise StopIteration (The same expansion would be used if cocalls are implicit of course). This ensures that a cofunction can raise StopIteration just as a regular function, which means we can extend the iterator protocol to support cofunctions with only minor changes. An interesting variation might be to keep the expression form of yield-from, but change its semantics so that it returns the StopIteration instance that was caught, instead of trying to extract a value. Then adding an IterationResult inheriting from StopIteration and using it for "return " in a generator. That would make all current yield-from examples work with the minor change that the old: = yield from would need to be written as = (yield from ).value And would have the benefit that the PEP 3152 expansion could reraise the actual StopIteration as in: e = yield from f.__cocall__(*args, **kwargs) if isinstance(e, IterationResult): = e.value else: raise e The idea of returning the exception takes some getting used to, but it solves the problem with StopIteration and cofunctions, and I'm sure I can find some interesting uses for it by itself. Anyway.... Thoughts? - Jacob From steve at pearwood.info Fri Oct 29 13:10:21 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Oct 2010 22:10:21 +1100 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> Message-ID: <4CCAAB9D.3060909@pearwood.info> Carl M. Johnson wrote: > But, pending the creation of an odict of equal or greater speed, is > there any reason we can't just make due for now by having the > __prepare__ method of our relevant metaclasses return an odict? +1 Given that the need to care about the order of keyword arguments is likely to be rare, I'd like to see some recipes and/or metaclass helpers before changing the language. Besides... moratorium. -- Steven From guido at python.org Fri Oct 29 16:28:29 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 07:28:29 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCA5AE4.7080403@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> Message-ID: On Thu, Oct 28, 2010 at 10:25 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> I'd also like to convince you to change g.close() so that it captures >> and returns the return value from StopIteration if it has one. > > Looking at this again, I find that I'm not really sure how > this impacts PEP 380. The current expansion specifies that > when a delegating generator is closed, the subgenerator's > close() method is called, any value it returns is ignored, > and GeneratorExit is re-raised. > > If that close() call were to return a value, what do you > think should be done with it? I went over that myself in detail and ended up deciding that for "yield-from" nothing should be changed! The expansion in the PEP remains the same. But since this PEP also specifies "return value" it would be nice if there was a convenient way to capture this value, and close seems to be it. E.g. def gen(): total = 0 try: while True: total += yield except GeneratorExit: return total def main(): g = gen() for i in range(100): g.send(i) print(g.close()) This would print the total computed by gen(). -- --Guido van Rossum (python.org/~guido) From guido at python.org Fri Oct 29 21:13:18 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 12:13:18 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCA752D.5090904@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> Message-ID: On Fri, Oct 29, 2010 at 12:18 AM, Greg Ewing wrote: > I've been pondering the whole close()-returning-a-value > thing I've convinced myself once again that it's a bad > idea. > > Essentially the problem is that we're trying to make > the close() method, and consequently GeneratorExit, > serve two different and incompatible roles. > > One role (the one it currently serves) is as an > emergency bail-out mechanism. In that role, when we > have a stack of generators delegating via yield-from, > we want things to behave as thought the GeneratorExit > originates in the innermost one and propagates back > out of the entire stack. We don't want any of the > intermediate generators to catch it and turn it > into a StopIteration, because that would give the > next outer one the misleading impression that it's > business as usual, but it's not. This seems to be the crux of your objection. But if I look carefully at the expansion in the current version of PEP 380, I don't think this problem actually happens: If the outer generator catches GeneratorExit, it closes the inner generator (by calling its close method, if it exists) and then re-raises the GeneratorExit: except GeneratorExit as _e: try: _m = _i.close except AttributeError: pass else: _m() raise _e I would leave this expansion alone even if g.close() was changed to return the generator's return value. Could it be that you are thinking of your accelerated implementation, which IIRC has a shortcut whereby generator operations (next, send, throw) on the outer generator are *directly* passed to the inner generator when a yield-from is active? It looks to me as if using g.close() to capture the return value of a generator is not of much value when using yield-from, but it can be of value for the simpler pattern that started this thread. Here's an updated version: def gclose(gen): ## Not needed with PEP 380 try: gen.throw(GeneratorExit) except StopIteration as err: return err.args[0] except GeneratorExit: pass # Note: other exceptions are passed out untouched. return None def summer(): total = 0 try: while True: total += yield except GeneratorExit: raise StopIteration(total) ## return total def maxer(): highest = 0 try: while True: value = yield highest = max(highest, value) except GeneratorExit: raise StopIteration(highest) ## return highest def map_to_multiple(it, funcs): gens = [func() for func in funcs] # Create generators for gen in gens: next(gen) # Prime generators for value in it: for gen in gens: gen.send(value) return [gclose(gen) for gen in gens] ## [gen.close() for gen in gens] def main(): print(map_to_multiple(range(100), [summer, maxer])) main() -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Sat Oct 30 01:03:21 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 12:03:21 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCA9E3C.8000604@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCA9E3C.8000604@improva.dk> Message-ID: <4CCB52B9.8010203@canterbury.ac.nz> Jacob Holm wrote: > The main thing to note is that the expression form of yield-from is > mostly intended to make it easier to have cofunctions that return a > value, and that there is a problem with reusing StopIteration for that > purpose. No, I don't think that's where the problem is, if I understand correctly which problem you're referring to. The fact that cofunctions as currently defined in PEP 3152 can't raise StopIteration and have it propagate to the caller is a problem with the use of StopIteration to signal the end of a cofunction. Whether the StopIteration carries a value or not is irrelevant. > And since there are no return values, there is no > problem with how "close" should treat them. There's no problem with that *now*, because close() is currently not defined as returning a value. A problem only arises if we try to overload close() to mean "no more data to send in, give me your result" as well as "bail out now and clean up". And as I pointed out, there are other ways of signalling end of data that work fine with things as they are. > 2) In PEP 3152, define "return " in a cofunction to raise a new > IterationResult exception with the value. That would have to apply to *all* forms of return, not just ones with a value. > = (yield from ).value > > have the benefit that the PEP 3152 expansion could reraise the > actual StopIteration as in: > > e = yield from f.__cocall__(*args, **kwargs) > if isinstance(e, IterationResult): > = e.value > else: > raise e There's another way to approach this: define cofunctions so that 'return' in one of its forms is the only way to raise an actual StopIteration, and any explicitly raised StopIteration gets wrapped in something else, such as CoStopIteration. The expansion would then be try: result = yield from f.__cocall__(*args, **kwargs) except CoStopIteration as e: raise e.value where e.value is the original StopIteration instance. This would have the advantage of not requiring any change to yield-from as it stands. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 30 01:30:54 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 12:30:54 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> Message-ID: <4CCB592E.7030507@canterbury.ac.nz> Guido van Rossum wrote: > I went over that myself in detail and ended up deciding that for > "yield-from" nothing should be changed! The expansion in the PEP > remains the same. In that case, the proposal has nothing to do with PEP 380 and needn't be mentioned in it -- except perhaps to point out that using it in the presence of yield-from may not produce the expected result. > But since this PEP also specifies "return value" it would be nice if > there was a convenient way to capture this value, As long as you're willing to accept that if the generator you're closing is delegating using yield-from, the return value from the inner generator will get lost. To put it another way, if you design a generator to be used in this way (i.e. its caller using close() to finish it and get a value), you may find it awkward or impossible to later refactor it in certain ways. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 30 01:37:41 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 12:37:41 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> Message-ID: <4CCB5AC5.8060907@canterbury.ac.nz> Guido van Rossum wrote: > But since this PEP also specifies "return value" it would be nice if > there was a convenient way to capture this value, and close seems to > be it. Sorry, I missed that bit -- you're right, it does need to be allowed for in PEP 380 if we're to do this. I'm still not convinced that it isn't a wrongheaded idea, though. The fact that it doesn't play well with yield-from gives off a very bad smell to me. It seems highly incongruous for the PEP to propose a feature that's incompatible with the main idea of the whole thing. -- Greg From jh at improva.dk Sat Oct 30 01:34:30 2010 From: jh at improva.dk (Jacob Holm) Date: Sat, 30 Oct 2010 01:34:30 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB52B9.8010203@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCA9E3C.8000604@improva.dk> <4CCB52B9.8010203@canterbury.ac.nz> Message-ID: <4CCB5A06.60408@improva.dk> On 2010-10-30 01:03, Greg Ewing wrote: > Jacob Holm wrote: > >> The main thing to note is that the expression form of yield-from is >> mostly intended to make it easier to have cofunctions that return a >> value, and that there is a problem with reusing StopIteration for that >> purpose. > > No, I don't think that's where the problem is, if I understand > correctly which problem you're referring to. The fact that cofunctions > as currently defined in PEP 3152 can't raise StopIteration and have it > propagate to the caller is a problem with the use of StopIteration > to signal the end of a cofunction. Exactly. > Whether the StopIteration carries > a value or not is irrelevant. > It is relevant if we later want to distinguish between "return" and "raise StopIteration". >> And since there are no return values, there is no >> problem with how "close" should treat them. > > There's no problem with that *now*, because close() is currently > not defined as returning a value. A problem only arises if we > try to overload close() to mean "no more data to send in, give me > your result" as well as "bail out now and clean up". And as I > pointed out, there are other ways of signalling end of data that > work fine with things as they are. > That is what I meant. We were discussing whwther to add a new feature to PEP 380 inspired by having "return " in generators. If we dropped "return " from PEP 380 (with the intent of adding it to PEP 3152 instead), so would the basis for the new feature. End of discussion... AFAICT, adding these features in a consistent way is a lot easier in the context of PEP 3152. >> 2) In PEP 3152, define "return " in a cofunction to raise a new >> IterationResult exception with the value. > > That would have to apply to *all* forms of return, not just ones > with a value. > Of course. >> = (yield from ).value >> >> have the benefit that the PEP 3152 expansion could reraise the >> actual StopIteration as in: >> >> e = yield from f.__cocall__(*args, **kwargs) >> if isinstance(e, IterationResult): >> = e.value >> else: >> raise e > > There's another way to approach this: define cofunctions so that > 'return' in one of its forms is the only way to raise an actual > StopIteration, and any explicitly raised StopIteration gets wrapped > in something else, such as CoStopIteration. The expansion would > then be > > try: > result = yield from f.__cocall__(*args, **kwargs) > except CoStopIteration as e: > raise e.value > > where e.value is the original StopIteration instance. > > This would have the advantage of not requiring any change to > yield-from as it stands. > That's just ugly... I realize it could work, but I think that makes *both* PEPs more complex than necessary. My suggestion is to cut/change some features from PEP 380 that are in the way and then add them in a cleaner way to PEP 3152. This should simplify both PEPs, at the cost of reopening some of the earlier discussions. - Jacob From guido at python.org Sat Oct 30 01:45:28 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 16:45:28 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB592E.7030507@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB592E.7030507@canterbury.ac.nz> Message-ID: On Fri, Oct 29, 2010 at 4:30 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> I went over that myself in detail and ended up deciding that for >> "yield-from" nothing should be changed! The expansion in the PEP >> remains the same. > > In that case, the proposal has nothing to do with PEP 380 > and needn't be mentioned in it -- except perhaps to point > out that using it in the presence of yield-from may > not produce the expected result. The connection is that it works well with returning values from generators, which *is* specified in PEP 380. So I think this does belong there. >> But since this PEP also specifies "return value" it would be nice if >> there was a convenient way to capture this value, > > As long as you're willing to accept that if the generator > you're closing is delegating using yield-from, the return > value from the inner generator will get lost. > > To put it another way, if you design a generator to be > used in this way (i.e. its caller using close() to finish > it and get a value), you may find it awkward or impossible > to later refactor it in certain ways. Only if after the refactoring the outer generator would need the return value of the interrupted yield-from expression in order to compute its return value. I think that's reasonable. (It might be possible to tweak the yield-from expansion so that the return value is assigned before GeneratorExit is re-raised, but that sounds fragile, and doesn't always apply, e.g. if the return value is not assigned to a local variable.) -- --Guido van Rossum (python.org/~guido) From jh at improva.dk Sat Oct 30 01:46:23 2010 From: jh at improva.dk (Jacob Holm) Date: Sat, 30 Oct 2010 01:46:23 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB5AC5.8060907@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> Message-ID: <4CCB5CCF.90504@improva.dk> On 2010-10-30 01:37, Greg Ewing wrote: > Guido van Rossum wrote: > >> But since this PEP also specifies "return value" it would be nice if >> there was a convenient way to capture this value, and close seems to >> be it. > > Sorry, I missed that bit -- you're right, it does need to be > allowed for in PEP 380 if we're to do this. I'm still not > convinced that it isn't a wrongheaded idea, though. The fact > that it doesn't play well with yield-from gives off a very > bad smell to me. > > It seems highly incongruous for the PEP to propose a feature > that's incompatible with the main idea of the whole thing. > Which is exactly why I'm suggesting dropping "return value" from PEP 380 and then doing it *right* in PEP 3152, which has a much better rationale for the "return value" feature anyway. - Jacob From ghazel at gmail.com Sat Oct 30 01:50:53 2010 From: ghazel at gmail.com (ghazel at gmail.com) Date: Fri, 29 Oct 2010 16:50:53 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB5CCF.90504@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> Message-ID: On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm wrote: > On 2010-10-30 01:37, Greg Ewing wrote: >> Guido van Rossum wrote: >> >>> But since this PEP also specifies "return value" it would be nice if >>> there was a convenient way to capture this value, and close seems to >>> be it. >> >> Sorry, I missed that bit -- you're right, it does need to be >> allowed for in PEP 380 if we're to do this. I'm still not >> convinced that it isn't a wrongheaded idea, though. The fact >> that it doesn't play well with yield-from gives off a very >> bad smell to me. >> >> It seems highly incongruous for the PEP to propose a feature >> that's incompatible with the main idea of the whole thing. >> > > Which is exactly why I'm suggesting dropping "return value" from PEP 380 > and then doing it *right* in PEP 3152, which has a much better rationale > for the "return value" feature anyway. Why not split "return value" for generators in to its own PEP? There is currently a use case for it in frameworks which use generators for coroutines, without any dependency on PEP 380 or PEP 3152. -Greg From guido at python.org Sat Oct 30 01:54:36 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 16:54:36 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB5CCF.90504@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> Message-ID: On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm wrote: > On 2010-10-30 01:37, Greg Ewing wrote: >> Guido van Rossum wrote: >> >>> But since this PEP also specifies "return value" it would be nice if >>> there was a convenient way to capture this value, and close seems to >>> be it. >> >> Sorry, I missed that bit -- you're right, it does need to be >> allowed for in PEP 380 if we're to do this. I'm still not >> convinced that it isn't a wrongheaded idea, though. The fact >> that it doesn't play well with yield-from gives off a very >> bad smell to me. >> >> It seems highly incongruous for the PEP to propose a feature >> that's incompatible with the main idea of the whole thing. I don't think it is. > Which is exactly why I'm suggesting dropping "return value" from PEP 380 > and then doing it *right* in PEP 3152, which has a much better rationale > for the "return value" feature anyway. Oh, but I still don't like that PEP, and it has a much higher probability of failing completely. PEP 380 OTOH has my approval except for minor quibbles like g.close(). -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Oct 30 01:55:39 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 16:55:39 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> Message-ID: On Fri, Oct 29, 2010 at 4:50 PM, wrote: > On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm wrote: >> On 2010-10-30 01:37, Greg Ewing wrote: >>> Guido van Rossum wrote: >>> >>>> But since this PEP also specifies "return value" it would be nice if >>>> there was a convenient way to capture this value, and close seems to >>>> be it. >>> >>> Sorry, I missed that bit -- you're right, it does need to be >>> allowed for in PEP 380 if we're to do this. I'm still not >>> convinced that it isn't a wrongheaded idea, though. The fact >>> that it doesn't play well with yield-from gives off a very >>> bad smell to me. >>> >>> It seems highly incongruous for the PEP to propose a feature >>> that's incompatible with the main idea of the whole thing. >>> >> >> Which is exactly why I'm suggesting dropping "return value" from PEP 380 >> and then doing it *right* in PEP 3152, which has a much better rationale >> for the "return value" feature anyway. > > Why not split "return value" for generators in to its own PEP? There > is currently a use case for it in frameworks which use generators for > coroutines, without any dependency on PEP 380 or PEP 3152. Either way it's not going in before Python 3.3... Aside from the moratorium, 3.2 is also too close to release. PS. Drop me a note to chat about Monocle. -- --Guido van Rossum (python.org/~guido) From jh at improva.dk Sat Oct 30 01:57:59 2010 From: jh at improva.dk (Jacob Holm) Date: Sat, 30 Oct 2010 01:57:59 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> Message-ID: <4CCB5F87.3020107@improva.dk> On 2010-10-30 01:50, ghazel at gmail.com wrote: > On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm wrote: >> Which is exactly why I'm suggesting dropping "return value" from PEP 380 >> and then doing it *right* in PEP 3152, which has a much better rationale >> for the "return value" feature anyway. > > Why not split "return value" for generators in to its own PEP? There > is currently a use case for it in frameworks which use generators for > coroutines, without any dependency on PEP 380 or PEP 3152. > You could do that, but I think 3152 would need to depend on the new PEP then. It would be quite strange to define a new class of "functions" and not have them able to return a value. - Jacob From jh at improva.dk Sat Oct 30 02:10:11 2010 From: jh at improva.dk (Jacob Holm) Date: Sat, 30 Oct 2010 02:10:11 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB592E.7030507@canterbury.ac.nz> Message-ID: <4CCB6263.9030602@improva.dk> On 2010-10-30 01:45, Guido van Rossum wrote: > On Fri, Oct 29, 2010 at 4:30 PM, Greg Ewing wrote: >> Guido van Rossum wrote: >>> But since this PEP also specifies "return value" it would be nice if >>> there was a convenient way to capture this value, >> >> As long as you're willing to accept that if the generator >> you're closing is delegating using yield-from, the return >> value from the inner generator will get lost. >> >> To put it another way, if you design a generator to be >> used in this way (i.e. its caller using close() to finish >> it and get a value), you may find it awkward or impossible >> to later refactor it in certain ways. > > Only if after the refactoring the outer generator would need the > return value of the interrupted yield-from expression in order to > compute its return value. I think that's reasonable. (It might be > possible to tweak the yield-from expansion so that the return value is > assigned before GeneratorExit is re-raised, but that sounds fragile, > and doesn't always apply, e.g. if the return value is not assigned to > a local variable.) > I have earlier proposed a simple change that would at least make the value available. Instead of reraising the original GeneratorExit after calling close on the subgenerator, you just raise a new GeneratorExit with the returned value as its first argument. Nick seemed in favor of this idea. - Jacob From cmjohnson.mailinglist at gmail.com Sat Oct 30 02:19:33 2010 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Fri, 29 Oct 2010 14:19:33 -1000 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <4CCAAB9D.3060909@pearwood.info> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> <4CCAAB9D.3060909@pearwood.info> Message-ID: On Fri, Oct 29, 2010 at 1:10 AM, Steven D'Aprano wrote: > Given that the need to care about the order of keyword arguments is likely > to be rare, I'd like to see some recipes and/or metaclass helpers ?before > changing the language. The recipe is more or less directly in PEP 3115 (the PEP which established the __prepare__ attribute). There's not a lot to it. >>> class OrderedMetaclass(type): ... @classmethod ... def __prepare__(metacls, name, bases): # No keywords in this case ... return OrderedDict() ... ... def __new__(cls, name, bases, classdict): ... result = type.__new__(cls, name, bases, dict(classdict)) ... result.odict = classdict ... return result ... >>> >>> class OrderedClass(metaclass=OrderedMetaclass): ... a = 1 ... z = 2 ... b = 3 ... >>> OrderedClass.odict OrderedDict([('__module__', '__main__'), ('a', 1), ('z', 2), ('b', 3)]) Thinking about it a little more, if I were making an HTML tree type metaclass though, I wouldn't want to use an OrderedDict anyway, since it can't have duplicate elements, and I would want the interface to be something like: class body(Tree()): h1 = "Hello World!" p = "Lorem ipsum." p = "Dulce et decorum est." class div(Tree(id="content")): p = "Main thing" class div(Tree(id="footer")): p = "(C) 2010" So, I'd probably end up making my own custom kind of dict that didn't overwrite repeated names. Of course, for an ORM, you don't want repeated field names, so an OrderedDict would work. Anyway, this just goes to show how limited the applicability of switching to an odict in Python internals is. From jh at improva.dk Sat Oct 30 02:41:16 2010 From: jh at improva.dk (Jacob Holm) Date: Sat, 30 Oct 2010 02:41:16 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> Message-ID: <4CCB69AC.9020701@improva.dk> On 2010-10-30 01:54, Guido van Rossum wrote: > On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm wrote: >> Which is exactly why I'm suggesting dropping "return value" from PEP 380 >> and then doing it *right* in PEP 3152, which has a much better rationale >> for the "return value" feature anyway. > > Oh, but I still don't like that PEP, and it has a much higher > probability of failing completely. PEP 380 OTOH has my approval except > for minor quibbles like g.close(). > I agree that PEP 3152 is far from perfect at this point, but I like the basics. The reason I am so concerned with the "return value" semantics is that I see some problems we are having in PEP 3152 as indicating a likely flaw/misfeature in PEP 380. I would be much happier with both PEPs if they didn't conflict in this way. So much so, that I would rather miss a few features in PEP 380 in the *hope* of getting them right later with another PEP. To quote the Zen: "never is often better than *right* now" A PEP just for the "return value" shouldn't be too hard to add later if PEP 3152 doesn't work out, and at that point we should have a better idea about the best way of doing it. - Jacob From guido at python.org Sat Oct 30 03:15:57 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 18:15:57 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB69AC.9020701@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: On Fri, Oct 29, 2010 at 5:41 PM, Jacob Holm wrote: > On 2010-10-30 01:54, Guido van Rossum wrote: >> On Fri, Oct 29, 2010 at 4:46 PM, Jacob Holm wrote: >>> Which is exactly why I'm suggesting dropping "return value" from PEP 380 >>> and then doing it *right* in PEP 3152, which has a much better rationale >>> for the "return value" feature anyway. >> >> Oh, but I still don't like that PEP, and it has a much higher >> probability of failing completely. PEP 380 OTOH has my approval except >> for minor quibbles like g.close(). > I agree that PEP 3152 is far from perfect at this point, but I like the > basics. I thought the basics weren't even decided? Implicit definitions, or implicit cocalls, terminology to be used, how to implement in Jython or IronPython, probably more (I can't keep the ideas of that PEP in my head so I end up blanking out on any discussion that mentions it). I truly wish it was easier to experiment with syntax -- it would be so much simpler if these PEPs could be accompanied by a library that people can just import to use the new syntax (even if it's a C extension) rather than by a patch to the core language. The need to "get it right in one shot" is keeping back the ability to experiment at any realistic scale, so all we see (on all sides) are trivial examples that may highlight proposed features and anticipated problems, but this is no way to gain experience with what the *real* problems would be. > The reason I am so concerned with the "return value" semantics > is that I see some problems we are having in PEP 3152 as indicating a > likely flaw/misfeature in PEP 380. ?I would be much happier with both > PEPs if they didn't conflict in this way. If there was a separate PEP specifying *just* returning a value from a generator and how to get at that value using g.close(), without yield-from, would those problems still exist? If not, that would be a reason to move those out in a separate PEP. Assume such a PEP (call it PEP X) existed, what would be the dependency tree? What would be the conflicts? Would PEP 3152 make sense with PEP X but without (the rest of) PEP 380? > So much so, that I would rather miss a few features in PEP 380 in the > *hope* of getting them right later with another PEP. Can you be specific? Which features? > To quote the Zen: > > ?"never is often better than *right* now" Um, Python 3.3 can hardly be referred to as "*right* now". There are plenty of arguments in the zen for PEP X, especially "If the implementation is easy to explain, it may be a good idea." Both returning a value from a generator and catching that value in g.close() are really easy to implement and the implementation is easy to explain. It's a small evolution from the current generator code. > A PEP just for the "return value" shouldn't be too hard to add later if > PEP 3152 doesn't work out, and at that point we should have a better > idea about the best way of doing it. It would a small miracle if PEP 3152 worked out. I'd much rather have a solid fallback position now. I'm not pushing for rushing PEP X to acceptance -- I'm just hoping we can write it now and discuss it on its own merits without too much concern for PEP 3152 or even PEP 380, although I personally still think that the interference with PEP 380 would minimal and not a reason for changing PEP X. BTW I don't think I like piggybacking a return value on GeneratorExit. Before you know it people will be writing except blocks catching GeneratorExit intending to catch one coming from inside but accidentally including a yield in the try block and catching one coming from the outside. The nice thing about how GeneratorExit works today is that you needn't worry about it coming from inside, since it always comes from the outside *first*. This means that if you catch a GeneratorExit, it is either one you threw into a generator yourself (it just bounced back, meaning the generator didn't handle it at all), or one that was thrown into you. But the pattern of catching GeneratorExit and responding by returning a value is a reasonable extension of the pattern of catching GeneratorExit and doing other cleanup. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sat Oct 30 05:07:41 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Oct 2010 13:07:41 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: On Sat, Oct 30, 2010 at 11:15 AM, Guido van Rossum wrote: > BTW I don't think I like piggybacking a return value on GeneratorExit. > Before you know it people will be writing except blocks catching > GeneratorExit intending to catch one coming from inside but > accidentally including a yield in the try block and catching one > coming from the outside. The nice thing about how GeneratorExit works > today is that you needn't worry about it coming from inside, since it > always comes from the outside *first*. This means that if you catch a > GeneratorExit, it is either one you threw into a generator yourself > (it just bounced back, meaning the generator didn't handle it at all), > or one that was thrown into you. But the pattern of catching > GeneratorExit and responding by returning a value is a reasonable > extension of the pattern of catching GeneratorExit and doing other > cleanup. (TLDR version: I'm -1 on Guido's modified close() semantics if there is no way to get the result out of a yield from expression that is terminated by GeneratorExit, but I'm +1 if we tweak PEP 380 to make the result available on the reraised GeneratorExit instance, thus allowing framework authors to develop ways to correctly unwind a generator stack in response to close()) Stepping back a bit, let's look at the ways a framework may "close" a generator-based operation (or substep of a generator). 1. Send in a sentinel value (often None, but you could easily reuse the exception types as sentinel values as well) 2. Throw in GeneratorExit explicitly 3. Throw in StopIteration explicitly 4. Throw in a different specific exception 5. Call g.close() Having close() return a value only helps with the last option, and only if the coroutine is set up to work that way. Yield from also isn't innately set up to unwind correctly in any of these cases, without some form of framework based signalling from the inner generator to indicate whether or not the outer generator should continue or bail out. Now, *if* close() were set up to return a value, then that second point makes the idea less useful than it appears. To go back to the simple summing example (not my too-complicated-for-a-mailing-list-discussion version which I'm not going to try to rehabilitate): def gtally(): count = tally = 0 try: while 1: tally += yield count += 1 except GeneratorExit: pass return count, tally Fairly straightforward, but one of the promises of PEP 380 is that it allows us to factor out some or all of a generator's internal logic without affecting the externally visible semantics. So, let's do that: def gtally2(): return (yield from gtally()) Unless the PEP 380 yield from expansion is changed, Guido's proposed "close() returns the value on StopIteration" just broke this equivalence for gtally2() - since the yield from expansion turns the StopIteration back into a GeneratorExit, the return value of gtally2.close is always going to be None instead of the expected (count, tally) 2-tuple. Since the value of the internal call to close() is thrown away completely, there is absolute nothing the author of gtally2() can do to fix it (aside from not using yield from at all). To me, if Guido's idea is adopted, this outcome is as illogical and unacceptacle as the following returning None: def sum2(seq): return sum(seq) We already thrashed out long ago that the yield from handling of GeneratorExit needs to work the way it does in order to serve its primary purpose of releasing resources, so allowing the inner StopIteration to propagate with the exception value attached is not an option. The question is whether or not there is a way to implement the return-value-from-close() idiom in a way that *doesn't* completely break the equivalence between gtally() and gtally2() above. I think there is: store the prospective return-value on the GeneratorExit instance and have the yield from expansion provide the most recent return value as it unwinds the stack. To avoid giving false impressions as to which level of the stack return values are from, gtally2() would need to be implemented a bit differently in order to *also* convert GeneratorExit to StopIteration: def gtally2(): # The PEP 380 equivalent of a "tail call" if g.close() returns a value try: yield from gtally() except GeneratorExit as ex: return ex.value Specific proposed additions/modifications to PEP 380: 1. The new "value" attribute is added to GeneratorExit as well as StopIteration and is explicitly read/write 2. The semantics of the generator close method are modified to be: def close(self): try: self.throw(GeneratorExit) except StopIteration as ex: return ex.value except GeneratorExit: return None # Ignore the value, as it didn't come from the outermost generator raise RuntimeError("Generator ignored GeneratorExit") 3. The GeneratorExit handling semantics for the yield from expansion are modified to be: except GeneratorExit as _e: try: _m = _i.close except AttributeError: pass else: _e.value = _m() # Store close() result on the exception raise _e With these modifications, a framework could then quite easily provide a context manager to make the idiom a little more readable and hide the fact that GeneratorExit is being caught at all: class GenResult(): def __init__(self): self.value = None @contextmanager def generator_return(): result = GenResult() try: yield except GeneratorExit as ex: result.value = ex.value def gtally(): # The CM suppresses GeneratorExit, allowing us # to convert it to StopIteration count = tally = 0 with generator_return(): while 1: tally += yield count += 1 return count, tally def gtally2(): # The CM *also* collects the value of any inner # yield from expression, allowing easier tail calls with generator_return() as result: yield from gtally() return result.value Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Sat Oct 30 05:09:04 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 16:09:04 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> Message-ID: <4CCB8C50.2010404@canterbury.ac.nz> Guido van Rossum wrote: > This seems to be the crux of your objection. But if I look carefully > at the expansion in the current version of PEP 380, I don't think this > problem actually happens: If the outer generator catches > GeneratorExit, it closes the inner generator (by calling its close > method, if it exists) and then re-raises the GeneratorExit: Yes, but if you want close() to cause the generator to finish normally, you *don't* want that to happen. You would have to surround the yield-from call with a try block to catch the GeneratorExit, and even then you would lose the return value from the inner generator, which you're probably going to want. > Could it be that you are thinking of your accelerated implementation, No, not really. The same issues arise either way. > It looks to me as if using g.close() to capture the return value of a > generator is not of much value when using yield-from, but it can be of > value for the simpler pattern that started this thread. My concern is that this feature would encourage designing generators with APIs that make it difficult to refactor the implementation using yield-from later on. Simple things don't always stay simple. > def summer(): > total = 0 > try: > while True: > total += yield > except GeneratorExit: > raise StopIteration(total) ## return total I don't see how this gains you much. The generator is about as complicated either way. The only thing that's simpler is the final step of getting the result, which in my version can be taken care of with a fairly generic helper function that could be provided by the stdlib. -- Greg From guido at python.org Sat Oct 30 05:11:24 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 20:11:24 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CC63065.9040507@improva.dk> References: <4CC63065.9040507@improva.dk> Message-ID: On Mon, Oct 25, 2010 at 6:35 PM, Jacob Holm wrote: > I had some later suggestions for how to change the expansion, see e.g. > http://mail.python.org/pipermail/python-ideas/2009-April/004195.html ?(I > find that version easier to reason about even now 1? years later) I like that style too. Here it is with some annotations. _i = iter(EXPR) _m, _a = next, (_i,) # _m is a function or a bound method; # _a is a tuple of arguments to call _m with; # both are set to other values further down while 1: # Move the generator along try: _y = _m(*_a) except StopIteration as _e: _r = _e.value break # Yield _y and process what came back in try: _s = yield _y except GeneratorExit as _e: # Request to exit try: # NOTE: This _m is unrelated to the other _m = _i.close except AttributeError: pass else: _m() raise _e # Always exit except BaseException as _e: # An exception was thrown in; pass it along _a = sys.exc_info() try: _m = _i.throw except AttributeError: # Can't throw it in; throw it back out raise _e else: # A value was sent in; pass it along if _s is None: _m, _a = next, (_i,) else: _m, _a = _i.send, (_s,) RESULT = _r I do note that this is a bit subtle; I don't like the reusing of _m and it's hard to verify that _m and _a are set on every path that goes back to the top of the loop. -- --Guido van Rossum (python.org/~guido) From greg.ewing at canterbury.ac.nz Sat Oct 30 05:17:50 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 16:17:50 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB5A06.60408@improva.dk> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCA9E3C.8000604@improva.dk> <4CCB52B9.8010203@canterbury.ac.nz> <4CCB5A06.60408@improva.dk> Message-ID: <4CCB8E5E.5080504@canterbury.ac.nz> Jacob Holm wrote: > It is relevant if we later want to distinguish between "return" and > "raise StopIteration". We want to distinguish between return *without* a value and StopIteration too. > My suggestion is to cut/change some features from PEP 380 that are in > the way But having StopIteration carry a value is *not* one of the things that's in the way, as far as I can see. -- Greg From guido at python.org Sat Oct 30 05:26:40 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 20:26:40 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB8C50.2010404@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> Message-ID: On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> This seems to be the crux of your objection. But if I look carefully >> at the expansion in the current version of PEP 380, I don't think this >> problem actually happens: If the outer generator catches >> GeneratorExit, it closes the inner generator (by calling its close >> method, if it exists) and then re-raises the GeneratorExit: > > Yes, but if you want close() to cause the generator to finish > normally, you *don't* want that to happen. You would have to > surround the yield-from call with a try block to catch the > GeneratorExit, Yeah, putting such a try-block around yield from works just as it works around plain yield: it captures the GeneratorExit thrown in. As a bonus, the inner generator is first closed, but the yield-from expression which was interrupted is not completed; just like anything else that raises an exception, execution of the code stops immediately and resumes at the except block. > and even then you would lose the return value > from the inner generator, which you're probably going to > want. Really? Can you show a realistic use case? (There was Nick's average-of-sums example but I think nobody liked it.) >> Could it be that you are thinking of your accelerated implementation, > > No, not really. The same issues arise either way. Ok. >> It looks to me as if using g.close() to capture the return value of a >> generator is not of much value when using yield-from, but it can be of >> value for the simpler pattern that started this thread. > > My concern is that this feature would encourage designing > generators with APIs that make it difficult to refactor the > implementation using yield-from later on. Simple things > don't always stay simple. Yeah, but there is also YAGNI. We shouldn't plan every simple thing to become complex; in fact we should expect most simple things to stay simple. Otherwise you'd never use lists and dicts but start with classes right away. >> def summer(): >> ?total = 0 >> ?try: >> ? ?while True: >> ? ? ?total += yield >> ?except GeneratorExit: >> ? ?raise StopIteration(total) ?## return total > > I don't see how this gains you much. The generator is about > as complicated either way. I'm just concerned about the following: > The only thing that's simpler is the final step of getting > the result, which in my version can be taken care of with > a fairly generic helper function that could be provided > by the stdlib. In my case too -- it would just be a method on the generator named close(). :-) In addition I like merging use cases that have some overlap, if the non-overlapping parts do not conflict. E.g. I believe the reason we all ended agreeing (at least last year :-) that returning a value should be done through StopIteration was that this makes it so that "return", "return None", "return " and falling of the end of the block are treated uniformly so that equivalences apply both ways. In the case of close(), I *like* that the response to close() can be either cleaning up or returning a value and that close() doesn't care which of the two you do (and in fact it can't tell the difference). -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Oct 30 05:47:04 2010 From: guido at python.org (Guido van Rossum) Date: Fri, 29 Oct 2010 20:47:04 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: On Fri, Oct 29, 2010 at 8:07 PM, Nick Coghlan wrote: > On Sat, Oct 30, 2010 at 11:15 AM, Guido van Rossum wrote: >> BTW I don't think I like piggybacking a return value on GeneratorExit. >> Before you know it people will be writing except blocks catching >> GeneratorExit intending to catch one coming from inside but >> accidentally including a yield in the try block and catching one >> coming from the outside. The nice thing about how GeneratorExit works >> today is that you needn't worry about it coming from inside, since it >> always comes from the outside *first*. This means that if you catch a >> GeneratorExit, it is either one you threw into a generator yourself >> (it just bounced back, meaning the generator didn't handle it at all), >> or one that was thrown into you. But the pattern of catching >> GeneratorExit and responding by returning a value is a reasonable >> extension of the pattern of catching GeneratorExit and doing other >> cleanup. > > (TLDR version: I'm -1 on Guido's modified close() semantics if there > is no way to get the result out of a yield from expression that is > terminated by GeneratorExit, but I'm +1 if we tweak PEP 380 to make > the result available on the reraised GeneratorExit instance, thus > allowing framework authors to develop ways to correctly unwind a > generator stack in response to close()) > > Stepping back a bit, let's look at the ways a framework may "close" a > generator-based operation (or substep of a generator). > > 1. Send in a sentinel value (often None, but you could easily reuse > the exception types as sentinel values ?as well) > 2. Throw in GeneratorExit explicitly > 3. Throw in StopIteration explicitly Throwing in StopIteration seems more unnatural than any other option. > 4. Throw in a different specific exception > 5. Call g.close() > > Having close() return a value only helps with the last option, and > only if the coroutine is set up to work that way. Yield from also > isn't innately set up to unwind correctly in any of these cases, > without some form of framework based signalling from the inner > generator to indicate whether or not the outer generator should > continue or bail out. Yeah, there is definitely some kind of convention needed here. A framework or app can always choose not to use g.close() for this purpose (heck, several current frameworks use yield to return a value) and in some cases that's just the right thing. Just like in other flow control situations you can often choose between sentinel values, exceptions, or something else (e.g. flag variables that must be explicitly tested). > Now, *if* close() were set up to return a value, then that second > point makes the idea less useful than it appears. To go back to the > simple summing example (not my > too-complicated-for-a-mailing-list-discussion version which I'm not > going to try to rehabilitate): > > def gtally(): > ?count = tally = 0 > ?try: > ? ?while 1: > ? ? ?tally += yield > ? ? ?count += 1 > ?except GeneratorExit: > ? ?pass > ?return count, tally I like this example. > Fairly straightforward, but one of the promises of PEP 380 is that it > allows us to factor out some or all of a generator's internal logic > without affecting the externally visible semantics. So, let's do that: > > ?def gtally2(): > ? ?return (yield from gtally()) And I find this a good starting point. > Unless the PEP 380 yield from expansion is changed, Guido's proposed > "close() returns the value on StopIteration" just broke this > equivalence for gtally2() - since the yield from expansion turns the > StopIteration back into a GeneratorExit, the return value of > gtally2.close is always going to be None instead of the expected > (count, tally) 2-tuple. Since the value of the internal call to > close() is thrown away completely, there is absolute nothing the > author of gtally2() can do to fix it (aside from not using yield from > at all). Right, they could do something based on the (imperfect) equivalency between "yield from f()" and "for x in f(): yield x". > To me, if Guido's idea is adopted, this outcome is as > illogical and unacceptable as the following returning None: > > ?def sum2(seq): > ? ?return sum(seq) Maybe. > We already thrashed out long ago that the yield from handling of > GeneratorExit needs to work the way it does in order to serve its > primary purpose of releasing resources, so allowing the inner > StopIteration to propagate with the exception value attached is not an > option. > > The question is whether or not there is a way to implement the > return-value-from-close() idiom in a way that *doesn't* completely > break the equivalence between gtally() and gtally2() above. I think > there is: store the prospective return-value on the GeneratorExit > instance and have the yield from expansion provide the most recent > return value as it unwinds the stack. > > To avoid giving false impressions as to which level of the stack > return values are from, gtally2() would need to be implemented a bit > differently in order to *also* convert GeneratorExit to StopIteration: > > ?def gtally2(): > ? ?# The PEP 380 equivalent of a "tail call" if g.close() returns a value > ? ?try: > ? ? ?yield from gtally() > ? ?except GeneratorExit as ex: > ? ? ?return ex.value Unfortunately this misses the goal of equivalency between gtally() and your original gtally2() by a mile. Having to add extra except clauses around each yield-from IMO defeats the purpose. > Specific proposed additions/modifications to PEP 380: > > 1. The new "value" attribute is added to GeneratorExit as well as > StopIteration and is explicitly read/write I already posted an argument against this. > 2. The semantics of the generator close method are modified to be: > > ?def close(self): > ? ?try: > ? ? ?self.throw(GeneratorExit) > ? ?except StopIteration as ex: > ? ? ?return ex.value > ? ?except GeneratorExit: > ? ? ?return None # Ignore the value, as it didn't come from the > outermost generator > ? ?raise RuntimeError("Generator ignored GeneratorExit") > > 3. ?The GeneratorExit handling semantics for the yield from expansion > are modified to be: > > ? ? ? ?except GeneratorExit as _e: > ? ? ? ? ? ?try: > ? ? ? ? ? ? ? ?_m = _i.close > ? ? ? ? ? ?except AttributeError: > ? ? ? ? ? ? ? ?pass > ? ? ? ? ? ?else: > ? ? ? ? ? ? ? ?_e.value = _m() # Store close() result on the exception > ? ? ? ? ? ?raise _e > > With these modifications, a framework could then quite easily provide > a context manager to make the idiom a little more readable and hide > the fact that GeneratorExit is being caught at all: > > class GenResult(): > ? ?def __init__(self): self.value = None > > @contextmanager > def generator_return(): > ? ?result = GenResult() > ? ?try: > ? ? ?yield > ? ?except GeneratorExit as ex: > ? ? ?result.value = ex.value > > def gtally(): > ?# The CM suppresses GeneratorExit, allowing us > ?# to convert it to StopIteration > ?count = tally = 0 > ?with generator_return(): > ? ?while 1: > ? ? ?tally += yield > ? ? ?count += 1 > ?return count, tally > > def gtally2(): > ?# The CM *also* collects the value of any inner > ?# yield from expression, allowing easier tail calls > ?with generator_return() as result: > ? ?yield from gtally() > ?return result.value I agree that you've poked a hole in my proposal. If we can change the expansion of yield-from to restore the equivalency between gtally() and the simplest gtally2(), thereby restoring the original refactoring principle, we might be able to save it. Otherwise I declare defeat. Right now I am too tired to think of such an expansion, but I recall trying my hand at one a few nights ago and realizing that I'd introduced another problem. So this does not look too hopeful, especially since I really don't like extending GeneratorExit for the purpose. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sat Oct 30 05:48:30 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Oct 2010 13:48:30 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> Message-ID: On Sat, Oct 30, 2010 at 1:26 PM, Guido van Rossum wrote: > Really? Can you show a realistic use case? (There was Nick's > average-of-sums example but I think nobody liked it.) Yeah, I'm much happier with the tally example. It got rid of all the irrelevant framework-y parts :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 30 06:05:20 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Oct 2010 14:05:20 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: On Sat, Oct 30, 2010 at 1:47 PM, Guido van Rossum wrote: > I agree that you've poked a hole in my proposal. If we can change the > expansion of yield-from to restore the equivalency between gtally() > and the simplest gtally2(), thereby restoring the original refactoring > principle, we might be able to save it. Otherwise I declare defeat. > Right now I am too tired to think of such an expansion, but I recall > trying my hand at one a few nights ago and realizing that I'd > introduced another problem. So this does not look too hopeful, > especially since I really don't like extending GeneratorExit for the > purpose. I tried to make the original version work as well, but always ran into one of two problems: - breaking GeneratorExit for resource cleanup - "leaking" inner return values so they looked like they came from the outer function. Here's a crazy idea though. What if gtally2() could be written as follows: def gtally2(): return from gtally() If we make the generator tail call explicit, then the interpreter can do the right thing (i.e. raise StopIteration-with-a-value instead of reraising GeneratorExit) and we don't need to try to shoehorn two different sets of semantics into the single yield-from construct. To give some formal semantics to the new statement: # RETURN FROM semantics _i = iter(EXPR) _m, _a = next, (_i,) # _m is a function or a bound method; # _a is a tuple of arguments to call _m with; # both are set to other values further down while 1: # Move the generator along # Unlike YIELD FROM, we allow StopIteration to # escape, since this is a tail call _y = _m(*_a) # Yield _y and process what came back in try: _s = yield _y except GeneratorExit as _e: # Request to exit try: # Don't reuse _m, since we're bailing out of the loop _close = _i.close except AttributeError: pass else: # Unlike YIELD FROM, we use StopIteration # to return the value of the inner close() call raise StopIteration(_close()) # If there is no inner close() attribute, return None raise StopIteration except BaseException as _e: # An exception was thrown in; pass it along _a = sys.exc_info() try: _m = _i.throw except AttributeError: # Can't throw it in; throw it back out raise _e else: # A value was sent in; pass it along if _s is None: _m, _a = next, (_i,) else: _m, _a = _i.send, (_s,) # Unlike YIELD FROM, this is a statement, so there is no RESULT Summary of the differences between return from and yield from: - statement, not an expression - an inner StopIteration is allowed to propogate - a thrown in GeneratorExit is converted to StopIteration Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Oct 30 06:26:48 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Oct 2010 14:26:48 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: On Sat, Oct 30, 2010 at 2:05 PM, Nick Coghlan wrote: > To give some formal semantics to the new statement: Oops, those would be some-formal-but-incorrect semantics that made the StopIteration exceptions visible in the current frame. Fixed below to actually kill the current frame properly, letting the generator instance take care of turning the return value into a StopIteration exception. # RETURN FROM semantics _i = iter(EXPR) _m, _a = next, (_i,) # _m is a function or a bound method; # _a is a tuple of arguments to call _m with; # both are set to other values further down while 1: # Move the generator along # Unlike YIELD FROM, we convert StopIteration # into an immediate return (since this is a tail call) try: _y = _m(*_a) except StopIteration as _e: return _e.value # Yield _y and process what came back in try: _s = yield _y except GeneratorExit as _e: # Request to exit try: # Don't reuse _m, since we're bailing out of the loop _close = _i.close except AttributeError: pass else: # Unlike YIELD FROM, we return the # value of the inner close() call return _close() # If there is no inner close() attribute, # we just return None return except BaseException as _e: # An exception was thrown in; pass it along _a = sys.exc_info() try: _m = _i.throw except AttributeError: # Can't throw it in; throw it back out raise _e else: # A value was sent in; pass it along if _s is None: _m, _a = next, (_i,) else: _m, _a = _i.send, (_s,) # Unlike YIELD FROM, this is a statement, so there is no RESULT -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Sat Oct 30 08:12:54 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 19:12:54 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: <4CCBB766.3020104@canterbury.ac.nz> Guido van Rossum wrote: > I thought the basics weren't even decided? I'll be posting a new version soon that ought to pin things down more precisely, then we'll have something to talk about. > I truly wish it was easier to experiment with syntax -- it would be so > much simpler if these PEPs could be accompanied by a library that > people can just import to use the new syntax Hmmm. Maybe if there were an option to use a parser and compiler written in pure Python? It wouldn't be fast, but it would be easier to hack experimental features into. > If there was a separate PEP specifying *just* returning a value from a > generator and how to get at that value using g.close(), without > yield-from, would those problems still exist? I don't think it's necessary to move the value-returning part into another PEP, because it doesn't conflict with anything. But close() returning that value could easily be moved into another PEP that depended on PEP 380. PEP 3152 would still depend on 380, not on the new PEP. > Would PEP 3152 make sense with PEP X but without (the rest > of) PEP 380? For PEP 3152 to *not* depend on PEP 380, it would have to duplicate almost all of PEP 380's content. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 30 08:13:29 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 19:13:29 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: <4CCBB789.4030208@canterbury.ac.nz> Nick Coghlan wrote: > 1. Send in a sentinel value (often None, but you could easily reuse > the exception types as sentinel values as well) > 2. Throw in GeneratorExit explicitly > 3. Throw in StopIteration explicitly > 4. Throw in a different specific exception > 5. Call g.close() > > Yield from also > isn't innately set up to unwind correctly in any of these cases, On the contrary, I think it works perfectly well with 1, and also with 4 as long as the inner generator catches it in the right place. Note that you *don't* want to unwind in this situation, you want to continue with normal processing, in the same way that a function reading from a file continues with normal processing when it reaches the end of the file. > without some form of framework based signalling from the inner > generator to indicate whether or not the outer generator should > continue or bail out. No such signalling is necessary -- all it needs to do is return in the normal way. Or to put it another way, from the yield-from statement's point of view, the signal is that it raised StopIteration and not GeneratorExit. > To avoid giving false impressions as to which level of the stack > return values are from, gtally2() would need to be implemented a bit > differently in order to *also* convert GeneratorExit to StopIteration: > > def gtally2(): > # The PEP 380 equivalent of a "tail call" if g.close() returns a value > try: > yield from gtally() > except GeneratorExit as ex: > return ex.value Exactly, which I think is a horrible thing to have to do, and I'm loathe to make any modification to PEP 380 to support this kind of pattern. > With these modifications, a framework could then quite easily provide > a context manager to make the idiom a little more readable Which would be papering over an awful mess that has no need to exist in the first place, as long as you don't insist on using technique no. 5. -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 30 08:16:44 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 19:16:44 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> Message-ID: <4CCBB84C.2080809@canterbury.ac.nz> Guido van Rossum wrote: > On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing wrote: > >>and even then you would lose the return value >>from the inner generator, which you're probably going to >>want. > > Really? Can you show a realistic use case? Here's an attempt: def variancer(): # Compute variance of values sent in (details left # as an exercise) def stddever(): # Compute standard deviation of values sent in v = yield from variancer() return sqrt(v) -- Greg From greg.ewing at canterbury.ac.nz Sat Oct 30 08:18:20 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 30 Oct 2010 19:18:20 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: <4CCBB8AC.2010402@canterbury.ac.nz> Nick Coghlan wrote: > Here's a crazy idea though. What if gtally2() could be written as follows: > > def gtally2(): > return from gtally() That seems like an excessively special case. Most of the time you're going to want to do some processing on the value, not just return it immediately. -- Greg From ncoghlan at gmail.com Sat Oct 30 08:25:36 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Oct 2010 16:25:36 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCBB789.4030208@canterbury.ac.nz> References: <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> <4CCBB789.4030208@canterbury.ac.nz> Message-ID: On Sat, Oct 30, 2010 at 4:13 PM, Greg Ewing wrote: > Nick Coghlan wrote: > >> 1. Send in a sentinel value (often None, but you could easily reuse >> the exception types as sentinel values ?as well) >> 2. Throw in GeneratorExit explicitly >> 3. Throw in StopIteration explicitly >> 4. Throw in a different specific exception >> 5. Call g.close() >> >> Yield from also >> isn't innately set up to unwind correctly in any of these cases, > > On the contrary, I think it works perfectly well with 1, and > also with 4 as long as the inner generator catches it in the > right place. Yeah, I'd agree with that. Unwinding the stack correctly requires cooperation from all of the intervening layers and the logic for that is likely to be a little clumsy, but the issues are not insurmountable (my own "return from" suggestion requires cooperation as well, since the layers have to explicitly invoke the alternate semantics to indicate that return values should be passed through). "return from" would make more sense as its own PEP, with the construct possibly given a meaning in ordinary functions as well (e.g. the occasionally sought tail-call optimisation in recursive functions). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Sat Oct 30 08:42:18 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 30 Oct 2010 01:42:18 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCB8C50.2010404@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> Message-ID: <4CCBBE4A.3040503@ronadam.com> On 10/29/2010 10:09 PM, Greg Ewing wrote: > Guido van Rossum wrote: > >> This seems to be the crux of your objection. But if I look carefully >> at the expansion in the current version of PEP 380, I don't think this >> problem actually happens: If the outer generator catches >> GeneratorExit, it closes the inner generator (by calling its close >> method, if it exists) and then re-raises the GeneratorExit: > > Yes, but if you want close() to cause the generator to finish > normally, you *don't* want that to happen. You would have to > surround the yield-from call with a try block to catch the > GeneratorExit, and even then you would lose the return value > from the inner generator, which you're probably going to > want. Ok, after thinking about this for a while, I think the "yield from" would be too limited if it could only be used for consumers that must run until the end. That rules out a whole lot of pipes, filters and other things that consume-some, emit-some, consume-some_more, and emit-some_more. I think I figured out something that may be more flexible and insn't too complicated. The trick is how to tell the "yield from" to stop delegating on a particular exception. (And be explicit about it!) # Inside a generator or sub-generator. ... next() # works in this frame. yield from except #Delegate until value = next() # works in this frame again. ... The explicit "yield from .. except" is easier to understand. It also avoids the close and return issues. It should be easier to implement as well. And it doesn't require any "special" framework in the parent generator or the delegated sub-generator to work. Here's an example. # I prefer to use a ValueRequest exception, but someone could use # StopIteration or GeneratorExit, if it's useful for what they # are doing. class ValueRequest(Exception): pass # A pretty standard generator that emits # a total when an exception is thrown in. # It doesn't need anything special in it # so it can be delegated. def gtally(): count = tally = 0 try: while 1: tally += yield count += 1 except ValueRequest: yield count, tally # An example of delegating until an Exception. # The specified "exception" is not sent to the sub-generator. # I think explicit is better than implicit here. def gtally_averages(): gt = gtally() next(gt) yield from gt except ValueRequest #Catches exception count, tally = gt.throw(ValueRequest) #Get tally yield tally / count # This part also already works and has no new stuf in it. # This part isn't aware of any delegating! def main(): gavg = gtally_averages() next(gavg) for x in range(100): gavg.send(x) print(gavg.throw(ValueRequest)) main() It may be that a lot of pre-existing generators will already work with this. ;-) You can still use 'yield from " to delegate until ends. You just won't get a value in the same frame was used in. The parent may get it instead. That may be useful in it self. Note: you *can't* put the yield from inside a try-except and do the same thing. The exception would go to the sub-generator instead. Which is one of the messy things we are trying to avoid doing. Cheers, Ron From ncoghlan at gmail.com Sat Oct 30 09:58:13 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 30 Oct 2010 17:58:13 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCBBE4A.3040503@ronadam.com> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBBE4A.3040503@ronadam.com> Message-ID: On Sat, Oct 30, 2010 at 4:42 PM, Ron Adam wrote: > Ok, after thinking about this for a while, I think the "yield from" would be > too limited if it could only be used for consumers that must run until the > end. That rules out a whole lot of pipes, filters and other things that > consume-some, emit-some, consume-some_more, and emit-some_more. Indeed, the "stop-in-the-middle" aspect is tricky, but is the crux of what we're struggling with here. > I think I figured out something that may be more flexible and insn't too > complicated. Basically a way to use yield from, while declaring how to force the end of iteration? Interesting idea. However, I think sentinel values are likely a better way to handle this in a pure PEP 380 context. > Here's an example. Modifying this example to use sentinel values rather than throwing in exceptions actually makes it all fairly straightforward in a PEP 380 context. So maybe the moral of this whole thread is really "sentinel values good, sentinel exceptions bad". # Helper function to finish off a generator by sending a sentinel value def finish(g, sentinel=None): try: g.send(sentinel) except StopIteration as ex: return ex.value def gtally(end_tally=None): # Tallies numbers until sentinel is passed in count = tally = 0 value = object() while 1: value = yield if value is end_tally: return count, tally count += 1 tally += value def gaverage(end_avg=None): count, tally = (yield from gtally(end_avg)) return tally / count def main(): g = gaverage() next(g) for x in range(100): g.send(x) return finish(g) Even more complex cases, like my sum-of-averages example (or any equivalent multi-level construct) can be implemented without too much hassle, so long as "finish current action" and "start next action" are implemented as two separate steps so the outer layer has a chance to communicate with the outside world before diving into the inner layer. I think we've thrashed this out enough that I, for one, want to see how PEP 380 peforms in the wild as it currently stands before we start tinkering any further. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jh at improva.dk Sat Oct 30 10:58:57 2010 From: jh at improva.dk (Jacob Holm) Date: Sat, 30 Oct 2010 10:58:57 +0200 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA5AE4.7080403@canterbury.ac.nz> <4CCB5AC5.8060907@canterbury.ac.nz> <4CCB5CCF.90504@improva.dk> <4CCB69AC.9020701@improva.dk> Message-ID: <4CCBDE51.5040502@improva.dk> On 2010-10-30 03:15, Guido van Rossum wrote: > On Fri, Oct 29, 2010 at 5:41 PM, Jacob Holm wrote: >> I agree that PEP 3152 is far from perfect at this point, but I like the >> basics. > > I thought the basics weren't even decided? Implicit definitions, or > implicit cocalls, terminology to be used, how to implement in Jython > or IronPython, probably more (I can't keep the ideas of that PEP in my > head so I end up blanking out on any discussion that mentions it). > The basics I am talking about are: 1) adding a new __cocall__ protocol with semantics described in terms of generators 2) adding a simpler way to call the new functions, based on "yield from" > I truly wish it was easier to experiment with syntax -- it would be so > much simpler if these PEPs could be accompanied by a library that > people can just import to use the new syntax (even if it's a C > extension) rather than by a patch to the core language. > > The need to "get it right in one shot" is keeping back the ability to > experiment at any realistic scale, so all we see (on all sides) are > trivial examples that may highlight proposed features and anticipated > problems, but this is no way to gain experience with what the *real* > problems would be. > Right. >> The reason I am so concerned with the "return value" semantics >> is that I see some problems we are having in PEP 3152 as indicating a >> likely flaw/misfeature in PEP 380. I would be much happier with both >> PEPs if they didn't conflict in this way. > > If there was a separate PEP specifying *just* returning a value from a > generator and how to get at that value using g.close(), without > yield-from, would those problems still exist? If not, that would be a > reason to move those out in a separate PEP. Assume such a PEP (call it > PEP X) existed, what would be the dependency tree? What would be the > conflicts? Would PEP 3152 make sense with PEP X but without (the rest > of) PEP 380? > If "return value" was moved from PEP 380 to PEP X, we should remove or alter the expression form of "yield from". I am currently in favor of changing "yield from" to return the StopIteration instance that stopped the inner generator, because that allows you to use different StopIteration subclasses in different circumstances (e.g. exhausted, told to quit) and still let the calling code know which one it was. This is useful for PEP 3152 but I am sure it has other uses as well. It also means that PEP X return values are still useful with yield-from, without modifications to PEP 380. (But slightly less convenient because you have to extract the value yourself). In other words, with that change to the expression form of "yield from" PEP 380 and PEP X could be completely independent and complementary. PEP 3152 would naturally depend on PEP X, but AFAICT it depends on PEP 380 only for ease of presentation. With the proposed change to the expression form of "yield from" there would be no conflicts either. (The current conflict is really only with the use of the current "yield from" in the presentation. The desired semantics could be defined from scratch. We just really don't want to do that) >> So much so, that I would rather miss a few features in PEP 380 in the >> *hope* of getting them right later with another PEP. > > Can you be specific? Which features? > "return value" in generators, current expression form of "yield from". >> To quote the Zen: >> >> "never is often better than *right* now" > > Um, Python 3.3 can hardly be referred to as "*right* now". > True, but the "close to acceptance" state of PEP 380 means that changes there have much more of a "now" feel than changes to other PEPs. > There are plenty of arguments in the zen for PEP X, especially "If the > implementation is easy to explain, it may be a good idea." Both > returning a value from a generator and catching that value in > g.close() are really easy to implement and the implementation is easy > to explain. It's a small evolution from the current generator code. > >> A PEP just for the "return value" shouldn't be too hard to add later if >> PEP 3152 doesn't work out, and at that point we should have a better >> idea about the best way of doing it. > > It would a small miracle if PEP 3152 worked out. I'd much rather have > a solid fallback position now. I'm not pushing for rushing PEP X to > acceptance -- I'm just hoping we can write it now and discuss it on > its own merits without too much concern for PEP 3152 or even PEP 380, > although I personally still think that the interference with PEP 380 > would minimal and not a reason for changing PEP X. > You are right, PEP X should be very small. The main points in it would be: 1) Allow "return value" in a generator, making it raise (a subclass of) StopIteration with value as the first argument. (I am now in favor of using a subclass, and treating a bare "return" as "return None". Working with PEP 3152 made me realize that there are use cases for distinguishing between a "return" and a "raise StopIteration") 2) Change g.close() to extract and return the value or add a new g.finish() for that purpose. (I'd prefer using "finish" and adding a new exception for this instead of reusing GeneratorExit. The new method+exception would make it work in my modified PEP 380 without further modification, reusing close+GeneratorExit would have the same problems as we have now) - Jacob From denis.spir at gmail.com Sat Oct 30 11:02:51 2010 From: denis.spir at gmail.com (spir) Date: Sat, 30 Oct 2010 11:02:51 +0200 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> <4CCAAB9D.3060909@pearwood.info> Message-ID: <20101030110251.09e8083e@o> On Fri, 29 Oct 2010 14:19:33 -1000 "Carl M. Johnson" wrote: > Thinking about it a little more, if I were making an HTML tree type > metaclass though, I wouldn't want to use an OrderedDict anyway, since > it can't have duplicate elements, and I would want the interface to be > something like: > > class body(Tree()): > h1 = "Hello World!" > p = "Lorem ipsum." > p = "Dulce et decorum est." > class div(Tree(id="content")): > p = "Main thing" > class div(Tree(id="footer")): > p = "(C) 2010" > > So, I'd probably end up making my own custom kind of dict that didn't > overwrite repeated names. Ah, but that's a completely different issue. You seem to be talking of an appropriate data structure to represent (the equivalent of) a parse tree, or rather an AST. In most grammars there are sequence patterns representing composite data, such as funcDef:(parameterList block) in which (1) order is not meaningful (2) most often each "kind" of element happens only once (*). And there are repetitive patterns, such as block:statement*, in which elements "kinds" also repeat. Composite elements like func defs can be represented as dicts (ordered or not), but actually their meaning is of a "flexible record", a named tuple (**). It's _not_ a collection. The point is they can be indexed by "kind" (id est which patterns generated them). Repetitive elements do not have this nice property, they must be represented as sequences of elements _holding_ their kind. For this reason, tree nodes often hold the element "kind" in addition to their actual data and some metadata. Denis (*) But that's not always true, eg addition:(addOperand '+' addOperand). (**) I miss "free objects" in python for this reason -- people often use dicts instead. I'd like to be able to write: "return (color:c, position:p)", where the defined object is instance of Object directly, or maybe of Individual, meaning Object with a __dict__. class Individual: def __init__ (self, **slots): self.__dict__ = slots def __repr__(self): return "(%s)" % \ ' '.join("%s:%s" %(k,v) for (k,v) in self.__dict__.items()) print Individual(a=135, d=100) # (a:135 d:100) If only we had a literal notation for that :-) No more need of classes for singletons. -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From steve at pearwood.info Sat Oct 30 11:58:23 2010 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 30 Oct 2010 20:58:23 +1100 Subject: [Python-ideas] Ordered storage of keyword arguments In-Reply-To: <20101030110251.09e8083e@o> References: <4CC93095.3080704@egenix.com> <4CC941FF.6070408@egenix.com> <4CC9E99F.5030805@canterbury.ac.nz> <4CCA8622.5060405@egenix.com> <4CCAAB9D.3060909@pearwood.info> <20101030110251.09e8083e@o> Message-ID: <4CCBEC3F.3090403@pearwood.info> spir wrote: > (**) I miss "free objects" in python for this reason -- people often use dicts instead. I'd like to be able to write: "return (color:c, position:p)", where the defined object is instance of Object directly, or maybe of Individual, meaning Object with a __dict__. > class Individual: > def __init__ (self, **slots): > self.__dict__ = slots > def __repr__(self): > return "(%s)" % \ > ' '.join("%s:%s" %(k,v) for (k,v) in self.__dict__.items()) > print Individual(a=135, d=100) # (a:135 d:100) > If only we had a literal notation for that :-) No more need of classes for singletons. We almost do. >>> from collections import namedtuple as nt >>> nt("Individual", "a d")(a=135, d=100) Individual(a=135, d=100) Short enough to write in-line. Or you could do this: >>> Individual = nt("Individual", "a d") >>> Individual(99, 42) Individual(a=99, d=42) It shouldn't be hard to write a function similar to namedtuple that didn't require a declaration before hand, but picked up the field names from the keyword-only arguments given: >>> from collections import record # say >>> record(a=23, b=42) record(a=23, b=24) I leave that as an exercise to the readers :) -- Steven From guido at python.org Sat Oct 30 17:00:32 2010 From: guido at python.org (Guido van Rossum) Date: Sat, 30 Oct 2010 08:00:32 -0700 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCBB84C.2080809@canterbury.ac.nz> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBB84C.2080809@canterbury.ac.nz> Message-ID: On Fri, Oct 29, 2010 at 11:16 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> On Fri, Oct 29, 2010 at 8:09 PM, Greg Ewing >> wrote: >> >>> and even then you would lose the return value >>> from the inner generator, which you're probably going to >>> want. >> >> Really? Can you show a realistic use case? > > Here's an attempt: > > ?def variancer(): > ? ?# Compute variance of values sent in (details left > ? ?# as an exercise) > > ?def stddever(): > ? ?# Compute standard deviation of values sent in > ? ?v = yield from variancer() > ? ?return sqrt(v) Good. I have to get a crazy idea off my chest: maybe the collective hang-up is that GeneratorExit must be special-cased. Let me explore a bit. Take a binary tree node: class Node: def __init__(self, label, left=None, right=None): self.label, self.left, self.right = label, left, right And an inorder traversal function: def inorder(node): if node: yield from inorder(node.left) yield node yield from inorder(node.right) This is a nice example, and different from gtally(), and variance()/stddev(), because of the recursion. Now let's say we want to design a protocol whereby the consumer of the nodes yielded by inorder() can ask the traversal to be stopped. With the code above this is trivial, just call g.close() or throw any other exception in. But now let's first modify inorder() to also return a value computed from the nodes traversed so far. For simplicity I'll use the count: def inorder(node): if not node: return 0 count += yield from inorder(node.left) yield node count += 1 count += yield from inorder(node.right) return count How would we stop this enumeration *and* receive a count of the nodes already enumerated up to that point? Throwing in some exception is the easiest approach. Let's say we throw EOFError. My first attempt has a bug: def inorder(node): if not node: return 0 count = 0 count += yield from inorder(node.left) # Bug here try: count += 1 yield node except EOFError: return count count += yield from inorder(node.right) return count This has the fatal flaw of not responding promptly when the EOFError is caught by the left subtree, since it returns normally and the parent doesn't "see" the EOFError: on the way in it's thrown directly into the first yield-from, on the way out there's no way to distinguish between a regular return or an interrupted one. A potential fix is to return two values: an interrupted flag and a count. But this is pretty ugly (I'm not even going to show the code). A different approach to fixing this is for the throwing code to keep throwing EOFError until the generator stops yielding values: def stop(g): while True: try: g.throw(EOFError) except StopIteration as err: return err.value I'm not concerned with the situation where the generator is already stopped; the EOFError will be bounced out, but that is the caller's problem, as they shouldn't have attempted to stop an already-stopped iterator. (Jacob is probably shaking his head here. :-) This solution doesn't quite work though, because the count returned will include the nodes that were yielded while the stack of generators was winding down. My pragmatic solution for this is to change the protocol so that stopping the generator means that the node yielded last should not be included in the count. If you envision the caller to be running a for-loop, think of calling stop() at the top of the loop rather than at the bottom. (Jacob is now again wondering how they'd get the count if the iterator runs till completion. :-) We can do this by modifying inorder() to bump the count after yielding rather than before: try: yield node except EOFError: return count count += 1 Now, to get back the semantics of getting the correct count *including* the last node seen by the caller, we can modify stop() to advance the generator by one more step: def stop(g): try: next(g) while True: g.throw(EOFError) except StopIteration as err: return err.value This works even if g was positioned after the last item to be yielded: in that case next(g) raises StopIteration. It still doesn't work if we use a for-loop to iterate through the end (Jacob nods :-) but I say they shouldn't be doing that, or they can write a little wrapper for iter() that *does* save the return value from StopIteration. (Okay, half of me says it would be fine to store it on the generator object. :-) [Dramatic pause] [Drumroll] What has this got to do with GeneratorExit and g.close()? I propose to modify g.close() to keep throwing GeneratorExit until the generator stops yielding values, and then capture the return value from StopIteration if that is what was raised. The beauty is then that the PEP 380 expansion can stop special-casing GeneratorExit: it just treats it as every other exception. And stddev() above works! (If you worry about infinite loops: you can get those anyway, by putting "while: True" in an "except GeneratorExit" block. I don't see much reason to worry more in this case.) -- --Guido van Rossum (python.org/~guido) From rrr at ronadam.com Sat Oct 30 18:54:15 2010 From: rrr at ronadam.com (Ron Adam) Date: Sat, 30 Oct 2010 11:54:15 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBBE4A.3040503@ronadam.com> Message-ID: <4CCC4DB7.9050604@ronadam.com> On 10/30/2010 02:58 AM, Nick Coghlan wrote: > On Sat, Oct 30, 2010 at 4:42 PM, Ron Adam wrote: >> Ok, after thinking about this for a while, I think the "yield from" would be >> too limited if it could only be used for consumers that must run until the >> end. That rules out a whole lot of pipes, filters and other things that >> consume-some, emit-some, consume-some_more, and emit-some_more. > > Indeed, the "stop-in-the-middle" aspect is tricky, but is the crux of > what we're struggling with here. > >> I think I figured out something that may be more flexible and insn't too >> complicated. > > Basically a way to use yield from, while declaring how to force the > end of iteration? Interesting idea. Not iteration, iteration can continue. It signals the end of delegation, and returns control to the generator that initiated the delegation. > However, I think sentinel values are likely a better way to handle > this in a pure PEP 380 context. Sentinel values aren't always better because they require a extra comparison on each item. >> Here's an example. > > Modifying this example to use sentinel values rather than throwing in > exceptions actually makes it all fairly straightforward in a PEP 380 > context. So maybe the moral of this whole thread is really "sentinel > values good, sentinel exceptions bad". > > # Helper function to finish off a generator by sending a sentinel value > def finish(g, sentinel=None): > try: > g.send(sentinel) > except StopIteration as ex: > return ex.value > > def gtally(end_tally=None): > # Tallies numbers until sentinel is passed in > count = tally = 0 > value = object() Left over from earlier edit? > while 1: > value = yield > if value is end_tally: > return count, tally > count += 1 > tally += value The comparison is executed on every loop. A try-except would be outside the loop. > def gaverage(end_avg=None): > count, tally = (yield from gtally(end_avg)) > return tally / count > > def main(): > g = gaverage() > next(g) > for x in range(100): > g.send(x) > return finish(g) Cheers, Ron From greg.ewing at canterbury.ac.nz Sun Oct 31 02:09:26 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 31 Oct 2010 13:09:26 +1300 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBB84C.2080809@canterbury.ac.nz> Message-ID: <4CCCB3B6.3010209@canterbury.ac.nz> Guido van Rossum wrote: > A different approach to fixing this is for the throwing code to keep > throwing EOFError until the generator stops yielding values: That's precisely what I would recommend. > This solution doesn't quite work though, because the count returned > will include the nodes that were yielded while the stack of generators > was winding down. > > My pragmatic solution for this is to change the > protocol so that stopping the generator means that the node yielded > last should not be included in the count. This whole example seems contrived to me, so it's hard to say whether this is a good or bad solution. > I propose to > modify g.close() to keep throwing GeneratorExit until the generator > stops yielding values, and then capture the return value from > StopIteration if that is what was raised. The beauty is then that the > PEP 380 expansion can stop special-casing GeneratorExit: it just > treats it as every other exception. This was actually suggested during the initial round of discussion, and shot down -- if I remember correctly, on the grounds that it could result in infinite loops. But if you're no longer concerned about that, it's worth considering. My concern is that this would be a fairly substantial change to the intended semantics of close() -- it would no longer be a way of aborting a generator and forcing it to clean up as quickly as possible. But maybe you don't mind losing that functionality? -- Greg From ncoghlan at gmail.com Sun Oct 31 02:34:35 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 31 Oct 2010 10:34:35 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBB84C.2080809@canterbury.ac.nz> Message-ID: On Sun, Oct 31, 2010 at 1:00 AM, Guido van Rossum wrote: > What has this got to do with GeneratorExit and g.close()? I propose to > modify g.close() to keep throwing GeneratorExit until the generator > stops yielding values, and then capture the return value from > StopIteration if that is what was raised. The beauty is then that the > PEP 380 expansion can stop special-casing GeneratorExit: it just > treats it as every other exception. And stddev() above works! (If you > worry about infinite loops: you can get those anyway, by putting > "while: True" in an "except GeneratorExit" block. I don't see much > reason to worry more in this case.) I'm back to liking your general idea, but wanting to use a new method and exception for the task to keep the two sets of semantics orthogonal :) If we add a finish() method that corresponds to your stop() function, and a GeneratorReturn exception as a peer to GeneratorExit: class GeneratorReturn(BaseException): pass def finish(self): if g.gi_frame is None: return self._result # (or raise RuntimeError) try: next(self) while True: self.throw(GeneratorReturn) except StopIteration as ex: return ex.value Then your node counter iterator (nice example, btw) would simply look like: def count_nodes(node): if not node: return 0 count = 0 count += yield from count_nodes(node.left) try: yield node except GeneratorReturn: return count count += 1 # Only count nodes when next is called in response count += yield from count_nodes(node.right) return count Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sun Oct 31 02:35:46 2010 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 31 Oct 2010 10:35:46 +1000 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: <4CCC4DB7.9050604@ronadam.com> References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBBE4A.3040503@ronadam.com> <4CCC4DB7.9050604@ronadam.com> Message-ID: On Sun, Oct 31, 2010 at 2:54 AM, Ron Adam wrote: >> However, I think sentinel values are likely a better way to handle >> this in a pure PEP 380 context. > > Sentinel values aren't always better because they require a extra comparison > on each item. Yep, Guido's example made me realise I was wrong on that front. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Sun Oct 31 08:26:11 2010 From: rrr at ronadam.com (Ron Adam) Date: Sun, 31 Oct 2010 02:26:11 -0500 Subject: [Python-ideas] Possible PEP 380 tweak In-Reply-To: References: <4CC63065.9040507@improva.dk> <4CC6E94F.3090702@improva.dk> <4CC889F1.8010603@improva.dk> <4CC939E5.5070700@improva.dk> <4CC9FC87.1040600@canterbury.ac.nz> <4CCA752D.5090904@canterbury.ac.nz> <4CCB8C50.2010404@canterbury.ac.nz> <4CCBBE4A.3040503@ronadam.com> <4CCC4DB7.9050604@ronadam.com> Message-ID: <4CCD1A13.1070400@ronadam.com> On 10/30/2010 07:35 PM, Nick Coghlan wrote: > On Sun, Oct 31, 2010 at 2:54 AM, Ron Adam wrote: >>> However, I think sentinel values are likely a better way to handle >>> this in a pure PEP 380 context. >> >> Sentinel values aren't always better because they require a extra comparison >> on each item. > > Yep, Guido's example made me realise I was wrong on that front. BTW: A sentinal could still work, and the 'except ' could be optional. The finish function isn't needed in this one. def gtally(end_tally): # Tallies numbers until sentinel is passed in count = tally = 0 while 1: value = yield if value is end_tally: break count += 1 tally += value yield count, tally def gaverage(end_avg): yield from gtally(end_avg) yield tally / count def main(): g = gaverage(None) next(g) for x in range(100): g.send(x) return g.send(None) Using sentinels not always wrong either. The data may have natural sentinel values in it. In those cases, value testing is what you want. I would like to be able to do it both ways myself. :-) Cheers, Ron From andre.roberge at gmail.com Sun Oct 31 17:55:36 2010 From: andre.roberge at gmail.com (Andre Roberge) Date: Sun, 31 Oct 2010 13:55:36 -0300 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers Message-ID: In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?) is a valid character for identifiers. I find that using it well can improve readability of programs written in those languages. Python 3 now allow all kinds of unicode characters in source code for identifiers. This is fantastic when one wants to teach programming to non-English speakers and have them use meaningful identifiers. While Python 3 does not allow ?, it does allow characters like ? ( http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used to good effect in writing valid identifiers such as functions that return either True or False, etc., thus improving (imo) readability. Given that one can legally mimic ? in Python identifiers, and given that the ? symbol is not used for anything in Python, would it be possible to consider allowing the use of ? as a valid character in an identifier? Andr? -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Sun Oct 31 18:39:58 2010 From: masklinn at masklinn.net (Masklinn) Date: Sun, 31 Oct 2010 18:39:58 +0100 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers In-Reply-To: References: Message-ID: On 2010-10-31, at 17:55 , Andre Roberge wrote: > In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?) > is a valid character for identifiers. I find that using it well can improve > readability of programs written in those languages. > > Python 3 now allow all kinds of unicode characters in source code for > identifiers. This is fantastic when one wants to teach programming to > non-English speakers and have them use meaningful identifiers. > > While Python 3 does not allow ?, it does allow characters like ? ( > http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used > to good effect in writing valid identifiers such as functions that return > either True or False, etc., thus improving (imo) readability. > > Given that one can legally mimic ? in Python identifiers, and given that the > ? symbol is not used for anything in Python, would it be possible to > consider allowing the use of ? as a valid character in an identifier? An other interesting postfix in the same line is "!" (for mutating methods). From songofacandy at gmail.com Sun Oct 31 18:48:12 2010 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 1 Nov 2010 02:48:12 +0900 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers In-Reply-To: References: Message-ID: > An other interesting postfix in the same line is "!" (for mutating methods). bytearray is a mutable type but it's methods are designed for immutable bytes. I think bytearray should provide in-place mutation method. And '!' is good for the method. For example, bytearray.strip!() should be in-place. -- INADA Naoki? From g.brandl at gmx.net Sun Oct 31 18:51:53 2010 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 31 Oct 2010 17:51:53 +0000 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers In-Reply-To: References: Message-ID: Am 31.10.2010 16:55, schrieb Andre Roberge: > In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?) is > a valid character for identifiers. I find that using it well can improve > readability of programs written in those languages. > > Python 3 now allow all kinds of unicode characters in source code for > identifiers. This is fantastic when one wants to teach programming to > non-English speakers and have them use meaningful identifiers. > > While Python 3 does not allow ?, it does allow characters like ? > (http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used to > good effect in writing valid identifiers such as functions that return either > True or False, etc., thus improving (imo) readability. Really? if number.even?(): # do something Since in Python, function/method calls require parens -- as opposed to Ruby, and in Scheme the parens are somewhere else, this doesn't strike me as more readable, on the contrary, it looks more noisy. Same goes for mutating methods with "!" suffix -- it looks just awkward followed by parens. (Obvious objection: use a property. Obvious answer: pick a method with args.) Another drawback of introducing such a convention this late in the design of the language is that you can never have it applied consistently. Changing the builtin and stdlib instances alone would need hundreds of compatibility aliases. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From stefan_ml at behnel.de Sun Oct 31 20:10:37 2010 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 31 Oct 2010 20:10:37 +0100 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers In-Reply-To: References: Message-ID: Georg Brandl, 31.10.2010 18:51: > Am 31.10.2010 16:55, schrieb Andre Roberge: >> In some languages (e.g. Scheme, Ruby, etc.), the question mark character (?) is >> a valid character for identifiers. I find that using it well can improve >> readability of programs written in those languages. >> >> Python 3 now allow all kinds of unicode characters in source code for >> identifiers. This is fantastic when one wants to teach programming to >> non-English speakers and have them use meaningful identifiers. >> >> While Python 3 does not allow ?, it does allow characters like ? >> (http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used to >> good effect in writing valid identifiers such as functions that return either >> True or False, etc., thus improving (imo) readability. > > Really? > > if number.even?(): > # do something > > Since in Python, function/method calls require parens -- as opposed to Ruby, > and in Scheme the parens are somewhere else, this doesn't strike me as more > readable, on the contrary, it looks more noisy. Same goes for mutating > methods with "!" suffix -- it looks just awkward followed by parens. Hmm, that reminds me. I think we should reconsider PEP 3117. There's still some value in it. http://www.python.org/dev/peps/pep-3117/ Stefan From greg.ewing at canterbury.ac.nz Sun Oct 31 21:58:28 2010 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Nov 2010 09:58:28 +1300 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers In-Reply-To: References: Message-ID: <4CCDD874.9010006@canterbury.ac.nz> Andre Roberge wrote: > In some languages (e.g. Scheme, Ruby, etc.), the question mark character > (?) is a valid character for identifiers. I find that using it well can > improve readability of programs written in those languages. Opinions differ on that. I find that having punctuation mixed in with identifiers makes the code *harder* to read. My wetware parser makes a clear distinction between characters that can be part of words and characters that separate words, and '?' falls very much into the latter category for me. Also, if we did this, it would preclude ever being able to use the characters concerned as operators in the future. -- Greg From solipsis at pitrou.net Sun Oct 31 22:07:22 2010 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 31 Oct 2010 22:07:22 +0100 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers References: Message-ID: <20101031220722.5063a7f9@pitrou.net> On Sun, 31 Oct 2010 13:55:36 -0300 Andre Roberge wrote: > > While Python 3 does not allow ?, it does allow characters like ? ( > http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used > to good effect in writing valid identifiers such as functions that return > either True or False, etc., thus improving (imo) readability. > > Given that one can legally mimic ? in Python identifiers, and given that the > ? symbol is not used for anything in Python, would it be possible to > consider allowing the use of ? as a valid character in an identifier? The fact that it looks like some other Unicode character is not really a valid reason to allow it in identifiers. Regards Antoine. From ben+python at benfinney.id.au Sun Oct 31 23:51:55 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 01 Nov 2010 09:51:55 +1100 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers References: Message-ID: <87hbg2kmb8.fsf@benfinney.id.au> Stefan Behnel writes: > Hmm, that reminds me. I think we should reconsider PEP 3117. There's > still some value in it. > > http://www.python.org/dev/peps/pep-3117/ I certainly got value out of reading it :-) -- \ ?On the internet you simply can't outsource parenting.? ?Eliza | `\ Cussen, _Top 10 Internet Filter Lies_, The Punch, 2010-03-25 | _o__) | Ben Finney From ben+python at benfinney.id.au Sun Oct 31 23:59:23 2010 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 01 Nov 2010 09:59:23 +1100 Subject: [Python-ideas] Accepting "?" as a valid character for identifiers References: Message-ID: <87d3qqklys.fsf@benfinney.id.au> Andre Roberge writes: > While Python 3 does not allow ?, it does allow characters like ? ( > http://en.wikipedia.org/wiki/Glottal_stop_%28letter%29) which can be used > to good effect in writing valid identifiers such as functions that return > either True or False, etc., thus improving (imo) readability. I consider ?read-over-the-telephone-ability? to be an essential component of ?readability?. Your identifiers containing unpronounceable characters would kill that. Unless you're going to argue that you are writing identifiers taken from a natural language that allows unambiguous pronunciation of ??? with the same concision as other characters, of course. I certainly don't want to be spelling out ?U+0294 LATIN LETTER GLOTTAL STOP? for a single character when I speak an identifier. -- \ ?But it is permissible to make a judgment after you have | `\ examined the evidence. In some circles it is even encouraged.? | _o__) ?Carl Sagan, _The Burden of Skepticism_, 1987 | Ben Finney